The Allen Cell Types data set is a database of neuronal cell types based on multimodal characterization of single cells to enable data-driven approaches to classification and is fully integrated with other Allen Brain Atlas resources. The database currently includes:
- electrophysiology: whole cell current clamp recordings made from Cre-positive neurons
- morphology: 3D bright-field images of the complete structure of neurons from the visual cortex
The Cell Types Jupyter notebook has many code samples to help get started with analysis:
Cell Types API¶
CellTypesApi class provides a Python interface for downloading data
in the Allen Cell Types Database. The following example demonstrates how to download meta data for
all cells with 3D reconstructions, then download the reconstruction and electrophysiology recordings
for one of those cells:
from allensdk.api.queries.cell_types_api import CellTypesApi ct = CellTypesApi() # a list of dictionaries containing metadata for cells with reconstructions cells = ct.list_cells(require_reconstruction=True) # download the electrophysiology data for one cell ct.save_ephys_data(cells['id'], 'example.nwb') # download the reconstruction for the same cell ct.save_reconstruction(cells['id'], 'example.swc')
Cell Types Cache¶
CellTypesCache class saves all of the data you can download via the
CellTypesApi in well known locations so that you don’t have to think
about file names and directories. It also takes care of knowing if you’ve already downloaded some files and reads
them from disk instead of downloading them again. The following example demonstrates how to download meta data for
all cells with 3D reconstructions:
from allensdk.core.cell_types_cache import CellTypesCache ctc = CellTypesCache() # a list of cell metadata for cells with reconstructions, download if necessary cells = ctc.get_cells(require_reconstruction=True) # open the electrophysiology data of one cell, download if necessary data_set = ctc.get_ephys_data(cells['id']) # read the reconstruction, download if necessary reconstruction = ctc.get_reconstruction(cells['id'])
EphysFeatureExtractor class calculates electrophysiology
features from cell recordings.
be used to extract the precise feature values available in the Cell Types Database:
from allensdk.api.queries.cell_types_api import CellTypesApi from allensdk.ephys.extract_cell_features import extract_cell_features from collections import defaultdict from allensdk.core.nwb_data_set import NwbDataSet # pick a cell to analyze specimen_id = 324257146 nwb_file = 'ephys.nwb' # download the ephys data and sweep metadata cta = CellTypesApi() sweeps = cta.get_ephys_sweeps(specimen_id) cta.save_ephys_data(specimen_id, nwb_file) # group the sweeps by stimulus sweep_numbers = defaultdict(list) for sweep in sweeps: sweep_numbers[sweep['stimulus_name']].append(sweep['sweep_number']) # calculate features cell_features = extract_cell_features(NwbDataSet(nwb_file), sweep_numbers['Ramp'], sweep_numbers['Short Square'], sweep_numbers['Long Square'])
This section provides a short description of the file formats used for Allen Cell Types data.
Morphology SWC Files¶
Morphological neuron reconstructions are available for download as SWC files. The SWC file format is a white-space delimited text file with a standard set of headers. The file lists a set of 3D neuronal compartments, each of which has:
|x||float||3D compartment position (x)|
|y||float||3D compartment position (y)|
|z||float||3D compartment position (z)|
|parent||string||parent compartment ID|
Comment lines begin with a ‘#’. Reconstructions in the Allen Cell Types Database can contain the following compartment types:
The Allen SDK comes with a
swc Python module that provides helper functions and classes for manipulating SWC files. Consider the following example:
import allensdk.core.swc as swc file_name = 'example.swc' morphology = swc.read_swc(file_name) # subsample the morphology 3x. root, soma, junctions, and the first child of the root are preserved. sparse_morphology = morphology.sparsify(3) # compartments in the order that they were specified in the file compartment_list = sparse_morphology.compartment_list # a dictionary of compartments indexed by compartment id compartments_by_id = sparse_morphology.compartment_index # the root soma compartment soma = morphology.soma # all compartments are dictionaries of compartment properties # compartments also keep track of ids of their children for child in morphology.children_of(soma): print(child['x'], child['y'], child['z'], child['radius'])
Neurodata Without Borders¶
The electrophysiology data collected in the Allen Cell Types Database is stored in the Neurodata Without Borders (NWB) file format. This format, created as part of the NWB initiative, is designed to store a variety of neurophysiology data, including data from intra- and extracellular electrophysiology experiments, optophysiology experiments, as well as tracking and stimulus data. It has a defined schema and metadata labeling system designed so software tools can easily access contained data.
The Allen SDK provides a basic Python class for extracting data from Allen Cell Types Database NWB files. These files store data from intracellular patch-clamp recordings. A stimulus current is presented to the cell and the cell’s voltage response is recorded. The file stores both stimulus and response for several experimental trials, here called “sweeps.” The following code snippet demonstrates how to extract a sweep’s stimulus, response, sampling rate, and estimated spike times:
from allensdk.core.nwb_data_set import NwbDataSet file_name = 'example.nwb' data_set = NwbDataSet(file_name) sweep_numbers = data_set.get_sweep_numbers() sweep_number = sweep_numbers sweep_data = data_set.get_sweep(sweep_number) # spike times are in seconds relative to the start of the sweep spike_times = data_set.get_spike_times(sweep_number) # stimulus is a numpy array in amps stimulus = sweep_data['stimulus'] # response is a numpy array in volts reponse = sweep_data['response'] # sampling rate is in Hz sampling_rate = sweep_data['sampling_rate'] # start/stop indices that exclude the experimental test pulse (if applicable) index_range = sweep_data['index_range']
NWB is implemented in HDF5. HDF5 files provide a hierarchical data storage that mirrors the organization of a file system. Just as a file system has directories and files, and HDF5 file has groups and datasets. The best way to understand an HDF5 (and NWB) file is to open a data file in an HDF5 browser. HDFView is the recommended browser from the makers of HDF5.
There are HDF5 manipulation libraries for many languages and platorms. MATLAB and Python in particular have strong HDF5 support.