Allen Brain Cell Atlas - Data Access#

The Allen Brain Cell Atlas (ABC Atlas) aims to empower researchers worldwide to explore and analyze multiple whole-brain datasets simultaneously. As the Allen Institute and its collaborators continue to add new modalities, species, and insights to the ABC Atlas, this groundbreaking platform will keep growing, opening up endless possibilities for groundbreaking discoveries and breakthroughs in neuroscience. With the ABC Atlas, researchers everywhere can gain new insights into the brain’s complex workings, advancing our understanding of this amazing organ in ways we never thought possible. The ABC Atlas can be accessed at here and is the primary method of interacting with these data. Any questions or issues associated with the ABC Atlas are best directed to the Allen Institute Community Forum. This repository is intended for users who wish to download the ABC Atlas data and processes it locally.

Data associated with the ABC Atlas is hosted on Amazon Web Services (AWS) in an S3 bucket as a AWS Public Dataset, arn:aws:s3:::allen-brain-cell-atlas. No account or login is required for access. The purpose of this repo is to provide an overview of the available data, how to download and use it through example use cases.

The Winter 2024 public beta data release includes:

We provide a lightweight python object, the AbcProjectCache, to handle downloading of the data and managing different release versions for the user. See the Getting Started notebook for more details and examples on how to use the AbcProjectCache.

Expression matrices are stored in the anndata h5ad format and need to be downloaded to a local file system for usage. To make data transfer, download and access more efficient, the 10x transcriptomics datasets have been subdivided into smaller packages grouped by method and anatomical origin. The notebooks provide example code on how to access data across these individual files using the AbcProjectCache.

If you have used the ABC Atlas visualization to select cells from a dataset and with to use your selected cells in an analysis with the data available for download here, you find a tutorial to do so in the Using cells selected in the ABC Atlas notebook.

Available notebooks:#

Release Notes#

  • [Aging Mouse data (version 20241130), abc_atlas_access (v0.4.0)]

    • Add the Jin et al., aging mouse dataset.

      • 1.2 million cells divided into adult and aged mouse cells.

      • 847 cluster taxonomy with identified associations into the supertype and above levels of the Whole Mouse Brain taxonomy.

      • 2,449 age differential expression genes.

    • Add aging mouse to create gene expression script.

    • Update WMB cluster annotations to the latest version.

    • Separate dependencies for notebooks and base, cache package.

  • [ABC Sample ID update (version 20241115) abc_atlas_access (v0.3.0)]

    • Add abc_sample_id to cell_metadatas of the projects listed below. This allows users to use cell selections they’ve downloaded from the ABC Atlas visualization.

      • MERFISH-C57BL6J-638850(-imputed)

      • WHB-10Xv3

      • WMB-10X

      • Zhuang-ABCA-(1-4)

    • Added notebook showing how to use abc_sample_id.

  • [Summer 2024 Public Beta (version 20240831) abc_atlas_access (v0.2.0)]

    • Released ~8k imputed genes for the MERFISH-C57BL6J-638850 dataset.

    • Added new notebooks and pages for the imputed gene dataset.

    • Updated cache object for better compatibility across platforms.

      • Added automatic unittesting via GitHub Actions.

      • Modified cache to autodetect cache type (local or s3) for easier compatibility with CodeOcean/s3fs-fuse mounts.

      • Simplified notebooks’ use of the cache.

  • [abc_atlas_access (v0.1.2)]

    • Add matplotlib magic.

    • Fixed bugs in pathlib import in notebooks

    • Added script for selecting genes from expression matrices for use on CodeOcean.

  • [abc_atlas_access (v0.1.1)]

    • Fixed compatibility issue with read only local caches specifically for fuse style file mounts such as those used on CodeOcean.

    • Added local cache example to notebooks.

  • [abc_atlas_access (v0.1.0)]

    • Fixed issue with loading 10X mouse data with the get_gene_data function.

  • [Spring 2024 Public Beta (version 20240330)]

    • Added Whole Human Brain transcriptomic, taxonomy and clustering datasets from Siletti et al. 2023 to the ABC Atlas.

    • Added new AbcProjectCache python class to facilitate data download and usage.

      • Added file hashes to the manifest.json to facilitate data integrity checks in the cache class.

    • Updated all jupyter notebooks to use the new AbcProjectCache class.

    • Added new notebooks for the whole human brain datasets.

    • Re-organized jupyter-book webpages.