Abstract¶
Reproducibility is a significant challenge in neuroscience, as analysis and visualization methods are often difficult to replicate due to a lack of accessible code, separation of code from published figures, or unavailability of code altogether. This issue may arise from the complex nature of neuroscience research, the use of diverse data formats and analysis techniques, and insufficient emphasis on open-source, collaborative practices. In addition, key neuroscience analyses are typically rewritten at the start of new scientific projects, slowing down the initiation of research efforts.
Four key components are essential for reproducible analysis: accessible data, accessible computational resources, a reproducible environment, and usage documentation. The OpenScope Databook, provided by the Allen Institute’s OpenScope Project, offers a solution to these challenges by facilitating the analysis and visualization of brain data, primarily using NWB files and the DANDI Archive. Hosted on GitHub, the entire publication – including code, data access, text, references, and revisions from reviewers and contributors – is readily available for collaboration and version control, promoting transparency and collective knowledge growth. The OpenScope Databook addresses these components by leveraging a combination of open-source Python libraries, such as DANDI, Binder, Jupyter Book, Google Colab, LaTeX references, Python scripts, Git versioning, and scientific revision through approved pull requests. The entire publication can be recreated by running the code locally, on distributed servers such as Binder, DandiHub, or Google Colab, or on any host running Jupyter notebooks.
We cover several broadly used analyses across the community, providing a missing component for system neuroscience. Our key analyses are organized into chapters, including NWB basics such as downloading, streaming, and visualizing NWB files from data archives. We document essential analyses typically performed in all neuroscience laboratories, such as temporal alignment, alignment to sensory stimuli, and association with experimental metadata. We cover the two leading neuronal recording techniques: two-photon calcium imaging and electrophysiological recordings, and share example analyses of stimulus-averaged responses. Advanced first-order analyses include showing receptive fields, identifying optotagged units, current source density analysis, and cell matching across days.
This resource is actively maintained and can be updated by the community, providing a living document that will grow over time.
The OpenScope Databook: Reproducible System Neuroscience Notebooks to Facilitate Data Sharing and Collaborative Reuse with Open Science Datasets¶
Open In¶
Run any notebook in the cloud using one of the platforms below:
See How Can I Use It? for details on each platform.
How Can I Use It?¶
There are three ways to run this code: Locally, with Binder, or with Dandihub. Binder is launchable directly from any notebook page using the Launch rocket button in the top-right of the Databook.

Locally¶
You can download an individual notebook by pressing the Download button in the top-right and selecting .ipynb. Alternatively, you can clone the repo to your machine and access the files there. The repo can be found by hovering over the GitHub button in the top-right and selecting repository.
Locally (Preferred: uv)¶
Use Python 3.10. Then install and sync with:
python -m pip install uv
uv sync --frozen --extra dev --python 3.10Run notebooks/tools through uv (for example):
uv run jupyter notebook ./docsLocally (Pip-only fallback)¶
If you prefer not to use uv, create a Python 3.10 virtual environment and install from exported requirements:
python -m pip install -r requirements-ci.txt
python -m pip install -e .Locally (Conda)¶
If you prefer conda for environment isolation, create a Python 3.10 conda environment first, then follow either the uv or pip-only installation steps above. For information on installing and using conda, go here. You can create a conda environment with:
conda create -n databook_env python=3.10and you can run that environment with
conda activate databook_envLocally (Docker)¶
The Databook also includes a dockerfile. If you want to build a docker container for the Databook yourself (for some reason), you can do so by running the following command in the Databook main directory after you have docker installed and running
docker build -t openscope_databookYou can then activate the docker by running the following command. Note that, to access the databook in your host machine’s web browser, the port 8888 should be mapped to the docker container’s port.
docker run -p 8888:8888 openscope_databookInstead of building the container yourself, you can use the main docker container that we maintain, registered publicly on Docker hub with the following command
docker run -p 8888:8888 rcpeene/openscope_databook:latestLocally (Running Notebook)¶
Once your environment is setup, you can execute the notebooks in Jupyter by running the following command within the repo directory:
Jupyter notebookBinder¶
Binder will automatically set up the environment with repo2docker and then execute the code in an instance of JupyterHub where the kernel is run. JupyterHub offers a lot of utilities for interacting with Jupyter notebooks and the environment. A given notebook can be launched in Binder by hovering over the Launch button in the top-right and selecting Binder. Occasionally, Binder will have to rebuild the environment before starting JupyterLab, which can take many minutes.
Dandihub¶
Dandihub is an instance of JupyterHub hosted by the DANDI Archive. To use it, navigate to https://hub.dandiarchive.org in your browser and sign in with your GitHub account. From there you will need to set up the environment yourself — there is no pre-built OpenScope image available. Follow the Locally (uv) or Locally (Pip-only fallback) instructions above after launching your server. Once the environment is ready, navigate to the openscope_databook/docs directory which contains the OpenScope notebooks.
How Does It Work?¶
Reproducible Analysis requires four components;
Accessible Data
Accessible Computational Resources
Reproducible Environment
Documentation of Usage
The Databook leverages a number of technologies to combine those components into a web-application.
Data¶
Data is accessed from the DANDI Archive and downloaded via the DANDI Python API within notebooks. Most notebooks make use of publicly available datasets on DANDI, but for some notebooks, there is not yet sufficient publicly-available data to demonstrate our analysis. For these, it is encouraged to use your own NWB Files that are privately stored on DANDI.
Computation¶
This project utilizes Binder, as the host for the environment and the provider of computational resources. Conveniently, Binder has support for effectively replicating a computational environment from a GitHub Repo. Users of the Databook don’t have to worry about managing the environment if they prefer to use our integrated Binder functionality. However, the Databook can be run locally or on other hosts. Details about the different ways to run this code can be found in the section How Can I Use It? below.
Environment¶
As mentioned above, Binder is capable of reproducing the environment in which to run this code. There are also other options for creating the necessary environment. Instructions for running this code locally can be found in the section How Can I Use It? below.
Documentation¶
The great part about this Databook is that the usage of the code is explained within each notebook. The instructions found here should be sufficient for utilizing our code and accurately reproducing a number of different analyses on the relevant data.
Statement of Support¶
We are releasing this code to the public as a tool we expect others to use. We are actively updating and maintaining this project. Issue submissions here are encouraged. Questions can be directed to @rcpeene or @jeromelecoq. We are open to hearing input from users about what types of analysis and visualization might be useful for reproducible neuroscience, particularly when working with the NWB standard.
