Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Streaming an NWB File with remfile

As you might have realized, NWB files are large. They take a lot of time to download and a lot of space on your drive. A convenient tool to mitigate this is remfile. remfile allows you to stream the information from a file remotely without having to download it. This can be more efficient if you are only wanting to quickly examine a file or just need access to a portion of the file’s contents. For more extensive analysis, it is still recommended that you download the file.

Environment Setup

⚠️Note: If running on a new environment, run this cell once and then restart the kernel⚠️

import warnings
warnings.filterwarnings('ignore')

try:
    from databook_utils.dandi_utils import dandi_stream_open
except:
    !git clone https://github.com/AllenInstitute/openscope_databook.git
    %cd openscope_databook
    %pip install -e .
import remfile
import h5py

from dandi import dandiapi
from pynwb import NWBHDF5IO

Streaming Configuration

Here you can configure the stream. Browse the DANDI Archive for a dandiset you’re interested in and use its ID in dandiset_id. Set dandi_filepath to the path of the file you want to download within the dandiset. You can get this by navigating to the file you want to download on the DANDI Archive website and pressing on the i icon. There, you can copy the filepath from the field labeled path. Don’t include a leading /.

If you’re accessing an embargoed dandiset, you should set authenticate to True, and set dandi_api_key to your DANDI API key, which can be found if you click on your profile icon in the top-right corner on the DANDI Archive website.

dandiset_id = "000871"
dandi_filepath = "sub-644972/sub-644972_ses-1237081845-acq-1237345890-denoised-movies_image+ophys.nwb"
authenticate = False
dandi_api_key = ""
if authenticate:
    client = dandiapi.DandiAPIClient(token=dandi_api_key)
else:
    client = dandiapi.DandiAPIClient()
my_dandiset = client.get_dandiset(dandiset_id)

print(f"Got dandiset {my_dandiset}")
A newer version (0.75.1) of dandi/dandi-cli is available. You are using 0.74.3
Got dandiset DANDI:000871/draft
file = my_dandiset.get_asset_by_path(dandi_filepath)
base_url = file.client.session.head(file.base_download_url)
file_url = base_url.headers['Location']

print(f"Retrieved file url {file_url}")
Retrieved file url https://dandiarchive.s3.amazonaws.com/blobs/fe1/358/fe135898-cfa7-4243-b927-e6964c31afee?response-content-disposition=attachment%3B%20filename%3D%22sub-644972_ses-1237081845-acq-1237345890-denoised-movies_image%2Bophys.nwb%22&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAUBRWC5GAEKH3223E%2F20260501%2Fus-east-2%2Fs3%2Faws4_request&X-Amz-Date=20260501T230301Z&X-Amz-Expires=21600&X-Amz-SignedHeaders=host&X-Amz-Signature=5cf50ba725c64434491b1a6bd496619f867ce0a034c39b938301306c9e5dd817

Streaming a File

Streaming with remfile is as easy as creating a remote file object from the url, and then opening it through the h5py and pywnb libraries.

rem_file = remfile.File(file_url)
h5py_file = h5py.File(rem_file, "r")
io = NWBHDF5IO(file=h5py_file, mode="r", load_namespaces=True)
nwb = io.read()
nwb.processing
Loading...

Interacting with a Remote File

Once the file has been opened remotely, you can explore the file as you wish via print statements, or you can view the whole thing with just by showed in Exploring an NWB File.

### uncomment these to view aspects of the file
### not all of these exist for all NWB files (Key Errors will arise if the fields don't exist for this file)

# nwb.identifier
# nwb.processing
# nwb.acquisition["events"]
# nwb.intervals["trials"]
# nwb.stimulus["StimulusPresentation"]
# nwb.electrodes

Using Databook Utils Function

Throughout the remainder of the OpenScope Databook, whenever a file is streamed we reuse this code in the form of a local package, databook_utils. To retrieve an NWB file you can use the method dandi_stream_open after importing it like shown at the top of this notebook.

io = dandi_stream_open(dandiset_id, dandi_filepath)
nwb = io.read()
nwb
Loading...