Cell Matching Across Days#
Experimental sessions are conducted across multiple days on the same mouse. When imaging the brain location on the same plane with 2-Photon Imaging, the image will not be perfectly identical. The frame may be in a slightly different location on the brain or at a slightly different depth. Additionally, cells can move relative to each other over time. Due to this, computational methods are required to identify which cell Regions of Interest (ROIs) are the same as the previous session.
We use the Allen Institute’s Nway Matching pipeline for this which is described in more detail below.
Environment Setup#
⚠️Note: If running on a new environment, run this cell once and then restart the kernel⚠️
import warnings
warnings.filterwarnings('ignore')
try:
from databook_utils.dandi_utils import dandi_download_open
except:
!git clone https://github.com/AllenInstitute/openscope_databook.git
%cd openscope_databook
%pip install -e .
import json
import os
import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
from PIL import Image
from time import sleep
Downloading Ophys NWB Files#
Here you can download several files for a subject that you can run nway matching on. The pipeline can take in any number of input sessions, however, the input ophys data should be from the same imaging plane of the same subject. To specify your own file to use, set dandiset_id
to be the dandiset id of the files you’re interested in. Also set input_dandi_filepaths
to be a list of the filepaths on dandi of each file you’re interested in providing to Nway Matching. When accessing an embargoed dataset, set dandi_api_key
to be your DANDI API key.
dandiset_id = "000336"
dandi_filepath_1 = "sub-621602/sub-621602_ses-1193555033-acq-1193675753_ophys.nwb"
dandi_filepath_2 = "sub-621602/sub-621602_ses-1194555869-acq-1194754135_ophys.nwb"
download_loc = "."
# here, specify the list dandi filepaths of the files you want to run cell matching on
# the subject ids should be the same, but the session ids should be different
input_dandi_filepaths = [dandi_filepath_1, dandi_filepath_2]
dandi_api_key = os.environ["DANDI_API_KEY"]
ios = [dandi_download_open(dandiset_id, filepath, download_loc, dandi_api_key=dandi_api_key) for filepath in input_dandi_filepaths]
nwbs = [io.read() for io in ios]
File already exists
Opening file
File already exists
Opening file
# these should be similar values, indicating that the images are from the same imaging plane
for nwb in nwbs:
print(nwb.lab_meta_data["metadata"].imaging_depth)
309
309
Generating Cell-Matching Input File#
The Allen Institute has created a robust pipeline for conducting cell matching between 2 or more sessions called Nway Matching. For those who are inquisitive, the general algorithm is described in page 31 of [Garrett et al., 2023]. In summary, the input images are aligned together with a euclidean image registration technique. Then the given input ROI masks are used with the Blossom Algorithm as a bipartite graph matching algorithm. This identifies which ROIs represent the same cells with high probability.
The caveat to using this pipeline is that it requires input in a specific JSON format rather than an NWB File. The following functions are used to generate a file, input.json
, which can be used to obtain important output. The input json contains an entry for each input experiment. 2 or more experiments can be used. The first experiment is called fixed and the other ones are moving. For each experiment, a unique experiment ID, a path to the experiment projection image, and a list of objects for each ROI are required.
### for each ROI in an ROI table, create a JSON object containing its information, return a list of JSON objects
def make_roi_entries(roi_table):
roi_entries = []
for roi_idx in range(len(roi_table)):
roi_entry = {
"id": int(roi_table["id"][roi_idx]),
"valid": True,
"x": 0,
"y": 0,
"z": 1,
"width": 512,
"height": 512,
# convert 2D array to 2D list of bools
"mask_matrix": [[bool(elem) for elem in row] for row in roi_table["image_mask"][roi_idx]]
}
roi_entries.append(roi_entry)
return roi_entries
### take in an image array and return it converted to uint16
def normalize_to_16b(image_array):
return image_array.astype(np.uint16)
### convert an nwb object into a 'experiment' JSON object with the required format
def nwb_to_experiment_json(nwb):
# save image here
grayscale_image = np.array(nwb.processing["ophys"]["images"]["average_image"])
image_array = normalize_to_16b(np.array(grayscale_image))
image_file = Image.fromarray(image_array)
image_file = image_file.convert("L")
image_file.save(f"./{nwb.identifier}_image.png")
roi_table = nwb.processing["ophys"]["image_segmentation"]["cell_specimen_table"]
experiment = {
"id": nwb.identifier,
"ophys_average_intensity_projection_image": f"./{nwb.identifier}_image.png",
"cell_rois": make_roi_entries(roi_table)
}
return experiment
### make the input json object, populate its experiments field with an experiment object for each nwb object
input_exps = {
"experiment_containers": {
"ophys_experiments": []
}
}
for nwb in nwbs:
experiment = nwb_to_experiment_json(nwb)
input_exps["experiment_containers"]["ophys_experiments"].append(experiment)
with open("input.json", "w") as f:
json.dump(input_exps, f, indent=2)
Running the Algorithm#
Nway Matching can be run as a command with the following parameters. From there, several output files are generated.
!python -m nway.nway_matching --input_json input.json --output_json "./output.json" --output_dir matching_output
c:\Users\carter.peene\Desktop\Projects\openscope_databook\databook_env\lib\site-packages\scipy\__init__.py:169: UserWarning: A NumPy version >=1.18.5 and <1.26.0 is required for this version of SciPy (detected version 1.26.4
warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}"
WARNING:root:many=True not supported from argparse
INFO:NwayMatching:NWAY_COMMIT_SHA None
INFO:NwayMatching:Nway matching version 0.6.0
c:\Users\carter.peene\Desktop\Projects\openscope_databook\databook_env\lib\site-packages\scipy\__init__.py:169: UserWarning: A NumPy version >=1.18.5 and <1.26.0 is required for this version of SciPy (detected version 1.26.4
warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}"
c:\Users\carter.peene\Desktop\Projects\openscope_databook\databook_env\lib\site-packages\scipy\__init__.py:169: UserWarning: A NumPy version >=1.18.5 and <1.26.0 is required for this version of SciPy (detected version 1.26.4
warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}"
c:\Users\carter.peene\Desktop\Projects\openscope_databook\databook_env\lib\site-packages\scipy\__init__.py:169: UserWarning: A NumPy version >=1.18.5 and <1.26.0 is required for this version of SciPy (detected version 1.26.4
warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}"
WARNING:root:many=True not supported from argparse
WARNING:root:many=True not supported from argparse
INFO:PairwiseMatching:Matching 1193675753 to 1194754135
INFO:PairwiseMatching:Matching 1193675753 to 1194754135: best registration was ['Crop', 'CLAHE', 'PhaseCorrelate']
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "C:\Users\carter.peene\AppData\Local\Programs\Python\Python310\lib\multiprocessing\pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "C:\Users\carter.peene\AppData\Local\Programs\Python\Python310\lib\multiprocessing\pool.py", line 48, in mapstar
return list(map(*args))
File "c:\Users\carter.peene\Desktop\Projects\openscope_databook\databook_env\lib\site-packages\nway\nway_matching.py", line 121, in pair_match_job
pair_match.run()
File "c:\Users\carter.peene\Desktop\Projects\openscope_databook\databook_env\lib\site-packages\nway\pairwise_matching.py", line 495, in run
segmask_moving_3d_registered = transform_mask(
File "c:\Users\carter.peene\Desktop\Projects\openscope_databook\databook_env\lib\site-packages\nway\pairwise_matching.py", line 384, in transform_mask
dtype=np.int)
File "c:\Users\carter.peene\Desktop\Projects\openscope_databook\databook_env\lib\site-packages\numpy\__init__.py", line 338, in __getattr__
raise AttributeError(__former_attrs__[attr])
AttributeError: module 'numpy' has no attribute 'int'.
`np.int` was a deprecated alias for the builtin `int`. To avoid this error in existing code, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\Users\carter.peene\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\carter.peene\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "c:\Users\carter.peene\Desktop\Projects\openscope_databook\databook_env\lib\site-packages\nway\nway_matching.py", line 502, in <module>
nmod.run()
File "c:\Users\carter.peene\Desktop\Projects\openscope_databook\databook_env\lib\site-packages\nway\nway_matching.py", line 462, in run
self.pair_matches = pool.map(pair_match_job, pair_arg_list)
File "C:\Users\carter.peene\AppData\Local\Programs\Python\Python310\lib\multiprocessing\pool.py", line 367, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "C:\Users\carter.peene\AppData\Local\Programs\Python\Python310\lib\multiprocessing\pool.py", line 774, in get
raise self._value
AttributeError: module 'numpy' has no attribute 'int'.
`np.int` was a deprecated alias for the builtin `int`. To avoid this error in existing code, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
Showing Cell Matching Output#
The cell matching pipeline outputs the data as a output JSON, and a few plots as .png
images. The paths to the plots are fetched from the output JSON and displayed below. Additionally, the output json contains a list of matching ROI IDs from each image.
with open("output.json", "r") as f:
output = json.load(f)
summary_plot = Image.open(output["nway_warp_summary_plot"])
fig, ax = plt.subplots(figsize=(10,10))
ax.imshow(summary_plot)
ax.axis("off")
(-0.5, 560.5, 2951.5, -0.5)
overlay_plot = Image.open(output["nway_warp_overlay_plot"])
fig, ax = plt.subplots(figsize=(10,10))
ax.axis("off")
ax.imshow(overlay_plot)
<matplotlib.image.AxesImage at 0x1c3b53e35b0>
fraction_plot = Image.open(output["nway_match_fraction_plot"])
fig, ax = plt.subplots(figsize=(10,10))
ax.axis("off")
ax.imshow(fraction_plot)
<matplotlib.image.AxesImage at 0x1c3b7dbdf00>
Visualizing Matching Cells#
Apart from the image plots above, the main other part of the output is the nway_matches
list, which contains a list of tuples the contain ROI IDs that match between session. The tuples are at most length n, where n is the number of experiments the Nway Matching pipeline was given, but some tuples will be shorter if a match wasn’t found. The code below uses these tuples of matching IDs, along with the ROI masks in each NWB file’s ROI tables, and generates n images that use color to show the matching ROIs between sessions.
# takes an ROI mask, colors it, adds it onto image
def add_color_mask(image, mask, color):
black = [0,0,0]
color_mask = np.array([[(color if boolean else black) for boolean in row] for row in mask])
merged_image = np.where(color_mask!=black, color_mask, image)
return merged_image
# generates a "random" color based on n
def gen_color(n):
return [n*270 % 255, n*610 % 255, n*890 % 255]
# finds the table and image index that an ROI id belongs to
def get_image_num(id):
for i in range(len(nwbs)):
if id in roi_tables[i].index:
return i
### for every match in the output, generate images which consist of ROI masks that supposedly match between sessions
# roi_tables correspond to images based on index of nwb in nwds list
roi_tables = [nwb.processing["ophys"]["image_segmentation"]["cell_specimen_table"].to_dataframe() for nwb in nwbs]
images = [np.zeros((512,512,3)).astype("uint8") for nwb in nwbs]
nway_matches = output["nway_matches"]
# for every list of matching IDs, generate a color, and add the ROI masks with those IDs to their respective images
for i in range(len(nway_matches)):
match = nway_matches[i]
# unmatched ROIs will be shown white
if len(match) == 1:
color = [255,255,255]
else:
color = gen_color(i)
for id in match:
# n is the index of which image and which ROI table this ROI belongs to
n = get_image_num(id)
roi_table = roi_tables[n]
# casting mask entry from roi table into np array
mask = roi_table[roi_table.index==id].image_mask.to_numpy()[0]
# add mask to image with given color
images[n] = add_color_mask(images[n], mask, color)
fig, axs = plt.subplots(len(images), figsize=(10,10))
for i in range(len(images)):
axs[i].imshow(images[i])
fig.suptitle("Matched ROIs between sessions\n(White ROIs are unmatched)")
Text(0.5, 0.98, 'Matched ROIs between sessions\n(White ROIs are unmatched)')