Cell Matching Across Days#

Experimental sessions are conducted across multiple days on the same mouse. When imaging the brain location on the same plane with 2-Photon Imaging, the image will not be perfectly identical. The frame may be in a slightly different location on the brain or at a slightly different depth. Additionally, cells can move relative to each other over time. Due to this, computational methods are required to identify which cell Regions of Interest (ROIs) are the same as the previous session.

We use the Allen Institute’s Nway Matching pipeline for this which is described in more detail below.

Environment Setup#

⚠️Note: If running on a new environment, run this cell once and then restart the kernel⚠️

import warnings
warnings.filterwarnings('ignore')

try:
    from databook_utils.dandi_utils import dandi_download_open
except:
    !git clone https://github.com/AllenInstitute/openscope_databook.git
    %cd openscope_databook
    %pip install -e .
import json
import os

import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np

from PIL import Image
from time import sleep

Downloading Ophys NWB Files#

Here you can download several files for a subject that you can run nway matching on. The pipeline can take in any number of input sessions, however, the input ophys data should be from the same imaging plane of the same subject. To specify your own file to use, set dandiset_id to be the dandiset id of the files you’re interested in. Also set input_dandi_filepaths to be a list of the filepaths on dandi of each file you’re interested in providing to Nway Matching. When accessing an embargoed dataset, set dandi_api_key to be your DANDI API key.

dandiset_id = "000336"
dandi_filepath_1 = "sub-621602/sub-621602_ses-1193555033-acq-1193675753_ophys.nwb"
dandi_filepath_2 = "sub-621602/sub-621602_ses-1194555869-acq-1194754135_ophys.nwb"
download_loc = "."

# here, specify the list dandi filepaths of the files you want to run cell matching on
# the subject ids should be the same, but the session ids should be different
input_dandi_filepaths = [dandi_filepath_1, dandi_filepath_2]

dandi_api_key = os.environ["DANDI_API_KEY"]
ios = [dandi_download_open(dandiset_id, filepath, download_loc, dandi_api_key=dandi_api_key) for filepath in input_dandi_filepaths]
nwbs = [io.read() for io in ios]
File already exists
Opening file
File already exists
Opening file
# these should be similar values, indicating that the images are from the same imaging plane
for nwb in nwbs:
    print(nwb.lab_meta_data["metadata"].imaging_depth)
309
309

Generating Cell-Matching Input File#

The Allen Institute has created a robust pipeline for conducting cell matching between 2 or more sessions called Nway Matching. For those who are inquisitive, the general algorithm is described in page 31 of [Garrett et al., 2023]. In summary, the input images are aligned together with a euclidean image registration technique. Then the given input ROI masks are used with the Blossom Algorithm as a bipartite graph matching algorithm. This identifies which ROIs represent the same cells with high probability.

The caveat to using this pipeline is that it requires input in a specific JSON format rather than an NWB File. The following functions are used to generate a file, input.json, which can be used to obtain important output. The input json contains an entry for each input experiment. 2 or more experiments can be used. The first experiment is called fixed and the other ones are moving. For each experiment, a unique experiment ID, a path to the experiment projection image, and a list of objects for each ROI are required.

### for each ROI in an ROI table, create a JSON object containing its information, return a list of JSON objects

def make_roi_entries(roi_table):
    roi_entries = []
    for roi_idx in range(len(roi_table)):
        roi_entry = {
            "id": int(roi_table["id"][roi_idx]),
            "valid": True,
            "x": 0,
            "y": 0,
            "z": 1,
            "width": 512,
            "height": 512,
            # convert 2D array to 2D list of bools
            "mask_matrix": [[bool(elem) for elem in row] for row in roi_table["image_mask"][roi_idx]]
        }
        roi_entries.append(roi_entry)
    return roi_entries
### take in an image array and return it converted to uint16

def normalize_to_16b(image_array):
    return image_array.astype(np.uint16)
### convert an nwb object into a 'experiment' JSON object with the required format

def nwb_to_experiment_json(nwb):
    # save image here
    grayscale_image = np.array(nwb.processing["ophys"]["images"]["average_image"])
    image_array = normalize_to_16b(np.array(grayscale_image))
    image_file = Image.fromarray(image_array)
    image_file = image_file.convert("L")
    image_file.save(f"./{nwb.identifier}_image.png")

    roi_table = nwb.processing["ophys"]["image_segmentation"]["cell_specimen_table"]
    
    experiment = {
        "id": nwb.identifier, 
        "ophys_average_intensity_projection_image": f"./{nwb.identifier}_image.png",
        "cell_rois": make_roi_entries(roi_table)
    }

    return experiment
### make the input json object, populate its experiments field with an experiment object for each nwb object

input_exps = {
    "experiment_containers": {
        "ophys_experiments": []
    }
}

for nwb in nwbs:
    experiment = nwb_to_experiment_json(nwb)
    input_exps["experiment_containers"]["ophys_experiments"].append(experiment)

with open("input.json", "w") as f:
    json.dump(input_exps, f, indent=2)

Running the Algorithm#

Nway Matching can be run as a command with the following parameters. From there, several output files are generated.

!python -m nway.nway_matching --input_json input.json --output_json "./output.json" --output_dir matching_output
c:\Users\carter.peene\Desktop\Projects\openscope_databook\databook_env\lib\site-packages\scipy\__init__.py:169: UserWarning: A NumPy version >=1.18.5 and <1.26.0 is required for this version of SciPy (detected version 1.26.4
  warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}"
WARNING:root:many=True not supported from argparse
INFO:NwayMatching:NWAY_COMMIT_SHA None
INFO:NwayMatching:Nway matching version 0.6.0
c:\Users\carter.peene\Desktop\Projects\openscope_databook\databook_env\lib\site-packages\scipy\__init__.py:169: UserWarning: A NumPy version >=1.18.5 and <1.26.0 is required for this version of SciPy (detected version 1.26.4
  warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}"
c:\Users\carter.peene\Desktop\Projects\openscope_databook\databook_env\lib\site-packages\scipy\__init__.py:169: UserWarning: A NumPy version >=1.18.5 and <1.26.0 is required for this version of SciPy (detected version 1.26.4
  warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}"
c:\Users\carter.peene\Desktop\Projects\openscope_databook\databook_env\lib\site-packages\scipy\__init__.py:169: UserWarning: A NumPy version >=1.18.5 and <1.26.0 is required for this version of SciPy (detected version 1.26.4
  warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}"
WARNING:root:many=True not supported from argparse
WARNING:root:many=True not supported from argparse
INFO:PairwiseMatching:Matching 1193675753 to 1194754135
INFO:PairwiseMatching:Matching 1193675753 to 1194754135: best registration was ['Crop', 'CLAHE', 'PhaseCorrelate']
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "C:\Users\carter.peene\AppData\Local\Programs\Python\Python310\lib\multiprocessing\pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "C:\Users\carter.peene\AppData\Local\Programs\Python\Python310\lib\multiprocessing\pool.py", line 48, in mapstar
    return list(map(*args))
  File "c:\Users\carter.peene\Desktop\Projects\openscope_databook\databook_env\lib\site-packages\nway\nway_matching.py", line 121, in pair_match_job
    pair_match.run()
  File "c:\Users\carter.peene\Desktop\Projects\openscope_databook\databook_env\lib\site-packages\nway\pairwise_matching.py", line 495, in run
    segmask_moving_3d_registered = transform_mask(
  File "c:\Users\carter.peene\Desktop\Projects\openscope_databook\databook_env\lib\site-packages\nway\pairwise_matching.py", line 384, in transform_mask
    dtype=np.int)
  File "c:\Users\carter.peene\Desktop\Projects\openscope_databook\databook_env\lib\site-packages\numpy\__init__.py", line 338, in __getattr__
    raise AttributeError(__former_attrs__[attr])
AttributeError: module 'numpy' has no attribute 'int'.
`np.int` was a deprecated alias for the builtin `int`. To avoid this error in existing code, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
    https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\carter.peene\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\carter.peene\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "c:\Users\carter.peene\Desktop\Projects\openscope_databook\databook_env\lib\site-packages\nway\nway_matching.py", line 502, in <module>
    nmod.run()
  File "c:\Users\carter.peene\Desktop\Projects\openscope_databook\databook_env\lib\site-packages\nway\nway_matching.py", line 462, in run
    self.pair_matches = pool.map(pair_match_job, pair_arg_list)
  File "C:\Users\carter.peene\AppData\Local\Programs\Python\Python310\lib\multiprocessing\pool.py", line 367, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "C:\Users\carter.peene\AppData\Local\Programs\Python\Python310\lib\multiprocessing\pool.py", line 774, in get
    raise self._value
AttributeError: module 'numpy' has no attribute 'int'.
`np.int` was a deprecated alias for the builtin `int`. To avoid this error in existing code, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
    https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

Showing Cell Matching Output#

The cell matching pipeline outputs the data as a output JSON, and a few plots as .png images. The paths to the plots are fetched from the output JSON and displayed below. Additionally, the output json contains a list of matching ROI IDs from each image.

with open("output.json", "r") as f:
    output = json.load(f)
summary_plot = Image.open(output["nway_warp_summary_plot"])
fig, ax = plt.subplots(figsize=(10,10))
ax.imshow(summary_plot)
ax.axis("off")
(-0.5, 560.5, 2951.5, -0.5)
../_images/366cc05bff747cc7ec150fdd0b3b9ff8a7cc7b1b98386de837c167102ed58ee4.png
overlay_plot = Image.open(output["nway_warp_overlay_plot"])
fig, ax = plt.subplots(figsize=(10,10))
ax.axis("off")
ax.imshow(overlay_plot)
<matplotlib.image.AxesImage at 0x1c3b53e35b0>
../_images/e6592d078b8d6b12d49437e52e737d2dad606111037a11c5dd4e8948d002c4b2.png
fraction_plot = Image.open(output["nway_match_fraction_plot"])
fig, ax = plt.subplots(figsize=(10,10))
ax.axis("off")
ax.imshow(fraction_plot)
<matplotlib.image.AxesImage at 0x1c3b7dbdf00>
../_images/13852a5cdb6fb6f6b4f2d967fa88596147bec8c4f93257a4a2fcaa97ea6c9f58.png

Visualizing Matching Cells#

Apart from the image plots above, the main other part of the output is the nway_matches list, which contains a list of tuples the contain ROI IDs that match between session. The tuples are at most length n, where n is the number of experiments the Nway Matching pipeline was given, but some tuples will be shorter if a match wasn’t found. The code below uses these tuples of matching IDs, along with the ROI masks in each NWB file’s ROI tables, and generates n images that use color to show the matching ROIs between sessions.

# takes an ROI mask, colors it, adds it onto image
def add_color_mask(image, mask, color):
    black = [0,0,0]
    color_mask = np.array([[(color if boolean else black) for boolean in row] for row in mask])
    merged_image = np.where(color_mask!=black, color_mask, image)
    return merged_image
# generates a "random" color based on n
def gen_color(n):
    return [n*270 % 255, n*610 % 255, n*890 % 255]
# finds the table and image index that an ROI id belongs to
def get_image_num(id):
    for i in range(len(nwbs)):
        if id in roi_tables[i].index:
            return i
### for every match in the output, generate images which consist of ROI masks that supposedly match between sessions

# roi_tables correspond to images based on index of nwb in nwds list
roi_tables = [nwb.processing["ophys"]["image_segmentation"]["cell_specimen_table"].to_dataframe() for nwb in nwbs]
images = [np.zeros((512,512,3)).astype("uint8") for nwb in nwbs]
nway_matches = output["nway_matches"]

# for every list of matching IDs, generate a color, and add the ROI masks with those IDs to their respective images
for i in range(len(nway_matches)):
    match = nway_matches[i]
    # unmatched ROIs will be shown white
    if len(match) == 1:
        color = [255,255,255]
    else:
        color = gen_color(i)

    for id in match:
        # n is the index of which image and which ROI table this ROI belongs to
        n = get_image_num(id)        
        roi_table = roi_tables[n]
        # casting mask entry from roi table into np array
        mask = roi_table[roi_table.index==id].image_mask.to_numpy()[0]
        
        # add mask to image with given color
        images[n] = add_color_mask(images[n], mask, color)
fig, axs = plt.subplots(len(images), figsize=(10,10))
for i in range(len(images)):
    axs[i].imshow(images[i])
fig.suptitle("Matched ROIs between sessions\n(White ROIs are unmatched)")
Text(0.5, 0.98, 'Matched ROIs between sessions\n(White ROIs are unmatched)')
../_images/3dc3225044da3cee98ff18b50046072a45e8ec145e5d0402b8b5934ea85205c9.png