Working with annotation volumes#

Annotation volumes are useful references for understanding brain parcellations at multiple levels. For example, a cell’s location can be described as being in the lateral geniculate nucleus or in any of its parent structures in the anatomical ontology (dorsal thalamus, thalamus, diencephalon, etc.). Annotation volumes can label CCF-registered data with terms from the HOMBA ontology.

This tutorial shows some examples for using the human HOMBA annotation volume to (1) query labels and (2) extract regions at any level as a mask for reference and visualization.

import pandas as pd
from pathlib import Path
from atlas_utils.annotation import Annotation

First, we will read in the annotation volume from the local directory. In this tutorial, we will assume that you have already downloaded the annotation volumes using the download_atlas method (see Getting started).

The annotation image is an array where each voxel is assigned a numeric annotation value that corresponds to the most granular structure that is annotated in the volume. To convert the numeric value to a label, we will need to import the terminology.csv table.

directory = "./data/atlases/hmba-adult-human-homba-atlas/2025/" #Change to local path if data is downloaded elsewhere

dir_path = Path(directory)
image_filepath = dir_path / "annotations_compressed_700.nii.gz"
terminology_filepath = dir_path / "terminology.csv"
annotation_image = Annotation.from_file(image_filepath, terminology_filepath)

print(f"Annotation image has dimensions: {annotation_image.npy.shape}")
Annotation image has dimensions: (260, 311, 260)

The terminology table contains the annotation label value as well as the label name, acronym, and associated metadata (eg. color).

annotation_image.terminology.head(10)
identifier annotation_value parent_identifier name abbreviation color_hex_triplet descendant_identifiers descendants descendant_annotation_values
0 HOMBA:10154 501 NaN central nervous system (neural tube) CNS #3cb44b ['HOMBA:10154', 'HOMBA:10155', 'HOMBA:AA30000'... ['HOMBA:10154', 'HOMBA:10155', 'HOMBA:AA30000'... [501, 502, 503, 84, 504, 505, 506, 507, 508, 5...
1 HOMBA:10155 502 HOMBA:10154 brain Br #808000 ['HOMBA:10155', 'HOMBA:AA30000', 'HOMBA:10156'... ['HOMBA:10155', 'HOMBA:AA30000', 'HOMBA:10156'... [502, 503, 84, 504, 505, 506, 507, 508, 509, 5...
2 HOMBA:AA30000 503 HOMBA:10155 gray matter of brain BGM #0082c8 ['HOMBA:AA30000', 'HOMBA:10156', 'HOMBA:10158'... ['HOMBA:AA30000', 'HOMBA:10156', 'HOMBA:10158'... [503, 84, 504, 505, 506, 507, 508, 509, 510, 5...
3 HOMBA:10156 84 HOMBA:AA30000 gray matter of forebrain (prosencephalon) FB #000080 ['HOMBA:10156', 'HOMBA:10158', 'HOMBA:10159', ... ['HOMBA:10156', 'HOMBA:10158', 'HOMBA:10159', ... [84, 504, 505, 506, 507, 508, 509, 510, 511, 5...
4 HOMBA:10158 504 HOMBA:10156 telencephalon Tel #e6194b ['HOMBA:10158', 'HOMBA:10159', 'HOMBA:10160', ... ['HOMBA:10158', 'HOMBA:10159', 'HOMBA:10160', ... [504, 505, 506, 507, 508, 509, 510, 511, 512, ...
5 HOMBA:10159 505 HOMBA:10158 cerebral cortex Cx #000080 ['HOMBA:10159', 'HOMBA:10160', 'HOMBA:AA30001'... ['HOMBA:10159', 'HOMBA:10160', 'HOMBA:AA30001'... [505, 506, 507, 508, 509, 510, 511, 512, 513, ...
6 HOMBA:10160 506 HOMBA:10159 neocortex (isocortex) NCx #ffd7b4 ['HOMBA:10160', 'HOMBA:AA30001', 'HOMBA:10161'... ['HOMBA:10160', 'HOMBA:AA30001', 'HOMBA:10161'... [506, 507, 508, 509, 510, 511, 512, 513, 514, ...
7 HOMBA:AA30001 507 HOMBA:10160 regions (areas) of neocortex NCxR #000080 ['HOMBA:AA30001', 'HOMBA:10161', 'HOMBA:10172'... ['HOMBA:AA30001', 'HOMBA:10161', 'HOMBA:10172'... [507, 508, 509, 510, 511, 512, 513, 514, 515, ...
8 HOMBA:10161 508 HOMBA:AA30001 frontal cortex FCx #008080 ['HOMBA:10161', 'HOMBA:10172', 'HOMBA:10190', ... ['HOMBA:10161', 'HOMBA:10172', 'HOMBA:10190', ... [508, 509, 510, 511, 512, 513, 514, 515, 516, ...
9 HOMBA:10172 509 HOMBA:10161 prefrontal cortex PFC #f032e6 ['HOMBA:10172', 'HOMBA:10190', 'HOMBA:10191', ... ['HOMBA:10172', 'HOMBA:10190', 'HOMBA:10191', ... [509, 510, 511, 512, 513, 514, 515, 516, 517, ...

1. Query annotation labels from coordinates#

Now that we have loaded in the annotation and terminology, we can use these data assets to query (x,y,z) coordinates and get the anatomical structure label for that coordinate voxel. Coordinates can be represented either by the array index or the physical space. Physical space coordinates refer to the anterior commissure as the origin and represent distances in mm from the origin in (x,y,z)

Let’s use the physical point [17.1, 0.7, -0.6] as a reference

GPe example

query_coordinate = [17.1, 0.7, -0.6]
acronym, name = annotation_image.get_atlas_label(query_coordinate, physical_coordinate = True)

print(f"Coordinate {query_coordinate} is in {acronym}: {name}")
Coordinate [17.1, 0.7, -0.6] is in GPe: external division of globus pallidus

Note: physical coordinates from the neuroglancer visualization have a correction factor, which is applied by setting neuroglancer_coordinate=True.

Human GPe neuroglancer

query_coordinate_neuroglancer = [-11, 3, -263]
acronym, name = annotation_image.get_atlas_label(query_coordinate_neuroglancer, physical_coordinate = True, neuroglancer_coordinate=True)

print(f"Coordinate {query_coordinate_neuroglancer} is in {acronym}: {name}")
Coordinate [-11, 3, -263] is in NACc: core of nucleus accumbens

1.1 Assign annotation labels to existing data table#

Often, we will have sets of coordinates that define points/landmarks of interest (eg. soma locations). Rather than labeling them individually, we can annotate the input data table with the appropriate anatomical labels.

Here, we will use a short example of randomly sampled points throughout the basal ganglia. In this example, the coordinates are indices rather than in physical dimensions (note that physical_coordinate=False)

input_data_filename = "./data/example_coordinates.csv"
input_data = pd.read_csv(input_data_filename)
input_data.head(5)
identifier x y z
0 cell01 148 191 131
1 cell02 155 189 87
2 cell03 158 139 131
3 cell04 107 177 108
4 cell05 86 167 89
labeled_data = annotation_image.label_csv(input_data, physical_coordinate=False)
labeled_data.head(5)
identifier x y z abbreviation name
0 cell01 148 191 131 CaB body of caudate nucleus
1 cell02 155 189 87 PuR rostral putamen
2 cell03 158 139 131 CaT tail of caudate nucleus (caudolateral division...
3 cell04 107 177 108 GPe external division of globus pallidus
4 cell05 86 167 89 PuCv ventral subdivision of PuC

2. Create region masks#

Annotation volumes can also be used to generate region masks for visualization or as a reference. For example, we may need to create a volume for the dorsal striatum for downstream use in visualizing region boundaries. However, the annotation volumes only show the most granular structures. To create a volume for the dorsal striatum, we will need to combine and aggregate all structures that are part of the dorsal striatum.

Dorsal striatum children structures

This can be done using the Annotation object we created earlier and the structure acronym

save_filename = dir_path / "dorsal_striatum_mask.nii.gz" # Or wherever you'd like to save it
annotation_image.create_region_mask('DS', save_filename)

Dorsal striatum mask

This creates a volume that can be used for masking and visualization. We can also use this mask to create a new annotation from file for querying for region identity.

2.1 Get descendant structure names#

Related to the above example, we may need to get a list of descendant structures associated with a region, eg. which labeled structures are part of dorsal striatum?

The annotation.terminology dataframe contains descendant identifiers and label values. However, to convert the identfiers into full names and acroynms, we can use the annotation object to query a structure and return its descendants.

query_structure = 'DS'
descendant_acronym, descendant_name = annotation_image.get_descendants(query_structure)
print(f"Descendants of {query_structure}: {descendant_acronym}")
print(f"Descendants of {query_structure}: {descendant_name}")
Descendants of DS: ['DS', 'Ca', 'CaH', 'CaHld', 'CaHmv', 'CaB', 'CaBld', 'CaBmv', 'CaT', 'CaTd', 'CaTv', 'Eca', 'Pu', 'PuR', 'PuRld', 'PuRmv', 'PuM', 'PuMld', 'PuMmv', 'IPAC', 'IPACm', 'IPACl', 'PuC', 'PuCld', 'PuCmv', 'PuCv', 'AStr', 'PuMG', 'CaPu']
Descendants of DS: ['dorsal striatum (caudoputamen complex-CP, STRd)', 'caudate nucleus (mediodorsal division of the CP)', 'head of caudate nucleus', 'laterodorsal subdivision of CaH', 'medioventral subdivision of CaH', 'body of caudate nucleus', 'laterodorsal subdivision of CaB', 'medioventral subdivision of CaB', 'tail of caudate nucleus (caudolateral division of the CP)', 'dorsal subdivision of CaT', 'ventral subdivision of CaT', 'peri-caudate ependymal and subependymal zone', 'putamen (lateroventral division of the CP)', 'rostral putamen', 'laterodorsal subdivision of PuR', 'medioventral subdivision of PuR', 'middle putamen', 'laterodorsal subdivision of PuM', 'medioventral subdivision of PuM', 'interstitial nucleus of posterior limb of anterior commissure (fundus of striatum)', 'medial part of IPAC', 'Lateral part of IPAC', 'caudal putamen', 'laterodorsal subdivision of PuC', 'medioventral subdivision of PuC', 'ventral subdivision of PuC', 'amygdalostriatal transition area', 'marginal division (cell groups) of putamen', 'caudate-putamen cell bridges']