Supplemental Metadata Descriptions#

HMBA Spatial Basal Ganglia data supplemental columns#

Below we list column names and descriptions for the columns available in the cell_supplemental_metadata table for the species in the spatial BG dataset. The cell_metadata tables includes a the column

  • ‘qc_pass’: aggregate of any/all QC metrics used to filter out low-quality cells

Additionally in the slab_plane_coordinates table the additional columns [‘x_rotated_um’, ‘y_rotated_um’]: Unofficial transformation of [‘x’, ‘y’] that uses a manual midline annotation to correctly orient each section (dorsal up; midline left). These are retained for context however, users should use the x/y_slab_mm coordinates which line up with the manually annotated polygons.

  • All species

    • ‘total_counts’: detected counts of all spots, includes both genes and any control probes or codewords (controls differ slightly between MERSCOPE and Xenium platforms; see individual species for breakdown)

    • ‘total_counts_genes’: detected counts of gene transcripts

  • Human (H22.30.001) & Macaque (QM23.50.001):

    • [‘total_counts_Blank’, ‘pct_counts_Blank’]: control for MERSCOPE platform

    • ‘n_genes_by_counts’: number of unique genes per cell

    • [‘doublet_singlet_score_diff’, ‘doublet_diff_threshold’]: parameters related to SOLO doublet detection

    • [‘blanks_filter’, ‘genes_filter’, ‘counts_filter’, ‘doublets_filter’]: QC pass boolean for each metric

    • ‘qc_pass_and_singlet’: boolean describing if a cell passed QC AND was predicted to be a singlet. Not usually used, but could be useful to have in the future.

  • Macaque (QM23.50.001):

    • [‘incongruous_genes_pct’, ‘incongruous_pairs_pct’]: Segmentation QC metric on manually defined genes that shouldn’t be expressed together

  • Marmoset (CJ23.56.004):

    • [‘total_counts_control_probe’, ‘total_counts_genomic_control’, ‘total_counts_control_codeword’, ‘total_counts_unassigned_codeword’, ‘total_counts_deprecated_codeword’]: controls for Xenium platform

    • [‘cell_area’, ‘nucleus_area’, ‘nucleus_count’]: additional metadata generated by cell segmentation

    • ‘segmentation_method’: specifies which stain or method was used by the 10X Cell Segmentation Kit algorithm; categories include [‘Segmented by boundary stain (ATP1A1+CD45+E-Cadherin)’, ‘Segmented by interior stain (18S)’, ‘Segmented by nucleus expansion of 5.0µm’]