Skip to content

Digtial Asset Controlled Vocabulary

Label Synonym Description
FASTQ FASTQ format A (text-based) format that is used for storing a biological sequence (typically a nucleotide sequence) and its corresponding quality scores. The sequence letter and quality score are encoded using a single ASCII character.
BAM binary alignment map format A format that is the compressed binary representation of a sequence alignment map format.
TIFF tag image file format A format that is for storing raster graphics images.
SWC SWC file format, SWC neuron morphology format A format that is used to store neuron morphology data, to share information to digitally reconstruct neurons, and to predict functional attributes using simulation environments.
NWB neurodata without borders format, NWB format, NWB file format A format that is designed to store a wide range of neurophysiology data. It is a HDF5 file that has data organized by including a dedicated location for storing all data that is acquired during an experiment as well as another location to store stimulus data that was presented.
browser extensible data format BED, browser extensible data file format, BED format, BED file format A format that is a text file used to store data regarding genomic regions as coordinates and associated annotations.
bigBed bigBed format, bigBed file format A format that is created from a BED file that stores annotation items that are simple or a linked collection of exons. They are in an indexed binary format.
comma-separated values format CSV, CSV format, CSV file format A format that is a delimited text file that uses a comma to separate values. Each line of the file is a data record, and each record consists of one or more fields, separated by commas.
MTX MTX format, MTX file format A format that is associated with a 3D scene format consisting of a ASCII text written using XML-style tags representing 3D object hierarchies, animation data, and global scene options.
tab-separated values format TSV, TSV format, TSV file format A format that is a text file format used to store data in a tabular structure, for example, a database table or spreadsheet data. It is also used as a means of exchanging information between databases.
H5 HDF5, hierarchical data format 5 A format that contains multidimensional arrays of scientific data.
Loom Loom format A format that is designed to hold large -omics datasets. The format is based on HDF5 in that it is an HDF5 file that contains specific groups containing the main matrix as well as row and column attributes.
PLINK.bed .bed PLINK file A format that is a binary text file format for PLINK that serves as input for analysis. This is the preferred file type for representing genotype calls for PLINK.
bim .bim file, .bim format, PLINK extended MAP file A format that is a variant information file accompanying a .bed or biallelic .pgen binary genotype table.
fam .fam file, .fam format, PLINK sample information file A format that is a sample information file accompanying a .bed or biallelic .pgen binary genotype table.
brain imaging data structure BIDS, BIDS format, BIDS file format, brain imaging data structure file format A format that is a standard for organizing and describing neuroimaging and behavioral data.
H5AD H5AD format A HDF5 file format that provides a scalable way of keeping track of data together with learned annotations. It is anndata's native file format.
NGFF NGFF format, zarr file, Next-generation file format A format that is designed to hold information about multidimensional, multiscale images, high-content screening datasets and derived labeled images. They are able to be hosted natively in an object (or cloud) storage for direct access by a large number of users.
Nifti file format Nifti format, nii file A format that is designed for neuroimaging data that is similar to the Analyze for the storage of Functional Magnetic Resonance Imaging (fMRI) and other medical images.
JSON JSON format, json file, JavaScript Object Notation, JavaScript Object Notation format A format that is an open standard file format and data interchange format that uses human-readable text consisting of attribute-value pairs and arrays to store and transmit data.
JSON-LD JSON-LD format, JSON-LD file, JavaScript Object Notation for Linked Data A format that is an open standard file format and data interchange format that uses human-readable text consisting of attribute-value pairs and arrays to store and transmit data in addition to a linked data (LD) element.
OME TIFF OME-TIFF file, OME-TIFF format A format for storing microscopy imaging data created to maximize the respective strengths of OME-XML and TIFF. It takes advantage of metadata defined in OME-XML while retaining the pixels in multi-page TIFF format for compatibility with many more applications.
mex mex format, mex file A format that is used to represent the gene-barcode matrix output by Cell Ranger. This is a sparse matrix format because the matrix for UMI counts for each barcode/gene pair are very large (~35K genes vs hundreds of thousands of barcodes) and most entries are 0.
scn scn format, scn file A format that is used in digital pathology, which is a proprietary format used by Leica Biosystems scanners to store whole slide images (WSIs).
docx docx format, docx file, Microsoft Word Open XML Document A format that is used as the default, modern Microsoft Word document format that uses Office Open XML (OOXML), a zipped archive of XML files that supports rich features like formatting, images, and tables.
PDF PDF format, pdf file, Portable Document Format A format that is used to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems.
PNG PNG format, png file, Portable Network Graphics A format that is a raster-graphics file format that supports lossless data compression.
tar .tar file, tarball A format that is an archive file format for bundling multiple files and directories into a single file, often on Unix-like systems, to simplify backups and distribution.
gb gb format, gb file, GenBank format A format that is a standard plain-text format used for storing biological sequence information, such as DNA, RNA and protein sequences, along with associated metadata.
jpg jpg format, jpg file, JPEG format, JPEG A format that is a type of image format that is saved using the method of lossy compression.
directory folder, directory folder A "directory file format" isn't one standard, but refers to the underlying file system's internal structure that organizes files and other directories, containing entries like inode numbers and names in Unix-like systems, or a similar mapping structure in others like FAT or NTFS. Directories, also known as folders, are special files themselves that store lists of pointers to other files and directories, enabling a hierarchical, tree-like structure for organizing data on a storage device.
bmp bmp format, bmp file, bitmap A format that is a raster graphics image file format used to store bitmap digital images, independently of the display device (such as a graphics adapter).
fasta FASTA format, fasta file A format that is a text-based format for representing either nucleotide sequences or amino acid (protein) sequences, in which nucleotides or amino acids are represented using single-letter codes.
gz gz format, gz file, gzip format, gzip A format that is a compressed archive created using the standard gzip (GNU zip) compression algorithm. A GZ File may contain multiple compressed files, directories and file stubs.
javascript JS, JS format, js file, JavaScript A format that contains JavaScript code for execution on web pages. JavaScript files are stored with the .js extension.
svg SVG format, svg file, Scalable Vector Graphics A format that uses XML based text format for describing the appearance of an image (Scalar Vector Graphics file).
txt TXT format, txt file, text file, flat file A format that is structured as a sequence of lines of electronic text.
parquet parquet format, parquet file, Apache Parquet A format that is a free and open-source column-oriented data storage format in the Apache Hadoop ecosystem.
zip zip format, zip file A format (archive file format) that supports lossless data compression. A ZIP file may contain one or more files or directories that may have been compressed.
xlsx XLSX format, xlsx file, Microsoft Excel Open XML Spreadsheet A format that is a modern Microsoft Excel spreadsheet format that uses Office Open XML (OOXML), a zipped archive of XML files that supports rich features like formatting, formulas, and charts.