doubletFinder(
data,
select.genes,
proportion.artificial = 0.2,
k = NULL,
plot = FALSE
)gene x sample matrix with counts (non-normalized)
list of genes with highest variance between samples
The proportion (from 0-1) of the merged real-artificial dataset that is artificial. In other words, this argument defines the total number of artificial doublets. Default is set to 20%
The number of nearest neighbours of the merged real-artificial dataset used to define each cell's neighborhood in PC space. Value is the minimum of 1
plot
An list of doublet.scores per samples and plots depicting the doublet scores for cells and artificial doublets. Adopted from https://www.cell.com/cell-systems/fulltext/S2405-4712(19)30073-0 (https://github.com/chris-mcginnis-ucsf/DoubletFinder) This function generates artificial nearest neighbors from existing single-cell RNA sequencing data. First, real and artificial data are merged. Second, dimension reduction is performed on the merged real-artificial dataset using PCA. Third, the proportion of artificial nearest neighbors is defined for each real cell. Finally, real cells are rank- ordered and predicted doublets are defined via thresholding based on the expected number of doublets.