Jiang Lan, Chen Huidong, Pinello Luca, Yuan Guo-Cheng
Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA.
Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA.
Genome Biol. 2016 Jul 1;17(1):144. doi: 10.1186/s13059-016-1010-4.
High-throughput single-cell technologies have great potential to discover new cell types; however, it remains challenging to detect rare cell types that are distinct from a large population. We present a novel computational method, called GiniClust, to overcome this challenge. Validation against a benchmark dataset indicates that GiniClust achieves high sensitivity and specificity. Application of GiniClust to public single-cell RNA-seq datasets uncovers previously unrecognized rare cell types, including Zscan4-expressing cells within mouse embryonic stem cells and hemoglobin-expressing cells in the mouse cortex and hippocampus. GiniClust also correctly detects a small number of normal cells that are mixed in a cancer cell population.
高通量单细胞技术在发现新细胞类型方面具有巨大潜力;然而,检测与大量细胞群体不同的稀有细胞类型仍然具有挑战性。我们提出了一种名为GiniClust的新型计算方法来克服这一挑战。针对基准数据集的验证表明,GiniClust具有高灵敏度和特异性。将GiniClust应用于公共单细胞RNA测序数据集,发现了以前未被识别的稀有细胞类型,包括小鼠胚胎干细胞中表达Zscan4的细胞以及小鼠皮质和海马中表达血红蛋白的细胞。GiniClust还能正确检测出混入癌细胞群体中的少量正常细胞。