Zheng Ye, Caron Daniel P, Kim Ju Yeong, Jun Seong-Hwan, Tian Yuan, Mair Florian, Stuart Kenneth D, Sims Peter A, Gottardo Raphael
Department of Bioinformatics and Computational Biology, Department of Systems Biology, University of Texas MD Anderson Cancer Center, Houston, TX, USA.
Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Center, Seattle, WA, USA.
Nat Commun. 2025 Jul 1;16(1):5852. doi: 10.1038/s41467-025-61023-6.
Cellular Indexing of Transcriptomes and Epitopes by Sequencing (CITE-seq) enables paired measurement of surface protein and mRNA expression in single cells using antibodies conjugated to oligonucleotide tags. Due to the high copy number of surface protein molecules, sequencing antibody-derived tags (ADTs) allows for robust protein detection, improving cell-type identification. However, variability in antibody staining leads to batch effects in the ADT expression, obscuring biological variation, reducing interpretability, and obstructing cross-study analyses. Here, we present ADTnorm, a normalization and integration method designed explicitly for ADT abundance. Benchmarking against 14 existing scaling and normalization methods, we show that ADTnorm accurately aligns populations with negative- and positive-expression of surface protein markers across 13 public datasets, effectively removing technical variation across batches and improving cell-type separation. ADTnorm enables efficient integration of public CITE-seq datasets, each with unique experimental designs, paving the way for atlas-level analyses. Beyond normalization, ADTnorm includes built-in utilities to aid in automated threshold-gating as well as assessment of antibody staining quality for titration optimization and antibody panel selection. Applying ADTnorm to an antibody titration study, a published COVID-19 CITE-seq dataset, and a human hematopoietic progenitors study allowed for identifying previously undetected phenotype-associated markers, illustrating a broad utility in biological applications.
通过测序进行转录组和表位的细胞索引(CITE-seq)能够使用与寡核苷酸标签偶联的抗体对单细胞中的表面蛋白和mRNA表达进行配对测量。由于表面蛋白分子的拷贝数高,对源自抗体的标签(ADT)进行测序可实现可靠的蛋白检测,改善细胞类型识别。然而,抗体染色的变异性会导致ADT表达中的批次效应,掩盖生物学差异,降低可解释性,并阻碍跨研究分析。在此,我们提出了ADTnorm,这是一种专门为ADT丰度设计的归一化和整合方法。通过与14种现有的缩放和归一化方法进行基准测试,我们表明ADTnorm能够准确地对齐13个公共数据集中表面蛋白标记物呈阴性和阳性表达的群体,有效消除批次间的技术差异并改善细胞类型分离。ADTnorm能够高效整合每个具有独特实验设计的公共CITE-seq数据集,为图谱级分析铺平道路。除了归一化之外,ADTnorm还包括内置实用工具,以辅助自动阈值门控以及评估抗体染色质量,用于滴定优化和抗体组合选择。将ADTnorm应用于抗体滴定研究、已发表的COVID-19 CITE-seq数据集和人类造血祖细胞研究,能够识别先前未检测到的表型相关标记物,说明了其在生物学应用中的广泛用途。