Biochemistry and Molecular Biology, Wright State University, Dayton, OH, 45435, USA.
Math and Microbiology, Wright State University, Dayton, OH, 45435, USA.
Sci Rep. 2018 Jul 3;8(1):10069. doi: 10.1038/s41598-018-28168-5.
Advances in high-throughput sequencing have enabled profiling of microRNAs (miRNAs), however, a consensus pipeline for sequencing of small RNAs has not been established. We built and optimized an analysis pipeline using Partek Flow, circumventing the need for analyzing data via scripting languages. Our analysis assessed the effect of alignment reference, normalization method, and statistical model choice on biological data. The pipeline was evaluated using sequencing data from HaCaT cells transfected with either a non-silencing control or siRNA against ΔNp63α, a p53 family member protein which is highly expressed in non-melanoma skin cancer and shown to regulate a number of miRNAs. We posit that 1) alignment and quantification to the miRBase reference provides the most robust quantitation of miRNAs, 2) normalizing sample reads via Trimmed Mean of M-values is the most robust method for accurate downstream analyses, and 3) use of the lognormal with shrinkage statistical model effectively identifies differentially expressed miRNAs. Using our pipeline, we identified previously unrecognized regulation of miRs-149-5p, 18a-5p, 19b-1-5p, 20a-5p, 590-5p, 744-5p and 93-5p by ΔNp63α. Regulation of these miRNAs was validated by RT-qPCR, substantiating our small RNA-Seq pipeline. Further analysis of these miRNAs may provide insight into ΔNp63α's role in cancer progression. By defining the optimal alignment reference, normalization method, and statistical model for analysis of miRNA sequencing data, we have established an analysis pipeline that may be carried out in Partek Flow or at the command line. In this manner, our pipeline circumvents some of the major hurdles encountered during small RNA-Seq analysis.
高通量测序技术的进步使得微 RNA(miRNA)的分析成为可能,然而,尚未建立用于小 RNA 测序的共识性管道。我们使用 Partek Flow 构建并优化了一个分析管道,从而避免了通过脚本语言分析数据的需求。我们的分析评估了对齐参考、归一化方法和统计模型选择对生物数据的影响。该管道使用 HaCaT 细胞经非沉默对照或针对 ΔNp63α 的 siRNA 转染后的测序数据进行了评估,ΔNp63α 是一种在非黑色素瘤皮肤癌中高度表达的 p53 家族蛋白,已被证明可调节多种 miRNA。我们假设:1)对齐和定量到 miRBase 参考可提供最稳健的 miRNA 定量;2)通过 Trimmed Mean of M-values 对样本读数进行归一化是进行准确下游分析的最稳健方法;3)对数正态分布与收缩统计模型的结合可有效识别差异表达的 miRNA。使用我们的管道,我们发现 ΔNp63α 以前未被识别的 miR-149-5p、18a-5p、19b-1-5p、20a-5p、590-5p、744-5p 和 93-5p 的调控。通过 RT-qPCR 验证了这些 miRNA 的调控,证实了我们的小 RNA-Seq 管道。对这些 miRNA 的进一步分析可能会深入了解 ΔNp63α 在癌症进展中的作用。通过确定 miRNA 测序数据分析的最佳对齐参考、归一化方法和统计模型,我们建立了一个可以在 Partek Flow 或命令行中执行的分析管道。通过这种方式,我们的管道避免了在小 RNA-Seq 分析中遇到的一些主要障碍。