School of Control Science and Engineering, Shandong University, Jinan, 250061, China.
Comput Biol Med. 2022 Jul;146:105558. doi: 10.1016/j.compbiomed.2022.105558. Epub 2022 Apr 27.
MicroRNAs (miRNAs) play important regulatory roles in the pathogenesis and progression of diseases. Most existing bioinformatics methods only study miRNA-disease binary association prediction. However, there are many types of associations between miRNA and disease. In addition, the miRNA-disease-type association dataset has inherent noise and incompleteness. In this paper, a novel method based on tensor factorization and label propagation (TFLP) is proposed to alleviate the above problems. First, as an effective tensor factorization method, tensor robust principal component analysis (TRPCA) is applied to the original multiple-type miRNA-disease associations to obtain a clean and complete low-rank prediction tensor. Second, the Gaussian interaction profile (GIP) kernel is used to describe the similarity of disease pairs and the similarity of miRNA pairs. Then, they are combined with disease semantic similarity and miRNA functional similarity to obtain an integrated disease similarity network and an integrated miRNA similarity network, respectively. Finally, the low-rank association tensor and the biological similarity as auxiliary information are introduced into label propagation. The prediction performance of the algorithm is improved by iterative propagation of labeled information to unlabeled samples. Extensive experiments reveal that the proposed TFLP method outperforms other state-of-the-art methods for predicting multiple types of miRNA-disease associations. The data and source codes are available at https://github.com/nayu0419/TFLP.
微小 RNA(miRNA)在疾病的发病机制和进展中发挥着重要的调节作用。大多数现有的生物信息学方法仅研究 miRNA-疾病二元关联预测。然而,miRNA 与疾病之间存在许多类型的关联。此外,miRNA-疾病类型关联数据集具有内在的噪声和不完整性。在本文中,提出了一种基于张量分解和标签传播(TFLP)的新方法来缓解上述问题。首先,作为一种有效的张量分解方法,张量鲁棒主成分分析(TRPCA)被应用于原始的多类型 miRNA-疾病关联,以获得干净和完整的低秩预测张量。其次,使用高斯互作用图(GIP)核来描述疾病对和 miRNA 对的相似性。然后,它们与疾病语义相似性和 miRNA 功能相似性相结合,分别得到整合的疾病相似性网络和整合的 miRNA 相似性网络。最后,将低秩关联张量和生物相似性作为辅助信息引入到标签传播中。通过对未标记样本的标记信息进行迭代传播,提高了算法的预测性能。广泛的实验表明,所提出的 TFLP 方法在预测多种类型的 miRNA-疾病关联方面优于其他最先进的方法。数据和源代码可在 https://github.com/nayu0419/TFLP 获得。