Suppr超能文献

Deep-4mCW2V:一种基于序列的预测工具,用于鉴定大肠杆菌中的 N4-甲基胞嘧啶位点。

Deep-4mCW2V: A sequence-based predictor to identify N4-methylcytosine sites in Escherichia coli.

机构信息

Center for Informational Biology and School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China.

Center for Informational Biology and School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China.

出版信息

Methods. 2022 Jul;203:558-563. doi: 10.1016/j.ymeth.2021.07.011. Epub 2021 Aug 2.

Abstract

N4-methylcytosine (4mC) is a type of DNA modification which could regulate several biological progressions such as transcription regulation, replication and gene expressions. Precisely recognizing 4mC sites in genomic sequences can provide specific knowledge about their genetic roles. This study aimed to develop a deep learning-based model to predict 4mC sites in the Escherichia coli. In the model, DNA sequences were encoded by word embedding technique 'word2vec'. The obtained features were inputted into 1-D convolutional neural network (CNN) to discriminate 4mC sites from non-4mC sites in Escherichia coli genome. The examination on independent dataset showed that our model could yield the overall accuracy of 0.861, which was about 4.3% higher than the existing model. To provide convenience to scholars, we provided the data and source code of the model which can be freely download from https://github.com/linDing-groups/Deep-4mCW2V.

摘要

N4-甲基胞嘧啶(4mC)是一种 DNA 修饰,可调节转录调控、复制和基因表达等多种生物学进程。准确识别基因组序列中的 4mC 位点,可以提供关于其遗传作用的特定知识。本研究旨在开发一种基于深度学习的模型,用于预测大肠杆菌中的 4mC 位点。在该模型中,DNA 序列通过词嵌入技术“word2vec”进行编码。所获得的特征被输入到一维卷积神经网络(CNN)中,以区分大肠杆菌基因组中的 4mC 位点和非 4mC 位点。对独立数据集的检验表明,我们的模型可以产生 0.861 的总体准确率,比现有模型高约 4.3%。为了方便学者们,我们提供了模型的数据和源代码,可从 https://github.com/linDing-groups/Deep-4mCW2V 上免费下载。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验