Climate Change Data Portal
DOI | 10.1073/pnas.2021171118 |
DeepTFactor: A deep learning-based tool for the prediction of transcription factors | |
Kim G.B.; Gao Y.; Palsson B.O.; Lee S.Y. | |
发表日期 | 2021 |
ISSN | 00278424 |
卷号 | 118期号:2 |
英文摘要 | A transcription factor (TF) is a sequence-specific DNA-binding protein that modulates the transcription of a set of particular genes, and thus regulates gene expression in the cell. TFs have commonly been predicted by analyzing sequence homology with the DNA-binding domains of TFs already characterized. Thus, TFs that do not show homologies with the reported ones are difficult to predict. Here we report the development of a deep learning-based tool, DeepTFactor, that predicts whether a protein in question is a TF. DeepTFactor uses a convolutional neural network to extract features of a protein. It showed high performance in predicting TFs of both eukaryotic and prokaryotic origins, resulting in F1 scores of 0.8154 and 0.8000, respectively. Analysis of the gradients of prediction score with respect to input suggested that DeepTFactor detects DNA-binding domains and other latent features for TF prediction. DeepTFactor predicted 332 candidate TFs in Escherichia coli K-12 MG1655. Among them, 84 candidate TFs belong to the y-ome, which is a collection of genes that lack experimental evidence of function. We experimentally validated the results of DeepTFactor prediction by further characterizing genome-wide binding sites of three predicted TFs, YqhC, YiaU, and YahB. Furthermore, we made available the list of 4,674,808 TFs predicted from 73,873,012 protein sequences in 48,346 genomes. DeepTFactor will serve as a useful tool for predicting TFs, which is necessary for understanding the regulatory systems of organisms of interest. We provide DeepTFactor as a stand-alone program, available at https://bitbucket.org/kaistsystemsbiology/deeptfactor. © 2021 National Academy of Sciences. All rights reserved. |
英文关键词 | ChIP-exo; Deep learning; Transcription factor; Transcription regulation; Y-ome |
语种 | 英语 |
来源期刊 | Proceedings of the National Academy of Sciences of the United States of America |
文献类型 | 期刊论文 |
条目标识符 | http://gcip.llas.ac.cn/handle/2XKMVOVA/181076 |
作者单位 | Metabolic and Biomolecular Engineering National Research Laboratory, Department of Chemical and Biomolecular Engineering (BK21 Plus Program), Korea Advanced Institute of Science and Technology, Daejeon, 34141, South Korea; Systems Metabolic Engineering and Systems Healthcare Cross-Generation Collaborative Laboratory, Korea Advanced Institute of Science and Technology, Daejeon, 34141, South Korea; KAIST Institute for the BioCentury, Korea Advanced Institute of Science and Technology, Daejeon, 34141, South Korea; KAIST Institute for Artificial Intelligence, Korea Advanced Institute of Science and Technology, Daejeon, 34141, South Korea; BioProcess Engineering Research Center, Korea Advanced Institute of Science and Technology, Daejeon, 34141, South Korea; BioInformatics Research Center, Korea Advanced Institute of Science and Technology, Daejeon, 34141, South Korea; Division of Biological Sciences, University of California San Diego, San Diego, CA 92093, United States; Department of Bioengineering, Univers... |
推荐引用方式 GB/T 7714 | Kim G.B.,Gao Y.,Palsson B.O.,et al. DeepTFactor: A deep learning-based tool for the prediction of transcription factors[J],2021,118(2). |
APA | Kim G.B.,Gao Y.,Palsson B.O.,&Lee S.Y..(2021).DeepTFactor: A deep learning-based tool for the prediction of transcription factors.Proceedings of the National Academy of Sciences of the United States of America,118(2). |
MLA | Kim G.B.,et al."DeepTFactor: A deep learning-based tool for the prediction of transcription factors".Proceedings of the National Academy of Sciences of the United States of America 118.2(2021). |
条目包含的文件 | 条目无相关文件。 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。