TY - JOUR
T1 - Deciphering the functional diversity of DNA-binding transcription factors in Bacteria and Archaea organisms
AU - Flores-Bautista, Emanuel
AU - Hernandez-Guerrero, Rafael
AU - Huerta-Saquero, Alejandro
AU - Tenorio-Salgado, Silvia
AU - Rivera-Gomez, Nancy
AU - Romero, Alba
AU - Ibarra, Jose Antonio
AU - Perez-Rueda, Ernesto
N1 - Publisher Copyright:
© 2020 Flores-Bautista et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
PY - 2020/8
Y1 - 2020/8
N2 - DNA-binding Transcription Factors (TFs) play a central role in regulation of gene expression in prokaryotic organisms, and similarities at the sequence level have been reported. These proteins are predicted with different abundances as a consequence of genome size, where small organisms contain a low proportion of TFs and large genomes contain a high proportion of TFs. In this work, we analyzed a collection of 668 experimentally validated TFs across 30 different species from diverse taxonomical classes, including Escherichia coli K-12, Bacillus subtilis 168, Corynebacterium glutamicum, and Streptomyces coelicolor, among others. This collection of TFs, together with 111 hidden Markov model profiles associated with DNA-binding TFs collected from diverse databases such as PFAM and DBD, was used to identify the repertoire of proteins putatively devoted to gene regulation in 1321 representative genomes of Archaea and Bacteria. The predicted regulatory proteins were posteriorly analyzed in terms of their genomic context, allowing the prediction of functions for TFs and their neighbor genes, such as genes involved in virulence, enzymatic functions, phosphorylation mechanisms, and antibiotic resistance. The functional analysis associated with PFAM groups showed diverse functional categories were significantly enriched in the collection of TFs and the proteins encoded by the neighbor genes, in particular, small-molecule binding and amino acid transmembrane transporter activities associated with the LysR family and proteins devoted to cellular aromatic compound metabolic processes or responses to drugs, stress, or abiotic stimuli in the MarR family. We consider that with the increasing data derived from new technologies, novel TFs can be identified and help improve the predictions for this class of proteins in complete genomes. The complete collection of experimentally characterized and predicted TFs is available at http://web.pcyt.unam.mx/EntrafDB/.
AB - DNA-binding Transcription Factors (TFs) play a central role in regulation of gene expression in prokaryotic organisms, and similarities at the sequence level have been reported. These proteins are predicted with different abundances as a consequence of genome size, where small organisms contain a low proportion of TFs and large genomes contain a high proportion of TFs. In this work, we analyzed a collection of 668 experimentally validated TFs across 30 different species from diverse taxonomical classes, including Escherichia coli K-12, Bacillus subtilis 168, Corynebacterium glutamicum, and Streptomyces coelicolor, among others. This collection of TFs, together with 111 hidden Markov model profiles associated with DNA-binding TFs collected from diverse databases such as PFAM and DBD, was used to identify the repertoire of proteins putatively devoted to gene regulation in 1321 representative genomes of Archaea and Bacteria. The predicted regulatory proteins were posteriorly analyzed in terms of their genomic context, allowing the prediction of functions for TFs and their neighbor genes, such as genes involved in virulence, enzymatic functions, phosphorylation mechanisms, and antibiotic resistance. The functional analysis associated with PFAM groups showed diverse functional categories were significantly enriched in the collection of TFs and the proteins encoded by the neighbor genes, in particular, small-molecule binding and amino acid transmembrane transporter activities associated with the LysR family and proteins devoted to cellular aromatic compound metabolic processes or responses to drugs, stress, or abiotic stimuli in the MarR family. We consider that with the increasing data derived from new technologies, novel TFs can be identified and help improve the predictions for this class of proteins in complete genomes. The complete collection of experimentally characterized and predicted TFs is available at http://web.pcyt.unam.mx/EntrafDB/.
UR - http://www.scopus.com/inward/record.url?scp=85089817268&partnerID=8YFLogxK
U2 - 10.1371/journal.pone.0237135
DO - 10.1371/journal.pone.0237135
M3 - Artículo
C2 - 32822422
AN - SCOPUS:85089817268
SN - 1932-6203
VL - 15
JO - PLoS ONE
JF - PLoS ONE
IS - 8 August 2020
M1 - e0237135
ER -