Correlation Engine 2.0
Clear Search sequence regions


filter terms:
  • bacteria (1)
  • dna sequences (1)
  • e coli (5)
  • escherichia coli (1)
  • gene (4)
  • regulon (15)
  • Sizes of these terms reflect their relevance to your search.

    The transcriptional regulatory network (TRN) of E. coli consists of thousands of interactions between regulators and DNA sequences. Regulons are typically determined either from resource-intensive experimental measurement of functional binding sites, or inferred from analysis of high-throughput gene expression datasets. Recently, independent component analysis (ICA) of RNA-seq compendia has shown to be a powerful method for inferring bacterial regulons. However, it remains unclear to what extent regulons predicted by ICA structure have a biochemical basis in promoter sequences. Here, we address this question by developing machine learning models that predict inferred regulon structures in E. coli based on promoter sequence features. Models were constructed successfully (cross-validation AUROC > = 0.8) for 85% (40/47) of ICA-inferred E. coli regulons. We found that: 1) The presence of a high scoring regulator motif in the promoter region was sufficient to specify regulatory activity in 40% (19/47) of the regulons, 2) Additional features, such as DNA shape and extended motifs that can account for regulator multimeric binding, helped to specify regulon structure for the remaining 60% of regulons (28/47); 3) investigating regulons where initial machine learning models failed revealed new regulator-specific sequence features that improved model accuracy. Finally, we found that strong regulatory binding sequences underlie both the genes shared between ICA-inferred and experimental regulons as well as genes in the E. coli core pan-regulon of Fur. This work demonstrates that the structure of ICA-inferred regulons largely can be understood through the strength of regulator binding sites in promoter regions, reinforcing the utility of top-down inference for regulon discovery. Copyright: © 2024 Qiu et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

    Citation

    Sizhe Qiu, Xinlong Wan, Yueshan Liang, Cameron R Lamoureux, Amir Akbari, Bernhard O Palsson, Daniel C Zielinski. Inferred regulons are consistent with regulator binding sequences in E. coli. PLoS computational biology. 2024 Jan;20(1):e1011824

    Expand section icon Mesh Tags

    Expand section icon Substances


    PMID: 38252668

    View Full Text