Monica Chagoyen, Florencio Pazos
Computational Systems Biology Group, National Centre for Biotechnology (CNB-CSIC), C/ Darwin 3, 28049 Madrid, Spain.
Bioinformatics (Oxford, England) 2010 Feb 1Gene Ontology (GO), the de facto standard for representing protein functional aspects, is being used beyond the primary goal for which it is designed: protein functional annotation. It is increasingly used to evaluate large sets of relationships between proteins, e.g. protein-protein interactions or mRNA co-expression, under the assumption that related proteins tend to have the same or similar GO terms. Nevertheless, this assumption only holds for terms representing functional groups with biological significance ('classes'), and not for the ones representing human-imposed aggregations or conceptualizations lacking a biological rationale ('categories'). Using a data-driven approach based on a set of high-quality functional associations, we quantify the functional coherence of GO biological process (GO:BP) terms as well as their explicit and implicit relationships, trying to distinguish classes and categories. We show that the quantification used is in agreement with the distinction one would intuitively make between these two concepts. As not all GO:BP terms and relationships are equally supported by current functional associations, any detailed validation of new experimental data using GO:BP, beyond whole-system statistics, should take such unbalance into account. Supplementary data are available at Bioinformatics online.
Monica Chagoyen, Florencio Pazos. Quantifying the biological significance of gene ontology biological processes--implications for the analysis of systems-wide data. Bioinformatics (Oxford, England). 2010 Feb 1;26(3):378-84
PMID: 19965879
View Full Text