Correlation Engine 2.0
Clear Search sequence regions


Sizes of these terms reflect their relevance to your search.

Accurately predicting polyadenylation [poly(A)] sites is important for defining the end of genes and understanding gene regulation mechanisms. Alternative polyadenylation (APA) has been demonstrated to play an important role in transcriptome diversity and regulating gene expression. To accurately predict poly(A) and APA sites in Chlamydomonas reinhardtii, a green alga that can be used to produce renewable energy, we proposed a novel model that integrated five methods for representing the features of these sites with a combined classifier. We presented a new grouping method based on pattern assembly to classify the poly(A) sites into four groups. We used five methods, involving the predicted RNA secondary structure, the term frequency-inverse document frequency weight, first-order Markov chain, pentamer ratio and a position weight matrix, to generate the feature space. We then developed a heuristic method to form the combined classifier by weighting multiple classifiers to predict poly(A) sites in each group. The high specificity and sensitivity of this model were demonstrated by testing the four groups of poly(A) sites and the intronic APA sites. The average prediction performance was approximately 8 % higher than the performance of a previous prediction model. For the group without any conserved patterns, the prediction accuracy was 9 % higher than for the accuracy with the previous technique. However, the prediction efficiency of this group was still significantly lower than that of the other groups, indicating the importance of identifying additional signal patterns for poly(A) site prediction. We also predicted the alternative poly(A) sites in introns with good accuracy. This prediction model was designed to be easily expanded with new classifiers or new features. Therefore, this model is applicable to new data or other species. Our model will be useful both in genome annotation because it predicts the end of a mature transcript and in genetic engineering because it enables researchers to eliminate undesirable poly(A) sites.

Citation

Xiaohui Wu, Guoli Ji, Yong Zeng. In silico prediction of mRNA poly(A) sites in Chlamydomonas reinhardtii. Molecular genetics and genomics : MGG. 2012 Dec;287(11-12):895-907

Expand section icon Mesh Tags

Expand section icon Substances


PMID: 23108961

View Full Text