Correlation Engine 2.0
Clear Search sequence regions


Sizes of these terms reflect their relevance to your search.

With the rapid expansion in the number of published papers in the biomedical field, finding relevant articles has become a demanding task for researchers. This has led to increasing interest in the use of text mining tools that help search the literature and identify the most relevant documents or information. One specific topic of interest is related to the identification of articles that might be used for extracting protein-protein interactions. Using the BioCreative III Article Classification Task dataset, composed of PubMed abstracts classified as relevant or non-relevant for describing protein-protein interactions, we compare different classification methods with different sets of features. The best results--area under the interpolated precision-recall curve of 0.654--indicate that the proposed classification strategy could be incorporated in the database curation workflows in order to prioritize articles for extraction of protein-protein interactions. Furthermore, we also analysed the use of this method for ranking documents resulting from general PubMed queries, and propose that this approach could be useful for general researchers looking for publications describing protein-protein interactions within a particular topic of interest. Copyright 2011 The Author(s). Published by Journal of Integrative Bioinformatics.

Citation

Sérgio Matos, José Luís Oliveira. Classification methods for finding articles describing protein-protein interactions in PubMed. Journal of integrative bioinformatics. 2011;8(3):178

Expand section icon Mesh Tags

Expand section icon Substances


PMID: 21926441

View Full Text