A cross-validation scheme for machine learning algorithms in shotgun proteomics.

Viktor Granholm, William Stafford Noble, Lukas Käll

BMC bioinformatics 2012

filter terms:

Peptides are routinely identified from mass spectrometry-based proteomics experiments by matching observed spectra to peptides derived from protein databases. The error rates of these identifications can be estimated by target-decoy analysis, which involves matching spectra to shuffled or reversed peptides. Besides estimating error rates, decoy searches can be used by semi-supervised machine learning algorithms to increase the number of confidently identified peptides. As for all machine learning algorithms, however, the results must be validated to avoid issues such as overfitting or biased learning, which would produce unreliable peptide identifications. Here, we discuss how the target-decoy method is employed in machine learning for shotgun proteomics, focusing on how the results can be validated by cross-validation, a frequently used validation scheme in machine learning. We also use simulated data to demonstrate the proposed cross-validation scheme's ability to detect overfitting.

Citation

Viktor Granholm, William Stafford Noble, Lukas Käll. A cross-validation scheme for machine learning algorithms in shotgun proteomics. BMC bioinformatics. 2012;13 Suppl 16:S3

Mesh Tags

Substances

PMID: 23176259

View Full Text

FAQ

A cross-validation scheme for machine learning algorithms in shotgun proteomics.

filter terms:

Citation

var meshTagsSectionCollapsed = true; Mesh Tags

var substancesSectionCollapsed = true; Substances

Mesh Tags

Substances