Faster sequence homology searches by clustering subsequences.

Shuji Suzuki, Masanori Kakuta, Takashi Ishida, Yutaka Akiyama

Bioinformatics (Oxford, England) 2015 Apr 15

filter terms:

algorithms (1)

amino acid sequence (1)

clusters (2)

databases (2)

humans (1)

languages (1)

molecular sequence data (1)

sequence analysis (1)

sequence analysis, dna (1)

sequence homology (3)

soil (2)

Sizes of these terms reflect their relevance to your search.

Sequence homology searches are used in various fields. New sequencing technologies produce huge amounts of sequence data, which continuously increase the size of sequence databases. As a result, homology searches require large amounts of computational time, especially for metagenomic analysis. We developed a fast homology search method based on database subsequence clustering, and implemented it as GHOSTZ. This method clusters similar subsequences from a database to perform an efficient seed search and ungapped extension by reducing alignment candidates based on triangle inequality. The database subsequence clustering technique achieved an ∼2-fold increase in speed without a large decrease in search sensitivity. When we measured with metagenomic data, GHOSTZ is ∼2.2-2.8 times faster than RAPSearch and is ∼185-261 times faster than BLASTX. The source code is freely available for download at http://www.bi.cs.titech.ac.jp/ghostz/ akiyama@cs.titech.ac.jp Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.

Citation

Shuji Suzuki, Masanori Kakuta, Takashi Ishida, Yutaka Akiyama. Faster sequence homology searches by clustering subsequences. Bioinformatics (Oxford, England). 2015 Apr 15;31(8):1183-90

Mesh Tags

Substances

PMID: 25432166

View Full Text

FAQ

Faster sequence homology searches by clustering subsequences.

filter terms:

Citation

var meshTagsSectionCollapsed = true; Mesh Tags

var substancesSectionCollapsed = true; Substances

Mesh Tags

Substances