Correlation Engine 2.0
Clear Search sequence regions


Sizes of these terms reflect their relevance to your search.

Sequence homology searches are used in various fields. New sequencing technologies produce huge amounts of sequence data, which continuously increase the size of sequence databases. As a result, homology searches require large amounts of computational time, especially for metagenomic analysis. We developed a fast homology search method based on database subsequence clustering, and implemented it as GHOSTZ. This method clusters similar subsequences from a database to perform an efficient seed search and ungapped extension by reducing alignment candidates based on triangle inequality. The database subsequence clustering technique achieved an ∼2-fold increase in speed without a large decrease in search sensitivity. When we measured with metagenomic data, GHOSTZ is ∼2.2-2.8 times faster than RAPSearch and is ∼185-261 times faster than BLASTX. The source code is freely available for download at http://www.bi.cs.titech.ac.jp/ghostz/ akiyama@cs.titech.ac.jp Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.

Citation

Shuji Suzuki, Masanori Kakuta, Takashi Ishida, Yutaka Akiyama. Faster sequence homology searches by clustering subsequences. Bioinformatics (Oxford, England). 2015 Apr 15;31(8):1183-90

Expand section icon Mesh Tags

Expand section icon Substances


PMID: 25432166

View Full Text