|
Report on Assessment of Smith-Waterman Sequence Search Tools Implemented in Bioccelerator, FDF and MasPar
Abstract
Analysis of database (SWISS-PROT) search performance, using a set of query protein sequences, by the three implementations of Smith-Waterman search algorithm (namely, Bioccelerator, Fast Data Finder - FDF, and MasPar) indicated that database search speed assumes an asymptotic behaviour as the length of the query sequence increases. Given the configurations available for this study, FDF is faster followed by Bioccelerator and MasPar. An increase in the subject database size brings about a disproportionate increase in MasPar search time while in Bioccelerator and FDF it is nearer to a proportionate increase.
Analysis of the ranking of search results by the three implementations indicated that the hit sequences are appropriately classified into groups of evolutionarily/functionally related members and all three implementations produce the same groups. The range of scores for the closely related group is distinctly higher than that of other groups. In every instance of the test queries (except in one case) the three implementations pick all the related sequence entries from SWISS-PROT database as top hits before picking up any unrelated sequences. Even though the default gap penalties (especially the gap extension) are different in the implementations, the ranking order of the hits are remarkably similar among the implementations (in general a reordering of 5 takes place and such a reordering always occurs within a group). In 9 of 17 instances of the query, the scores of the top group is similar among the implementations while in the remaining 8 cases a difference of 2-10% can be observed.
Availability
The complete report is available both as HTML or as PDF document. A separate document (HTML or PDF) shows cost-effectiveness of the machines tested, but note that this comparison is not final because most prices are bargainable.
Written by: Alphonse Thanaraj
Resources and further information
European Bioinformatics Institute http://www.ebi.ac.uk/
Industry Support Programme http://industry.ebi.ac.uk/
Sequence assessment report starting page http://www.ebi.ac.uk/~thanaraj/seqassess/report.html
More info: Alphonse Thanaraj EMBL-Outstation Hinxton European Bioinformatics Institute Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD United Kingdom E-mail: thanaraj@ebi.ac.uk Phone: +44 (0) 1223 494 650
External sites are not endorsed by EMBL-EBI |