|
FASTA3, BLASTs and Smith&Waterman Services at the EBI
by Rodrigo Lopez Services Programme, EMBL-Outstation Hinxton, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
Database searching is the main tool for sequence analysis. From this it is not only possible to derive homologies and similarities between newly sequenced genes but also to infer from the results some notion about familiarity and function. Traditionally, searches concentrate on DNA and Protein sequence databanks and the algorithms used often revolve around Bill Pearson's fasta1), NCBI's blast2) and various implementations of rigorous Smith & Waterman3).
The EMBL-EBI has recently made public three new services: Fasta3 is an improved implementation of the generic fasta program from Bill Pearson which runs on multiprocessor systems. The main effect of parallelisation of the software is a significant gain in speed without any sensitivity loss. Warren Gish's Washington University blast2 as well as NCBI's blast 1.4 and soon blast2 are also available. WU-Blast2 performs a blast search allowing gaps and aligning the hits in a global manner. This results in much higher sensitivity since the number of false positives is reduced as a result (as opposed to NCBI's blast 1.4). These first two services run on SGI Challenge servers as well as DEC 8400's. MPSrch, running on a (now decommissioned) MasPar has been replaced by a Compugen Bioccelerator-2 Smith & Waterman implementation.
All three new services offer interactive as well as e-mail access. This means that users with good network connectivity may run their searches interactively with a minimum of wait time (typically one minute) while others who submit sequences to be searched via email will wait slightly longer but receive results typically within 30 minutes. The execution times vary slightly depending on network traffic and load on the servers.
When searching interactively the result of a search will be displayed with ready-made links to the EBI's SRS server from where it is possible to study deeper the relationship within neighbouring database search hits.
These services are primarily based on WWW technology and the URL's are listed at the end of this article.
About the Job Submission Interface
EMBL-EBI has adopted a straightforward approach in this matter. The main component of the interface is an HTML form which allows the user to change some of the parameters typical to fasta and blast. All the program parameters are set automatically to default values and the user is only required to provide their email address, and optional title for the search, the type of search (interactive or email) and the sequence to be compared either by cutting & pasting into a window or by uploading a file from the computer Comprehensive help can be obtained at various locations on the form.
Once a job has been submitted a report will appear on the screen. This report tells the user what to expect depending on how the job has been submitted. If the job is interactive the user is asked to stay in that page and after a short pause the result of the search appears on the screen. Otherwise confirmation of an email submission is presented along with searching parameter details and in a format suitable for input from any mailer.
About the databases
The following lists the databases which can be currently searched from the EBI:
 |
|
swissprot |
|
swissnew |
|
trembl |
|
tremblnew |
|
gprpc |
|
pdb |
|
swall |
|
|
EMBL |
|
EMBL divisions |
|
EMBLNEW |
|
EMALL |
|
ESTs |
|
 |
|
x |
|
x |
|
x |
|
x |
|
|
x |
|
x |
|
|
x |
|
x |
|
x |
|
x |
|
x |
|
 |
|
x |
|
x |
|
x |
|
x |
|
x |
|
x |
|
x |
|
|
x |
|
|
x |
|
|
x |
|
 |
|
x |
|
x |
|
x |
|
x |
|
|
x |
|
|
|
Article by: Rodrigo Lopez
Resources and further information
European Bioinformatics Institute http://www.ebi.ac.uk/
BLAST (WU-Blast2 & NCBI's blast 1.4) http://www2.ebi.ac.uk/blast2/
FASTA3 http://www2.ebi.ac.uk/fasta3/
S&W http://www2.ebi.ac.uk/bic_sw/ or http://www2.ebi.ac.uk/genweb/
SRS server http://srs.ebi.ac.uk:5000/
Literature references:
- 1) W. R. Pearson and D. J. Lipman (1988), "Improved Tools for Biological Sequence Analysis", PNAS 85:2444-2448, and W. R. Pearson (1990) "Rapid and Sensitive Sequence Comparison with FASTP and FASTA" Methods in Enzymology 183:63-98).
- 2) Altschul, Stephen F., Warren Gish, Webb Miller, Eugene W. Myers, and David J. Lipman (1990). Basic local alignment search tool. J. Mol. Biol.215:403-10
- 3) Advances in Applied Mathematics, 2:482-489 (1981).
External sites are not endorsed by EMBL-EBI |