|
New Developments from the Barton Group
Geoff Barton EMBL Outstation - Hinxton, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
Update to 3Dee - the database of protein structural domains
Uwe Dengler and Geoff Barton 3Dee is a curated database of domains for proteins of known three dimensional structures. It was first made available in November 1994 from the University of Oxford, but moved to EBI with Geoff Barton in October 1997. 3Dee contains domain definitions for all proteins for which the structure has been determined by X-ray or NMR methods. Unlike the SCOP, CATH and Dali/FSSP databases, 3Dee includes domains that are made up of multiple chains. The domains are clustered into sequence similar families and representatives of each family are clustered by a 3D structure comparison algorithm into structurally similar fold families. The database also contains multiple structure alignments for each fold family as well as secondary structure definitions for each domain. Over the last nine months, the software used to update 3Dee has been extensively improved and a full update of the database has been completed. The update dates up to August 1998, but a further update is in progress to bring the database into line with the current PDB. It is anticipated that with these improvements, weekly updates to the database will be possible from Spring 2000.
JNET: Protein secondary structure prediction at over 76% accuracy
James Cuff and Geoff Barton The JNET protein secondary structure prediction method is a neural network predictor that works from a range of different multiple sequence alignment profiles. These include the output of a PSI-BLAST search as well as HMM profiles. In a blind test on 406 sequence non-redundant proteins JNET predicts secondary structure at 76.4% accuracy. JNET also provides a 'Confidence number' that ranges from 0-10 for each predicted residue. Residues that have a confidence number of >=5 are predicted at 84% accuracy on average and cover 68% of the residues. At the time of writing, JNET was the most accurate secondary structure prediction method so far reported. The JNET software is available for download from the Barton Group pages or can be run on the JPRED server at EBI (see below).
JNET: Buried residue prediction at 86.6% accuracy
Knowledge of how buried a residue is can indicate its importance to the fold or function of a protein. The location of buried residues can also be used to guide the prediction of the tertiary structure of a protein, either by homology modelling, fold recognition or ab initio methods. JNET1) gives three predictions of residue burial based on either total burial (0% exposed), 5% or 25% exposure. The accuracy of prediction for JNET in these three categories is: 86.6%, 79.9% and 76.2% respectively.
JPRED2: Enhanced Secondary Structure Prediction Server
James Cuff and Geoff Barton The JPRED secondary structure prediction server takes a single sequence, or a multiple protein sequence alignment and runs a range of secondary structure prediction methods on the sequence(s). If a single sequence is given, JPRED automatically searches for homologues and builds a multiple sequence alignment to feed the prediction algorithms. JPRED includes a consensus prediction method that is more accurate than any of the constituent methods (Cuff and Barton, 1999). The results of the prediction are presented either in HTML, or within the JalView multiple alignment editor/viewer. Since it was first made available in May 1998, the original JPRED server has made over 18,000 predictions for users worldwide. JPRED2 adds the new JNET secondary structure prediction method to the JPRED server. It also revises the method used to search for homologues to build the multiple alignment. This is now done by the more sensitive iterative procedure PSI-BLAST. JPRED2 was moved to a new, faster server in December 1999.
ProtEST: Protein multiple sequence alignments from ESTs
James Cuff, Ewan Birney, Michele Clamp and Geoff Barton The aim of ProtEST2) is to simplify the process of gathering sequence data from both the protein database and available ESTs into a multiple protein sequence alignment. When provided with a protein sequence, ProtEST searches protein and EST collections, automatically clusters EST sequences, assembles them, removes redundancies, checks for sequence errors, translates into protein and multiply aligns. Since ProtEST gathers additional sequences not available in the SWALL protein sequence database, it provides additional data both for prediction and analysis.
Article by: Geoff Barton
Resources and further information
European Bioinformatics Institute http://www.ebi.ac.uk/
The Barton Group http://barton.ebi.ac.uk/ If you are interested in the www services, software or data described in this article, please see links from http://barton.ebi.ac.uk/. The precise addresses for each service may change, but the above URL will not.
3Dee - database of protein structural domains http://barton.ebi.ac.uk/servers/3Dee.html
JNET: protein secondary structure prediction http://barton.ebi.ac.uk/servers/jnet.html
JPRED2: enhanced secondary structure prediction http://barton.ebi.ac.uk/servers/jpred.html
ProtEST Protein multiple sequence alignments from ESTs http://barton.ebi.ac.uk/servers/protest.html
References
- 1)
Cuff, J. A. and Barton, G. J. (2000), PROTEINS, in press, "Application of Multiple Sequence Alignment Profiles to Improve Protein Secondary Structure Prediction".
- 2) Cuff, J. A. and Birney, E. and Clamp, M. E. and Barton, G. J. (2000), Bioinformatics, in press, "ProtEST: Protein multiple sequence alignments from Expressed Sequence Tags".
|