Bioinformatics initiatives, software and databases
The Bio-Eye section of the "BioInformer" aims to keep you informed of new initiatives and technologies as well as the more "mundane" announcements of new software and database releases. These are not necessarily EMBL-EBI's, but news about the above topics is welcome from every person or organisation working in the field of bioinformatics.
You are welcome to send announcements to email@example.com
SWISS-PROT will need to acquire additional funds to keep up with the enormous amounts of data, and to maintain and improve and the very high quality of the data and annotations in the database. The model that has been chosen, is to request yearly license fees from non-academic users to get access to the database. Academic users will not be affected. Read more about this important development in this press-release.
The SRS software has been acquired by Lion biosciences AG and commercial users will have to pay a license fee. The evolution from a free package to a commercial one is a healthy and obvious continuation, since this will set in place technical and customer support, as well as accelerating the future development of SRS.
Campylobacter jejuni genome finished by Sanger Centre
The Sanger Centre recently completed the sequence of the food poisoning organism Campylobacter jejuni. The sequence is 1,641,480 bp in length (with 25 polymorphic regions), and was generated from 33,824 sequencing reads.
The "Genome Analysis and Protein Family Maker" has been developed for the analysis of most of the complete bacterial genomes announced since 1995.
MView is a free tool for converting the results of a sequence database search into the form of a multiple alignment of hits stacked against the query.
SEALS (A System for Easy Analysis of Lots of Sequences) is a software package expressly designed for large-scale research projects in bioinformatics. Using a friendly, scalable command-line user interface, SEALS provides dozens of commands to help the user quickly implement standard sequence analysis protocols, design new investigations, and generally Get Things Done with dispatch.
The BioCatalog is a software directory of general interest in molecular biology and genetics. This release contains 571 molecular biology software programs.
The Dali Domain Dictionary not only provides a global structural classification of proteins, but also a comprehensive description of families of protein sequences grouped around representative proteins of known structure.
The EMBL Nucleotide Sequence Database contains in this release 2,689,618 sequence entries comprising 1,904,091,473 nucleotides. This represents an increase of about 15,5 % over release 55.
The international ImMunoGeneTics database, IMGT, is a high-quality integrated database specialising in genes important in the function of the immune system and involved in immune recognition of all vertebrate species. Several new developments to IMGT are presented.
INFOGENE is a collection of databases of known and predicted genes and proteins. Organisms that are covered in these databases are: human, mouse, drosophila and arabidopsis.
Nrdb90 is a non-redundant sequence database that facilitates (i) faster homology searches using, for example, Fasta or Blast, and (ii) unified annotation for all sequences clustered around a representative.
Pfam is a collection of 1313 protein domain family alignments which were constructed semi-automatically using profile hidden Markov models. Pfam families contain functional annotation and cross-references to other databases.
RHdb is a database of raw data used in constructing radiation hybrid maps. This includes STS data, scores, experimental conditions, and extensive cross references. This release ontains 13 panel entries, 132 experimental conditions, 23 maps and 86021 RH entries for 3 different species (human, rat, and mouse).
SPTR is a comprehensive protein sequence database that combines the high quality of annotation in SWISS-PROT with the completeness of the weekly updated translation of protein coding sequences from the EMBL nucleotide database.
STACK version 2.0 is an error compensated database of alignments and clustered EST consensus sequences generated by very exhaustive sequence comparison of all possible sequence fragments against each other.
Release 36.0 of the SWISS-PROT protein sequence database contains 74'019 sequence entries, comprising 26'840'295 amino acids abstracted from 59'911 references. This represents an increase of 7% over release 35.
TrEMBL is the protein sequence database supplementing the SWISS-PROT Protein Sequence Data Bank. TrEMBL contains the translations of all coding sequences (CDS) present in the EMBL Nucleotide Sequence Database not yet integrated in SWISS-PROT. TrEMBL 7, generated from EMBL 55, ontains 193'860 sequence entries, comprising 53'601'062 amino acids.