Home
 Content
 Lead article
 EBI
 Bio-eye
 Events
BioInformer Logo -- click for homepage

A publication of EMBL - Outstation Hinxton, The European Bioinformatics Institute

EBI logo -- click for homepage
biobrddwn

SEALS 0.823

Introduction

SEALS (A System for Easy Analysis of Lots of Sequences) is a software package expressly designed for large-scale research projects in bioinformatics. Using a friendly, scalable command-line user interface, SEALS provides dozens of commands to help the user quickly implement standard sequence analysis protocols, design new investigations, and generally Get Things Done with dispatch. SEALS is not an automated system for genome analysis. We find that this goal is neither possible nor desirable. SEALS is designed toward a different goal: to facilitate large-scale semi-automatic sequence analysis projects, leveraging human intelligence by simplifying laborious tasks, without taking human judgement out of the loop.

Graphical interfaces are avoided, as they tend to be slow and do not scale well. Simple human-readable file formats are used exclusively in order to facilitate human interaction at every stage of a process.  In addition to providing user-level sequence analysis tools, SEALS also aims to provide a rapid development environment of sorts, implementing a set of primitives at the appropriate level of abstraction for current research projects in genome analysis. New applications can be rapidly prototyped and non-programmers can easily create and modify novel functions using shell scripts.

Functionality

SEALS includes such functions as

  1. Data retrieval
    Local database lookups are seamlessly integrated with remote NCBI Entrez retrievals for maximum speed. The HTTP-based Entrez retrieval supports an implementation of the Secure Sockets Layer (SSL) for secure transactions.
  2. Taxonomy analysis
    SEALS tools understand the complete NCBI taxonomy. It is possible to filter lists of sequences according to any taxonomic criterion, allowing queries such as "What is the best eukaryotic BLAST hit for my favourite sequence?", or (more fun), "What is the best eukaryotic BLAST hit for each sequence in my favourite genome?".
  3. Pattern matching
    Using Perl regular expressions in combination with Perl code snippets, SEALS tools can match and score arbitrarily complex patterns and specifications in sequences such as "tryptophan followed by a strongly hydrophobic region in the C-terminal end of a serine-rich protein".
  4. Scripting tools
    Robust scripters for popular programs such as BLAST and ClustalW provide conveniences for large-scale jobs along with a consistent user interface.
  5. BLAST parsers
    Any flavour of BLAST output, including the new PSI-BLAST, can be used with SEALS BLAST manipulation tools. Recognition of file formats is automatic and transparent.
  6. Miscellaneous tools
    Dozens of useful general widgets are provided, including file-format converters, flat file parser, fasta record sorter, DNA<->aa translation, Netscape interface, etc.
  7. Enhancements to standard Unix command-line functionality
    • unlimited-size fileglobs
    • recursive fileglobs
    • use of URLs as input filenames

Implementation & Availability

SEALS is written entirely in Perl.  The package is under very active development. SEALS is free software, released into the public domain.

Information by: D. Roland Walker


 

Resources and further information

  • National Center for Biotechnology Information (NCBI)
    http://www.ncbi.nlm.nih.gov/
    • SEALS homepage
      http://www.ncbi.nlm.nih.gov/Walker/SEALS/index.html
    • Download SEALS
      http://www.ncbi.nlm.nih.gov/Walker/SEALS/download.html
    • Recent citations of SEALS:
      1. Stephens RS, Kalman S, Lammel CJ, Fan J, Marathe R, Aravind L, Mitchell WP,  Olinger L, Tatusov RL, Zhao Q, Koonin EV, Davis RW. Genome Sequence of an Obligate Intracellular Pathogen of Humans: Chlamydia trachomatis. Science 1998 (in press)
      2. Aravind L, Tatusov RL, Wolf YI, Walker DR, Koonin EV. Massive gene exchange between archael and bacterial hyperthermophiles. Trends in Genetics 1998 (in press)
      3. Galperin MY, Walker DR, Koonin EV. Analogous enzymes: independent inventions in enzyme evolution. Genome Res 1998 Aug; 8(8):779-90
      4. Cole KA, Chuaqui RF, Katz K, Pack S, Zhuang Z, Cole CE, Lyne JC, Linehan WM, Liotta LA, Emmert-Buck MR.  cDNA sequencing and analysis of POV1 (PB39): a novel gene up-regulated in prostate cancer. Genomics 1998 Jul 15;51(2):282-7
      5. Mushegian AR, Garey JR, Martin J, Liu LX. Large-scale taxonomic profiling of eukaryotic model organisms: A comparison of orthologous proteins encoded by the human, fly, nematode, and yeast genomes. Genome Res. 1998 Jun; 8(6):590-8
      6. Koonin EV, Mushegian AR, Galperin MY, Walker DR. Comparison of archaeal and bacterial genomes: computer analysis of protein sequences predicts novel functions and suggests a chimeric origin for the archaea. Mol Microbiol 1997 Aug; 25(4):619-37

 

External sites are not endorsed by EMBL-EBI

 

biobrddwn

Direct questions or comments to Bioinformer Editor. This page last modified Friday, 16 July, 1999.
ISSN 1462-1363.
More information about the BioInformer.

(c) 1997-1999 EMBL-EBI. All Rights Reserved.