Bioinformatics and computational biology related projects
This page offers an overview of the projects that I have been involved in over the years. Most of these projects are bioinformatics and computational biology related, but some, from my earlier work, are more in the area of statistical analysis and simulation. The projects are listed by reversed date, i.e. the more recent projects first. However, to start off I have provided an executive background of myself and some quick links to related pages.Executive Summary
Born in Leiden, The Netherlands, in 1967. Attended the University of Leiden and received a drs. degree (equivalent to M.Sc + 2 years research) in Biology, majoring in Bioinformatics and Ethology. Worked for the Netherlands Institute for Ecology, Centre for Estuarine and Marine Ecology (NIOO-CEMO) and the DLO-Research Institute for Agrobiology and Soil Fertility (AB-DLO, currently named Plant Research International), before joining the European Bioinformatics Institute (EMBL-EBI). Worked for ten years at the EBI in the Industry Programme and the Sequence Database Group, before moving to the University of Nebraska-Lincoln as Manager of the Bioinformatics Core Research Facility.Quick links
- A more descriptive overview of my career to date
- Overview of public available presentations (post overhead transparency era)
Projects
- Alternative Splicing
European Bioinformatics Institute (2001-2005).
Simply represented, alternative splicing is the process in which the splicing process of a pre-mRNA can lead to different ripe mRNA molecules and therefore to different proteins. This happens mostly in eukaryotes, but is also observed in viruses. Three methods of creating alternate transcripts are commonly known:- via Transcription Start Sites (promotors)
- via Splice Patterns
- via PolyA/cleavage sites
At the EBI the Alternative Splicing Team has been working the last couple of years to create a computational pipeline of production standards that generates a database of transcript (EST/mRNA) confirmed alternative exons and introns, alternative splice patterns, and alternative events. This basic data set has been further annotated by further parts of our pipeline to show mappings to single nucleotide polymorphisms (SNPs), human-mouse conservation data, alternative peptides, and transcriptional expression profiles. This data can be accessed via the Alternative Splicing Database (ASD) web site. The web site provides an excellent query interface to access the data, and flat files can be downloaded as well. Several tools that are being used in the computational pipeline have also been made available as a Splicing Workbench.
The Alternative Splicing Database covers method 2 of creating alternate transcripts.
Recently, the Alternate Transcript Diversity project was started to build upon previous work and to extend the computational pipeline to include alternative polyadenalytion/cleavage sites as well as alternative transcript start sites. A pre-release of this database (with alternative splice patterns and polyA/cleavage sites) is imminent. - XEMBL
European Bioinformatics Institute (1998-2002).
The XEMBL project - bringing CORBA and XML together - using the EMBL nucleotide sequence database and various DTD´s to export sequence data in easily parsable format.
Originally this project started off as a demonstration of CORBA: show that CORBA is useful in the domain of bioinformatics, ergo, the life sciences. Especially in pharmaceutical or large data centre environments. Beside a usable demonstration of CORBA, we wanted to circumvent the cumbersome flat files that normally pester the bioinformatics arena. What if access can be provided to genomic data without having to go through vast amounts of flat file; what if access to data can be coarse- or fine-grained as the user demanded? With CORBA as middle-ware between the XEMBL application and the databases, this was easily achieved (Note: I could have just as well written XEMBL to connect directly to the database, but remember this was originated as a demo application for CORBA - hence through CORBA we go). Output from XEMBL was in two formats (both XML): BSML and AGAVE. BSML was the de-facto industry standard for biological sequence data. Beside the usual setup as a web form and underlying CGI script, XEMBL was also written as a SOAP/WSDL Web Service; the first publically available bioinformatics related Web Service in the world. XEMBL was also includes as a drop-in service in Labbook's Genomic Suites.
Unfortunately, the current situation at the European Bioinformatics Institute is such that no man-power is available to maintain the CORBA servers, hence XEMBL is now RIP. XML and Web Service access to genomic data will be served via a new DTD developed at the EBI, and supporting services respectively. - BioWurld
European Bioinformatics Institute (1996-2001).
Developer and administrator of BioWurld, a categorised and searchable database of web sites with relevance to bioinformatics, computational -, and molecular biology. This site is now unfortunately resting in peace - similar reasons as with XEMBL: lack of man-power. - BioInformer
European Bioinformatics Institute (1996-2001).
Editor of EBI’s quarterly newsletter the ‘BioInformer’ (ISSN 1462-1363), which is available in online version and was available as a full colour print paper version as well. An archive of the ‘BioInformer’ is available on my site. This newsletter reported on research in the bioinformatics and computational biology domain, announced and reviewed workshops and conferences organised by the European Bioinformatics Institute, and contained various small news items. - Information Dissemination
European Bioinformatics Institute (1996-2001).
During this period I was primarily involved in the Industry Programme, which aimed to leverage bioinformatics approaches and techniques in pharmaceutical and biotech industries, allowing them to quickly adapt to and maximise benefits from developments in bioinformatics. Also administrator and content/webmaster for the Industry Programme and EBI´s primary web server. - FSU
Plant Research International (1992-1996).
Generic crop modelling environment.
- MANAGE-N
Plant Research International (1992-1996).
Optimisation of fertiliser application in rice to increase yield.
- PROEST
NIOO-CEMO (1991-1992).
Anaylsis of estuarine ecosystem models.
- STAR*PC
University of Leiden (1990-1991)
See Structure Analysis of RNA web site.
