The UK Crop Plant Bioinformatics Network

BrassicaDB Process

EMBL Release 79

(the last huzzah?)



The following notes are specific to EMBL release 79, and supplement the general details of the process described in BrassicaDB Nucleotide Sequence Process.


Thursday 10th June 2004

Extracted Brassica accessions from the EMBL flat files
BrassicaEMBL
EST43,74621,434,594
HTG468,255
GSS596,9308,471,953
Other genomic seqs2,6804,476,642
Total643,36034,451,444


Friday 11th June 2004

Loading the generated ace files into an empty database yields ...

Class# objects% change (cf r78)
DNA643,359+0.02%
Paper1,061+3.8%
Protein0
Sequence646,071+0.04%

Parsed UniProt Release 1.11, released 7/6/2004 (results). Built the SPTrEMBL and EMBL BLAST databases for the BrassicaDB compute.


Monday 14th June 2004

Building the database with EMBL r79, post-79 updates to today (14/06/04), the BBSRC SSR data, UniProt 1.11 and the BrasscaDB legacy dataset gives ...

Class# objects
Author10,002
DNA643,765
Gene_Product11,113
Paper7,261
Journal1,207
Peptide1,876
Protein1,876
Sequence646,540
Species28


Wednesday 16th June 2004

Started the BrassicaDB BLAST compute at 1135. This process is being hindered by a large increase in the number of EST query sequences and hardware/software problems with our Linux cluster.For the present we are limiting the BLASTN analysis for the SSR flanking sequences to BrassicaDB and not EMBL.


Monday 21st June 2004

Completed the BLAST analysis at 4.20 pm


Tuesday 22nd June 2004

Processed the BLAST output to ace files and applied to database. The Brassica GSS sequences had already been mapped to the TIGR v5 pseudomolecules as part of a separate development exercise. This output is in the form of a GFF file to drive GBrowse/MySQL. It was decided to recode the parsing of this file to look up annotation from TIGR v5 XML files as the reference source.


Thursday 24th June 2004

Forged intra-database protein object links and the links to the (frozen) Mendel-GFDb and Mendel-ESTS databases.


Wednesday 30th June 2004

Completed the recoding of the GSS GFF datafile parser. Applied the Mendel db links and the GSS acefile to the database. Tested out this development version and then committed. Started copying the binary DB files to jicbio and UK CropNet servers.


Thursday 1st July 2004

Rebuilt database available from both servers.