In November 2008 we took the first steps to significantly upgrade the annotation that we apply to completed BAC sequences from the UK/China collaboration in the Multinational Brassica Genome Sequencing Project (BrGSP). All future UK/China BACs and all sequences deposited as part of the BrGSP will be annotated using the new pipeline.
Our next step will be to add BLASTX searches against all of UniProt in order to give information on predictions for proteins not present in the annotated Arabidopsis proteome.
GBrowse track listings:
The Genes track in the Overview panel and the PASA gene models track in the Details panel now use SNAP ab initio gene predictions trained on Brassica and corrected with PASA using ~800k Brassica raw ESTs. These represent our best models. The PASA track is mouseover-enabled with popups giving the best Arabidopsis BLASTX hit, links to GO terms for that gene and a link to a virtual protein translation and alignment with ClustalW/Jalview.
The following gene prediction models are still available on demand by selecting the appropriate tracks:
SNAP
Augustus
FGENESH
GlimmerHMM
GeneZilla
Genscan
Arabidopsis gene models - BLAT is used to attempt an accurate alignment between the Arabidopsis CDS corresponding to the best BLASTX hit for the PASA gene model and the Brassica genomic DNA.
Transcript assembly alignments - in a first pass the sequence was searched by BLASTN with the 95k unigene set developed in collaboration with JCVI. This comprises an oriented and annotated set of 42,642 assemblies and 51,916 singletons. The BLAST hits recovered have then been re-aligned to genomic sequence using BLAT. Clicking on a Transcript assembly feature allows the user to launch a realtime ClustalW alignment and to inspect it graphically with the Jalview applet.
KBr BACends - BLASTN hits with the complete set of ~200k KBr BES. Results are marked up to give information on overlaps and to suggest candidate extension clones or bridging clones
B. napus BACends - BLASTN hits with ~90,000 BES from the B. napus cv Tapidor (JBnB) library.
SSRs - the sequence was searched for SSRs with the msatfinder program and Primer3 used to generate PCR primers for candidate amplicons. These can be directly searched for cross-hybridization against the entire reference sequence database using GBrowse's OligoFinder plugin.
A walkthough of using this website is available here
For help, comments and bug reports please contact Martin Trick