Developed by James Estill, Dept. of Plant Biology, University of Georgia

Preview:

DESCRIPTION

Developed by James Estill, Dept. of Plant Biology, University of Georgia. TriAnnot. France. IOB Cluster: UGA. Pipeline Annotate Wheat Sequences. PERL. GAME XML. BLAST –m 8 -d MIPS. BLAST –m 8 -d RB_pln. BLAST –m 8 -d TIGRGram. BLAST –m 8 -d TREP9nr. >HEX0014K09 GCAATACT CGGCACTT. - PowerPoint PPT Presentation

Citation preview

Developed by James Estill, Dept. of Plant Biology, University of Georgia

Pipeline Annotate Wheat Sequences

PERL

TriAnnot

FranceIOB Cluster: UGA

GAME XML

Annotation PipelineBLAST –m 8-d MIPSBLAST –m 8

-d RB_plnBLAST –m 8-d TIGRGramBLAST –m 8

-d TREP9nr>HEX0014K09GCAATACTCGGCACTT

Gene Annotation TE Annotation

De Novo HomologyFindmiteLTR_StrucLTR_SeqFind_LTRLTR_Finder

HMMERRepeatmaskerTE NestBLAST

De Novo HomologyGENSCANGENIDFGENESH

BLASTBLATSIM4

Individual Program Procedure

Directoryof FASTAFiles

Configuration File

Run Program

RawResults

GFFFormated

Developed by James Estill, Dept. of Plant Biology, University of Georgia

!! THIS DOCUMENT IS UNDER CURRENT DEVELOPMENT!!

This program manual and the scripts that make up the DAWG-PAWS package are under current development. Everything is subject to change without notice at this point. This software comes as is, without any expressed or implied warranty. Use at your own risk.

File requirements:1. Each fasta file contains a single record2. BAC scaffolds need to be merged to a single sequence3. Short header

Repeat masking with RepeatMasker and TREP1. Softmask (using RepeatMasker)2. Convert softmask to hardmask because many gene prediction programs

are not softmasked aware

Structural feature annotation: Includes currently only the annotation of gaps

Gene annotation:1. Conduct gene prediction using TriAnnot pipeline2. Run individual gene prediction programs

GenMarkHMM: can be run locally (free license required)GENSCAN: Run on web server & convert output to .gff fileFGeneSH: Run on web server & convert output to .gff file

NCBI-Blast: Most time-consuming step in the pipeline

Transposable element annotation:1. By homology: RepeatMasker, NCBI-Blast2. By structural criteria: LTR-finder

De Novo LTR Annotation Software

PubYear

Source

Availabili

OperatingS

ystem

Speed

Param

eterC

ontrol

License

TSD

LTR

Dinucleotides

PB

S

GA

G IN RT

RH

PP

T

LTR_Struc 2003

LTR_Seq 2006

find_ltr 2007

LTR_Finder 2007

Computation Annotation

Best Good Neutral Bad Crap

Preparing the computational results for Apollo1. Audit the computational results2. Concatenate the .gff files

Recommended