Upload
lex-nederbragt
View
2.801
Download
7
Tags:
Embed Size (px)
DESCRIPTION
A talk I gave at the Dec 2013 Assembly Masterclass at UC Davis. Really licensed under CC0. UPDATED May 2014, for the presentation I gave at the combined SeRC Nordic Assembly Workshop in Stockholm, Sweden, May 14th 2014
Citation preview
A warning
The list is by no means complete
Nor do we have experience with all the programs mentioned
Sample
DNA
Reads
Genome assembly
Sequencing AssemblyDNA isolation
QC QCQC
Reads
Genome
assembly
Assembly
QC
Fastqc
Prinseq
Many others…
www.nipgr.res.in/ngsqctoolkit.html
preqc (sga)
http://arxiv.org/abs/1307.8026
Reads
Genome
assembly
Assembly
Grooming
Format conversion
http://en.wikipedia.org/wiki/FASTQ_format
Fastq format hell
Adapter/quality trimming
http://www.biostars.org/p/53528/
Celera assemblerOverlap based trimming
Fastx ToolkitSeqtkPrinSeqNGS QC ToolkitTrimmomaticBioPiecesCutadapt……
Mate pair splitting and orientation
150 – 600 bases
Illumina paired end reads
2 – 40 kilobases
Illumina mate pair reads
2 – 40 kilobases
454 mate pair reads
linker
Mate pair splitting and orientationIllumina paired end reads
Illumina mate pair reads
454 mate pair reads
linker
junctionjunction
+ +
paired end reads ‘contamination’
Mate pair splitting and orientationIllumina paired end reads
Illumina mate pair reads
454 mate pair reads
linker
junctionjunction
+ +
paired end reads ‘contamination’
Check what orientation your assembler expects
for the reads!
Reads
Genome
assembly
AssemblyPreparing
Error-correctionStand-alone or built into assembler
Merging pairs
List from Torsten Seeman’s bloghttp://thegenomefactory.blogspot.no/2012/11/tools-to-merge-overlapping-paired-end.html
COPE http://sourceforge.net/projects/coperead/SeqPrep https://github.com/jstjohn/SeqPrepFLASH http://www.cbcb.umd.edu/software/flashfastq-join http://code.google.com/p/ea-utils/wiki/FastqJoinPANDAseq https://github.com/neufeld/pandaseqmergePairs.py http://code.google.com/p/standardized-velvet-assembly-report/source/browse/trunk/mergePairs.py
Recent addition
Extend reads
http://140.116.235.124/~tliu/arf-pe/
Digital normalisation
http://arxiv.org/abs/1203.4802
Estimate kmer to use
preqc (SGA)
http://arxiv.org/abs/1307.8026
Reads
Genome
assembly
Assembly
What can the reads tell us about the genome
kmer-based
preqc (SGA)
Kmerspectrumanalyzer
http://arxiv.org/abs/1307.8026
Khmer from Titus
Reads
Genome
assembly
Assembly
This talk
Reads
Genome
assembly
Assembly
QC
Genome assembly
Comparing to each other
Metrics
MergingImprovement
Visualization
Validation
Comparing to reference
Genome assembly
Comparing to each other
Metrics
MergingImprovement
Visualization
Validation
Comparing to reference
Assemblathon stats
http://korflab.ucdavis.edu/datasets/Assemblathon/Assemblathon2/Basic_metrics/assemblathon_stats.pl
OR
https://github.com/lexnederbragt/sequencetools/
Genome assembly
Comparing to each other
Metrics
MergingImprovement
Visualization
Validation
Comparing to reference
Gap closing
IMAGE2
Correcting bases
Quiver from Pacific Biosciences
Separate scaffolding
Genome assembly
Comparing to each other
Metrics
MergingImprovement
Visualization
Validation
Comparing to reference
Assembly merging/reconciliation
Genome assembly
Comparing to each other
Metrics
MergingImprovement
Visualization
Validation
Comparing to reference
Mapped genomic reads
FRCBAM
Mapped transcriptomic reads
Gene finding
Binning
Nederbragt et al, 2010
Genome assembly
Comparing to each other
Metrics
MergingImprovement
Visualization
Validation
Comparing to reference
Genome browser(s)IGV
Genome assembly
Comparing to each other
Metrics
MergingImprovement
Visualization
Validation
Comparing to reference
Comparative measures
Log Average Probability (LAP)
Assembly Likelihood Evaluation (ALE)
See also Howison, Zapata2 and Dunn (2013) Toward a statistically explicit understanding of de novo sequence
assembly doi: 10.1093/bioinformatics/btt525
Genome assembly
Comparing to each other
Metrics
MergingImprovement
Visualization
Validation
Comparing to reference
Reference comparison
Mauve assembly metrics
Review
Too many tools…
http://seqanswers.com/wiki/Software/list
Too many tools…
http://wwwdev.ebi.ac.uk/fg/hts_mappers
88 short-read mappers
Embargo!
Benchmarking, anyone?
All-in-one assembly pipeline
doi:10.1186/1471-2105-15-126