Upload
heller
View
30
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Introduction to Bioinformatics Tuesday, 29 January. X. Can we push the due date to Friday?. Introduction to Bioinformatics Tuesday, 29 January. Comments from Questionnaire. I need a textbook to teach me the basics. Introduction to Bioinformatics Tuesday, 29 January. - PowerPoint PPT Presentation
Citation preview
Introduction to BioinformaticsTuesday, 29 January
Can we push the due date to Friday?
X
Introduction to BioinformaticsTuesday, 29 January
Comments from Questionnaire
I need a textbook to teach me the basics.
Introduction to BioinformaticsTuesday, 29 January
I need a textbook to teach me the basics.
I need a textbook to teach me the basics.
Comments from Questionnaire
I don't understand the purpose of mates.
I'm not really sure how to go about figuring out how much 1X of the
genome would be. How much what? Is this referring to nucleotides or contigs?
What is the purpose of a DNA library?
What do you do think is the better option for everyday sequencing
use in the real world: Dideoxy or shotgun genome?
Shotgun Sequence of a Genome
Drosophila genome(~100 million nt)
Shotgun Sequence of a Genome
Shotgun Sequence of a Book
Marley was dead, to begin with. There is no doubt whatever about that. The register of his burial was signed by the clergyman, the clerk, the undertaker, and the chief mourner. Scrooge signed it. And Scrooge's name was good upon 'Change for anything he chose to put his hand to.
Old Marley was as dead as a doornail.
Shotgun Sequence of a Book
Marley was dead, to begin with. There is no doubt whatever about that. The register of his burial was signed by the clergyman, the clerk, the undertaker, and the chief mourner. Scrooge signed it. And Scrooge's name was good upon 'Change for anything he chose to put his hand to.
Old Marley was as dead as a doornail.
Shotgun Sequence of a Book
Marley was
begin wit
dead, to be
ad, to begi
There is no doub
Marley was dead
Shotgun Sequence of a Book
…you would have to oversample a sequence for the shotgun
approach to work?
Sequencing processDrosophila genome(~100 million nt)
. . .Suppose broken into 500 nt fragments
Sequencing processDrosophila genome(~100 million nt)
. . .SAMPLE
Sequencing processDrosophila genome(~100 million nt)
SAMPLE
. . .
Marley was
begin wit
dead, to be
ad, to begi
There is no doub
Marley was dead
Marley was dead, to begin with. There is no doub
Shotgun Sequence of a Book
Marley was
begin wit
dead, to be
ad, to begi
There is no doub
Marley was dead
Marley was dead, to begin with. There is no doub
Shotgun Sequence of a Book
Marley was
begin wit
dead, to be
ad, to begi
There is no doub
Marley was dead
Marley was dead, to begin with.
Shotgun Sequence of a Book
There is no doub
Contig #47
Contig #29 How to connect contigs?How to get the snippets?
Dideoxy sequencingCGACCATCGCCTTAGTAC
DNA replication
DNA replication
DNA replication
DNA replication
Dideoxy sequencing
Dideoxy sequencing
Dideoxy sequencing
Dideoxy sequencing
Dideoxy sequencing
Dideoxy sequencing
Dideoxy sequencing
What is the sequence (5' to 3') represented by the gel? G A T C
Myers et al SQ2
What is the sequence (5' to 3') represented by the gel? G A T C
ddCddC
ddCddC
ddC
TCGTGTACATCGTAACACGGTTAAGT
Myers et al SQ2
Dideoxy sequencing
Study Question 4What is high-quality sequence?
G A T C
Study Question 4What is high-quality sequence?
To determine high quality sequences,… How do you know when a peak stops
being high enough?
Could you explain in more detail the fluorescence chart
with the waves.
Study Question 4What is high-quality sequence?
Dideoxy sequencingHow sure are you?
To determine high quality sequences,… How do you know when a peak stops
being high enough?
For SQ3, I have been unable to identify the organism/molecule by using
BioBIKE. I have tried the function SEQUENCES-SIMILAR-TO.
DNA replication
DNA replication
Primer
How to provide a primer to an
unknown sequence?
G A T Cprimer
primer
plasmid
insert
~2000 nt mates
Myers et al SQ6Why read pairs? Scaffolds?
What is the purpose of a DNA library?
primer
plasmid
insert~40 letters
Mate pairs
How to connect contigsWhy read pairs? Scaffolds?
dead, to begin with. There is no doub
. . .
~ 150,000 nt
Bacterial Artificial CHROMOSOME
mates
Myers et al SQ6Why read pairs? Scaffolds?
What is BAC used for again? .
I don't understand the purpose of mates.
Marley was dead. God bless us every one.
Marley was
begin wit
dead, to be
ad, to begi
There is no doub
Marley was dead
Marley was dead, to begin with.
Shotgun Sequence of a Book
There is no doub
Contig #47
Contig #29 How to connect contigs?
Marley was
begin wit
dead, to be
ad, to begi
There is no doub
Marley was dead
Marley was dead, to begin with. There is no doub
Shotgun Sequence of a Book
How are gaps between assembled contigs "closed experimentally"?
Marley was dead, to begin with. There is no doubt whatever
about that. The register of his burial was signed by the
clergyman, the clerk, the undertaker, and the chief mourner.
Scrooge signed it. And Scrooge's name was good upon
'Change for anything he chose to put his hand to.
Old Marley was as dead as a doornail.
Shotgun Sequence of a Book
Polymerase Chain Reaction (PCR)
Requires known primer sequences,one on each of the two strands.
with. Ther
Sequencing vs AssemblyDideoxy sequencing
G A T C GGGATATGTCAGACGGTA
AATACAAGAACCCAAGCACCCAATTAA
GTCCGATAGGCTCTTGTCG
TCTGGAAGCATTTAACCG
TAATTCTCTTTGTTATGGTGTCTGACC
TGCAGCGTCAGCGAAA
TAAATTCTGCTAGTGTCCGGTTTGC
CGGATACGCGCGAGAACTGACGACAACTCAGCGA
Sequence assembly
Contig 1 Contig 2
Finishing
Sequencing vs AssemblyDideoxy sequencing
G A T C GGGATATGTCAGACGGTA
AATACAAGAACCCAAGCACCCAATTAA
GTCCGATAGGCTCTTGTCG
TCTGGAAGCATTTAACCG
TAATTCTCTTTGTTATGGTGTCTGACC
TGCAGCGTCAGCGAAA
TAAATTCTGCTAGTGTCCGGTTTGC
CGGATACGCGCGAGAACTGACGACAACTCAGCGA
Sequence assembly
Contig 1 Contig 2
Finishing
What do you do think is the better option for everyday sequencing
use in the real world: Dideoxy or shotgun genome?
SQ10. Why not 1X? How much of the sequence would thereby be determined?
Drosophila genome(~100 million nt)
SAMPLE
. . .
How many 500 nt samples needed 100 million nt?100 000 000 500
I'm not really sure how to go about figuring out how much 1X of the
genome would be. How much what? Is this referring to nucleotides or contigs?
Paint the wall
How long will this take?
SQ10. Why not 1X? How much of the sequence would thereby be determined?
Paint the wall
How long will this take?
SQ10. Why not 1X? How much of the sequence would thereby be determined?
Paint the wall
How long will this take?
40 "
25 "
1 sq "
SQ10. Why not 1X? How much of the sequence would thereby be determined?
Paint the wall
How long will this take?
40 "
25 "
1000paint balls?
SQ10. Why not 1X? How much of the sequence would thereby be determined?
Paint the wall
How long will this take?
40 "
25 "
1 sq "
SQ10. Why not 1X? How much of the sequence would thereby be determined?
Do more clones generate a more accurate genome assembly?
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 2 4 6 8 10
Oversampling
Com
plet
enes
s How much is painted with 1x oversampling?
SQ10. Why not 1X? How much of the sequence would thereby be determined?
What are gapped palindromes? ...how we can design an
algorithm to do this in normal english
Comments from Questionnaire
GATATCA palindrome?
What are gapped palindromes? ...how we can design an
algorithm to do this in normal english
Comments from Questionnaire
GATCATCA palindrome?
What are gapped palindromes? ...how we can design an
algorithm to do this in normal english
Comments from Questionnaire
GATCATCA palindrome?
What are gapped palindromes? ...how we can design an
algorithm to do this in normal english
Comments from Questionnaire
A palindrome?
Problem Set 1
Paint the wall
How long will this take?
40 "
25 "
1 sq "
SQ10. Why not 1X? How much of the sequence would thereby be determined?
2Firras
LawanginSonia
1Farah
KristenSandrine
Sue
3KavyaMandi
Supriya
5Franklin
SoniaTrevor
4Grace
JonathanKeith
Yordanos
7BobbyKathy
MoshrafTayab
6AbdallahCeleste
Tori
Me
FRONT
Problem Set 2, #3g: PalindromeSQ10: CoverageProblem Set 3, #1: Assembly
SQ14. From figures given in the text and in Table 1, check the accuracy of each of the following statements: a. "We produced 3.156 million reads that yielded 1.76 Gbp of sequence. . ." b. ". . .trillions of overlaps between reads are examined." c. ". . .to produce 654,000 of the 2-kbp mates and 497,000 of the 10-kbp mates."
Myers et al (2000)
how to read and understand the tables and the figures
they represent in reference to the sequencing.
SQ14. From figures given in the text and in Table 1, check the accuracy of each of the following statements: a. "We produced 3.156 million reads that yielded 1.76 Gbp of sequence. . ." b. ". . .trillions of overlaps between reads are examined." c. ". . .to produce 654,000 of the 2-kbp mates and 497,000 of the 10-kbp mates."
Myers et al (2000)
Is it okay if we answer the questions of the tour using the internet instead of answering them just through the article?
SQ14. From figures given in the text and in Table 1, check the accuracy of each of the following statements: a. "We produced 3.156 million reads that yielded 1.76 Gbp of sequence. . ." b. ". . .trillions of overlaps between reads are examined." c. ". . .to produce 654,000 of the 2-kbp mates and 497,000 of the 10-kbp mates."
Myers et al (2000)
I'm having trouble setting up the calculations in a
way to help me solve SQ14.
SQ13. Consider the data types listed in Table 1.
Why is each important?
Myers et al (2000)
TATA boxes before genes?
Sequencing processDrosophila genome(~100 million nt)
. . .
Focus on one nucleotide…What’s the probability that it’s covered by one read?What’s the probability that it’s covered by two reads?
What’s the probability that it’s covered by 200,000 reads?
Questometer Report
Jan 17 Jan 22 Jan 24