15
2/8/16 1 Introduction to Bioinformatics Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, February 8 th 2016 Info and documentation http://tbb.bio.uu.nl/BDA/ http://www.google.com/ http://www.wikipedia.org/ … but only for guidance and hints: never take the internet for granted Campbell Biology, 9 th or 10 th edition, Pearson Reader Printed in black and white Download full color PDF at: http://tbb.bio.uu.nl/BDA/BioInf2016.pdf Errata: http://tbb.bio.uu.nl/BDA/errata.html

20160208 introduction to bioinformatics - Utrecht Universitytheory.bio.uu.nl/BDA/2016/20160208_introduction_to_bioinformatics.… · 2/8/16 1 Introduction to Bioinformatics Bas E

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 20160208 introduction to bioinformatics - Utrecht Universitytheory.bio.uu.nl/BDA/2016/20160208_introduction_to_bioinformatics.… · 2/8/16 1 Introduction to Bioinformatics Bas E

2/8/16

1

IntroductiontoBioinformatics

BasE.DutilhSystemsBiology:BioinformaticDataAnalysis

UtrechtUniversity,February8th 2016

Infoanddocumentation• http://tbb.bio.uu.nl/BDA/

• http://www.google.com/ http://www.wikipedia.org/– …butonly forguidanceandhints:never taketheinternetforgranted

• Campbell Biology,9th or10th edition, Pearson

• Reader– Printedinblackandwhite– DownloadfullcolorPDFat:http://tbb.bio.uu.nl/BDA/BioInf2016.pdf

– Errata:http://tbb.bio.uu.nl/BDA/errata.html

Page 2: 20160208 introduction to bioinformatics - Utrecht Universitytheory.bio.uu.nl/BDA/2016/20160208_introduction_to_bioinformatics.… · 2/8/16 1 Introduction to Bioinformatics Bas E

2/8/16

2

Courseevaluation• Final markcourse

– 40%markofBioinformaticDataAnalysis• BasDutilh

– 10%markofBasicMaths• KirstentenTusscher

– 50%markofMathematics/Theoretical Biology• KirstentenTusscher enRobdeBoer

• BioinformaticDataAnalysisexam– Written exam– “Cheatsheet”allowed:onehand-written A4,double-sided isOK– Date:March 14th 2015at13:30-16:30inEducatorium Gamma

• BioinformaticDataAnalysisbonuspoint– Makeall exercises andhavethem signed by your assistant

• This hasto be done inthe same weekofthe practical• Incaseofemergency: lastchanceto sign offisonMonday before lecture

– Themaximummarkisa10– Mini-articlewascancelled

Howwouldyoufigureoutthefunctionofaprotein?

Knock-outmouse

X-raystructureActivityassay

BLASTsearch

Page 3: 20160208 introduction to bioinformatics - Utrecht Universitytheory.bio.uu.nl/BDA/2016/20160208_introduction_to_bioinformatics.… · 2/8/16 1 Introduction to Bioinformatics Bas E

2/8/16

3

Howaboutforallproteinsinagenome?

Genomesizes

Tb: Tera basepairs(1012)Gb:Gigabasepairs(109)Mb:Megabasepairs(106)Kb:Kilobasepairs(103)

Chaos chaos (1.4 Tb,Friz 1968)

Page 4: 20160208 introduction to bioinformatics - Utrecht Universitytheory.bio.uu.nl/BDA/2016/20160208_introduction_to_bioinformatics.… · 2/8/16 1 Introduction to Bioinformatics Bas E

2/8/16

4

Genedensityandnon-codingDNA• Mammals(including humans) havethelowestgenedensity– NumberofgenesinagivenlengthofDNA

• Introns withingenes• Noncoding DNAbetweengenes

Componentsofthehumangenome• 20,000– 25,000protein-codinggenes(1.5%)

• Introns (25.9%)

• Transposable elements(44.7%)– DNAtransposons– Longterminalrepeat(LTR)retrotransposons– Shortinterspersednuclearelements(SINEs)– Longinterspersednuclearelements(LINEs)– Endogenous retroviruses– Miniatureinvertedrepeattransposableelements(MITEs)

Page 5: 20160208 introduction to bioinformatics - Utrecht Universitytheory.bio.uu.nl/BDA/2016/20160208_introduction_to_bioinformatics.… · 2/8/16 1 Introduction to Bioinformatics Bas E

2/8/16

5

Largestgenomes

Largestsequencedgenome:Loblollypine(Pinus taeda)20,000,000,000bp (20Gb)

Kinugasasō (Parisjaponica)149,000,000,000bp (149Gb)

Smallestgenomes• Eukaryota– Free:Ostreococcus tauri (12.6Mb)– Endosymb:Encephalitozoon intestinalis (2.3Mb)

• BacteriaandArchaea– Free:Mycoplasma genitalium (580kb)– Endosymb:Cand. Carsonella ruddii (160kb)

• Viruses– Circoviridae (1.8kb– onlytwoproteins!)

Page 6: 20160208 introduction to bioinformatics - Utrecht Universitytheory.bio.uu.nl/BDA/2016/20160208_introduction_to_bioinformatics.… · 2/8/16 1 Introduction to Bioinformatics Bas E

2/8/16

6

Humangenome• 3,000,000,000 bp (3Gb)• HumanGenomeProject (HGP)

– 1990-2003– Draftgenomesequencecompletein2000

• Referencegenome– Source:blood (female)andsperm(male)– Samplestakenfrommanydonors,butonlyafewwereusedtoprotectdonor identities

– Sequenceisnot fromoneindividual• >70%fromonemaledonor

• CostHGP:$3,000,000,000– Target:$1,000genome

Prokaryotes

Geneticdiversity• PhylogeneticTreeofLife

Bacteria

Archaea

Eukaryotes

Page 7: 20160208 introduction to bioinformatics - Utrecht Universitytheory.bio.uu.nl/BDA/2016/20160208_introduction_to_bioinformatics.… · 2/8/16 1 Introduction to Bioinformatics Bas E

2/8/16

7

Genomesequencing

Clonedgenomes

Segmentsknownorder

Fragmentandsequence

Assemblesequences

Consensusgenome

WholeGenomeShotgun (WGS)approach

Page 8: 20160208 introduction to bioinformatics - Utrecht Universitytheory.bio.uu.nl/BDA/2016/20160208_introduction_to_bioinformatics.… · 2/8/16 1 Introduction to Bioinformatics Bas E

2/8/16

8

Personalgenomesequences

CraigVenter JamesWatson

ReferenceGenome

~5.000.000differences

~2.000.000differences

~5.000.000differences

Yourpersonalgenomesequence

Page 9: 20160208 introduction to bioinformatics - Utrecht Universitytheory.bio.uu.nl/BDA/2016/20160208_introduction_to_bioinformatics.… · 2/8/16 1 Introduction to Bioinformatics Bas E

2/8/16

9

Sowehavea$200personalgenome…

• …nowthemillion dollarquestionis:

WhatcanIlearnfrommy3,000,000,000A’s,C’s,G’s,andT’s?

Personalizedmedicine

• Fromreactivetoproactivemedicine– Identifyhighriskalleles– Adaptlifestyle(e.g.riskofhighbloodpressure)– Preventivescreeningortreatment(e.g.riskofcancer)

• Pharmacogenomics:– Impactofgeneticvariationonresponsetomedication

SergeyBrinCo-founder

LRRK2polymorphismonchromosome12- 28%riskofParkinson’satage59- 51% atage69- 74% atage79

Co-invester

Page 10: 20160208 introduction to bioinformatics - Utrecht Universitytheory.bio.uu.nl/BDA/2016/20160208_introduction_to_bioinformatics.… · 2/8/16 1 Introduction to Bioinformatics Bas E

2/8/16

10

Biology isBigData science#sequ

encedgeno

mes

Moore'sLaw: computerpowerdoublesevery~2years.

RNA Protein

Omics sciences• Thesuffix -ome referstoa totality ofsomesort• Gene(genetics)• Transcript(RNA)• Protein

• Metabolite• Lipid• Microbe

• Genome• Transcriptome• Proteome

• Metabolome• Lipidome• Microbiome

• Genomics• Transcriptomics• Proteomics

• Metabolomics• Lipidomics• Microbiomics (?!)

DNA

Page 11: 20160208 introduction to bioinformatics - Utrecht Universitytheory.bio.uu.nl/BDA/2016/20160208_introduction_to_bioinformatics.… · 2/8/16 1 Introduction to Bioinformatics Bas E

2/8/16

11

Genomics• Identifydifferencesingenecontentbetweengenomes• Discovernewspecies:“BiologicalDarkMatter”• Analyzegenomeevolution• Predictgenefunctions

Chordata ↔Echinodermata

1,000,000,000,000 specieson earth?

10,000 speciescultured

30,000 genomessequenced

Page 12: 20160208 introduction to bioinformatics - Utrecht Universitytheory.bio.uu.nl/BDA/2016/20160208_introduction_to_bioinformatics.… · 2/8/16 1 Introduction to Bioinformatics Bas E

2/8/16

12

Sample

Filter

Microbesorviruses

Metagenomics

Page 13: 20160208 introduction to bioinformatics - Utrecht Universitytheory.bio.uu.nl/BDA/2016/20160208_introduction_to_bioinformatics.… · 2/8/16 1 Introduction to Bioinformatics Bas E

2/8/16

13

Spangetal. Nature2015

Metagenomicdiscovery ofLokiarchaeota

Prokaryotes

Geneticdiversity• PhylogeneticTreeofLife

Bacteria

Archaea

Eukaryotes

Page 14: 20160208 introduction to bioinformatics - Utrecht Universitytheory.bio.uu.nl/BDA/2016/20160208_introduction_to_bioinformatics.… · 2/8/16 1 Introduction to Bioinformatics Bas E

2/8/16

14

Image:LisaBrownfor

Humanmicrobiomeandvirome• Inyourbody: ~1013 humancells~1014 bacteria~1015 viruses

Bioinformatics• Bioinformatics:studyofinformatic processesinbioticsystems

PaulienHogeweg andBenHesper (UtrechtUniversity,1970)• BioinformaticDataAnalysis:usingcomputationalmethodstoanalyzebiologicaldata

Page 15: 20160208 introduction to bioinformatics - Utrecht Universitytheory.bio.uu.nl/BDA/2016/20160208_introduction_to_bioinformatics.… · 2/8/16 1 Introduction to Bioinformatics Bas E

2/8/16

15

Bioinformatics inUtrechttoday

Bringyourlaptop