107

Lives of the Scientist Genetic Basis of Differentiation Events in time and space

Embed Size (px)

Citation preview

Page 1: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 2: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

Lives of the Scientist

Page 3: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

Genetic Basis of Differentiation

Events in time and space . . .

Page 4: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

Events in time and space . . .. . . driven by patterned gene expression

Genetic Basis of Differentiation

Page 5: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

Events in time and space . . . . . . driven by patterned gene expression

Genetic Basis of Differentiation

Page 6: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

Events in time and space . . . . . . driven by patterned gene expression

Genetic Basis of Differentiation

NH3 N2

NH3

Nostoc

Page 7: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

Genetic Basis of Differentiation

NH3

Environmental Signal Developmental Response

Histidine Kinase

How?

Page 8: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

Genetic Basis of Differentiation

Developmental Response

Histidine Kinase

How?

NH3

Environmental Signal

PAT

Page 9: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

Genetic Basis of Differentiation

Developmental Response

Histidine Kinase

P

Response Regulator

How?

NH3

Environmental Signal

Phistidine

Page 10: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

Genetic Basis of Differentiation

Developmental Response

Histidine Kinase

P

Response Regulator

How?

NH3

Environmental Signal

P

? ? ?NpR3010

Page 11: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

AATAAAGCTTTACAAACCAAACTCTGGCTTCAATTGTGTAACCCAAGCTTTGATTCTTTCCTCTGTTAAATCGGATTGATTATCTTCATCAAGGGCAAGACCTACAAATTTACCATCACGAACAGCTTTAGACTCACTGAATTCATAACCTTCTGTAGGCCAATAGCCAACTGTTTCACCACCATTTTCTGAAATTTTTTCCTCTAGAATACCGAGGGCATCTTGAAATGTATCAGGATAACCAACCTGGTCTCCAGGAGCAAAATAAGCAACTTTTTTGCCGATGAAGTCAATGTTATCTAACTCATCATAAAAATTTTCCCAATCACTTTGCAATTCTCCAACATTCCAGGTAGGACAACCAACAACGATATAATCGTAGTTATTGAAATCACTTGGTTCAGCTTGTGAAATATCATATAAAGTTACAACACTATCACCACCAAACTCCTTCTGAATTATTTCTGATTCAGTTTGGGTATTGCCTGTTTGAGTACCAAAAAATAAACCAATATTAGACATTTTTACTCCTTTTATGTATTTGCAAAATTATTTCAATTAAAATATTTAGTAATAATTAATTGTTAGCTAGCTAATAATTAAATTTTTATTACAATCATTGTAAAAGGCATTGAAAAAGTAAATAAAAATTTTTATTCTACGTTATTTCAAAAATATTTACTTACATATACTTAACCTTTATAGTGATGTAATATACTCTAATTCCTATTTTACTTATAAATACCATCTCAGCTTAATGTAACGAATTTTTCTGTTTATCTTTAAATACAAAAAATTCAACAAAACTACAGAAAATTAATCTTAATAACACAAAACAAGTATCAATCTGTAATACAACTAAGCTTAAATAAATTAATAGAAAGCTTCATCTATCTAATAGGTTGAGAATAGTTTATGTCTAATGACATAAATTCATTCGTGTTGATTTCATTTGGGTATATTCATCTGATTTAGGATTTACTCCATTAAGTTTGTACTCATCAATGCCCGCCTGTTGGTATCCACAATTCTCATACAGTGCGCGAGCAAAGTAATCAATCGTTCGTCGCCATATCTAACTTTGAGTCAAACAAACCAGTTGGATTACCAACCCTCAACTAATCGCTTCTTTAAGGCGAGCGATCGCACATTTAACTGTTGGTTGTCACAAGAGAACTAATACTACAGCAGTATATTTAACAACTAAGGGTGGTTCAACTTTCGCTGCGACTCCTCCAACGCGCTGAAATACACAGGACTGATGCGATCGCAAACTCTTTGACTAAATTCCATACATTATCATGACCATCTCCCAAACAAACAAGTGGGTTAACCAGATGCTGACTATTAACATCCCCTGAGTTCGGAGTTGTAGGTCTATTTGACTGGTTCAAAGCGATGATGGAACGGCTTTGTTGCATGAATTAAAAAAAGACACACCATCACCTACTTCTAGGATAGACACATCAAACGTCCCACCGCCTAAGTCAAATACCAAGATAATTTCGTTAGTTTTCTTGTCAAGTCCGTAAGCGAGGGCCGCCGCCGTGGGCTAGTTGATAATTCGCAGAACTTTAATCCCGGCAATTCTACTGGCATCTTTGGTAGCCTGCCGTTGAGAGTCATTGAAATAGGCAGGGGTGGTAATTACCGCTTGCCTCACTGGTTCCCCCAGATATGTGCTGGCATCATCTATCAGCTTGCGGACTACCTCATACCATTTCACGAAAAACCTGATACACATGTAAACTCTGAAACCCTTGCTGTATCAAAGTTTTGTAATTACGAATTACGAATTACGAATTGATATCAGCCGAGATTTCTTCGGGTGAAAATTCCTTGTTCAGAGCGGGACAGTGTAGCTTGACATTGCCATTACTGTCACGTACCACTTTGTAAGTAACTTGTTTTGCCTCTTGCGTAACTTCATCATACCTGCGCCCGATGAACCGCTTCACAGAATAAAAAGTGTTTTCTGGGTTCATTACACCCTGGCGCTT

Genetic Basis of Differentiation

Developmental Response

Histidine Kinase

P

Response Regulator

How?

NH3

Environmental Signal

P

? ? ?NpR3010

Page 12: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 13: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 14: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 15: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 16: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 17: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

Histidine Kinase

NpR3010Nostoc punctiforme

Genes Functionally Related to His Kinase

Anabaena PCC 7120

Trichodesmium

Synechocystis PCC 6803

. . . (13 total) Find similar genes

BlastConserved

Page 18: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 19: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 20: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 21: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 22: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

>npun_22dec03_Contig1_revised_geneNpR3010 MWHIQDSIITLSNHNQYLTFYKNQVKNPERFCRNVNQFDSQIDFVSCDIL ELKDGRFFEQYSKPLRLAEEIIGTVWSFRDITESQQAKEENRRIIQQEKQ LAEDRAYFTSMIFHEFRNPLNIISYSTSLLKRHSHHWSEEKKLQCLQNLQ TAVEQINQFTDEVLIIESVEAGKLQYELKPIDLNLFCREVLAEMSLYTKG ASQFLLFQNK*

Page 23: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 24: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 25: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

MWHIQDSIITLSNHNQYLTFYKNQVKNPERFCRNVNQFDSQIDFVSCDIL

ELKDGRFFEQYSKPLRLAEEIIGTVWSFRDITESQQAKEENRRIIQQEKQ

LAEDRAYFTSMIFHEFRNPLNIISYSTSLLKRHSHHWSEEKKLQCLQNLQ

TAVEQINQFTDEVLIIESVEAGKLQYELKPIDLNLFCREVLAEMSLYTKG

ASQFLLFQNK

Page 26: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 27: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 28: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 29: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 30: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

>npun_22dec03_Contig1_revised_geneNpR3008 LSPYLEACCLRISASVSYQRAAEDIEYLTGVEVSKSVQQRLVHRQNFELP QVESTVEELSVDGGNIRIRTIKGQVCDWKGYKATCLHEKQAIAASFQENS LVIDWVKSQSIAPILTCLGDGHDGIWNIVRDFAPEHQRREVLDWFHLMEN LHKIGGSNQRLNQAKILLWQGKVDDAIAVFADCQLKQAFNFCTYLEKHRH RIVNYQYYQAEQICSIGSGAIESTVKQIDRRTKISGAQWKSDNVPQVLAQ RQSLSQWINLCSLNKNWDAPMKSSVERLSDYPVAR*

Page 31: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 32: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 33: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 34: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 35: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 36: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

A new family of proteins?!A type of transposase?

transposase

...ATTTCTCTAGAAAGGCTGAAGGGGGGACAAGCACCCGAAAGCCTTTGTGCT...

...TAAAGAGATCTTTCCGACTTCCCCCCTGTTCGTGGGCTTTCGGAAACACGA...

...ATACAGTCAGCTTTATAGGCTTCATGTCGCCCCTTCAGCTAGAAAGGTACATA......TATGTCAGTCGAAATATCCGAAGTACAGCGGGGAAGTCGATCTTTCCATGTAT...

TRANSPOSON

Page 37: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

A new family of proteins?!A type of transposase?

transposase

...ATTTCTCTAGAAAGGCTGAAGGGGGGACAAGCACCCGAAAGCCTTTGTGCT...

...TAAAGAGATCTTTCCGACTTCCCCCCTGTTCGTGGGCTTTCGGAAACACGA...

...ATACAGTCAGCTTTATAGGCTTCATGTCGCCCCTTCAGCTAGAAAGGTACATA......TATGTCAGTCGAAATATCCGAAGTACAGCGGGGAAGTCGATCTTTCCATGTAT...

TRANSPOSON

Page 38: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

A new family of proteins?!A type of transposase?

...ATTTCTCTAGAAAGGCTGAAGGGGGGACAAGCACCCGAAAGCCTTTGTGCT...

...TAAAGAGATCTTTCCGACTTCCCCCCTGTTCGTGGGCTTTCGGAAACACGA...

...ATACAGTCAGCTTTATAGGCTTCATGTCGCCCCTTCAGCTAGAAAGGTACATA......TATGTCAGTCGAAATATCCGAAGTACAGCGGGGAAGTCGATCTTTCCATGTAT...

transposase

TRANSPOSON

Page 39: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

A new family of proteins?!A type of transposase?

transposase

TRANSPOSON

Is Npr3008 a transposase?

Page 40: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 41: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 42: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 43: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 44: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 45: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 46: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 47: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 48: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 49: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 50: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

AATAAAGCTTTACAAACCAAACTCTGGCTTCAATTGTGTAACCCAAGCTTTGATTCTTTCCTCTGTTAAATCGGATTGATTATCTTCATCAAGGGCAAGACCTACAAATTTACCATCACGAACAGCTTTAGACTCACTGAATTCATAACCTTCTGTAGGCCAATAGCCAACTGTTTCACCACCATTTTCTGAAATTTTTTCCTCTAGAATACCGAGGGCATCTTGAAATGTATCAGGATAACCAACCTGGTCTCCAGGAGCAAAATAAGCAACTTTTTTGCCGATGAAGTCAATGTTATCTAACTCATCATAAAAATTTTCCCAATCACTTTGCAATTCTCCAACATTCCAGGTAGGACAACCAACAACGATATAATCGTAGTTATTGAAATCACTTGGTTCAGCTTGTGAAATATCATATAAAGTTACAACACTATCACCACCAAACTCCTTCTGAATTATTTCTGATTCAGTTTGGGTATTGCCTGTTTGAGTACCAAAAAATAAACCAATATTAGACATTTTTACTCCTTTTATGTATTTGCAAAATTATTTCAATTAAAATATTTAGTAATAATTAATTGTTAGCTAGCTAATAATTAAATTTTTATTACAATCATTGTAAAAGGCATTGAAAAAGTAAATAAAAATTTTTATTCTACGTTATTTCAAAAATATTTACTTACATATACTTAACCTTTATAGTGATGTAATATACTCTAATTCCTATTTTACTTATAAATACCATCTCAGCTTAATGTAACGAATTTTTCTGTTTATCTTTAAATACAAAAAATTCAACAAAACTACAGAAAATTAATCTTAATAACACAAAACAAGTATCAATCTGTAATACAACTAAGCTTAAATAAATTAATAGAAAGCTTCATCTATCTAATAGGTTGAGAATAGTTTATGTCTAATGACATAAATTCATTCGTGTTGATTTCATTTGGGTATATTCATCTGATTTAGGATTTACTCCATTAAGTTTGTACTCATCAATGCCCGCCTGTTGGTATCCACAATTCTCATACAGTGCGCGAGCAAAGTAATCAATCGTTCGTCGCCATATCTAACTTTGAGTCAAACAAACCAGTTGGATTACCAACCCTCAACTAATCGCTTCTTTAAGGCGAGCGATCGCACATTTAACTGTTGGTTGTCACAAGAGAACTAATACTACAGCAGTATATTTAACAACTAAGGGTGGTTCAACTTTCGCTGCGACTCCTCCAACGCGCTGAAATACACAGGACTGATGCGATCGCAAACTCTTTGACTAAATTCCATACATTATCATGACCATCTCCCAAACAAACAAGTGGGTTAACCAGATGCTGACTATTAACATCCCCTGAGTTCGGAGTTGTAGGTCTATTTGACTGGTTCAAAGCGATGATGGAACGGCTTTGTTGCATGAATTAAAAAAAGACACACCATCACCTACTTCTAGGATAGACACATCAAACGTCCCACCGCCTAAGTCAAATACCAAGATAATTTCGTTAGTTTTCTTGTCAAGTCCGTAAGCGAGGGCCGCCGCCGTGGGCTAGTTGATAATTCGCAGAACTTTAATCCCGGCAATTCTACTGGCATCTTTGGTAGCCTGCCGTTGAGAGTCATTGAAATAGGCAGGGGTGGTAATTACCGCTTGCCTCACTGGTTCCCCCAGATATGTGCTGGCATCATCTATCAGCTTGCGGACTACCTCATACCATTTCACGAAAAACCTGATACACATGTAAACTCTGAAACCCTTGCTGTATCAAAGTTTTGTAATTACGAATTACGAATTACGAATTGATATCAGCCGAGATTTCTTCGGGTGAAAATTCCTTGTTCAGAGCGGGACAGTGTAGCTTGACATTGCCATTACTGTCACGTACCACTTTGTAAGTAACTTGTTTTGCCTCTTGCGTAACTTCATCATACCTGCGCCCGATGAACCGCTTCACAGAATAAAAAGTGTTTTCTGGGTTCATTACACCCTGGCGCTT

AATAAAGCTTTACAAACCAAACTCTGGCTTCAATTGTGTAACCCAAGCTTTGATTCTTTCCTCTGTTAAATCGGATTGATTATCTTCATCAAGGGCAAGACCTACAAATTTACCATCACGAACAGCTTTAGACTCACTGAATTCATAACCTTCTGTAGGCCAATAGCCAACTGTTTCACCACCATTTTCTGAAATTTTTTCCTCTAGAATACCGAGGGCATCTTGAAATGTATCAGGATAACCAACCTGGTCTCCAGGAGCAAAATAAGCAACTTTTTTGCCGATGAAGTCAATGTTATCTAACTCATCATAAAAATTTTCCCAATCACTTTGCAATTCTCCAACATTCCAGGTAGGACAACCAACAACGATATAATCGTAGTTATTGAAATCACTTGGTTCAGCTTGTGAAATATCATATAAAGTTACAACACTATCACCACCAAACTCCTTCTGAATTATTTCTGATTCAGTTTGGGTATTGCCTGTTTGAGTACCAAAAAATAAACCAATATTAGACATTTTTACTCCTTTTATGTATTTGCAAAATTATTTCAATTAAAATATTTAGTAATAATTAATTGTTAGCTAGCTAATAATTAAATTTTTATTACAATCATTGTAAAAGGCATTGAAAAAGTAAATAAAAATTTTTATTCTACGTTATTTCAAAAATATTTACTTACATATACTTAACCTTTATAGTGATGTAATATACTCTAATTCCTATTTTACTTATAAATACCATCTCAGCTTAATGTAACGAATTTTTCTGTTTATCTTTAAATACAAAAAATTCAACAAAACTACAGAAAATTAATCTTAATAACACAAAACAAGTATCAATCTGTAATACAACTAAGCTTAAATAAATTAATAGAAAGCTTCATCTATCTAATAGGTTGAGAATAGTTTATGTCTAATGACATAAATTCATTCGTGTTGATTTCATTTGGGTATATTCATCTGATTTAGGATTTACTCCATTAAGTTTGTACTCATCAATGCCCGCCTGTTGGTATCCACAATTCTCATACAGTGCGCGAGCAAAGTAATCAATCGTTCGTCGCCATATCTAACTTTGAGTCAAACAAACCAGTTGGATTACCAACCCTCAACTAATCGCTTCTTTAAGGCGAGCGATCGCACATTTAACTGTTGGTTGTCACAAGAGAACTAATACTACAGCAGTATATTTAACAACTAAGGGTGGTTCAACTTTCGCTGCGACTCCTCCAACGCGCTGAAATACACAGGACTGATGCGATCGCAAACTCTTTGACTAAATTCCATACATTATCATGACCATCTCCCAAACAAACAAGTGGGTTAACCAGATGCTGACTATTAACATCCCCTGAGTTCGGAGTTGTAGGTCTATTTGACTGGTTCAAAGCGATGATGGAACGGCTTTGTTGCATGAATTAAAAAAAGACACACCATCACCTACTTCTAGGATAGACACATCAAACGTCCCACCGCCTAAGTCAAATACCAAGATAATTTCGTTAGTTTTCTTGTCAAGTCCGTAAGCGAGGGCCGCCGCCGTGGGCTAGTTGATAATTCGCAGAACTTTAATCCCGGCAATTCTACTGGCATCTTTGGTAGCCTGCCGTTGAGAGTCATTGAAATAGGCAGGGGTGGTAATTACCGCTTGCCTCACTGGTTCCCCCAGATATGTGCTGGCATCATCTATCAGCTTGCGGACTACCTCATACCATTTCACGAAAAACCTGATACACATGTAAACTCTGAAACCCTTGCTGTATCAAAGTTTTGTAATTACGAATTACGAATTACGAATTGATATCAGCCGAGATTTCTTCGGGTGAAAATTCCTTGTTCAGAGCGGGACAGTGTAGCTTGACATTGCCATTACTGTCACGTACCACTTTGTAAGTAACTTGTTTTGCCTCTTGCGTAACTTCATCATACCTGCGCCCGATGAACCGCTTCACAGAATAAAAAGTGTTTTCTGGGTTCATTACACCCTGGCGCTT

AATAAAGCTT

TACAAAC

CAAACTCTGG

CTTCAAT

TGTGTAACCC

AAGCTTT

GATTCTTTCC

TCTGTTA

AATCGGATTG

ATTATCT

TCATCAAGG

GCAAGAC

CTACAAATTT

ACCATCA

CGAACAGCTT

TAGACTC

ACTGAATTCA

TAACCTT

CTGTAGGCC

AATAGCC

AACTGTTTCA

CCACCAT

TTTCTGAAAT

TTTTTCCT

CTAGAATACC

GAGGGC

ATCTTGAAAT

GTATCAG

GATAACCAAC

CTGGTCT

CCAGGAGC

AAAATAAG

CAACTTTTTT

GCCGATG

AAGTCAATGT

TATCTAA

CTCATCATAA

AAATTTT

CCCAATCACT

TTGCAAT

TCTCCAACAT

TCCAGGT

AGGACAACC

AACAACG

ATATAATCGT

AGTTATT

GAAATCACTT

GGTTCAG

CTTGTGAAAT

ATCATAT

AAAGTTACAA

CACTATC

ACCACCAAAC

TCCTTCT

GAATTATTTC

TGATTCA

GTTTGGGTAT

TGCCTGT

TTGAGTACCA

AAAAATA

AACCAATATT

AGACATT

TTTACTCCTT

TTATGTAT

TTGCAAAATT

ATTTCAA

TTAAAATATT

TAGTAAT

AATTAATTGT

TAGCTAG

CTAATAATTA

AATTTTTA

TTACAATCAT

TGTAAAA

GGCATTGAA

AAAGTAA

ATAAAAATTT

TTATTCTA

CGTTATTTCA

AAAATAT

TTACTTACAT

ATACTTAA

CCTTTATAGT

GATGTAA

TATACTCTAA

TTCCTATT

TTACTTATAA

ATACCAT

CTCAGCTTAA

TGTAACG

AATTTTTCTG

TTTATCTT

TAAATACAAA

AAATTCA

ACAAAACTAC

AGAAAAT

TAATCTTAAT

AACACAA

AACAAGTATC

AATCTGT

AATACAACTA

AGCTTAA

ATAAATTAAT

AGAAAGC

TTCATCTATC

TAATAGG

TTGAGAATAG

TTTATGT

CTAATGACAT

AAATTCA

TTCGTGTTGA

TTTCATT

TGGGTATAT

TCATCTGA

TTTAGGATTT

ACTCCAT

TAAGTTTGTA

CTCATCA

ATGCCCGCC

TGTTGGT

ATCCACAATT

CTCATAC

AGTGCGCGA

GCAAAGT

AATCAATCGT

TCGTCGC

CATATCTAAC

TTTGAGT

CAAACAAACC

AGTTGG

ATTACCAACC

CTCAACT

AATCGCTTCT

TTAAGGC

GAGCGATCG

CACATTTA

ACTGTTGGTT

GTCACAA

GAGAACTAA

TACTACAG

CAGTATATTT

AACAACT

AAGGGTGG

TTCAACTTT

CGCTGCGAC

TCCTCCA

ACGCGCTG

AAATACAC

AGGACTGAT

GCGATCG

CAAACTCTTT

GACTAAA

TTCCATACAT

TATCATG

ACCATCTCCC

AAACAAA

CAAGTGGGT

TAACCAG

ATGCTGACTA

TTAACAT

CCCCTGAGTT

CGGAGT

TGTAGGTCTA

TTTGACT

GGTTCAAAGC

GATGAT

GGAACGGC

TTTGTTGC

ATGAATTAAA

AAAAGAC

ACACCATCAC

CTACTTC

TAGGATAGAC

ACATCAA

ACGTCCCACC

GCCTAA

GTCAAATACC

AAGATAA

TTTCGTTAGT

TTTCTTGT

CAAGTCCGTA

AGCGAG

GGCCGCCG

CCGTGGGC

TAGTTGATAA

TTCGCAG

AACTTTAATC

CCGGCAA

TTCTACTGG

CATCTTTG

GTAGCCTGCC

GTTGAG

AGTCATTGAA

ATAGGCA

GGGGTGGTA

ATTACCG

CTTGCCTCAC

TGGTTCC

CCCAGATATG

TGCTGG

CATCATCTAT

CAGCTTG

CGGACTACC

TCATACCA

TTTCACGAAA

AACCTGA

TACACATGTA

AACTCTG

AAACCCTTGC

TGTATCA

AAGTTTTGTA

ATTACGA

ATTACGAATT

ACGAATT

GATATCAGC

CGAGATTT

CTTCGGGTG

AAAATTCC

TTGTTCAGAG

CGGGAC

AGTGTAGCTT

GACATTG

CCATTACTGT

CACGTAC

CACTTTGTAA

GTAACTT

GTTTTGCCTC

TTGCGTA

ACTTCATCAT

ACCTGCG

CCCGATGAA

CCGCTTC

ACAGAATAAA

AAGTGTT

TTCTGGGTTC

ATTACAC

CCTGGCGCTT

Page 51: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 52: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 53: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 54: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

Observation

* Photos courtesy of www.webshots.com and Peter Smallwood

Page 55: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

Observation

* Photos courtesy of www.webshots.com and Peter Smallwood

Page 56: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

Observation

* Photos courtesy of www.webshots.com and Peter Smallwood

Page 57: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

Observation

* Photos courtesy of www.webshots.com and Peter Smallwood

Page 58: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

Filters: Information reducersSquirrel filter

Page 59: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

Filters: Information reducersMolecular filter

Page 60: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

TCTACTTATA TTCAATCCAC AGGGCTACAC CTAGTTCTTG AAGAGTCTGT TGAATGAACA CATACATGGT TTATCTGTTT TTCTGTCTGC TCTGACCTCT GGCAGCTTTC CACTAGTTTC TGGATTTCGG AACTCTAGCC TGCCCCACTC TTAGATAAAC GAACCTTAGT GACTTCTGCT ATACCAAAGT CTCCACGCCC CTCCGTAAAC CTCTAACATG ATGTCAGCAA ATATTAAAAA TGAATAAACT TTGTTAAAGG TACAAATGAA AATTAGCAAA AAGAGTTTAA AGTTAAAAAC GAATTGCAGT CATTCTAGGG AAACCTGTAT GGTTACATGA ACTGCCTAAA AAACAAGCTA TTATATATTT TAAGAAATTA ATTGCAATTA ATTTCCTGGG CCCCAGCTGT CATTAAAAAG AGGCAAATAC AGCCAAGGAC GACAGCACTG ACCCTCAAGA AGGCACCGGC TGACAGACAG GCTGAAATTC CGCTGAGAGC AGAGTGGTAC ATTGAACCCT CCCTGCACCA GGTCTTTCCT GTGGGCACTG AGTGCAGACA ATGAATGACT GAACGAACGA TTGAATGAAA AGAAATGAGA TATGAGGCAA TCACAGCATC AGGTGACCTT AGTATCTATT CTCGGGAGCG CACGGCTCTA AAGAGGCCCA TATCCAGGCA CCTTTAGATG CAAGAAGGAG GAAACAGCTC GAAATCCCTG AGGCCGGAGG GTCAAGAACT CTCCACCGGC GGCAGCGGCC CCCCGGCCTA AGGCTGCCTG TGCTATAAAT ACGCGGCCCA TTCCCTGGGC TCGGCGGGAC AGATAACATG AATGTGCCCT

CTCCGTAAAC CTCTAAC...

Filters: Information reducersSequence filter

Page 61: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

How do Biologists use Bioinformation?

Candidate genes Predicted genes

Interpolated Markov model

Gene finder

TCTACTTATA TTCAATCCAC AGGGCTACACAAGAGTCTGT TGAATGAACA CATACATGGTTTCTGTCTGC TCTGACCTCT GGCAGCTTTC TGGATTTCGG AACTCTAGCC TGCCCCACTC GAACCTTAGT GACTTCTGCT ATACCAAAGT CTCCGTAAAC CTCTAACATG ATGTCAGCAA TGAATAAACT TTGTTAAAGG TACAAATGAA AAGAGTTTAA AGTTAAAAAC GAATTGCAGT AAACCTGTAT GGTTACATGA ACTGCCTAAA TTATATATTT TAAGAAATTA ATTGCAATTA CCCCAGCTGT CATTAAAAAG AGGCAAATAC GACAGCACTG ACCCTCAAGA AGGCACCGGC GCTGAAATTC CGCTGAGAGC AGAGTGGTAC CCCTGCACCA GGTCTTTCCT GTGGGCACTG ATGAATGACT GAACGAACGA TTGAATGAAA

What genes are in my organism?

Page 62: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

Predicted genesCandidate genes Predicted genes

Conform to standard modelChallenge

accepted beliefs

How do Biologists use Bioinformation?

Gene finder

TCTACTTATA TTCAATCCAC AGGGCTACACAAGAGTCTGT TGAATGAACA CATACATGGTTTCTGTCTGC TCTGACCTCT GGCAGCTTTC TGGATTTCGG AACTCTAGCC TGCCCCACTC GAACCTTAGT GACTTCTGCT ATACCAAAGT CTCCGTAAAC CTCTAACATG ATGTCAGCAA TGAATAAACT TTGTTAAAGG TACAAATGAA AAGAGTTTAA AGTTAAAAAC GAATTGCAGT AAACCTGTAT GGTTACATGA ACTGCCTAAA TTATATATTT TAAGAAATTA ATTGCAATTA CCCCAGCTGT CATTAAAAAG AGGCAAATAC GACAGCACTG ACCCTCAAGA AGGCACCGGC GCTGAAATTC CGCTGAGAGC AGAGTGGTAC CCCTGCACCA GGTCTTTCCT GTGGGCACTG ATGAATGACT GAACGAACGA TTGAATGAAA

What genes are in my organism?

Interpolated Markov model

Page 63: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

Predicted genesCandidate genes Predicted genes

Conform to standard model

How do Biologists use Bioinformation?

Gene finder

TCTACTTATA TTCAATCCAC AGGGCTACACAAGAGTCTGT TGAATGAACA CATACATGGTTTCTGTCTGC TCTGACCTCT GGCAGCTTTC TGGATTTCGG AACTCTAGCC TGCCCCACTC GAACCTTAGT GACTTCTGCT ATACCAAAGT CTCCGTAAAC CTCTAACATG ATGTCAGCAA TGAATAAACT TTGTTAAAGG TACAAATGAA AAGAGTTTAA AGTTAAAAAC GAATTGCAGT AAACCTGTAT GGTTACATGA ACTGCCTAAA TTATATATTT TAAGAAATTA ATTGCAATTA CCCCAGCTGT CATTAAAAAG AGGCAAATAC GACAGCACTG ACCCTCAAGA AGGCACCGGC GCTGAAATTC CGCTGAGAGC AGAGTGGTAC CCCTGCACCA GGTCTTTCCT GTGGGCACTG ATGAATGACT GAACGAACGA TTGAATGAAA

What genes are in my organism?

Interpolated Markov model

Page 64: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

Predicted genesCandidate genes Predicted genes

Conform to standard modelChallenge

accepted beliefs

How do Biologists use Bioinformation?

Gene finder

TCTACTTATA TTCAATCCAC AGGGCTACACAAGAGTCTGT TGAATGAACA CATACATGGTTTCTGTCTGC TCTGACCTCT GGCAGCTTTC TGGATTTCGG AACTCTAGCC TGCCCCACTC GAACCTTAGT GACTTCTGCT ATACCAAAGT CTCCGTAAAC CTCTAACATG ATGTCAGCAA TGAATAAACT TTGTTAAAGG TACAAATGAA AAGAGTTTAA AGTTAAAAAC GAATTGCAGT AAACCTGTAT GGTTACATGA ACTGCCTAAA TTATATATTT TAAGAAATTA ATTGCAATTA CCCCAGCTGT CATTAAAAAG AGGCAAATAC GACAGCACTG ACCCTCAAGA AGGCACCGGC GCTGAAATTC CGCTGAGAGC AGAGTGGTAC CCCTGCACCA GGTCTTTCCT GTGGGCACTG ATGAATGACT GAACGAACGA TTGAATGAAA

What genes are in my organism?

Interpolated Markov model

Page 65: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

Filters are powerful

globin

Highly filtered output • Easy to grasp• High-level insights

Page 66: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

Filters Constrain New Discovery

globin

Highly filtered output • Easy to grasp• High-level insights

Unfiltered output• Confusing• Basic insights

Page 67: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

Filters are tempting

Page 68: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

Globin

Filters are tempting

Page 69: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 70: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 71: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 72: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 73: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

The Death of Science

Page 74: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

Current State of Affairs

1. Need high-level filters

Page 75: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

1. Need high-level filters

2. Need access to raw phenomena

AATAAAGCTTTACAAACCAAACTCTGGCTTCAATTGTGTAACCCAAGCTTTGATTCTTTCCTCTGTTAAATCGGATTGATTATCTTCATCAAGGGCAAGACCTACAAATTTACCATCACGAACAGCTTTAGACTCACTGAATTCATAACCTTCTGTAGGCCAATAGCCAACTGTTTCACCACCATTTTCTGAAATTTTTTCCTCTAGAATACCGCAACACTATCACCACCAAACTCCTTCTGAATTATTTCTGATTCAGTTTGGGTATTGCCTGTTTGAGTACCAAAAAATAAACCAATATTAGAC

Current State of Affairs

Page 76: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

1. Need high-level filters

2. Need access to raw phenomena

3. Need ability to build new tools

ASSIGN K12-set FROM Gene-finder (K12-DNA)

ASSIGN O157-set FROM Gene-finder (O157-DNA)

CONSIDER EACH protein IN O157-set

WHEN Constituent-of (K12-set, protein) = FALSE

COLLECT protein

Current State of Affairs

Page 77: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

We need…

Biologists . . .

. . . and Programmers

Page 78: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 79: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

1. Need high-level filters

2. Need access to raw phenomena

3. Need ability to build new tools

Current State of Affairs

Need biologist programmers

Page 80: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

AATAAAGCTTTACAAACCAAACTCTGGCTTCAATTGTGTAACCCAAGCTTTGATTCTTTCCTCTGTTAAATCGGATTGATTATCTTCATCAAGGGCAAGACCTACAAATTTACCATCACGAACAGCTTTGARYGACTCACTGAATTCLARATAACCTTCTGTAGGCCASONATAGCCAACTGTTTCACCACCATTTTCTGAAATTTTTTCCTCT

Page 81: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

TATTCAAAATGAATTATATCGGTAACTTTAGTACAGAAAATGACGTTAAGAATATCTGCAACTTTAAACCTGAATGATATTATTATTGGCGGGCCTCCATGCCAGGGATTTAGTATTGCTGGGCCAGCCCAAAEALAVGIASTCCTAAAGATCCTAGAAATGGTTTAGAATTTTCATCAACTTTGCACAATGGATAAAATTTCTTGAACCTAAAGCGTTTGTCATGGAAAACGTGAATTCAAAAGGATTGCTATCAAGGAAAAATGCAGAAGGTTTTAAAGTTATAGATATTATTAAGAAAACATTTGGAATTCGAGAACTTGGTTATTTTGTCGAAGTATGGGTTTTAAATGCTGCGGAATATGGCATTCCGCAAATTAGAGAACGGAATTCGATTTTTATTGTTGGCAATAAAAAAGGTAAAGTACTAGGTATTCCTAAAAAAACACATTCTCTGCAATTTTTAAGAATTCGATTTAAATAGGTCTCAATTATCGATCTTCGATGAT

ATGAGTATTATACCTGCACTAACTTTGTGGGACGCAATATCAGACTTACGAATTCGACAGAACTTAATGCGCGTGAAGGAAGTGAAGAGCAACCCTATCATTTAAAACCTCAAAATACTTATCAGACTTGGGCTAGAAATGGTAGTGGAATTCGATACGCTTTACAATCATGTTGCAATGGAACATTCTGACCGTTTAGTAGAACGTTTCCGGCATATAAAATGGGGTGAATCCAGTTCGGATGTATCTAAAGAAGAATTCGACATGGAGCTAGACGACGT

AGTGGTAATGGTGAATTATCAAACAAATCATATGATCAGAATAATCGCCGTTTAAATCCTCATAAACCGGAATTCGAATTCTCACACTATTGCTGCGTCATTCTATGCTAATTTTGTCCATCCTTTTCAACATCGAAATTTAACAGCCCGTGAAGGAGCTAGAATCCAATCTTTTCCAGATAACTATAGATTTTTTGGAAAAGAATTCGAATTCAAACTGTCGTATCTCATAAACTATTGCATCGAGAAGAAAGATTTGATGAAAAATTTCTTTGTCAATATAATCAAATCGGTAATGCTGTACCCCCTCTTCTCGCTAAAGTAATTGCACATCATCTTCTAGAGAAATTAGGAATTCGAATTCAGTTATGCCAACAACTGATAGAAATCCTCTAGTGCATGGATCAAATCTTGAACAAAAAGAGAATCATCGTACAAAATACAGAGATACTGAAAGCAGGACTTTCCTTAGAGAAATCAGAACTGAATATGACAAATGGCATAAAGCAAATATGAACCTGGAATTCGAATTCGAGTTGGACCAAAATCAGAAATTACTGACCA

AGATGATTCAATTATTACTCAAAGAGTGGAACTTCTCACTAAATATAAAGATTTTTTAGATCAGCAGCATTATGCAGAAAAATTTGATTCAAGATCCAACCTTCATTCTAGTGTTTTAGAGACCATTTATAAAGTAAATCTTTAGACGACTAGACGACGTAGCGAATTCGAATTCGAATTCATAATACGAGTCATAACGGCATATATG

GCAGCCTCACTCATTTCTGGGAGACGCTCATAATCCTTACTGAGACGACGGTACTGGTTTAACCAGCCAAATGTTCTTTCTACTACCCACCGTTTGGGCAAAACCTGAAATTCTTGATTAGTACGCCGGATTACCTCAACATGAGCTTGAATCATCAGCCAAACAGAGAGCGCAAATTTATCACCGTCATAGCCGGAATCAACCCAGATGACTTGAATTCGAATTCGAATTCGAACAACTTTTTCCAGTAATTCTGGACGCTCTTCTAACAGTTCCATCAAAGTATAGGCGGCAAGTAATCTTTCTCCAGCATTTGCTTCACTTACAACCACTTTTAACAAAAGTCCCAGACTATCAACCAAAGTTTGCCGCTTTCGTCCTTTTACCTTCTTGCCACCATCAAAACCGTACACATCCCCCTTTTTTCAGTCGTTTTTACCGACTGGCTGTCTGCCGCGATCGCCGTGGGTTGAGTTGACTTCCCCATTTTTTGACGAACTTGATCGCGCAAAGTATGATTCATTTCAGTTGAACTAGGAGGAAAATCCCCTGGAAGCATATCCCACTGAATTCGAATTCGAATTCGAATTCGAATTCGACAACCTGTTTTCAGATGGTAGTAGATAGCGTTGCATACTTCTCGCATATCAGTTGTTCGGGGATGCCCACCGCATTTAGCGGGTGGAATCAAAGGAGCTAAAATTGCCCATTCTGAGTCATTAAGGTCTGTAGAATAAGACTTTCGTCTCATTGTTTCCTATGTAAATACACTCTACAAACAGTATCTTATCGCTGCCTTTTTATCTTAGCTCTCCTTTAGATTTACTTTATAAATAGCCTCTTAGAAGAATTTCTTTATTATTTATTTAAAGATTTAGTACAAGATTTCGGGCAGAACGCTCTTATTGGTAAGTCACACACGTTCAAAGATATTTTCTTCGTACCACCAAAATATTCTGAAATGCTCAAGCGACCTTATGCGCGAATTGAGAGAAAAGATCATGATTTCGTAATTGGTGCAACTGTTCAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAACCATGCTTGAGGGATCTTCACGCGCAGCAGAGGATTTAAAAGCGAGAAATCCTAACAGTTT

ATACCTTGTGGTTATGGAATGGATAAAACTGACCAATGATGTAAATTTACGAAAATATAAAGTTGATCAAATTTATGTACTACGTCAGCAAAAAAATACTGATAGAGAGTTTAGGTATGAGTCAACTTACATAAAAAAT

Page 82: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

Why hasn’t this happened?

Part of bioinformatic program written in C

if (pcInFile == NULL) pfInFile = stdin;

else pfInFile = fopen(pcInFile, "r");

pfOutFile = fopen( pcOutFile, "w" );

if (pfInFile == NULL) { fprintf( stderr, "ERROR opening %s\n", pcInFile ); exit(1); }

if (pfOutFile == NULL) { fprintf( stderr, "ERROR opening %s\n", pcOutFile ); exit(1); }

fputc( fgetc(pfInFile), pfOutFile ); /* deal with first '>' in file */

for ( ; ; )

{

if (processIdentifier( pfInFile, pfOutFile )) { }

else { break; }

if (processSequence( pfInFile, pfOutFile )) { }

else { break; }

}

fclose( pfInFile );

fclose( pfOutFile );

Page 83: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

Why hasn’t this happened?

Part of bioinformatic program written in Perl

sub match_positions {

my $pattern;

local $_;

($pattern, $_) = @_;

my @results;

local $matchStart;

my $instrumentedPattern = qr/(?{ $matchStart = pos() })$pattern/;

while (/$instrumentedPattern/g) {

my $nextStart = pos();

push @results, "[$matchStart..$nextStart)";

pos() = $matchStart+1;

}

return @results;

Page 84: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

Why hasn’t this happened?

Biologists will not come to programming

Programming must come to biologists

Page 85: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

BioLingua

Page 86: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

Genetic Basis of Differentiation

NH3

Environmental Signal Developmental Response

Histidine Kinase

P

Response Regulator

? ? ?NpR3010

Page 87: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

Genetic Basis of DifferentiationNpR3010

RR HKHK-upstream HK-downstream

Page 88: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

HK-upstream HK-downstreamHKRR

Genetic Basis of DifferentiationNpR3010

Page 89: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

BioLingua:: (#$Npun.NpF0304 #$Npun.NpR0355 #$Npun.NpR0450 #$Npun.NpF0484 #$Npun.NpR0589 #$Npun.NpF0832 #$Npun.NpF0906 #$Npun.NpR0956 #$Npun.NpF1084 #$Npun.NpF1085 #$Npun.NpR1109 #$Npun.NpF1184 #$Npun.NpF1278 #$Npun.NpR1450 #$Npun.NpF1453 #$Npun.NpF1516 #$Npun.NpR1633 #$Npun.NpR1678 #$Npun.NpR1683 #$Npun.NpR1688 #$Npun.NpF1776 #$Npun.NpR1779 #$Npun.NpF1800 #$Npun.NpR1903 #$Npun.NpR2091 #$Npun.NpF2162 #$Npun.NpR2263 #$Npun.NpF2346 #$Npun.NpF2364 #$Npun.NpR2420 #$Npun.NpR2902 #$Npun.NpF2972 #$Npun.NpR3053 #$Npun.NpF3084 #$Npun.NpR3197 #$Npun.NpR3241 #$Npun.NpF3659 #$Npun.NpF3676 #$Npun.NpR3733 #$Npun.NpF3829 #$Npun.NpR3907 #$Npun.NpR3959 #$Npun.NpF3972 #$Npun.NpR4101 #$Npun.NpR4160 #$Npun.NpR4165 #$Npun.NpF4214 #$Npun.NpR4435 #$Npun.NpF4460 #$Npun.NpR4503 #$Npun.NpR4743 #$Npun.NpR4768 #$Npun.NpF4909 #$Npun.NpR5015 #$Npun.NpF5034 #$Npun.NpF5044 #$Npun.NpR5135 #$Npun.NpR5136 #$Npun.NpR5316 #$Npun.NpF5361 #$Npun.NpF5636 #$Npun.NpF5682 #$Npun.NpF5759 #$Npun.NpF5763 #$Npun.NpF5788 #$Npun.NpR6014 #$Npun.NpR6015 #$Npun.NpR6228 #$Npun.NpF6321 #$Npun.NpR6360 #$Npun.NpF6363 #$Npun.pNpAF075 #$Npun.pNpBR039 #$Npun.pNpBF139 #$Npun.pNpBF146 #$Npun.pNpBR169 #$Npun.pNpBR170 #$Npun.pNpBF205 #$Npun.pNpEF003)

(GENES-DESCRIBED-BY "response regulator" IN Npun)<1>>

(GENE-UPSTREAM-OF NpF0304)<2>>

Page 90: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

BioLingua:: #$Npun.NpF0303

(GENE-UPSTREAM-OF NpF0304)<2>>

(DESCRIPTIONS-OF *)<4>>

<3>>(GENES-UPSTREAM-OF (RESULT 1)):: (#$Npun.NpF0303 #$Npun.NpF0356 #$Npun.NpF0451 #$Npun.NpF0483 #$Npun.NpR0590 #$Npun.NpF0831 #$Npun.NpF0905 #$Npun.NpF0957 #$Npun.NpR1083 #$Npun.NpF1084 #$Npun.NpR1110 #$Npun.NpF1183 #$Npun.NpF1277 #$Npun.NpR1451 #$Npun.NpR1452 #$Npun.NpR1515 #$Npun.NpF1634 #$Npun.NpR1679 #$Npun.NpF1684 #$Npun.NpR1689 #$Npun.NpF1775 #$Npun.NpF1780 #$Npun.NpF1799 #$Npun.NpR1904 #$Npun.NpR2092 #$Npun.NpF2161 #$Npun.NpR2264 #$Npun.NpR2345 #$Npun.NpF2363 #$Npun.NpR2421 #$Npun.NpR2903 #$Npun.NpR2971 #$Npun.NpR3054 #$Npun.NpR3083 #$Npun.NpR3198 #$Npun.NpF3242 #$Npun.NpR3658 #$Npun.NpF3675 #$Npun.NpR3734 #$Npun.NpR3828 #$Npun.NpF3908 #$Npun.NpR3960 #$Npun.NpF3971 #$Npun.NpF4102 #$Npun.NpR4161 #$Npun.NpF4166 #$Npun.NpR4213 #$Npun.NpR4436 #$Npun.NpF4459 #$Npun.NpR4504 #$Npun.NpR4744 #$Npun.NpR4769 #$Npun.NpR4908 #$Npun.NpF5016 #$Npun.NpF5033 #$Npun.NpF5043 #$Npun.NpR5136 #$Npun.NpF5137 #$Npun.NpF5317 #$Npun.NpF5360 #$Npun.NpR5635 #$Npun.NpF5681 #$Npun.NpF5758 #$Npun.NpR5762 #$Npun.NpR5787 #$Npun.NpR6015 #$Npun.NpR6016 #$Npun.NpR6229 #$Npun.NpR6320 #$Npun.NpF6361 #$Npun.NpF6362 #$Npun.pNpAF074 #$Npun.pNpBR040 #$Npun.pNpBF138 #$Npun.pNpBF145 #$Npun.pNpBR170 #$Npun.pNpBR171 #$Npun.pNpBR204 #$Npun.pNpER002)

Page 91: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

BioLingua:: ("two-component sensor histidine kinase [Nostoc sp. PCC 7120] gi|25531611|pir||AD2200 two- "unknown protein [Nostoc sp. PCC 7120] gi|25534386|pir||AH1981 hypothetical protein alr1403 "tmRNA-binding protein [Nostoc sp. PCC 7120] gi|22096164|sp|Q8YM70|SSRP_ANASP SsrA-binding protein "GTP-binding protein era homolog" "unknown protein [Nostoc sp. PCC 7120] gi|25533156|pir||AF2229 hypothetical protein asr3389 "ORF_ID:tlr0160~similar to ferredoxin [Thermosynechococcus elongatus BP-1] "hypothetical protein [Nostoc sp. PCC 7120] gi|25367067|pir||AH2295 hypothetical protein alr3919 "two-component hybrid sensor and regulator [Nostoc sp. PCC 7120] gi|25532444|pir||AE2276 two- "hypothetical protein [Nostoc sp. PCC 7120] gi|25358966|pir||AG2158 hypothetical protein alr2822 "two-component response regulator [Nostoc sp. PCC 7120] gi|25533086|pir||AF2158 two-component "probable two-component sensor histidine kinase [Gloeobacter violaceus] gi|35214672|dbj|BAC92039.1| "phytochrome-like protein [Tolypothrix sp. PCC 7601]" "two-component sensor histidine kinase [Nostoc sp. PCC 7120] gi|25530471|pir||AC1860 two-component NIL NIL NIL "hypothetical protein [Nostoc sp. PCC 7120] gi|25535333|pir||AI2179 hypothetical protein all2992 NIL "unknown protein [Nostoc sp. PCC 7120] gi|25535440|pir||AI2275 hypothetical protein alr3760 "transcriptional regulator [Nostoc sp. PCC 7120] gi|25302898|pir||AB2544 transcription regulator "similar to two-component sensor histidine kinase [Nostoc sp. PCC 7120] gi|25531791|pir||AD2385 "putative gluconolactonase precursor [Sinorhizobium meliloti] gi|25369832|pir||G95274 probable "similar to two-component sensor histidine kinase [Nostoc sp. PCC 7120] gi|25531791|pir||AD2385 "hypothetical protein [Nostoc sp. PCC 7120] gi|25530521|pir||AC1903 hypothetical protein asr0773 . . .

DESCRIPTIONS-OF *)<4>>

Page 92: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

BioLingua

:: "List of length 79 suppressed"

(DEFINE RR-class AS (GENES-DESCRIBED-BY "response regulator" IN Npun) DISPLAY off)

<5>>

(INTERSECTION-OF (HK-adjacent RR-class)) <10>>

(DEFINE HK-class AS (GENES-DESCRIBED-BY “histidine kinase" IN Npun) DISPLAY off)

<6>>

:: "List of length 89 suppressed"

(DEFINE HK-upstream AS (GENES-UPSTREAM-OF HK-class) DISPLAY off)

<7>>

:: "List of length 89 suppressed"

(DEFINE HK-downstream AS (GENES-DOWNSTREAM-OF HK-class) DISPLAY off)

<8>>

:: "List of length 89 suppressed"

(DEFINE HK-adjacent AS (UNION-OF (HK-upstream HK-downstream)) DISPLAY off)

<9>>

:: "List of length 178 suppressed"

Page 93: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

BioLingua:: 22 elements in INTERSECTION> (#$Npun.pNpBF205 #$Npun.pNpBF139 #$Npun.NpR6228 #$Npun.NpR5316 #$Npun.NpF4214 #$Npun.NpF3676 #$Npun.NpF3084 #$Npun.NpR3053 #$Npun.NpR1779 #$Npun.NpR0589 #$Npun.NpF0304 #$Npun.NpR1109 #$Npun.NpF1278 #$Npun.NpF1776 #$Npun.NpF1800 #$Npun.NpR2420 #$Npun.NpR2902 #$Npun.NpR3197 #$Npun.NpR4503 #$Npun.NpF5763 #$Npun.NpF6363 #$Npun.pNpBF146)

(INTERSECTION-OF (HK-adjacent RR-class))<10>>

(DEFINE RR-candidates AS (SET-DIFFERENCE RR-class (RESULT 10)) DISPLAY off)

<11>>

:: "List of length 57 suppressed"

<12>>

Page 94: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

Histidine Kinase

NpR3010Nostoc punctiforme

Genes Functionally Related to His Kinase

Anabaena PCC 7120

Trichodesmium

Synechocystis PCC 6803

. . . (13 total) Find similar genes

Conserved

Page 95: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

BioLingua:: 24 elements in INTERSECTION> (#$Npun.pNpBF205 #$Npun.pNpBF139 #$Npun.NpR6228 #$Npun.NpR5316 #$Npun.NpF4214 #$Npun.NpF3676 #$Npun.NpF3084 #$Npun.NpR3053 #$Npun.NpR1779 #$Npun.NpR0589 #$Npun.NpF0304 #$Npun.NpR1109 #$Npun.NpF1278 #$Npun.NpF1776 #$Npun.NpF1800 #$Npun.NpR2420 #$Npun.NpR2902 #$Npun.NpR3197 #$Npun.NpR4503 #$Npun.NpF5763 #$Npun.NpF6363 #$Npun.pNpBF146)

(INTERSECTION-OF (RR-adjacent HK-class))<10>>

(DEFINE RR-candidates AS (SET-DIFFERENCE RR-class (RESULT 10)) DISPLAY off)

<11>>

:: "List of length 57 suppressed"

(CONTEXT-OF NpF0304)<12>>

(ALL-ORTHOLOGS-OF *)<13>>

:: (<- #$Npun.NpR0302 potassium-dependent ATPase sub) 523 (-> #$Npun.NpF0303 two-component sensor histidine) 85 (-> #$Npun.NpF0304 two-component response regulat) 473 (-> #$Npun.NpF0305 hypothetical protein glr0895 [) 85 (<- #$Npun.NpR0306 primosomal protein N' [Nostoc ) > (#$Npun.NpR0302 #$Npun.NpF0303 #$Npun.NpF0304 #$Npun.NpF0305 #$Npun.NpR0306)

Page 96: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

BioLingua

:: ((#$S7942.sef0159 #$Npun.NpR0302 #$Gvi.glr0573 #$A29413.Av?3368 #$A7120.all3154) (#$S6803.sll1590 #$Npun.NpF0303 #$Gvi.gll0572 #$A29413.Av?1247 #$A7120.alr3155) (#$S6803.sll1592 #$P9313.PMT1405 #$Npun.NpF0304 #$Gvi.gll0571 #$A29413.Av?1248 #$A7120.alr3156) (#$Tery.Te?7017 #$Npun.NpF0305 #$Cwat.Cw?3050) (#$Tery.Te?2243 #$TeBP1.tll0415 #$S6803.sll0270 #$S8102.SynW1782 #$S7942.sef1895 #$PRO1375.Pro0497 #$P9313.PMT1271 #$PMED4.PMM0497 #$Npun.NpR0306 #$Gvi.gll0025 #$Cwat.Cw?3016 #$A29413.Av?5206 #$A7120.all4248))

(ALL-ORTHOLOGS-OF *)<13>>

<14>>

(CONTEXT-OF NpF0304)<12>>:: (<- #$Npun.NpR0302 potassium-dependent ATPase sub) 523 (-> #$Npun.NpF0303 two-component sensor histidine) 85 (-> #$Npun.NpF0304 two-component response regulat) 473 (-> #$Npun.NpF0305 hypothetical protein glr0895 [) 85 (<- #$Npun.NpR0306 primosomal protein N' [Nostoc ) > (#$Npun.NpR0302 #$Npun.NpF0303 #$Npun.NpF0304 #$Npun.NpF0305 #$Npun.NpR0306)

Page 97: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

A new family of proteins?!A type of transposase?

transposase

TRANSPOSON

Is Npr3008 a transposase?

Page 98: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

BioLingua

:: Query Q-Start Q-End Subject S-Start S-End E-value %ID 1. "Seq 1" 1 2258 #$Npun.chromosome 3706846 3704589 0.0 100.0 2. "Seq 1" 293 1511 #$Npun.chromosome 4008429 4009647 0.0 100.0 3. "Seq 1" 293 1512 #$Npun.chromosome 7932036 7930817 0.0 99.92 4. "Seq 1" 293 1510 #$Npun.chromosome 4228111 4229328 0.0 99.92 5. "Seq 1" 293 1510 #$Npun.chromosome 3971285 3972502 0.0 99.92 6. "Seq 1" 293 1510 #$Npun.chromosome 4027833 4029050 0.0 99.75 7. "Seq 1" 293 1511 #$Npun.chromosome 2121987 2123204 0.0 99.67 8. "Seq 1" 293 1510 #$Npun.chromosome 2136737 2135521 0.0 99.67 9. "Seq 1" 397 1510 #$Npun.chromosome 2030748 2031861 0.0 99.64 10. "Seq 1" 1537 2258 #$Npun.pNpB 42015 42737 4.6d-83 80.5 11. "Seq 1" 1331 1420 #$Npun.chromosome 8036134 8036045 1.8d-8 83.33 12. "Seq 1" 1319 1385 #$Npun.chromosome 5915424 5915358 2.7d-4 83.58 13. "Seq 1" 1319 1385 #$Npun.chromosome 2577387 2577453 2.7d-4 83.58> (#$Temp27 #$Temp28 #$Temp29 #$Temp30 #$Temp31 #$Temp32 #$Temp33 #$Temp34 #$Temp35 #$Temp36 #$Temp37 #$Temp38 #$Temp39)

(BLAST extended-NpR3008 Npun) <15>>

<16>>

(DEFINE extended-NpR3008 AS (SEQUENCE-OF NpR3008 FROM -700 TO-END +700) DISPLAY off)

<14>>

:: “Results suppressed"

Page 99: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

BioLingua

:: Query Q-Start Q-End Subject S-Start S-End E-value %ID 1. "Seq 1" 1 2258 #$Npun.chromosome 3706846 3704589 0.0 100.0 2. "Seq 1" 293 1511 #$Npun.chromosome 4008429 4009647 0.0 100.0 . . .

(BLAST extended-NpR3008 Npun) <15>>

<16>>

(DEFINE extended-NpR3008 AS (SEQUENCE-OF NpR3008 FROM -700 TO-END +700) DISPLAY off)

<14>>

:: “Results suppressed"

(FOR-EACH hit IN * AS (subj S-start) = (GET-ELEMENTS (subject Subject-start) FROM hit) AS start = (- S-start 15) AS end = (+ S-start 40) AS left-end = (SEQUENCE-OF subj FROM start TO end) COLLECT left-end)

Page 100: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

BioLingua

:: Query Q-Start Q-End Subject S-Start S-End E-value %ID 1. "Seq 1" 1 2258 #$Npun.chromosome 3706846 3704589 0.0 100.0 2. "Seq 1" 293 1511 #$Npun.chromosome 4008429 4009647 0.0 100.0 . . .

(BLAST extended-NpR3008 Npun) <15>>

<16>>

(DEFINE extended-NpR3008 AS (SEQUENCE-OF NpR3008 FROM -700 TO-END +700) DISPLAY off)

<14>>

:: “Results suppressed"

(FOR-EACH hit IN * AS (subj S-start) = (GET-ELEMENTS (subject Subject-start) FROM hit) AS start = (- S-start 15) AS end = (+ S-start 40) AS left-end = (SEQUENCE-OF subj FROM start TO end) COLLECT left-end)

:: > ("TACGCTCTATCTTCAGCAAGTTGTTTTTCTTGCTGTATAATTCGGCGATTCTCTTC" "AAAGAAACGCTAGAGGGGTGCATCCCAGTTTTTATTATTCCAAAACAAATAAATAA" "AAACTGGGATGCACCCCTTATTAATGCTCTTTGGAGTCAATACTAATTTTGCCAAA" "TACCTTTGTGATAGGGGGTGCATCCCAGTTTTTATTATTCCAAAACAAATAAATAA" "AAATTAGTTTATTATGGGTGCATCCCAGTTTTTATTATTCCAAAACAAATAAATAA" "CACCGATTCACTAATGGGTGCATCCCAGTTTTTATTATTCCAAAACAAATAAATAA" "ACTATTGTAGAGACTGGGTGCATCCCAGTTTTTATTATTCCAAAACAAATAAATAA" . . .

Page 101: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

BioLingua(ALIGNMENT-OF * LINE-LENGTH 60 SEGMENT-LENGTH 60) <17>>

:: Seq 4 1 TACCTTTGT-GATAGGGGGTGCATCCCAGTTTTTATTAT--TCCAAAACAAATAAATAA--------------- Seq 7 1 -ACTATTGTAGAGACTGGGTGCATCCCAGTTTTTATTAT--TCCAAAACAAATAAATAA--------------- Seq 2 1 -AAAGAAACGCTAGAGGGGTGCATCCCAGTTTTTATTAT--TCCAAAACAAATAAATAA--------------- Seq 5 1 AAATTAGTTTATTA-TGGGTGCATCCCAGTTTTTATTAT--TCCAAAACAAATAAATAA--------------- Seq 6 1 -CACCGATTCACTAATGGGTGCATCCCAGTTTTTATTAT--TCCAAAACAAATAAATAA--------------- Seq 8 1 ----------AAACTGGGATGCA-CCCAGTCTCTACAATAGTTCTAGA-GAACACATAACGTAAATAC------ Seq 3 1 ----------AAACTGGGATGCACCCC--TTATTAATGCTCTTTGGAGTCAATAC-TAATTTTGCCAAA----- Seq 9 1 -----------CATTGTCGCCCCTTGAAGTCATCAAGAC-----TAGGTGTATCAATGACTCCTGAAGAAGA-- Seq 12 1 ------------------GTTCAGCTTGGTAATAGCTGTAGTTAATAATGCGAGAGCGATGTTTTTCGAGATAA Seq 1 1 ---------TACGCTCTATCTTCAGCAAGTTGTTTTTCT--TGCTGTATAATTCGGCGATTCTCTTC------- Seq 10 1 --------------GGTCGGGAAATTGCGAGATTATTCAGTGGCGAAGTAGTGGGAGAACTACCATTGAT---- Seq 11 1 ------------TTGAACAAATTTGTTCGTGGAAATGGTAATTGGAAATTTGCTGCGGAATGCGGTGA------ Seq 13 1 ------------ATTATTAACTACAGCTATTACCAAGCTGAACAACTGTGTTCTATTGGTTCTGGTTC------ consensus 1

Page 102: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

Genetic Basis of Differentiation

NH3 N2

NH3

Nostoc + Anabaena

Not Synechocystis, Trichodesmium,…

Page 103: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

BioLingua(DEFINE diff-cb AS (Npun Avar A7120) DISPLAY off)<18>>

:: "List of length 3 suppressed"

(DEFINE non-diff-cb AS (REMOVE-FROM-SET *loaded-organisms* diff-cb) DISPLAY off)

<19>>

:: "List of length 10 suppressed"

(DEFINE diff-cb-specific AS (COMMON-ORTHOLOGS-OF diff-cb NOT-IN non-diff-cb) DISPLAY off)

<20>>

:: "List of length 661 suppressed"

Page 104: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

BioLingua

• Provides knowledge in accessible form

• Provides tools accessed in common way

• Provides results that can be manipulated

• Provides a programming language that speaks to biologists

Page 105: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

The Death of Science

Page 106: Lives of the Scientist Genetic Basis of Differentiation Events in time and space
Page 107: Lives of the Scientist Genetic Basis of Differentiation Events in time and space

CreditsWest Coast

- Jeff Shrager - JP Massar - Mike Travers

VCU

- Austin Hess - James Mastros - Sarah Cousins - Yue Zhao

BioLingua: http://ramsites.net/~biolingua/help

Jeff Elhai: Center for the Study of Biological Complexity Virginia Commonwealth University

Phone: 828-0794 E-mail: [email protected]