42
Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discovery Mark Boguski December 5, 2007

Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

Proteomics, Systems Biology and Knowledge-Mining in Drug and

Biomarker Discovery

Mark Boguski

December 5, 2007

Page 2: Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

2

Page 3: Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

3

PAM-250

Page 4: Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

4

The practice of The practice of ““proteomicsproteomics”” extends back to the early 1970sextends back to the early 1970s

19251925--19831983

Dayhoff, M.O. and Eck, R.V. (1970)MASSPEC: a computer program for complete sequence analysis of large proteins from mass spectrometry data of a single sampleComputers in Biology and Medicine 1:5-28

Page 5: Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

5

Dayhoff, M.O. and Eck, R.V. (1970)MASSPEC: a computer program for complete sequence analysis of large proteins from mass spectrometry data of a single sampleComputers in Biology and Medicine 1:5-28

““This new method should make This new method should make feasible various experiments, for feasible various experiments, for

example, with single example, with single organisms.organisms.””

19251925--19831983

The practice of The practice of ““proteomicsproteomics”” extends back to the early 1970sextends back to the early 1970s

Page 6: Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

6

SCIENTIFIC AMERICANSCIENTIFIC AMERICAN

Page 8: Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

8

System Biology: What are the deliverables?function z=power(x,y),z=x^y,endfunction

function z=root(x,y),z=y^(1/x),endfunction

function xdot=f(t,x)

// compartment_compartment id: compartment compartment_compartment=3e-12;

// k1f_v1 id: k1f reactionID: v1

k1f_v1=0.003;

<?xml version="1.0" encoding="UTF-8" ?>- <sbml xmlns="http://www.sbml.org/sbml/level2"metaid="metaid_0000001" level="2" version="1">

- <model metaid="metaid_0000002"name="Kholodenko1999_EGFRsignaling">

- <notes>- <body

xmlns="http://www.w3.org/1999/xhtml">- <p align="left">

- <font face="Arial, Helvetica, sans-serif

Conceptual model

Mathematical model

Computer model

Kholodenko et al. Targets of EGFR in Tumor Cells JBC 274:30169, 1999

Validation Data

SB Markup Language

Page 9: Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

9

The Process of a Systems Biology Analysis

Butcher et al. (2004) Nature Biotechnology 22, 1253 - 1259

Page 10: Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

10

5

- 30- 25- 20- 15- 10- 50

0 50 100 150 200

0 1 2 3 4 5-2

0

2

4

6

time

contro

l

cyto

0 1 2 3 4 5-1

0

1

2

3

time

nuc

0 1 2 3 4 5-2

0

2

4

6

time

drug

0 1 2 3 4 5-1

0

1

2

3

time

...

“omics”Experiments

MathematicalModels

Monitoring and modeling network activity via proteomics measurements

Page 11: Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

11

Crude cell lysates

...Spotting Exp. A

Exp. B

Exp. C

Exp. D

Sample dilution

Ab1

Ab2

Ab3

Ab6

Ab4

Ab5

Analysis Assay

Reverse Protein Arrays: probing pathways with antibodies

Key features of arrays: • Scalable in terms of samples and analytes• Small sample volumes (0.5 nl per spot)• Large degree of automation throughout the process

Page 12: Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

12

Reverse arrays: sample requirements & capacity

1 sample

4 dilutions induplicate

Page 13: Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

13

Level of detection: How much protein is needed?

Page 14: Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

14

Planar Waveguide Principle - for HighSensitivity Fluorescence Microarray Detection

free label

Imaging of surface-confined fluorescence

excitation of bound label

CCD camera

microarray on chip

Page 15: Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

15

Kinetics of ERK phosphorylation in T-cell signaling

Page 16: Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

16

Cell lines, disease models

Cell lysate

Antibodies toKey nodes in

pathwaysDisease “signature” &Systems Response Profiles

Proteome &Pathway

Annotations

- Cmpd (or RNAi)

+ Cmpd (or RNAi)…incubate…

Cell lysate

“Reverse” arrays

+

eIF4E

LKB1

mTOR

PP PPP

S2448T2446

MO25 STRAD

AMPKα

P172γ

4E-BP1PP

Branch ChainAmino Acids

Raptor GβL

Rheb14-3-3

Growth Factor ReceptorsIP3K

PDKPKB

AMPK β

TSC1TSC2

p70S6K

mTOR

S2448T2446Raptor

eIF4EeIF3

eIF4B

P

P P P

4E-BP1PP PTranslation

initiation

VEGF

FKBP12mTOR

Rictor GβL

Rho PKCα

RasRafMek 1/2Erk 1/2

PPP

HIF-1α

IRS1

PathwayModel

Page 17: Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

17

Customized Signaling Network Database

Pathway nodes whereantibodies are available

(Only for illustration)

Mouse-over shows Abspecificity

Hyperlinksto Ab web Reports

Page 18: Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

18

The Process of a Systems Biology Analysis

Butcher et al. (2004) Nature Biotechnology 22, 1253 - 1259

Page 19: Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

19

Curating the literature

Page 21: Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

0

10,000

20,000

30,000

40,000

50,000

60,000

70,000

80,000

1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006

Users / W

eekday

0

200,000

400,000

600,000

800,000

1,000,000

1,200,000

1,400,000

1,600,000

1,800,000

Bas

e Pa

irs

(Mill

ions

)

GenBank Base Pairs

Users per Weekday

+

Seventeen Years of Growth:NCBI Data and User Services

Growth of NCBI Data and User Services

Page 22: Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

0

10,000

20,000

30,000

40,000

50,000

60,000

70,000

80,000

1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006

Users / W

eekday

0

200,000

400,000

600,000

800,000

1,000,000

1,200,000

1,400,000

1,600,000

1,800,000

Bas

e Pa

irs

(Mill

ions

)

GenBank Base Pairs

Users per Weekday

+

BLASTEntrez

GenBankat NCBIdbEST

3D StructureNetwork Entrez

WWWdbSTS

BankItGenomesTaxonomy

OMIMGeneMapCn3DUniGene

PubMedPSI-BLASTVASTePCR

Microbial Genomes PHI-BLASTCGAP

Human GenomeLinkOut LocusLinkRefSeqdbSNP

PubMed CentralBLINKMapViewerGEOGeneRIFs

dbMHCBookShelfHuman Genome-Transcripts alignmts

WGSHLA HaplotypesHuman Genome -TPA

Entrez GenesMouse CompositeGenome

Gnomon

PubChemTrace ArchiveCCDSCancer ChromosomesEnvironmental Samples

Public AccessInfluenza SequencesGenSATGeneTestsWhole Genome Assoc

Growth of NCBI Data and User Services

Page 23: Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

GenBank191Million Bases

ESTs966 Million Bases

High Throughput Genomes12 Billion Bases

Shotgun Sequencing142 Billion Bases (not including 1.1 Trillion Trace Bases)

WholeGenome

Sequences

ESTStandardGenBankEntries

EST

EST

HTG

HTG

HTG

StandardGenBankEntries

HTG20011997

2006

1994

Growth of Sequence Data and Shift in Data Types

Page 25: Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

25

Drug Targets, Biomarkers

Computed associations

Filtering and ranking

Evaluation by human expertsMEDLINE/Embase

6,000,000 articles

Human Genome 24,000 genes

Disease Area ~10-100 disease

entities

CAS/IDDB 10,000 Patents

Text-mining adapted for Biomedical Discovery

Feedback loop with different filters and/or

ranking criteria

User interfaces

Page 26: Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

26

Drug Targets, Biomarkers

Computed associations

Filtering and ranking

Evaluation by human experts

Feedback loop with different filters and/or

ranking criteria

MEDLINE/Embase6,000,000 articles

Human Genome 24,000 genes

Disease Area ~10-100 disease

entities

CAS/IDDB 10,000 Patents

Text-mining adapted for Biomedical Discovery

Ultralinks

User interfaces

Page 27: Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

The Ultralink – an expert system for contextual hyperlinking in knowledge management

OR

Beyond Google® and PubMed

Page 28: Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

28

Connecting Knowledge Corpora

Indexing of large heterogeneous data collections (databases, full texts) to enable semantic expansionInformation retrieval and extraction, entity recognition, semantic enrichmentKnowledge Map (for navigating the conceptual network)Terminology Hub (thesauri and ontologies)Ontology-associated rules

Page 29: Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

29

Examples of entities that constitute our terminologies

Chemical entities – IUPAC names, trivial names, trade names, compound codes…Biological entities – targets, genes/protein, receptors, ligandsmodes and mechanisms of actions..Diseases, Indications, Side Effects, ContraindicationsInstitutions, Affiliations, PeopleGeographic locations

Page 30: Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

30

The Ultralink can be called from Internet Explorer

Internet Explorer Integration

Plug in

Internet Explorer Integration

Plug in

Web Page Tagged Document

2

Sends the document for

analysis

3

Gets back tagged parts

1

User requests for analysis

4

Injection of specific HTML

tags

GPS Lexical Analysis Server ToolsGPS Lexical Analysis Server Tools

Lexical ExtractLexical Extract

ZoningZoning

TaggingTagging

DocStructuresDocStructures

Meta-RulesMeta-Rules

TerminologyTerminology

Web Service (WSDL)Web Service (WSDL)

Page 31: Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

31

Annotations of records in PubMed

Activate UltraLink

Page 32: Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

32

Annotations of any web page

Page 33: Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

33

“Mouse-over”

“Click”

Color coding according to concept type, e,g,Yellow = Gene Name; Tan = Institution

Page 34: Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

34

BLAST Interface

Page 35: Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

35

More Ultralink Examples

Page 36: Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

36

Ultralink technology is integrated with MS Office

Internet Explorer Integration

Plug in

Internet Explorer Integration

Plug in

2

Sends the document for

analysis

3

Gets back tagged parts

1

User requests for analysis

4

Injection of specific HTML

tags

GPS Lexical Analysis Server ToolsGPS Lexical Analysis Server Tools

Lexical ExtractLexical Extract

ZoningZoning

TaggingTagging

DocStructuresDocStructures

Meta-RulesMeta-Rules

TerminologyTerminology

Web Service (WSDL)Web Service (WSDL)

Office Document

Tagged Document

Page 37: Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

37

Annotations on full text in a Word document

Page 38: Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

38

Conclusions (for Microsoft)

• Biology is inherently messy, phenomenological and contingent

• Much progress has been made in information integration, much less in standardization

• Software tools and database systems will still have to deal with the complexities and multiple dimensions of biological systems for the foreseeable future

Page 39: Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

39

Modeling

Omics Complex cell systems assays

Meters

Molecules Pathways Cells Tissues Humans

10-9 10-8 10-7 10-6 10-5 10-4 10-3 10-2 10-1 1

Seconds 10-6 102 104 105 108

ScaleAdapted from: Butcher, E.C. et al. (2004) Systems biology in drug discovery.

Nature Biotechnology 22, 1253 - 1259

Should we be optimistic?

Page 40: Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

40

Modeling

Omics Complex cell systems assays

Meters

Molecules Pathways Cells Tissues Humans

10-9 10-8 10-7 10-6 10-5 10-4 10-3 10-2 10-1 1

Seconds 10-6 102 104 105 108

ScaleAdapted from: Butcher, E.C. et al. (2004) Systems biology in drug discovery.

Nature Biotechnology 22, 1253 - 1259

Should we be optimistic?

“... [models], insofar as they represent informational patterns abstracted from their instantiation in a biological substrate, can never fully capture the embodied actuality, unless they are as prolix and noisy as the body itself.”

N.K. HaylesHow We Became Posthuman: Virtual Bodies in Cybernetics,

Literature and Informatics(Univ. Of Chicago Pres, 1999)

Page 41: Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

41

Modeling

Omics Complex cell systems assays

Meters

Molecules Pathways Cells Tissues Humans

10-9 10-8 10-7 10-6 10-5 10-4 10-3 10-2 10-1 1

Seconds 10-6 102 104 105 108

ScaleAdapted from: Butcher, E.C. et al. (2004) Systems biology in drug discovery.

Nature Biotechnology 22, 1253 - 1259

Should we be optimistic?

Yes -- Systems Biology approaches will lead to more rigorous definitions of biological entities and relationships

and to better organization of our knowledge.

Page 42: Proteomics, Systems Biology and Knowledge-Mining in Drug and Biomarker Discoverymarkboguski.net/.../Proteomics_SysBiol_Biomarkers_2007.pdf · 2017. 9. 10. · Drug Targets, Biomarkers

42