View
218
Download
3
Tags:
Embed Size (px)
Citation preview
Integrated Computational Integrated Computational Approach for Approach for
Translational Biomedical Translational Biomedical ResearchResearch
Seungchan Kim, Ph.D.Seungchan Kim, Ph.D.
CSE, Arizona State UniversityCSE, Arizona State Universityand and
MDTV/GenSIP, Translational Genomics Research MDTV/GenSIP, Translational Genomics Research InstituteInstitute
AI @ ASU Lunch BunchAI @ ASU Lunch BunchOct. 25, 2005Oct. 25, 2005
BY 510BY 510
AI@ASU, BY510, Oct. 25, 2005
Biomedical ProblemsBiomedical Problems
• Can we recognize disease Can we recognize disease subtypessubtypes??• Can we identify Can we identify molecular markersmolecular markers for for
certain type of disease?certain type of disease?• Can we learn Can we learn regulatory mechanismregulatory mechanism
governing cellular phenotype, i.e. governing cellular phenotype, i.e. disease?disease?
• Can we find a new Can we find a new therapeutic targettherapeutic target for the treatment of disease?for the treatment of disease?
• Etc.…Etc.…
AI@ASU, BY510, Oct. 25, 2005
Cells: Basic FeaturesCells: Basic Features
• All living things are made of All living things are made of cellscells..• All cells share the same machinery for All cells share the same machinery for
their most basic functions.their most basic functions.• All cells store their hereditary All cells store their hereditary
information in the same linear chemical information in the same linear chemical code, stored in a double-stranded code, stored in a double-stranded molecule, the molecule, the deoxyribonucleic acid deoxyribonucleic acid (DNA).(DNA).
• All cells replicate their hereditary All cells replicate their hereditary information by information by templated polymerizationtemplated polymerization..
AI@ASU, BY510, Oct. 25, 2005
Cells: Basic FeaturesCells: Basic Features
• All cells All cells transcribetranscribe portions of their portions of their hereditary information into single stranded hereditary information into single stranded molecules known as ribonucleic acids (RNA).molecules known as ribonucleic acids (RNA).
• All cells All cells translatetranslate RNA into protein (long RNA into protein (long polymer chains) in the same way.polymer chains) in the same way.
• All cells use proteins to catalyze most All cells use proteins to catalyze most chemical reactions.chemical reactions.
• All cells function as biochemical factories All cells function as biochemical factories dealing with the same basic molecular dealing with the same basic molecular building blocks.building blocks.
AI@ASU, BY510, Oct. 25, 2005
Prokaryotic v. EukaryoticProkaryotic v. Eukaryotic
• Living organisms can be classified on Living organisms can be classified on the basis of cell structure into two the basis of cell structure into two groups:groups:– Eukaryotes Eukaryotes (plants, fungi, and animals)(plants, fungi, and animals)– Prokaryotes Prokaryotes (bacteria)(bacteria)
• Eukaryotes keep their DNA in a distinct Eukaryotes keep their DNA in a distinct membrane-bounded intracellular membrane-bounded intracellular compartment called the compartment called the nucleusnucleus..
• Prokaryotes have no distinct nuclear Prokaryotes have no distinct nuclear compartment to house their DNA.compartment to house their DNA.
AI@ASU, BY510, Oct. 25, 2005
A Typical Prokaryotic CellA Typical Prokaryotic Cell
© Garland Science, Molecular Biology of The Cell, 4th Edition
AI@ASU, BY510, Oct. 25, 2005
A Typical Eukaryotic CellA Typical Eukaryotic Cell
© Garland Science, Molecular Biology of The Cell, 4th Edition
AI@ASU, BY510, Oct. 25, 2005
A “Simplified” CellA “Simplified” Cell• The The membrane membrane is the lipid bi-is the lipid bi-
layer and associated proteins layer and associated proteins that encloses all cells.that encloses all cells.
• The The nucleus nucleus is a prominent is a prominent membrane-bounded organelle membrane-bounded organelle in a eukaryotic cell, in a eukaryotic cell, containing DNA organized containing DNA organized into into chromosomeschromosomes..
• The The nuclear envelop nuclear envelop is a double is a double membrane surrounding the membrane surrounding the nucleus. It consists of an outer nucleus. It consists of an outer and inner membrane and is and inner membrane and is perforated by nuclear pores.perforated by nuclear pores.
• The The chromatin chromatin is the complex of is the complex of DNA and various proteins that DNA and various proteins that are found in the nucleus of a are found in the nucleus of a eukaryotic cell. It is the material eukaryotic cell. It is the material that chromosomes are made of.that chromosomes are made of.
• The The cytoplasm cytoplasm is the contents of is the contents of the cell that are contained the cell that are contained within its plasma membrane within its plasma membrane but, in the case of eukaryotic but, in the case of eukaryotic cells, outside the nucleus.cells, outside the nucleus.
• The The ribosomes ribosomes are particles are particles composed of ribosomal RNAs composed of ribosomal RNAs and ribosomal proteins that and ribosomal proteins that associate with associate with messenger RNAs messenger RNAs and catalyze the synthesis of and catalyze the synthesis of protein.protein.
nucleus
chromatin
ribosomes
membrane
nuclear envelope
AI@ASU, BY510, Oct. 25, 2005
DNA and its Building DNA and its Building BlocksBlocks
• DNA is made from DNA is made from simple subunits, called simple subunits, called nucleotidesnucleotides, each , each consisting of a sugar consisting of a sugar phosphate molecule phosphate molecule with a nitrogen-with a nitrogen-containing side-group, containing side-group, or or basebase, attached to it., attached to it.
• The bases are of four The bases are of four types:types:– Adenine Adenine (A)(A)– Guanine Guanine (G)(G)– Cytosine Cytosine (C)(C)– Thymine Thymine (T)(T)
© Garland Science, Molecular Biology of The Cell, 4th Edition
AI@ASU, BY510, Oct. 25, 2005
DNA and its Building DNA and its Building BlocksBlocks
• A single strand of DNA consists of nucleotides A single strand of DNA consists of nucleotides joined together by sugar-phosphate linkages.joined together by sugar-phosphate linkages.
• The individual sugar-phosphate units are The individual sugar-phosphate units are asymmetric, giving the backbone of the strand a asymmetric, giving the backbone of the strand a definite directionality or polarity.definite directionality or polarity.
• This directionality guides the molecular This directionality guides the molecular processes by which the information in DNA is processes by which the information in DNA is interpreted and copied in cells.interpreted and copied in cells.
© Garland Science, Molecular Biology of The Cell, 4th Edition
AI@ASU, BY510, Oct. 25, 2005
DNA and its Building DNA and its Building BlocksBlocks
• Through Through templated templated polymerizationpolymerization, the , the sequence of nucleotides in sequence of nucleotides in an existing DNA strand an existing DNA strand controls the sequence in controls the sequence in which nucleotides are which nucleotides are joined together in a new joined together in a new DNA strand.DNA strand.
• RulesRules: {A : {A T} | {C T} | {C G} G}• The new strand has a The new strand has a
nucleotide sequence that is nucleotide sequence that is complementary to that of complementary to that of the old strand, and a the old strand, and a backbone with opposite backbone with opposite directionality.directionality.
© Garland Science, Molecular Biology of The Cell, 4th Edition
AI@ASU, BY510, Oct. 25, 2005
DNA and its Building DNA and its Building BlocksBlocks
• A normal DNA molecule consists of two A normal DNA molecule consists of two complementary strands.complementary strands.
• The nucleotides within each strand are linked by The nucleotides within each strand are linked by strong (strong (covalentcovalent) chemical bonds.) chemical bonds.
• The complementary nucleotides on opposing strands The complementary nucleotides on opposing strands are held together more weakly, by are held together more weakly, by hydrogen hydrogen bonds.bonds.
© Garland Science, Molecular Biology of The Cell, 4th Edition
AI@ASU, BY510, Oct. 25, 2005
DNA and its Building DNA and its Building BlocksBlocks
• The two strands The two strands twist around each twist around each other to form a other to form a double helixdouble helix..
• This is a robust This is a robust structure that can structure that can accommodate accommodate any any sequence of sequence of nucleotides nucleotides without altering its without altering its basic structure.basic structure.
© Garland Science, Molecular Biology of The Cell, 4th Edition
AI@ASU, BY510, Oct. 25, 2005
DNA ReplicationDNA Replication
• During the process of DNA replication, the two During the process of DNA replication, the two strands of DNA double helix are pull apart.strands of DNA double helix are pull apart.
• Each strand serves as a template for synthesis Each strand serves as a template for synthesis of a new complementary strand by means of of a new complementary strand by means of templated polymerization.templated polymerization.
© Garland Science, Molecular Biology of The Cell, 4th Edition
AI@ASU, BY510, Oct. 25, 2005
DNA TranscriptionDNA Transcription
• Each cell contains a Each cell contains a fixed fixed set of DNA molecules.set of DNA molecules.• A given segment of DNA serves to guide the synthesis A given segment of DNA serves to guide the synthesis
of many of many identical identical RNA transcripts.RNA transcripts.• These transcripts serve as These transcripts serve as working copies working copies of the of the
information stored in the DNA archive.information stored in the DNA archive.• Many different sets of RNA molecules can be made by Many different sets of RNA molecules can be made by
transcribing selected parts of a long DNA sequence, transcribing selected parts of a long DNA sequence, allowing each cell to use its stored information allowing each cell to use its stored information differently.differently.
© Garland Science, Molecular Biology of The Cell, 4th Edition
AI@ASU, BY510, Oct. 25, 2005
DNA TranscriptionDNA Transcription
• All RNA in a cell is made by the process of All RNA in a cell is made by the process of DNA DNA transcriptiontranscription..
• DNA transcription is similar to DNA replication.DNA transcription is similar to DNA replication.• It produces a single-stranded RNA molecule that It produces a single-stranded RNA molecule that
is complementary to one strand of DNA.is complementary to one strand of DNA.
© Garland Science, Molecular Biology of The Cell, 4th Edition
AI@ASU, BY510, Oct. 25, 2005
TranslationTranslation
• During translation, the During translation, the RNA molecules RNA molecules produced from produced from transcription are used transcription are used to guide the synthesis to guide the synthesis of molecules of of molecules of proteins.proteins.
• Proteins Proteins are long are long polymer chains formed polymer chains formed by stringing together by stringing together monomeric building monomeric building blocks (blocks (amino acidsamino acids) ) drawn from a standard drawn from a standard repertoire that is the repertoire that is the same for all living same for all living cells.cells.
© Garland Science, Molecular Biology of The Cell, 4th Edition
AI@ASU, BY510, Oct. 25, 2005
TranslationTranslation
• There are only four different nucleotides in There are only four different nucleotides in mRNA and twenty different types of amino mRNA and twenty different types of amino acids in a protein.acids in a protein.
• Therefore, translation cannot be accounted Therefore, translation cannot be accounted for by a direct one-to-one correspondence for by a direct one-to-one correspondence between a nucleotide in RNA and an amino between a nucleotide in RNA and an amino acid in protein.acid in protein.
• The nucleotide sequence in mRNA is read The nucleotide sequence in mRNA is read in sets of 3 nucleotides, called in sets of 3 nucleotides, called codonscodons..
• Each codon corresponds to one amino acid.Each codon corresponds to one amino acid.• This mapping is determined by rules This mapping is determined by rules
known as the known as the genetic codegenetic code..
AI@ASU, BY510, Oct. 25, 2005
Genetic CodesGenetic Codes
Name 3L 1L codon Name 3L 1L codon Name 3L 1L codon Name 3L 1L codon
Alanine Ala A GCA Glutamic acidGlu E GAA Lysine Lys K AAA Threonine Thr T ACAGCC GAG AAG ACCGCG Glutamin Gln Q CAA Methionine Met M AUG ACGGCU CAG PhenylalaninePhe F UUC ACU
Arginine Arg R AGA Glycine Gly G GCA UUU Tryptophan Trp W UGGAGG GGC Proline Pro P CCA Tyrosin Tyr Y UACCGA GGG CCC UAUCGC GGU CCG Valine Val V GUACGG Histidine His H CAC CCU GUCCGU CAU Serine Ser S AGC GUG
Aspartic acid Asp D GAC Isoleucine Ile I AUA AGU STOP GUUGAU AUC UCA UAA
Arsparagine Asn N AAC AUU UCC UAGAAU Leucine Leu L UUA UCG UGA
Cystein Cys C UGC UUG UCUUGU CUA
CUCCUGCUU
* Only 20 different amino acids + STOP codes
• AUG acts as both initiation codon and codon for Methionine
AI@ASU, BY510, Oct. 25, 2005
Mechanisms of Translation: Mechanisms of Translation: InitiationInitiation
© Jones and Bartlett Publishers, Essential Genetics: A Genomics Perspective, 3rd Edition
AI@ASU, BY510, Oct. 25, 2005
Mechanisms of Translation: Mechanisms of Translation: ElongationElongation
© Jones and Bartlett Publishers, Essential Genetics: A Genomics Perspective, 3rd Edition
AI@ASU, BY510, Oct. 25, 2005
Mechanisms of Translation: Mechanisms of Translation: TerminationTermination
© Jones and Bartlett Publishers, Essential Genetics: A Genomics Perspective, 3rd Edition
AI@ASU, BY510, Oct. 25, 2005
From Gene to ProteinFrom Gene to Protein
© Garland Science, Molecular Biology of The Cell, 4th Edition
AI@ASU, BY510, Oct. 25, 2005
Genes and GenomeGenes and Genome
• The fragment of DNA that corresponds to one The fragment of DNA that corresponds to one protein (by means of transcription and translation) is protein (by means of transcription and translation) is known as a known as a genegene..
• DNA molecules are usually very large, containing DNA molecules are usually very large, containing thousands of genes, and thus specify thousands of thousands of genes, and thus specify thousands of proteins.proteins.
• In all cells, the expression of individual genes is In all cells, the expression of individual genes is regulated: instead of manufacturing a full repertoire regulated: instead of manufacturing a full repertoire of all possible proteins at full tilt all the time, the of all possible proteins at full tilt all the time, the cell adjusts the rate of transcription and translation cell adjusts the rate of transcription and translation of different genes independently, according to need.of different genes independently, according to need.
• The entire genetic information encoded in an The entire genetic information encoded in an organism is called the organism is called the genomegenome..
AI@ASU, BY510, Oct. 25, 2005
Genotypes and PhenotypesGenotypes and Phenotypes
• The genome of an organism is different than the The genome of an organism is different than the genome of another organism, although many genome of another organism, although many similarities may exist.similarities may exist.
• The genetic constitution (i.e., the genome) of an The genetic constitution (i.e., the genome) of an organism is called the organism is called the genotype genotype of that organism.of that organism.
• The different cell types in a multi-cellular The different cell types in a multi-cellular organism differ dramatically in both structure organism differ dramatically in both structure and function.and function.
• This is because different cell types synthesize This is because different cell types synthesize and accumulate different sets of RNA and protein and accumulate different sets of RNA and protein molecules, without altering their genotype.molecules, without altering their genotype.
• The observable character of a cell or an The observable character of a cell or an organism is called the organism is called the phenotype phenotype of that cell.of that cell.
AI@ASU, BY510, Oct. 25, 2005
Systems’ ViewSystems’ View
• Biology is an Biology is an informationalinformational science science– SystematicallySystematically perturbing and perturbing and
monitoring biological systems utilizing monitoring biological systems utilizing powerful new high-throughput toolspowerful new high-throughput tools
– Creation of new Creation of new computational computational methods methods for modeling and analysis.for modeling and analysis.
– The The integrationintegration of discovery science of discovery science (data mining) and hypothesis-driven (data mining) and hypothesis-driven science (modeling & simulation)science (modeling & simulation)
AI@ASU, BY510, Oct. 25, 2005
MolecuMolecular lar
CircuitrCircuitry of y of
CancerCancer
Hahn et al., Nature Review Cancer 2 (2002)
AI@ASU, BY510, Oct. 25, 2005
Wnt5a Signaling PathwayWnt5a Signaling Pathway
A.T.Weeraratna et al., Cancer Cell 1 (2002)
AI@ASU, BY510, Oct. 25, 2005
Genome DynamicsGenome DynamicsPerturbation
RNA
DNA Protein
MeasurementsReference DNA Sequence
Sequence VariantsGene Copy Number
CpG Methylation
RNA AbundanceRNA Half-life
Protein InteractionsProtein Modification
Protein Half-life
TranslationTranscription
Ectopic Expression
RNA interference
Protein/DNA Interactions
Increased Expression
Decreased Expression
Protein/RNA Interactions
AI@ASU, BY510, Oct. 25, 2005
Biological DataBiological Data
• Genomic dataGenomic data– SequencesSequences– SNPsSNPs– Gene Expression Gene Expression
MicroarraysMicroarrays– CGH arraysCGH arrays
• Proteomic dataProteomic data– MALDI (spectral MALDI (spectral
data)data)– Protein arraysProtein arrays
• Clinical dataClinical data– PatientsPatients– Drug treatmentDrug treatment
• Physiological dataPhysiological data– DietDiet– ExerciseExercise
AI@ASU, BY510, Oct. 25, 2005
Gene Expression MicroarraysGene Expression Microarrays
• It measures transcriptional activities of tens of It measures transcriptional activities of tens of thousands of genes simultaneously, resulting in thousands of genes simultaneously, resulting in individual snapshots of a cell’s transcriptional state at individual snapshots of a cell’s transcriptional state at any given time.any given time.
• While it reflects one of the central dynamic processes While it reflects one of the central dynamic processes of a biological system, it does not provide an accurate of a biological system, it does not provide an accurate picture of other important dynamic aspects, such as picture of other important dynamic aspects, such as the current levels of protein abundance, or of the the current levels of protein abundance, or of the activation state or modification state of extant activation state or modification state of extant proteins.proteins.
• To compensate for this, other measurement To compensate for this, other measurement technologies, i.e. protein abundance and interaction technologies, i.e. protein abundance and interaction arrays, can be combined with expression data to get a arrays, can be combined with expression data to get a comprehensive transcription, translation, and comprehensive transcription, translation, and modification profile.modification profile.
AI@ASU, BY510, Oct. 25, 2005
Single Nucleotide Polymorphisms Single Nucleotide Polymorphisms (SNPs)(SNPs)
• Genome ProjectsGenome Projects: Multiple genomic sequences provide : Multiple genomic sequences provide a reference estimate of normalitya reference estimate of normality
• Single nucleotide polymorphisms (SNPs), small genetic Single nucleotide polymorphisms (SNPs), small genetic changes or variations that can occur within a person's changes or variations that can occur within a person's DNA sequence, serve as possible markers of aberration DNA sequence, serve as possible markers of aberration from this reference that might indicate a disease cause from this reference that might indicate a disease cause or a susceptibility to diseaseor a susceptibility to disease
• Long runs of SNPs also serve to mark haplotypes, Long runs of SNPs also serve to mark haplotypes, groups of closely linked alleles that tend to be inherited groups of closely linked alleles that tend to be inherited together, which can be useful for following specific together, which can be useful for following specific chromosomal areas inherited by affected individuals in chromosomal areas inherited by affected individuals in familial genetic studiesfamilial genetic studies
• Several commercial platforms are currently available Several commercial platforms are currently available that survey genomes for SNPs at intervals approaching that survey genomes for SNPs at intervals approaching 20kb and smaller 20kb and smaller
AI@ASU, BY510, Oct. 25, 2005
Comparative Genomic Comparative Genomic Hybridization (CGH)Hybridization (CGH)
• Array based CGH (aCGH), first introduced by Array based CGH (aCGH), first introduced by Kallioniemi (Science, 1992), has proven to be a high Kallioniemi (Science, 1992), has proven to be a high throughput and sensitive genomic screening tool that throughput and sensitive genomic screening tool that detects DNA gains and losses with resolution of 1.0 to detects DNA gains and losses with resolution of 1.0 to 1.5 Mb using BAC arrays.1.5 Mb using BAC arrays.
• CGH data is read as the number of copies of a CGH data is read as the number of copies of a chromosomal region and array CGH provides a list of chromosomal region and array CGH provides a list of genes and genomic elements that are overrepresented genes and genomic elements that are overrepresented (gain) in the cell when an amplification event occurs (gain) in the cell when an amplification event occurs or underrepresented (loss) when deletions occur.or underrepresented (loss) when deletions occur.
• Currently, the application of chip based technology Currently, the application of chip based technology with highly annotated DNA targets of 20-mer or 60-with highly annotated DNA targets of 20-mer or 60-oligomer length permits whole genome surveys in oligomer length permits whole genome surveys in clinical specimens.clinical specimens.
AI@ASU, BY510, Oct. 25, 2005
Computational Systems Computational Systems BiologyBiology
Biological DataDNA, mRNA/cDNA,
CGH, SNP
Data Mining & Pattern Recognition· Automated & Systematic· Algorithmic & Computational
Candidate Biological Components
genesproteins
Derived Biological Context
biological processsubtype of disease
Biological Context as prior knowledge
biological processsubtype of disease
Clinical and Pathological Information
treatment history, age, gender, race, survival, and so on
Association studies
Association studies
Clustering Integration
· Better diagnostic markers· Better drug development· More efficient drug
treatment
Pathways discovery
Modeling
Mathematical and Computational Biological
Process ModelsDiscrete vs. Continuous
Deterministic vs. Stochastic
Biological Process
Biological operations
In-silicoBiological Process
Comp
In-silico Biological operations
Phenotypeobservation
PredictionHypothetical observation
Model refinement
Network Modeling & Systems Biology
Perturbation
Integration
Modeling
· Better treatment strategy· New drug targets
Measurements
Knowledge
Computable Knowledge
gene-to-gene relationships
gene ontologychemical databasegenomic database
proteomic database
Literature (PubMed)
Clinical chart/report
Chemistrycooperative binding
genomic databaseproteomic database
Text-mining
Databasing
Knowledge Representation & Mining
Knowledge Mining
AI@ASU, BY510, Oct. 25, 2005
Biological DataDNA, mRNA/cDNA,
CGH, SNP
Data Mining & Pattern Recognition· Automated & Systematic· Algorithmic & Computational
Candidate Biological Components
genesproteins
Derived Biological Context
biological processsubtype of disease
Biological Context as prior knowledge
biological processsubtype of disease
Clinical and Pathological Information
treatment history, age, gender, race, survival, and so on
Association studies
Association studies
Clustering Integration
· Better diagnostic markers· Better drug development· More efficient drug
treatment
Pathways discovery
Data mining & Pattern Data mining & Pattern RecognitionRecognition
• UnsupervisedUnsupervised analysis: analysis: exploratoryexploratory– Subtype recognitionSubtype recognition– Clustering analysisClustering analysis– Multi-Dimensional Scaling Multi-Dimensional Scaling
plot (MDS)plot (MDS)– Contextual pattern Contextual pattern
recognitionrecognition
• SupervisedSupervised analysis: analysis: discriminatorydiscriminatory– Classification of diseasesClassification of diseases– Rank genes according to their Rank genes according to their
impact on minimizing cluster impact on minimizing cluster volume and maximizing center-volume and maximizing center-to-center inter-cluster distanceto-center inter-cluster distance
– tt-test, SAM, TNoM, SVM, -test, SAM, TNoM, SVM, Gene@Work, Strong-featureGene@Work, Strong-feature
AI@ASU, BY510, Oct. 25, 2005
Clustering & MDS: Clustering & MDS: melanomamelanoma
AI@ASU, BY510, Oct. 25, 2005
RNAi triggered by synthetic siRNA:
A powerful new tool forGene KnockdownsIn mammalian cells
D. Azorsa
RNA interferenceRNA interference
AI@ASU, BY510, Oct. 25, 2005
low
high
RNAi Synthetic Lethal Phenotype RNAi Synthetic Lethal Phenotype Profiling of >10,000 siRNAProfiling of >10,000 siRNA
Context: BxPC3 Pancreatic Cancer Isogenic Cell Lines:DPC4 negative vs. DPC4 positive
Survival Scatter Plot
Highlighted Circles: Gene targeting events that preferentially affect the survival of the BxPC3 DPC4/SMAD4 minus cell line
AI@ASU, BY510, Oct. 25, 2005
Network Modeling and Network Modeling and Systems BiologySystems Biology
• Boolean networks Boolean networks – S. A. Kauffman, 1969S. A. Kauffman, 1969– On/Off representation of the On/Off representation of the
state of genesstate of genes– Boolean networks qualitatively Boolean networks qualitatively
capture typical genetic behaviorcapture typical genetic behavior
• Probabilistic Probabilistic Boolean Boolean networksnetworks– Shmulevich et al., Shmulevich et al.,
20022002– Stochastic Stochastic
extension of extension of Boolean networkBoolean network
• OthersOthers– Differential Differential
Equations, Linear Equations, Linear Model, Bayesian Model, Bayesian network …network …
Modeling
Mathematical and Computational Biological
Process ModelsDiscrete vs. Continuous
Deterministic vs. Stochastic
Biological Process
Biological operations
In-silicoBiological Process
Comp
In-silico Biological operations
Phenotypeobservation
PredictionHypothetical observation
Model refinement
Network Modeling & Systems Biology
Perturbation
Integration
Modeling
· Better treatment strategy· New drug targets
Biological DataDNA, mRNA/cDNA,
CGH, SNP
Data Mining & Pattern Recognition· Automated & Systematic· Algorithmic & Computational
Candidate Biological Components
genesproteins
Derived Biological Context
biological processsubtype of disease
Biological Context as prior knowledge
biological processsubtype of disease
Clinical and Pathological Information
treatment history, age, gender, race, survival, and so on
Association studies
Association studies
Clustering Integration
· Better diagnostic markers· Better drug development· More efficient drug
treatment
Pathways discovery
Measurements
AI@ASU, BY510, Oct. 25, 2005
WNT5a
pirin
S100P
RET1Knowledge Repository: GO, GenMAPP, KEGG PubMed
AI@ASU, BY510, Oct. 25, 2005
Knowledge IntegrationKnowledge Integration
• Biological databaseBiological database– Genomic SequenceGenomic Sequence– ProteinProtein– Biochemical databaseBiochemical database
• KnowledgebaseKnowledgebase– PathwaysPathways– OntologyOntology– Protein-Protein Protein-Protein
InteractionInteraction– Gene-Gene InteractionGene-Gene Interaction
• Knowledge MiningKnowledge Mining– LiteraturesLiteratures– Clinical recordsClinical records
• BioLogBioLog– PubMed literature PubMed literature
access logger, archival access logger, archival and analyzerand analyzer
• Text- and Context-Text- and Context-miningmining
Knowledge
Computable Knowledge
gene-to-gene relationships
gene ontologychemical databasegenomic database
proteomic database
Literature (PubMed)
Clinical chart/report
Chemistrycooperative binding
genomic databaseproteomic database
Text-mining
Databasing
Knowledge Representation & Mining
Knowledge Mining
Biological DataDNA, mRNA/cDNA,
CGH, SNP
Data Mining & Pattern Recognition· Automated & Systematic· Algorithmic & Computational
Candidate Biological Components
genesproteins
Derived Biological Context
biological processsubtype of disease
Biological Context as prior knowledge
biological processsubtype of disease
Clinical and Pathological Information
treatment history, age, gender, race, survival, and so on
Association studies
Association studies
Clustering Integration
· Better diagnostic markers· Better drug development· More efficient drug
treatment
Pathways discovery
Modeling
Mathematical and Computational Biological
Process ModelsDiscrete vs. Continuous
Deterministic vs. Stochastic
Biological Process
Biological operations
In-silicoBiological Process
Comp
In-silico Biological operations
Phenotypeobservation
PredictionHypothetical observation
Model refinement
Network Modeling & Systems Biology
Perturbation
Integration
Modeling
· Better treatment strategy· New drug targets
Measurements
AI@ASU, BY510, Oct. 25, 2005
PathwayAssistTM
Statistically ProcessedGene List
Acquire Current GeneIdentifiers and Information
Gene Ontology AnalysisNetwork AnalysisCanonical Pathway Analysis
Knowledge Mining: Extracting Biological Information from Global
RNAi Phenotype Data
AI@ASU, BY510, Oct. 25, 2005Figure 2. Doxorubicin and Drug Resistance Molecular Interaction Network. Doxorubicin Drug Resistance Pathway
Knowledge Mining: Building Knowledge Mining: Building Regulatory Networks from Regulatory Networks from Global RNAi PhenotypesGlobal RNAi Phenotypes
AI@ASU, BY510, Oct. 25, 2005
To be continued …To be continued …