31
BioPlanner: A Plan Adaptation Approach for the Discovery of Biological Pathways across Species Li Jin and Keith S. Decker Department of Computer & Information Sciences Carl J. Schmidt Department of Animal and Food Sciences Computer & Information Sciences Animal & Food Sciences

BioPlanner: A Plan Adaptation Approach for the Discovery of Biological Pathways across Species Li Jin and Keith S. Decker Department of Computer & Information

Embed Size (px)

Citation preview

Page 1: BioPlanner: A Plan Adaptation Approach for the Discovery of Biological Pathways across Species Li Jin and Keith S. Decker Department of Computer & Information

BioPlanner: A Plan Adaptation Approach for the Discovery of Biological Pathways across Species

Li Jin and Keith S. Decker

Department of Computer & Information Sciences

Carl J. Schmidt

Department of Animal and Food Sciences

Computer & Information Sciences

Animal & Food Sciences

Page 2: BioPlanner: A Plan Adaptation Approach for the Discovery of Biological Pathways across Species Li Jin and Keith S. Decker Department of Computer & Information

Outline

2

Introduction to Biological Pathways Discovery of Bio-Pathways

Prior Work Challenge and Our Approach

BioPlanner Modeling a Pathway with Planning Formalisms Generating Hypotheses across Species Evaluating Hypotheses Preliminary Experimental Results, Conclusions and

Future Work

Jin, Schmidt and Decker – IAAI0907/16/2009

Page 3: BioPlanner: A Plan Adaptation Approach for the Discovery of Biological Pathways across Species Li Jin and Keith S. Decker Department of Computer & Information

Biological Pathways

3Jin, Schmidt and Decker – IAAI09

Common Types Metabolic : make chemical

reactions occur, e.g. break down food into energy, build up molecules.

Gene regulation: turn genes on and off.

Signal transduction: a signal moves from a cell's exterior to interior through a receptor.

Signals to Other Cells

Cell

What is a biological pathway? a series of actions among molecules in a cell leads to a certain product or a change in a cell

SignalSurface Receptor

[genome.gov]

07/16/2009

Page 4: BioPlanner: A Plan Adaptation Approach for the Discovery of Biological Pathways across Species Li Jin and Keith S. Decker Department of Computer & Information

4

Signal Transduction Pathways

07/16/2009 Jin, Schmidt and Decker – IAAI09

2. Transport

3. Reception

intracellular interactions

4. Transduction 5. Response

Target Cell

[Copyright of Pearson Education, Inc. , publishing as Benjamin Cummings.]

1. Stimulation

Page 5: BioPlanner: A Plan Adaptation Approach for the Discovery of Biological Pathways across Species Li Jin and Keith S. Decker Department of Computer & Information

507/16/2009 Jin, Schmidt and Decker – IAAI09

EGFR Signal Transduction Pathway(Epidermal growth factor receptor)

RAS Activation

RAF Activation

EGFR

MAP Kinase Cascade

MEK Activation

ERK Activation

ERK

MSK

Page 6: BioPlanner: A Plan Adaptation Approach for the Discovery of Biological Pathways across Species Li Jin and Keith S. Decker Department of Computer & Information

Why Study Biological Pathways?

6Jin, Schmidt and Decker – IAAI0907/16/2009

Cancer one target , one drug A array of different genetic mutations can lead to the same cancer. Dozens of drugs for dozens of mutation 2 or 3 drugs for 2 or 3

pathways.

Identify the causes of a disease Compare pathways in healthy people and

pathways in patients

Drugs Use pathway information to choose and

combine existing drugs Design new drugs

Page 7: BioPlanner: A Plan Adaptation Approach for the Discovery of Biological Pathways across Species Li Jin and Keith S. Decker Department of Computer & Information

Outline

7

Introduction to Biological Pathways Discovery of Bio-Pathways

Prior Work Challenge and Our Approach

BioPlanner Modeling a Pathway with HTN Formalisms Generating Hypotheses across Species Evaluating Hypotheses Experimental Results, Conclusions and

Future WorkJin, Schmidt and Decker – IAAI0907/16/2009

Page 8: BioPlanner: A Plan Adaptation Approach for the Discovery of Biological Pathways across Species Li Jin and Keith S. Decker Department of Computer & Information

Pathway Discovery

8Jin, Schmidt and Decker – IAAI0907/16/2009

Finding an ordered sequence of subcellular processes elicits a specific cellular response when applied to a subset of cellular

components.

Biological laboratory studies to discover pathways. Challenge

Experiments are expensive

intracellular interactions

Target Cell

?

Computational approaches

Page 9: BioPlanner: A Plan Adaptation Approach for the Discovery of Biological Pathways across Species Li Jin and Keith S. Decker Department of Computer & Information

Prior Work

9

Planning Approach Recast pathway discovery problems as planning problems [Khan et al, ICAPS03].Model changes in cellular processes as exogenous actions (triggers) [Tran and

Baral, AAAI05]. Simulate formulation of gene regulatory network intervention with decision

theoretical planning [Bryce and Kim, IJCAI07].Expert system

Ecocyc [Karp, Science01]: Ontology, tracing pathways from one state to another.

Graph-basedPetri Nets [Peleg et al, Bioinformatics02]: hybrid work flow and Petri Net modeling.

Algebra-calculus [Regev, PSB01]: Computational processes – molecules, domains, Complementary structural and chemical determinants – communication channels, Chemical interactions –communication through channels.

07/16/2009 Jin, Schmidt and Decker – IAAI09

Challenge: No enough information exists for pathway construction for some species.

Page 10: BioPlanner: A Plan Adaptation Approach for the Discovery of Biological Pathways across Species Li Jin and Keith S. Decker Department of Computer & Information

Our Approach

10

Generate hypothetical pathways worthy of expensive experiments by adaption.

Planning Approach A pathway of a sequence of processes – A plan of a sequence of actions.

Hierarchical Task Network (HTN) planning Bio-processes and their underlying information are hierarchical in nature.

Case-based plan adaptation Predict pathways from incomplete domain information of one species by

adapting already well-known pathways of another species.

07/16/2009 Jin, Schmidt and Decker – IAAI09

Curated Information available for 990 pathways Curated Information available for 30 pathways

Page 11: BioPlanner: A Plan Adaptation Approach for the Discovery of Biological Pathways across Species Li Jin and Keith S. Decker Department of Computer & Information

Challenges and Solutions

11

Construction of HTN planning models for pathways Extract from Reactome, a knowledge base of manually curated

Homo sapiens pathways [Vastrik et al 07].

Incompleteness of domain knowledge Adapting well-studied pathways instead of planning from scratch.

Ranking hypothetical plans Rank by confidence based on supporting data and the underlying

adaptation or prediction methods. Recommend the best hypotheses.

Distributed domain knowledge base Multi-agent system to gather information

07/16/2009 Jin, Schmidt and Decker – IAAI09

Page 12: BioPlanner: A Plan Adaptation Approach for the Discovery of Biological Pathways across Species Li Jin and Keith S. Decker Department of Computer & Information

Outline

12

Introduction to Biological Pathways Discovery of Bio-Pathways

Pathway Discovery Problems Prior Work Challenges and Our Approach

BioPlanner Modeling a Pathway with HTN Formalisms Generating Hypotheses across Species Evaluating Hypotheses Experimental Results, Conclusions and Future

WorkJin, Schmidt and Decker – IAAI0907/16/2009

Page 13: BioPlanner: A Plan Adaptation Approach for the Discovery of Biological Pathways across Species Li Jin and Keith S. Decker Department of Computer & Information

BioPlanner

1307/16/2009 Jin, Schmidt and Decker – IAAI09

External Biological KBs

Local Knowledge

Base

Plan Library

Hypotheses

Information flow in BioPlanner.

Plan Repair

Hypothesis Evaluator

HTN Models

HTN Generator

Formalisms

Cases

Query

Evaluated Hypotheses

InformationUser

Interface

Reactome

Interactome

BIND

KEGG

… …

BioMAS

Page 14: BioPlanner: A Plan Adaptation Approach for the Discovery of Biological Pathways across Species Li Jin and Keith S. Decker Department of Computer & Information

Planning Problems

14

An ST pathway an HTN planning problem (I, T, D): I: initial state, conjunction of the initial configurations of

a pathway components. e.g. each protein is initialized to some state, such as its cellular location.

T: task, transfering information from one location to another initialized by a signaling molecule.

e.g. EGFR pathway can be considered as a task to transfer information initialized by EGF between cells.

D: domain theory, a collection of operators and methods.

A plan solution a sequence of actions whose executions are the

biological processes responding to stimulus events.07/16/2009 Jin, Schmidt and Decker – IAAI09

Page 15: BioPlanner: A Plan Adaptation Approach for the Discovery of Biological Pathways across Species Li Jin and Keith S. Decker Department of Computer & Information

HTN Representations of Pathways (1)

15

Physical Entities (istype ?x t)

variable ?x is of a type t e.g. (istype EGFR protein)

(has-domain ?x d) variable ?x has a domain d e.g. (istype EGFR-extra domain) (has-domain EGFR EGFR-extra), (has-domain EGFR EGFR-

mem), (has-domain EGFR EGFR-intra)

(isa subtype type) e.g. (isa protein physical-entity)

07/16/2009 Jin, Schmidt and Decker – IAAI09

Page 16: BioPlanner: A Plan Adaptation Approach for the Discovery of Biological Pathways across Species Li Jin and Keith S. Decker Department of Computer & Information

HTN Representations of Pathways (2)

16

CompartmentalizationA specific location of a cell where a physical entity would function.

The cellular location of an entity is mapped to a predicate, (in physical-entity location).

e. g. (in EGF plasma-membrane): EGF is present at the plasma membrane.

07/16/2009 Jin, Schmidt and Decker – IAAI09

Page 17: BioPlanner: A Plan Adaptation Approach for the Discovery of Biological Pathways across Species Li Jin and Keith S. Decker Department of Computer & Information

17

Abstract OperatorsBiological reactions modify the states of physical entities in a cell.

Hierarchy based on biological detail.

Protein-bindingDomain-binding

Follow the formalism of JSHOP [Nau IEEE IS05]

(head, preconditions, deleted list, added list)

(:operator (!protein-bind ?x ?y ?loc) (;;precondition (istype ?x protein) (istype ?y protein) (istype ?loc compartment) (in ?x ?loc) (in ?y ?loc) (can-ppi ?x ?y ?loc)) (;;delete-list (in ?x ?loc) (in ?y ?loc)) (;;add-list (istype ?x:?y bound-protein) (in ?x:?y ?loc)))

07/16/2009 Jin, Schmidt and Decker – IAAI09

HTN Representations of Pathways (3)

Page 18: BioPlanner: A Plan Adaptation Approach for the Discovery of Biological Pathways across Species Li Jin and Keith S. Decker Department of Computer & Information

HTN Representations of Pathways (4)

18

Task Method Model Hierarchical information is extracted from

Reactome.

07/16/2009 Jin, Schmidt and Decker – IAAI09

(h, pre, subTasks)

Name of a Pathway

preconditions not achieved by any other subtasksCatalysts activating/inactivating reactionsPhysical entities not output from any reactions

Sub Pathways

Page 19: BioPlanner: A Plan Adaptation Approach for the Discovery of Biological Pathways across Species Li Jin and Keith S. Decker Department of Computer & Information

Generate Hypothetical Pathways

19

Do not plan from scratch Not enough information to construct pathways

for some species

Adapt a well-studied pathway to predict Human (reference) Chicken (target) Reference pathway target pathway Adaptation Strategies [e.g. Hammond 1990; Kambhampati and

Hendler 1992]

Action Modifications Task Modifications

07/16/2009 Jin, Schmidt and Decker – IAAI09

Page 20: BioPlanner: A Plan Adaptation Approach for the Discovery of Biological Pathways across Species Li Jin and Keith S. Decker Department of Computer & Information

Action Modifications (1)

2007/16/2009 Jin, Schmidt and Decker – IAAI09

Strategy 1: Physical entity adaptation Actions fail due to missing physical entities of

reference species in a new environment. Replace failed physical entities with homologs

(those of similar physical structures) , e.g. BLAST. e-value: evaluation of confidence.

Strategy 2: Modifying an action Similar physical entities missing Modifying an action according to knowledge base

(react P1 P2 P3) (react P’1 P’2 P’3)Pre1 Pre1’

Pre’2

Page 21: BioPlanner: A Plan Adaptation Approach for the Discovery of Biological Pathways across Species Li Jin and Keith S. Decker Department of Computer & Information

Task Modifications (1)

2107/16/2009 Jin, Schmidt and Decker – IAAI09

Strategy3: Splitting an action One failed reaction might be achieved by multiple

reactions. Repair a failing action by splitting it into multiple

actions. For example,

Strategy 4: Combining actions (protein-bind P’1 P’3)

(protein-bind P’3 P’2)

(protein-bind P’1 P’2)

(protein-bind P’1 P’3)

(protein-bind P’3 P’2)

(protein-bind P’1 P’2)

Page 22: BioPlanner: A Plan Adaptation Approach for the Discovery of Biological Pathways across Species Li Jin and Keith S. Decker Department of Computer & Information

Task Modifications (2)

2207/16/2009 Jin, Schmidt and Decker – IAAI09

Strategy 5: Adding a new task Add a new task to achieve a failed precondition. For example, a missing complex as a catalyst can be

created by a reaction.

Strategy 6: Other task re-decomposition (future work) Using other alternative methods or new information

sources to achieve a task. Planning from scratch.

Page 23: BioPlanner: A Plan Adaptation Approach for the Discovery of Biological Pathways across Species Li Jin and Keith S. Decker Department of Computer & Information

Evaluation of Hypotheses

23

Biological assumptions:A hypothetical pathway is more preferred if it has fewer differences from the original one. The differences include: Participant structures Reactions

A hypothesis is considered more confident than others, if it is Found in literature or experimental resources. Obtained only by physical entity substitutions, no any

other modifications. Achieved by splitting a failing reaction into two

reliable reactions instead of more than two reactions.07/16/2009 Jin, Schmidt and Decker – IAAI09

Page 24: BioPlanner: A Plan Adaptation Approach for the Discovery of Biological Pathways across Species Li Jin and Keith S. Decker Department of Computer & Information

Ranking Hypotheses

2407/16/2009 Jin, Schmidt and Decker – IAAI09

Compare two hypotheses, hp1 and hp2

Failing actions not repaired hp1 is ranked more confident than hp2 if hp1 has a lower

percentage of failing actions than hp2. Reliability

hp1 is ranked more confident than hp2 if hp1 has a higher percentage of actions whose corresponding reactions are found in experimental or literature resources.

Priorities of actions hp1 is ranked more confident than hp2 if hp1 contains a higher

percentage of actions that are achieved by applying adaptation strategies of higher priorities.

e-value: evaluation of confidence hp1 is ranked more confident than hp2 if hp1 has a lower average

e-value of participating entities than hp2.

Page 25: BioPlanner: A Plan Adaptation Approach for the Discovery of Biological Pathways across Species Li Jin and Keith S. Decker Department of Computer & Information

Implementation and Experiments

25

BioPlanner is implemented on JSHOP2 (Nau et al. IEEE IS05)

BioPlanner has integrated data gathered from 11 knowledge resources, e.g. Reactome, Kegg, DIP, etc.

The ST pathway HTN model currently consists of 14 operator schemas.

Around 400 action cases and 150 plan cases of Human signaling pathways have been retrieved from Reactome.

Predict Pathways from Human for Mouse, Chicken, Fruit Fly.

07/16/2009 Jin, Schmidt and Decker – IAAI09

Page 26: BioPlanner: A Plan Adaptation Approach for the Discovery of Biological Pathways across Species Li Jin and Keith S. Decker Department of Computer & Information

Predicting Pathways for Gallus Gallus

2607/16/2009 Jin, Schmidt and Decker – IAAI09

Hypothesis Explanation

Page 27: BioPlanner: A Plan Adaptation Approach for the Discovery of Biological Pathways across Species Li Jin and Keith S. Decker Department of Computer & Information

Repair Human Pathways for Different Species

2707/16/2009 Jin, Schmidt and Decker – IAAI09

Page 28: BioPlanner: A Plan Adaptation Approach for the Discovery of Biological Pathways across Species Li Jin and Keith S. Decker Department of Computer & Information

Performance

2807/16/2009 Jin, Schmidt and Decker – IAAI09

Page 29: BioPlanner: A Plan Adaptation Approach for the Discovery of Biological Pathways across Species Li Jin and Keith S. Decker Department of Computer & Information

Conclusion

29

Propose and rank order hypothetical pathways using incomplete information. Represent pathways using HTN planning

formalisms. Challenges and solutions

Future work Evaluating our approach further with more data

available Using experimental data to diagnose, modify or

eliminate hypothetical plans.

07/16/2009 Jin, Schmidt and Decker – IAAI09

Page 30: BioPlanner: A Plan Adaptation Approach for the Discovery of Biological Pathways across Species Li Jin and Keith S. Decker Department of Computer & Information

Acknowledgement

30

This work is supported in part by the Cooperative State Research, Education, and Extension Service, U.S. Department of Agriculture, under Agreement No. 2008-35205-18734.

Professor Keith Decker Professor Carl Schmidt

07/16/2009 Jin, Schmidt and Decker – IAAI09

Page 31: BioPlanner: A Plan Adaptation Approach for the Discovery of Biological Pathways across Species Li Jin and Keith S. Decker Department of Computer & Information

Thank you!

Questions? Comments? Suggestions?