42
August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/ or send a letter to Creative Commons, 444 Castro Street, Suite 900, Mountain View, California, 94041, USA. BEL Framework v2.0.0

August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

Embed Size (px)

Citation preview

Page 1: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

August 2012

This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/ or send a letter to Creative Commons, 444 Castro Street, Suite 900, Mountain View, California, 94041, USA.

BEL Framework v2.0.0

Page 2: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

BEL Framework Overview

• Current version 2.0.0 released June 29, 2012– Open source

• The BEL Framework includes:– BEL Compiler– KAM store– Tools– Web and Java APIs

• API = Application Programming Interface– Can be used by software to access information from KAMs

• KAM Navigator uses the Web API• Whistle uses the Java API

– Web Server

Page 3: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

Knowledge User Workflow: BEL Framework and Applications

Multiple KAMs can be imported for use by the application

BEL Compiler

Encrypted portable KAM

BEL Framework BEL Framework API

KAM Store

Application

BEL Documents

3

Page 4: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

Contents

• KAMs and the KAM store• BEL Compiler

– Running the BEL Compiler– Phase I - Compiler Expansions– Phase II - Equivalencing– Phase III – Compiler Augmentations

• BEL Framework Tools

4

Page 5: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

Knowledge Assembly Model (KAM)

• A knowledge base in network form• Composed of Nodes (KamNode) and Edges

(KamEdge)• Each KamNode represents one or more BEL Terms

drawn from one or more BEL Documents• Each KamEdge represents one or more BEL

Statements from from one or more BEL Documents

5

Page 6: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

KamNodes

• Nodes represent one or more BEL terms• KamNodes are coalesced wherever possible by the equivalencing engine

(Phase II)

6

Page 7: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

KamEdges

• Represent assertions supported by one or more BEL Statements

• Querying a KamEdge will return:– Each BEL Statement supporting the assertion– Assertions are coalesced based solely on semantic triple

after equivalencing, independent of Annotations

• Querying a BEL Statement will return:– The BEL Document the statement was recorded in– The list of assertions for the statement

7

Page 8: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

© 2012, Open BEL Community 8

KAM Store

• The database that stores KAMs• Default database is Derby

– Can configure to use MySQL or other databases

• Put KAMs into the KAM Store by:– Compiling a KAM (belc.cmd)– Importing a KAM (tools\KamManager.cmd --import)

• Access KAMs via:– APIs– Exporting a KAM (tools\KamManager.cmd –export)

Page 9: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

Contents

• KAMs and the KAM store• BEL Compiler

– Running the BEL Compiler– Phase I - Compiler Expansions– Phase II - Equivalencing– Phase III – Compiler Augmentations

• BEL Framework Tools

9

Natalie Catlett
add framework overview?framework componentsinformation flow?
Page 10: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

KAMs Are Compiled from BEL Documents

• The BEL Compiler compiles one or more BEL Documents into a Knowledge Assembly Model (KAM)

• Multi-Phase compiler/assembler:1. Compiler – compiles each BEL Document into a proto-network2. Equivalencer – merges proto-networks by equivalencing

analogous nodes across namespaces3. Augmenter – increases KAM computability by injecting terms

and relationships from additional sources of prior knowledge (e.g. relationships connecting RNAs to their corresponding proteins)

4. Assembler – Generates final network and supporting evidence structures

• Users can change compiler parameters to control the knowledge assembly process

10

Page 11: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

KAM Compilation Phases

Compiler Equivalencer Augmentor Final Assembler

CompiledKAM

BEL Documents

EquivalenceTables

Other Prior Knowledge

Namespace &Annotation

Tables

Network Resources

11

Page 12: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

Contents

• KAMs and the KAM store• BEL Compiler

– Running the BEL Compiler– Phase I - Compiler Expansions– Phase II - Equivalencing– Phase III – Compiler Augmentations

• BEL Framework Tools

12

Natalie Catlett
add framework overview?framework componentsinformation flow?
Page 13: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

Running the BEL Compiler

• From BEL Framework folder: – belc.cmd (Windows) – belc.sh (Linux or OS X)

• Ensure that the server is not running• Required:

– BEL document(s)• Specify filename(s) with –f • OR specify path to folder of BEL documents with -p

– KAM name• Specify with -k

– KAM description• Specify with –d

13

>belc.cmd –f myDoc.bel –k myKAM –d "my KAM description"

Page 14: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

Contents

• KAMs and the KAM store• BEL Compiler

– Running the BEL Compiler– Phase I - Compiler Expansions– Phase II - Equivalencing– Phase III – Compiler Augmentations

• BEL Framework Tools

14

Page 15: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

Phase I Expansions

• List expansions • Inner terms• Protein modifications• Reactions• Nested statements• Reciprocal statements

15

Page 16: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

List Expansion - hasMembers• Phase I expands hasMembers relationships to individual

hasMember relationships• All hasMembers relationship statements are removed

p(PFH:"AKT Family") hasMember p(HGNC:AKT1)p(PFH:"AKT Family") hasMember p(HGNC:AKT2)p(PFH:"AKT Family") hasMember p(HGNC:AKT3)

becomes

16

p(PFH:"AKT Family") hasMembers \ list(p(HGNC:AKT1),p(HGNC:AKT2),p(HGNC:AKT3))

Page 17: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

List Expansion - hasComponents

• Phase I expands hasComponents relationships to individual hasComponent relationships

• All hasComponents relationship statements are removed

17

complex(NCH:"IkappaB Kinase Complex") hasComponent p(HGNC:CHUK)complex(NCH:"IkappaB Kinase Complex") hasComponent p(HGNC:IKBKB)complex(NCH:"IkappaB Kinase Complex") hasComponent p(HGNC:IKBKG)

becomes

complex(NCH:"IkappaB Kinase Complex") hasComponents \ list(p(HGNC:CHUK), p(HGNC:IKBKB), p(HGNC:IKBKG))

Page 18: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

complexAbundance Expansion

• Phase I preprocesses complexAbundance() terms and injects individual hasComponent relationships

18

complex(p(HGNC:GTF2E1),p(HGNC:GTF2E2))complex(p(HGNC:GTF2E1),p(HGNC:GTF2E2))\ hasComponent p(HGNC:GTF2E1)complex(p(HGNC:GTF2E1),p(HGNC:GTF2E2))\ hasComponent p(HGNC:GTF2E2)

becomes

complex(p(HGNC:GTF2E1),p(HGNC:GTF2E2))

Page 19: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

compositeAbundance Expansion

• Phase I preprocesses compositeAbundance() terms and injects individual includes relationships

composite(a(CHEBI:"deoxyribonucleic acid"), a(CHEBI:"NAD(+)"))composite(a(CHEBI:"deoxyribonucleic acid"), a(CHEBI:"NAD(+)")) includes \ a(CHEBI:"deoxyribonucleic acid"),composite(a(CHEBI:"deoxyribonucleic acid"), a(CHEBI:"NAD(+)")) includes \ a(CHEBI:"NAD(+)")

becomes

19

composite(a(CHEBI:"deoxyribonucleic acid"), a(CHEBI:"NAD(+)")) \ -> ribo(p(HGNC:PARP1))

Page 20: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

Inner Terms Expansion

• Phase I expands inner terms to relate abundances to activity terms using actsIn relationships

becomes

phos(p(HGNC:DUSP1)) =| kin(p(HGNC:MAPK8))p(HGNC:DUSP1) actsIn phos(p(HGNC:DUSP1)) p(HGNC:MAPK8) actsIn kin(p(HGNC:MAPK8))

20

phos(p(HGNC:DUSP1)) =| kin(p(HGNC:MAPK8))

Page 21: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

Protein Modification Expansion

• Phase I expands proteinModification() sub-terms to associate a modified protein abundance with the root protein abundance

p(HGNC:MAPK1, pmod(P, T)) => kin(p(HGNC:MAPK1))p(HGNC:MAPK1) hasModification p(HGNC:MAPK1, pmod(P,T)) p(HGNC:MAPK1) actsIn kin(p(HGNC:MAPK1))

becomes

21

p(HGNC:MAPK1, pmod(P,T)) => kin(p(HGNC:MAPK1))

Page 22: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

Variant Expansion• Phase I expands fusion(), truncation(), and substitution()

sub-terms to associate a protein variant abundance with the parent (reference) protein abundance

p(HGNC:KRAS, sub(G,12,V))p(HGNC:KRAS) hasVariant p(HGNC:KRAS, sub(G,12,V))

becomes

22

p(HGNC:KRAS, sub(G,12,V))

Page 23: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

Reaction Expansion

• Phase I expands reactants() and products() reaction sub-terms to associate the reactant and product lists with their abundances

reaction(reactants(a(CHEBI:superoxide)), \ products(a(CHEBI:"hydrogen peroxide"),a(CHEBI:oxygen))a(CHEBI:superoxide) reactantIn \ reaction(reactants(a(CHEBI:superoxide)), \ products(a(CHEBI:"hydrogen peroxide"),a(CHEBI:oxygen))reaction(reactants(a(CHEBI:superoxide)), \ products(a(CHEBI:"hydrogen peroxide"),a(CHEBI:oxygen)) \ hasProduct a(CHEBI:"hydrogen peroxide")reaction(reactants(a(CHEBI:superoxide)), \ products(a(CHEBI:"hydrogen peroxide"),a(CHEBI:oxygen)) \ hasProduct a(CHEBI:oxygen)

becomes

23

reaction(reactants(a(CHEBI:superoxide)), products(a(CHEBI:"hydrogen peroxide"),a(CHEBI:oxygen))

Page 24: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

Nested Statement Expansion

• The compiler will automatically expand nested statements and create additional relationships from the subject of the statement to the object of the nested statement– can be turned off using the --no-statement-expansion

switch

24

Page 25: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

Default Nested Statement Expansion

• Phase I expands nested statements to link the subject of the statement to the object of the nested statement

• The original statement is preserved as supporting evidence for the derived assertions

p(HGNC:CLSPN) -> p(HGNC:CHEK1, pmod(P))kin(p(HGNC:ATR)) => p(HGNC:CHEK1, pmod(P))

p(HGNC:ATR) actsIn kin(p(HGNC:ATR))p(HGNC:CHEK1) hasModification p(HGNC:CHEK1, pmod(P))

becomes

25

p(HGNC:CLSPN) -> (kin(p(HGNC:ATR)) => p(HGNC:CHEK1, pmod(P)))

Page 26: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

Modified Nested Statement Expansion

• When the –no-statement-expansion switch is set, the compiler will instantiate the subject of the statement and expand the nested statement but not couple the two together.

• The original statement is removed

becomes

26

kin(p(HGNC:ATR)) => p(HGNC:CHEK1, pmod(P))p(HGNC:CLSPN)

p(HGNC:ATR) actsIn kin(p(HGNC:ATR))p(HGNC:CHEK1) hasModification p(HGNC:CHEK1, pmod(P))

p(HGNC:CLSPN) -> (kin(p(HGNC:ATR)) => p(HGNC:CHEK1, pmod(P)))

Page 27: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

Reciprocal Statement Expansion

• All KAM edges are directed• Non-directed BEL relationships (positiveCorrelation,

negativeCorrelation, association) are expanded to be expressed in both directions:

27

r(HGNC:IL8) positiveCorrelation path(MESHD:"Lung Neoplasms")path(MESHD:"Lung Neoplasms") positiveCorrelation r(HGNC:IL8)

becomes

r(HGNC:IL8) positiveCorrelation path(MESHD:"Lung Neoplasms")

r(HGNC:IL8) positiveCorrelation path(MESHD:"Lung Neoplasms")path(MESHD:"Lung Neoplasms") positiveCorrelation r(HGNC:IL8)

Page 28: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

Contents

• KAMs and the KAM store• BEL Compiler

– Running the BEL Compiler– Phase I - Compiler Expansions– Phase II - Equivalencing– Phase III – Compiler Augmentations

• BEL Framework Tools

28

Page 29: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

Phase II Equivalences

• Nodes are equivalenced based on:– Namespace value UUID

• In .beleq resource file

– Equivalent unordered list • complexes, composites, rxns

29

Page 30: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

The BEL Framework Manages Equivalences Between External IDs

• Equivalences between terms from different vocabularies are provided to the BEL compiler– AKT3 in the HGNC namespace and Entrez Gene ID 10000 refer to the

same gene– p(HGNC:AKT3) and p(EG:10000) coalesce to a single node in a KAM

• Selection of preferred namespaces “Dialect” slated for future

30

Natalie Catlett
update this with something intelligable
Page 31: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

Contents

• KAMs and the KAM store• BEL Compiler

– Running the BEL Compiler– Phase I - Compiler Expansions– Phase II - Equivalencing– Phase III – Compiler Augmentations

• BEL Framework Tools

31

Page 32: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

Phase III Augmentations

• Gene Scaffolding• Protein Families• Named Complexes• Orthology

32

Page 33: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

Network Augmentation Order

Protein Family

Inclusion

Named ComplexInclusion

Protein Family

Expansion

NamedComplex

Expansion

GeneScaffolding

OptionalStages

Basic Stages

33

Orthology

Page 34: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

Gene Scaffolding• Default behavior is to insert p(), r(), and g() nodes

and corresponding edges wherever a protein, rna, or gene abundance term is detected

• The compiler will only insert missing nodes and edges– Can be turned off with the --no-gene-scaffolding switch

becomes

p(HGNC:KRAS, sub(G, 12, V)) -> \ path(MESH:Neoplasms)p(HGNC:KRAS) hasVariant \ p(HGNC:KRAS, sub(G, 12, V))r(HGNC:KRAS) >> p(HGNC:KRAS)g(HGNC:KRAS) :> r(HGNC:KRAS)

34

p(HGNC:KRAS, sub(G, 12, V)) -> path(MESHD:Neoplasms)

Page 35: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

Protein Family Expansion

• The compiler will automatically include protein family members when a protein family term is identified – Can be turned off using the --no-protein-families switch

• The compiler can also search for protein families to include when a protein family member is identified– Can be enabled using the --expand-protein-families switch

• The compiler will automatically connect protein family activity terms with the corresponding family member activity terms

35

Page 36: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

Protein Family Example 1 (Default Behavior)

becomesp(HGNC:KRAS, sub(G,12,D)) -> kin(p(PFH:"MAPK JNK Family"))p(HGNC:KRAS) hasVariant p(HGNC:KRAS, sub(G,12,D)) p(PFH:"MAPK JNK Family") actsIn kin(p(PFH:"MAPK JNK Family"))

p(PFH:"MAPK JNK Family") hasMember p(HGNC:MAPK8)p(PFH:"MAPK JNK Family") hasMember p(HGNC:MAPK9)p(PFH:"MAPK JNK Family") hasMember p(HGNC:MAPK10)

Gene scaffolding will also be added to p(HGNC:KRAS) , p(HGNC:MAPK8), p(HGNC:MAPK9), and p(HGNC:MAPK10)

36

p(HGNC:KRAS, sub(G,12,D)) -> kin(p(PFH:"MAPK JNK Family"))

Page 37: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

Protein Family Example 2 (Default Behavior)

becomes

kin(p(HGNC:AKT1)) -> p(HGNC:RELA)kin(p(PFH:"AKT Family")) =| bp(MESHPP:Apoptosis)p(HGNC:AKT1) actsIn kin(p(HGNC:AKT1)) p(PFH:"AKT Family") actsin kin(p(PFH:"AKT Family"))p(PFH:"AKT Family") hasMember p(HGNC:AKT1) p(PFH:"AKT Family") hasMember p(HGNC:AKT2) p(PFH:"AKT Family") hasMember p(HGNC:AKT3) kin(p(HGNC:AKT1)) isA kin(p(PFH:"AKT Family"))

Gene scaffolding would then be applied to p(HGNC:AKT1), p(HGNC:AKT2), p(HGNC:AKT3), and p(HGNC:RELA)

37

kin(p(HGNC:AKT1)) -> p(HGNC:RELA)kin(p(PFH:"AKT Family")) =| bp(MESHPP:Apoptosis)

Page 38: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

Protein Family Example 3 (--expand-protein-families enabled)

becomes

kin(p(HGNC:AKT1)) -> p(HGNC:RELA)p(HGNC:AKT1) actsIn kin(p(HGNC:AKT1)) p(PFH:"AKT Family") hasMember p(HGNC:AKT1) p(PFH:"AKT Family") hasMember p(HGNC:AKT2) p(PFH:"AKT Family") hasMember p(HGNC:AKT3)

Gene scaffolding would then be applied to p(HGNC:AKT1), p(HGNC:AKT2), p(HGNC:AKT3), and p(HGNC:RELA)

38

kin(p(HGNC:AKT1)) -> p(HGNC:RELA)

Page 39: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

Named Complex Expansion

• The compiler will automatically include named complex components when a named complex member is identified – can be turned off using the --no-named-complexes switch

• The compiler can also search for named complexes to include when a named complex member is identified– Can be enabled using the --expand-named-complexes

switch

39

Page 40: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

Named Complex Expansion(Default Behavior)

becomes

kin(complex(NCH:"IkappaB Kinase Complex")) => \ p(HGNC:NFKBIA, pmod(P,S,32))complex(NCH:"IkappaB Kinase Complex") actsIn \ kin(complex(NCH:"IkappaB Kinase Complex")) p(HGNC:NFKBIA) hasModification p(HGNC:NFKBIA, pmod(P, S, 32)) complex(NCH:"IkappaB Kinase Complex") hasComponent p(HGNC:CHUK)complex(NCH:"IkappaB Kinase Complex") hasComponent p(HGNC:IKBKB)complex(NCH:"IkappaB Kinase Complex") hasComponent p(HGNC:IKBKG)

Gene scaffolding would then be applied to p(HGNC:CHUK) , p(HGNC:NFKBIA), p(HGNC:IKBKB), and p(HGNC:IKBKG)

40

kin(complex(NCH:"IkappaB Kinase Complex")) => \ p(HGNC:NFKBIA, pmod(P,S,32))

Page 41: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

Contents

• KAMs and the KAM store• BEL Compiler

– Running the BEL Compiler– Phase I - Compiler Expansions– Phase II - Equivalencing– Phase III – Compiler Augmentations

• BEL Framework Tools

41

Page 42: August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit

BEL Framework Tools

• Found in the “tools” folder of the BEL Framework• Two versions for each:

– .cmd (Windows) – .sh (Linux, OS X)

• KamManager– Use with –h to get full options list– list KAMs in KAM store, export KAM to XGMML, delete KAM

• BelCheck – check BEL document validity

• DocumentConverter – convert between BEL script and xbel formats

• CacheManager– Manage cached resources

42