The Pathway Tools Schema
SRI InternationalBioinformaticsMotivations for Understanding
Schema
Pathway Tools visualizations and analyses depend upon the software being able to find precise information in precise places within a Pathway/Genome DB
When writing complex queries to PGDBs, those queries must name classes and slots within the schema
A Pathway/Genome Database is a web of interconnected objects; each object represents a biological entity
SRI InternationalBioinformaticsReference
Pathway Tools User’s Guide, Volume I Appendix A: Guide to the Pathway Tools Schema
SRI InternationalBioinformaticsWeb of Relationships for One
Enzyme
Sdh-flavo Sdh-Fe-S Sdh-membrane-1 Sdh-membrane-2
sdhA sdhB sdhC sdhD
Succinate + FAD = fumarate + FADH2
Enzymatic-reaction
Succinate dehydrogenase
TCA Cycle
SRI InternationalBioinformaticsFrame Data Model
Frame Data Model -- organizational structure for a PGDB
Knowledge base (KB, Database, DB)
Frames
Slots
Facets
Annotations
SRI InternationalBioinformaticsKnowledge Base
Collection of frames and their associated slots, values, facets, and annotations
AKA: Database, PGDB
Can be stored within An Oracle or MySQL DB A disk file Pathway Tools binary program
SRI InternationalBioinformaticsFrames
Entities with which facts are associated
Kinds of frames: Classes: Genes, Pathways, Biosynthetic Pathways Instances (objects): trpA, TCA cycle
Classes: Superclass(es) Subclass(es) Instance(s)
A symbolic frame name (id, key) uniquely identifies each frame
SRI InternationalBioinformaticsSlots
Encode attributes/properties of a frame Integer, real number, string
Represent relationships between frames The value of a slot is the identifier of another frame
Every slot is described by a “slot frame” in a KB that defines meta information about that slot
SRI InternationalBioinformaticsSlot Links
Sdh-flavo Sdh-Fe-S Sdh-membrane-1 Sdh-membrane-2
sdhA sdhB sdhC sdhD
Succinate + FAD = fumarate + FADH2
Enzymatic-reaction
Succinate dehydrogenase
TCA Cycle
product
component-of
catalyzes
reaction
in-pathway
SRI InternationalBioinformaticsSlots
Number of values Single valued Multivalued: sets, bags
Slot values Any LISP object: Integer, real, string, symbol (frame name)
Slotunits define properties of slots: datatypes, classes, constraints
Two slots are inverses if they encode opposite relationships
Slot Product in class Genes Slot Gene in class Polypeptides
SRI InternationalBioinformaticsRepresentation of Function
Sdh-flavo Sdh-Fe-S Sdh-membrane-1 Sdh-membrane-2
sdhA sdhB sdhC sdhD
Succinate + FAD = fumarate + FADH2
Enzymatic-reaction
Succinate dehydrogenase
TCA Cycle
EC#Keq
CofactorsInhibitors
Molecular wtpI
Left-end-position
SRI InternationalBioinformaticsMonofunctional Monomer
Gene
Reaction
Enzymatic-reaction
Monomer
Pathway
SRI InternationalBioinformaticsBifunctional Monomer
Gene
Reaction
Enzymatic-reaction
Monomer
Pathway
Reaction
Enzymatic-reaction
SRI InternationalBioinformaticsMonofunctional Multimer
Monomer Monomer Monomer Monomer
Gene Gene Gene Gene
Reaction
Enzymatic-reaction
Multimer
Pathway
SRI InternationalBioinformaticsPathway and Substrates
Reactant-1
Reaction
Pathway
ReactionReactionReaction
Reactant-2
Product-2
Product-1
in-pathwayleft
right
SRI InternationalBioinformaticsTranscriptional Regulation
site001
pro001
trpE
trpD
trpC
trpB
trpA
trpL
Int003 RpoSig70
TrpR*trpInt001
trpLEDCBA
trp
apoTrpRInt005
SRI InternationalBioinformaticsPrinciple Classes
Class names are capitalized, plural, separated by dashes
Genetic-Elements, with subclasses: Chromosomes Plasmids
Genes Transcription-Units RNAs
rRNAs, snRNAs, tRNAs, Charged-tRNAs Proteins, with subclasses:
Polypeptides Protein-Complexes
SRI InternationalBioinformaticsPrinciple Classes
Reactions, with subclasses: Transport-Reactions
Enzymatic-Reactions
Pathways
Compounds-And-Elements
SRI InternationalBioinformaticsFrame IDs of Instances
Instance frame ID conventions have evolved over time
Examples: Pathways
TRPSYN-PWY, P23-PWY Genes
AG10045 Monomers
TRPA-MONOMER, AG10045-MONOMER
SRI InternationalBioinformaticsSlots in Multiple Classes
Common-NameSynonymsNames (computed as union of Common-Name,
Synonyms)
CommentCitations
DB-Links
SRI InternationalBioinformaticsGenes Slots
Component-Of (links to replicon, transcription unit)
Left-End-PositionRight-End-PositionCentisome-PositionTranscription-DirectionProduct
SRI InternationalBioinformaticsProteins Slots
Molecular-Weight-SeqMolecular-Weight-Exp
pILocations
Modified-FormUnmodified-Form
Component-Of
SRI InternationalBioinformaticsPolypeptides Slots
Gene
SRI InternationalBioinformaticsProtein-Complexes Slots
Components
SRI InternationalBioinformaticsReactions Slots
EC-Number
Left, RightSubstrates (computed as union of Left, Right)
DeltaG0Keq
Spontaneous?
SRI InternationalBioinformaticsEnzymatic-Reactions Slots
EnzymeReactionActivatorsInhibitorsPhysiologically-RelevantCofactorsProsthetic-GroupsAlternative-SubstratesAlternative-Cofactors
SRI InternationalBioinformaticsPathways Slots
Reaction-ListPredecessorsPrimaries