Upload
chris-mungall
View
2.251
Download
0
Tags:
Embed Size (px)
Citation preview
CrossProductExtensionstotheGeneOntology
ChrisMungallGeneOntologyConsor8um
h:p://www.geneontology.org
Outline
• WhattheGeneOntologyisusedfor– GOstructure– Limita8onsoftextdefini8ons
• Cross‐productextensionstotheGO– Logicalcomputabledefini8ons
• ResultsandExamples– Chemicalen88es,proteins,cells– Anatomyanddevelopment– Rela8ons– Reasoning
• ReleasePlan• Conclusions
Abriefintroduc8ontotheGO
• Nearing11thbirthday• 3ontologies,28kclasses
– MolecularFunc8on(MF)– BiologicalProcess(BP)– CellularComponent(CC)
• Annota8ons– 42mstatementsassigningfunc8onorlocaliza8ontogenesacross187kspecies
• StandardusesofGOannota8on:– Naviga8ngandqueryingfunc8onalannota8onsforgenes– Discovery;termenrichment;seman8csimilarity– >50toolsforperforminghi‐throughputanalysisusingGO
• Mostusesrequireasimple,lightlyaxioma8zedgraph– is_a– part_of– Defini8onsaretextual
Problemsandlimita8ons
• Maintenanceanderrors– Combinatorialterms– Tangledpolyhierarchies
• Denormalized– Redundancy– lackofreuse
Solu8on:normaliza8on+reasoning
• Priorwork– Rectoretal– Hilletal
• Retrospec.venormaliza8on– GOprecededOBO
• How?– GONG,Wroeetal– Ogrenetal– Obol
biosynthesis
metabolism sulfur amino acid
cysteine
cysteine biosynthesis
cysteine metabolism
sulfur amino acid
biosynthesis
sulfur amino acid metabolism
x
=
Assigninglogicaldefini8onstoGOclasses
• Logicaldefini8onstructure– AnXisaGthatD
• X:definedterm• G:genus(parent)term• D:differen8a(e)–discrimina8ngrela8onships
– Necessaryandsufficientcondi8ons– Computabledefini6onshouldmirrortextdefini6on
• Simpleformalism,limitedexpressivity– Equivalenceaxiomsbetweennamedclassesandposi8veconjunc8ons
ofnamedclassandoneormoreexisten8alrestric8ons• OBOprinicipleofPosi.vity
– Generaltemplate:• EquivalentClasses(NamedClassintersec8onOf(NamedGenus
[someValuesFrom(NamedObjectPropertyNamedDifferen.aClass)]+))
Example:mitochondrialtransla8on
• ‘mitochondrialtransla8on’=def‘transla8on’thatoccurs_in‘mitochondrion’– (currentrela8onshipsinGOarenecessarycondi8onsonly)
OBO id: GO:0032543 name: mitochondrial translation intersection_of: GO:0006412 ! translation intersection_of: occurs_in GO:0005739 ! mitochondrion
FOL Xinstance_of‘mitochondrialtransla8on’<‐>Xinstance_oftransla8on&existsC,t[Cinstance_ofmitochondrionatt&Xoccurs_inCatt]
OWLmanchestersyntax
Class:‘mitochondrialtransla8on’EquivalentTo:transla8onANDoccurs_inSOMEmitochondrion
CrossProduct(XP)Sets
• GOhas~28kclasses– Retrospec8veassignmentoflogicaldefini8onsisalotofwork– Divideworkaccordingtoontologiesdirectlyused
• CrossProductpar88ons– X∈<O1xO2x..xOn>
• typicallyn=2• GenustakenfromO1
• Differen8aetakenfromO2..n
– Example:BP:cysteine_biosynthesis∈<BPxCHEBI>• BP:biosynthesisthathas_outputCHEBI:cysteine
– EachXPsethasoneormoretemplates• Obolgrammars
– h:p://wiki.geneontology.org/index.php/Category:Cross_Products
Results:Logicaldefini8onsperXPsetGenus
MF BP CC
MF 103 241 148
BP 4046 27
CC 634 289
cell 541 25
anatomy 692
chemical 7278 3072
protein 37
quality 0
sequence 66
RNA 0
13kclasseshaveprovisionallogicaldefini8ons(46%ofclasses)
GOClass LogicalDefini6on GenusOntology
Differen6aontology(s)
Sphaseofmito6ccellcycle
Sphaseandpart_ofmitosis BP BP
mitochondrialtransla6on
transla6onandoccurs_inmitochondrion BP CC
Oocytedifferen6a6on
celldifferen6a6onandresults_in_acquisi.on_of_features_ofoocyte
BP CL
Neuralplateforma6on
anatomicalstructureforma6onandresults_in_forma.on_ofneuralplate
BP anatomy
Interleukin‐1biosynthesis
biosynthe6cprocessandhas_outputinterleukin‐1
BP PRO
L‐cysteinecatabolicprocesstotaurine
catabolicprocessandhas_inputL‐cysteineandhas_outputtaurine
BP CHEBI
groupIintroncatabolicprocess
catabolicprocessandhas_inputgroupIintron
BP SO/RNAO
GOClass LogicalDefini6on GenusOntology
Differen6aontology(s)
histonedeacetylasecomplex
proteincomplexandhas_func.onhistonedeacetylaseac6vity
CC MF
acrosomalmembrane
membraneandsurroundsacrosome CC CC
neuronprojec6on cellprojec6onandpart_ofneuron CC CL
viriontransportvesicle
transportvesicleandrealizesvesicletransport
CC BP
snoRNPbinding bindingandresults_in_binding_ofsnoRNP
MF CC
methioninesynthaseac6vity
cataly6cac6vityandhas_input5‐methyltetrahydrofolateandhas_inputL‐homocysteineandhas_outputtetrahydrofolateandhas_outputL‐methionine
MF CHEBI
Nestedlogicaldefini8ons
• Mul8pledifferen8aeandnesteddescrip8onsallowed– Onlynamedclassesused– SpansXPsets
GOClass LogicalDefini6on GenusOntology
Differen6aontology(s)
nega6veregula6onofRNAmetabolicprocess
biologicalprocessandhas_par.cipantRNAmetabolicprocess
BP BP
RNAmetabolicprocess
metabolicprocessandhas_par.cipantRNA
BP CHEBI
Developmentandanatomy
• Neuralplateforma6on=anatomicalstructureforma6onandresults_in_forma.on_ofneuralplate– GOannota8onstoxenopus,zebrafish,mouse
• Whereisneuralplatedeclared?– DevelopmentalstructuresnotinscopeofFMA– Otherchoices:
• EHDAA–mouse(TS1‐26)• ZFA‐zebrafish• TAO‐teleost• XAO‐xenopus
– Grossanatomicalontologiesarespecies‐or‐taxon‐centric
Uberon:amul8‐speciesanatomyontology
• GOcontainsanimplicitanatomyontologyspanningmul8plespecies– GO:0007423!sensoryorgandevelopment
• GO:0001654!eyedevelopment– GO:0043010!camera‐typeeyedevelopment– GO:0048749!compoundeyedevelopment
• NormalizedtoformUberon– Alignmentswithspecies‐centricAOs– 3000classes– SeePoster
• CurrentXPpar88oning:– Uberon[mostmetazoa]– PO[plants]– Others
• Fungalanatomyontology• Dictyosteliamanatomyontology
sensoryorgandevelopment
eyedevelopment
compoundeyedevelopment
camera‐typeeyedevelopment
Addi8onalrela8onsarerequiredforfullXPset
• CoreRO– part_of,has_par.cipant
• Spa8alrela8ons(CCx{CC,CL})– membranes,pores– adjacent_to,surrounds,perforates
• Par8cipa8onrela8onsubtypes– has_input,has_output– ‘macro’definedrela8ons
– E.g.results_in_transport_{of,to,from}
Reasoning
• Reasoningusedaspartofontologydevelopmentcycle– batchmode– interac8veinOBO‐Edit2– pre‐reasoned:inferredrela8onshipsareasserted
• Scalability– GO+XPs+Referencedontologies=130kclasses– Inmemoryreasonersdonotscale– h:p://wiki.geneontology.org/index.php/OBO‐
Edit:Reasoner_Benchmarks– Solu8ons:
• Segmenta8onbyXPset• CHEBIslim• RDBMSbasedreasoning
Reasonerresults
• 1000soflinksfixedovernumberyears• inconsistenciesinternaltoGOfixedimmediately– Fixhierarchyofdefinedclass– Fixhierarchyofreferencedclass
• abduc8vereasoning(BadaetalOWLED2008)
– Fixlogicaldefini8on• inconsistenciesexternaltoGOtakelongertoberesolved– CL– CHEBI
BPxCHEBIexample
carbohydrate
carbohydratephosphates
nucleosidephosphates
nucleo6des
transport
carbohydratetransport
nucleo6de,nucleobaseornucleosidetransport
nucleo6detransport
is_a
is_a
is_a
is_ais_a
is_a
cabrohydratetransport=deftransportandresults_in_movement_ofcarbohydrate
nucleo6detransport=deftransportandresults_in_movement_ofnucleo8de
Releaseplan:basicandextendedreleases
• GOiscurrentlyavailableintwoversions– gene_ontology:“standard”
• is_a,part_of,intra‐ontologyregulates• intendedforbasictools
– gene_ontology_ext:“extended”• h:p://www.geneontology.org/GO.ontology‐ext.rela8ons.shtml• standard+otherrela8onsandaxioms
– disjoint_from– has_part(Aug12009)
• XPsetscurrentavailableasseparatebridgefiles– h:p://wiki.geneontology.org/index.php/Category:Cross_Products
– willgraduallymigrateintogene_ontology_ext
Prevspostcomposi8on
• Composeclassdescrip8ons– Duringontologydevelopmentcycle?– Atthe8meofannota8on?
• Logicallyequivalent…– Givencomputabledefini8ons,reasonerscandetermineequivalency
• ..Butverydifferentfromprac8calpointofview• GOguidelines
– pre‐composeclassesforanytypeforwhichscien8ficgeneraliza8onscanbemade• Yes:mitochondrialtransla8on• Yes:oocytenucleus• No:nucleusofepitheliumofle~ear
– Usepost‐composi8ontoextendatannota8on8me
Relatedwork:weavingthefabricoftheOBOFoundry
• OntologyforBiomedicalInves8ga8ons(OBI)• PhenotypeOntologies
– MammalianPhenotype– HumanPhenotype– WormPhenotype– Planttrait
• Environmentontology• FMA• Flyanatomyontology
– Neuronalsubtypeandsenseorganlogicaldefini8onsusingCHEBIandGO
Futureapplica8onsofcross‐productsets
• Demonstratedu8lityaspartofontologydevelopmentcycle– Howdoweevaluate?– butwhataboutactualapplica8ons?
• Howcanlogicaldefini8ons(andaddi8onalaxioma8sa8oningeneral)help:– Searchanddiscovery– Visualiza8onandpresenta8ontousers– Cura8on– Improvefunc8onpredic8on– Databaseintegra8on
• E.g.pathwaydatabases– Termenrichment– Seman8csimilarity
• Needtoeducatetooldevelopers
Conclusions
• Normalizingretrospec.velyishard– Prospec.veapproachrecommended– Butredundancyineffortfromalterna8veperspec8vecanyield
valuableinforma8on• Manyofthechallengesaresociotechnological
– Whatifthereferencedontology• doesnotyetexist?• existsbutisunfunded?• isconstructedaccordingtodifferentprinciples?• isincomplete?• ..orthereisachoiceoftwocompe8ngontologies?
– TheOBOFoundryprocessiscrucial• Grantchallenge:moreapplica8onsneeded
Acknowledgments
• GOOntologyDevelopers– MidoriHarris– JaneLomax– JenDeegan– AmeliaIreland– TanyaBerardini– DavidHill
• Also– MikeBada– ColinBatchelor
• OBO– AlanRu:enberg– BarrySmith– RichardScheuermann
• OBOOntologydevelopers– AlexDiehl(GO,Cell)– JannaHas8ngs(CHEBI)– PauladeMatos(CHEBI)– DavidOsumi‐Sutherland(Fly)– MelissaHaendel(Zebrafish)– DarrenNatale(PRO)– KarenEilbeck(SO)
• OBO‐Edit• AminaAbdulla• NomiHarris• JohnDay‐Richter
• GOPIs– SuzannaLewis– MikeCherry– MichaelAshburner– JudithBlake