35
Semantic assets and challenges of ontologies management Vasily.Bunakov <at> stfc.ac.uk Science and Technology Facilities Council, United Kingdom The EMMC IntOP2018 Workshop in Freiburg, 6-7 November 2018

Semantic assets and challenges of ontologies management · 2018-12-05 · Semantic assets and challenges of ontologies management Vasily.Bunakov stfc.ac.uk Science and

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Semantic assets and challenges of ontologies management · 2018-12-05 · Semantic assets and challenges of ontologies management Vasily.Bunakov  stfc.ac.uk Science and

Semantic assets and challenges of ontologies management

Vasily.Bunakov<at>stfc.ac.ukScienceandTechnologyFacilitiesCouncil,UnitedKingdom

TheEMMCIntOP2018WorkshopinFreiburg,6-7November2018

Page 2: Semantic assets and challenges of ontologies management · 2018-12-05 · Semantic assets and challenges of ontologies management Vasily.Bunakov  stfc.ac.uk Science and

TOC

•  STFCandSCDbackground•  SemanticAssetsforMaterialsScienceTaskGroup•  Lessonsfromnano-foundriesmetadatadesign•  Lessonsfromelsewhere•  Suggestionsonfurthercommunication

Page 3: Semantic assets and challenges of ontologies management · 2018-12-05 · Semantic assets and challenges of ontologies management Vasily.Bunakov  stfc.ac.uk Science and

STFC and SCD background

Page 4: Semantic assets and challenges of ontologies management · 2018-12-05 · Semantic assets and challenges of ontologies management Vasily.Bunakov  stfc.ac.uk Science and

STFC in a nutshell

~1700permanentstaff~7500visitorscientistsannually

Page 5: Semantic assets and challenges of ontologies management · 2018-12-05 · Semantic assets and challenges of ontologies management Vasily.Bunakov  stfc.ac.uk Science and

STFC Scientific Computing Department

•  High Performance Computing •  Petabyte data store •  CERN LHC Tier 1 hub •  Data management and

data analysis solutions

•  Biology and Life Sciences •  Engineering and Environment •  Computational Chemistry •  Theoretical and Computational

Physics

Seemoreatwww.stfc.ac.uk/SCD

Docomputationalscience:OperateanddevelopITinfrastructure:

ThisiswhereIcomefrom

Page 6: Semantic assets and challenges of ontologies management · 2018-12-05 · Semantic assets and challenges of ontologies management Vasily.Bunakov  stfc.ac.uk Science and

Physical Sciences Data Service

•  ServicetoprovidedataresourcestoUKChemistryandMaterialsScienceCommunity•  Extendacurrentservice:http://cds.rsc.org/•  ProvideUKAcademicaccesstocommercialchemicaldatabases

•  UniversityofSouthamptonandSTFCtakingovertheservicefromJan2019•  InitiallytransferringthecurrentservicefromtheRoyalSocietyofChemistry

•  PlantodevelopthisasaDataScienceplatform•  DevelopitasaresourcehubforPhysicalSciences•  ExtendfromChemistry,toincludeMaterialsScience,ChemicalEngineeringandotherrelatedareas

•  MoreOpenScienceresources•  Provideaddedvalue–commonmetadata,crosssearch,accesstosoftware,training

•  Computed(simulated)datasetsareidentifiedasapossibleterritoryfortheservicegrowth•  Theadventofmoremachine-usableinterfacesisforeseen•  RelationwithNISTimportant

Page 7: Semantic assets and challenges of ontologies management · 2018-12-05 · Semantic assets and challenges of ontologies management Vasily.Bunakov  stfc.ac.uk Science and

Recent EU projects with the STFC SCD contribution

•  EUDAT–researchdatainfrastructure•  EOSC–EuropeanOpenScienceCloud• VIMMP(wellrepresentedinthisworkshop)• NFFA–NanoscienceFoundriesandFineAnalysis•  FREYA–persistentidentifiersinsupportofOpenScience

WealsocontributetoanumberofRDAgroups,notablyResearchdataneedsofthePhotonandNeutronSciencecommunityIGandVocabularyServicesIG

Page 8: Semantic assets and challenges of ontologies management · 2018-12-05 · Semantic assets and challenges of ontologies management Vasily.Bunakov  stfc.ac.uk Science and

Semantic Assets for Materials Science Task Group

Page 9: Semantic assets and challenges of ontologies management · 2018-12-05 · Semantic assets and challenges of ontologies management Vasily.Bunakov  stfc.ac.uk Science and

Semantic Assets for Materials Science Task Group

• DevisedintheRDABerlinplenary(April2018),asaresultofdiscussionsbetweenSTFCandNIST•  SetupwithintheRDAVocabulariesInteroperabilityIG•  FirstonlinemeetinginMay2018,followedbymeetingsinJulyandSeptember• Veryopenandinclusivegroup•  ~25inthemailinglist,~10-12atypicalattendance• VasilyBunakov(STFC)andZacharyTrautt(NIST)co-chair

Page 10: Semantic assets and challenges of ontologies management · 2018-12-05 · Semantic assets and challenges of ontologies management Vasily.Bunakov  stfc.ac.uk Science and

Semantic Assets Task Group scope

•  BuildinganinventoryofexistingsemanticassetsforMaterialsScience:ontologies,vocabularies,controlledtermslists,metadataschemes.Thiscanincludenotonlyvocabulariesaboutmaterialspersebutalsocoveradjacenttopics,sayinstrumentationandchemistry,thatarehighlyrelevantforMaterialscommunity.• Monitoringtechnologyforvocabulariesbuildingandvocabulariesmaintenance/updates/curationinMaterialsdomain• MonitoringusecasesandactualpracticesforsemanticassetsapplicationinMaterialsdomain.ThisincludesusingthemintheactualITservices.•  Discussingformsofrepresentation/publishingforsemanticassets•  Discussinginteroperabilitybetweenvocabularies:apossibilityforcross-walksorsensiblelinksbetweentermsfromdifferentvocabularies

Page 11: Semantic assets and challenges of ontologies management · 2018-12-05 · Semantic assets and challenges of ontologies management Vasily.Bunakov  stfc.ac.uk Science and

Semantic Assets Task Group progress so far

• AgoodcommunicationchannelwithrepresentationfromEuropeandAmerica;liaisonwithJapan/NIMSrequiresdevelopment•  FirstexperimentswithsemanticassetsregistrationusingNISTplatformhttp://schemas.nist.gov/• Workonacommonvocabularystarted• PotentialfortheF2FmeetingintheRDAPlenaryinPhiladelphia(April2019)• MovingfromtheRDAVocabulariesInteroperabilityIGtotheRDA/CODATAMaterialsData,Infrastructure&InteroperabilityIGispossible

Page 12: Semantic assets and challenges of ontologies management · 2018-12-05 · Semantic assets and challenges of ontologies management Vasily.Bunakov  stfc.ac.uk Science and

Lessons from NFFA metadata design

Page 13: Semantic assets and challenges of ontologies management · 2018-12-05 · Semantic assets and challenges of ontologies management Vasily.Bunakov  stfc.ac.uk Science and

NFFA in a nutshell

•  IsaHorizon2020project•  Givesaccesstodistributedinfrastructureforgrowth,nano-lithography,nano-characterization,theoryandsimulationandfine-analysiswithsynchrotron,FELandneutronradiationsources

•  “Virtualresearchenterprise”withproposalssystemanddatamanagementobligation

Seemoreatwww.nffa.eu

Page 14: Semantic assets and challenges of ontologies management · 2018-12-05 · Semantic assets and challenges of ontologies management Vasily.Bunakov  stfc.ac.uk Science and

“What artefacts we produce” and “How we discuss them”: Stages of NFFA metadata design

Commonvocabulary

ERdiagram

ListofMDelements

NFFAdiscussio

nandam

endm

ents

Externaldisc

ussio

nandam

endm

ents

Metadatainaserializedform(XML,JSON,RDF,…)

Othermetadata,vocabulariesandontologies

CODATA-VAMAS,NOMAD,RDA,…

NFFAite

rativ

ediscussio

n

Page 15: Semantic assets and challenges of ontologies management · 2018-12-05 · Semantic assets and challenges of ontologies management Vasily.Bunakov  stfc.ac.uk Science and

An example of a semantic asset: A fragment of NFFA Common Vocabulary

• ResearchUser.Aperson,agroupofthem,oraninstitution(organization)whoconductExperimentonananoscienceFacilityusingananoscienceInstrumentinordertocollectandanalyzeRawData,orisinterestedindatacollectedoranalyzedbyotherResearchUsersonthesameorotherFacilities.

• Project.Anactivity,oraseriesofactivitiesperformedbyoneormoreResearchUsersononeormoreFacilitiesusingoneormoreInstrumentsfortakingoneormoreMeasurementsofoneormoreSamplesduringoneormoreExperiments.Facility,Instrument,MeasurementandSamplecanrefertocomputersimulationenvironment.

• Facility.Aninstitution(organization),oradivisionofitthatoperatesoneormorenanoscienceInstrumentsforResearchUsers.Forcomputersimulation,Facilitycanbeasoftwareplatformthatallowstoorderandmanagecomputationalexperiments(sothatthesoftwareplatformservesthepurposeofmanagingsoftwaremodulesthatcanbeconsideredvirtualInstruments).

• Instrument.Identifiableequipment(suchasadeviceorastandoraline)thatallowsconductinganindependentnanoscienceresearch,perhapswithoutinvolvementofotherInstruments.InstrumentishostedbyFacilityandusedbyResearchUser.InstrumentproducesRawDatainthecourseofExperiment.Instrumentcanbeinfactasoftwareforcomputersimulation(asoftwaremoduleor/andaparticularconfigurationofit).

Page 16: Semantic assets and challenges of ontologies management · 2018-12-05 · Semantic assets and challenges of ontologies management Vasily.Bunakov  stfc.ac.uk Science and

An example of a semantic asset: ER diagram for NFFA metadata components

Page 17: Semantic assets and challenges of ontologies management · 2018-12-05 · Semantic assets and challenges of ontologies management Vasily.Bunakov  stfc.ac.uk Science and

“No model is an island”: Mapping and gap analysis exercise

NFFA concept

CODATA-VAMAS concept

NOMAD concept

Experiment Nano-object production steps Series of software runs

Measurement Nano-object testing steps Software run

Sample Nano-object or collection of objects Input data

Data Asset Output data

Nanotechnology aspect

NFFA model

CODATA-VAMAS model

NOMAD model

Nano-object (sample) Conceptual Detailed Detailed

Computation Detailed Unaddressed Detailed

Experiment lifecycle Detailed Conceptual Conceptual

Data lifecycle Detailed Unaddressed Conceptual

Concep

tsm

apping

Mod

elsc

overage/g

aps

Page 18: Semantic assets and challenges of ontologies management · 2018-12-05 · Semantic assets and challenges of ontologies management Vasily.Bunakov  stfc.ac.uk Science and

“Why do we do it at all”: A place of metadata in a (virtual) Enterprise Architecture

UseCases/BusinessAnalysis

Metadatadesign

ITArchitecturedevelopment

UseCases,ITArchitectureandMetadatacanbeconsideredpartsofa(virtual)EnterpriseArchitectureSeemoreaboutEnterpriseArchitectureathttps://en.wikipedia.org/wiki/Enterprise_architecture

Page 19: Semantic assets and challenges of ontologies management · 2018-12-05 · Semantic assets and challenges of ontologies management Vasily.Bunakov  stfc.ac.uk Science and

Lessons from semantic modelling beyond Materials Science

Page 20: Semantic assets and challenges of ontologies management · 2018-12-05 · Semantic assets and challenges of ontologies management Vasily.Bunakov  stfc.ac.uk Science and

Ontology for finance

200+organizations7000+professionals

Businessconceptualmodelofhowallfinancialinstruments,businessentitiesandprocessesworkinthefinancialindustry

www.edmcouncil.org

https://spec.edmcouncil.org/fibo/

FIBOisawell-governedprojectstartedcirca2010andsupportedbyawell-fedworld-wideorganization

Page 21: Semantic assets and challenges of ontologies management · 2018-12-05 · Semantic assets and challenges of ontologies management Vasily.Bunakov  stfc.ac.uk Science and

Ontology for finance (continued): FIBO structure vs FIBO teams

•  FIBOLeadershipTeam(FLT)•  FIBOProcessTeam(FPT)•  FIBOProof-of-ConceptTeams•  FIBOFoundations(FND)•  FIBOBusinessEntities(BE)•  FIBOFinancialBusiness&Commerce(FBC)•  FIBOIndicesandIndicators(IND)•  FIBOSecurities&Equities(SEC)•  FIBODerivatives(DER)

12vendorsarereportedsofarashavingimplementedFIBOintheirITsolutions.NotallpartsofthemodelarecurrentlycoveredbyFIBOteams.

Page 22: Semantic assets and challenges of ontologies management · 2018-12-05 · Semantic assets and challenges of ontologies management Vasily.Bunakov  stfc.ac.uk Science and

Ontology Maturity Model that informs FIBO development process

“TheOntologyMaturityModel”byLeoObrst,2009(inspiredbyCMM/CMMImodelforbusinessprocessesmaturity)

Page 23: Semantic assets and challenges of ontologies management · 2018-12-05 · Semantic assets and challenges of ontologies management Vasily.Bunakov  stfc.ac.uk Science and

(a kind of) Ontology favoured by social science data archives

Aninternationalstandardfordescribingsurveys,questionnaires,statisticaldatafiles,andsocialsciencesstudy-levelinformation

Ittook18yearsfromthefirstcodificationoftermstothefirst(incomplete)semanticrepresentation.TheofficialserializationisstillXMLSchema. www.ddialliance.org

Page 24: Semantic assets and challenges of ontologies management · 2018-12-05 · Semantic assets and challenges of ontologies management Vasily.Bunakov  stfc.ac.uk Science and

Ontology for bibliography (one of a few out there)

•  1960s:MARCStandardsdeveloped•  1971:MARCbecomeanationalstandardintheUS

•  1973:MARCbecomesaninternationalstandard•  2002:librarytechnologistRoyTennantarguedthat"MARCMustDie",asitisusedonlywithinthelibrarycommunity,anddesignedtobeadisplay,ratherthanastorageorretrievalformat

•  2008:reportfromtheLibraryofCongresswrotethatMARCis"basedonforty-yearoldtechniquesfordatamanagementandisoutofstepwithprogrammingstylesoftoday"

•  2012:theLibraryofCongressannouncedthatithadcontractedwithZepheira,adatamanagementcompany,todevelopalinkeddataalternativetoMARC

•  2012:thelibraryreleasedadraftofthenewmodel,namedBIBFRAME•  2016:TheLibraryofCongressreleasedversion2.0ofBIBFRAME

Page 25: Semantic assets and challenges of ontologies management · 2018-12-05 · Semantic assets and challenges of ontologies management Vasily.Bunakov  stfc.ac.uk Science and

The actual experiment of transforming MARC records to Linked Data by four national libraries )*

)*AspresentedinMTSR2018conferencebyProf.ChristosPapatheodorou,IonianUniversity,Corfu,GreeceDetaileddescriptionofexperiment:Tallerås,K.(2017).Qualityoflinkedbibliographicdata:Themodels,vocabularies,andlinksofdatasetspublishedbyfournationallibraries.JournalofLibraryMetadata,17(2),126–155.https://doi.org/10.1080/19386389.2017.1355166

Page 26: Semantic assets and challenges of ontologies management · 2018-12-05 · Semantic assets and challenges of ontologies management Vasily.Bunakov  stfc.ac.uk Science and

Linked Data by 4 national libraries continued (something about semantics and interoperability)

•  3of1,141uniquepropertyandclasstermsareusedbyall4libraries(owl:sameAs,rdf:type,anddct:language)•  13termsby(setsof)3libraries•  34termsby(setsof)2libraries

Whythesethree?

Set Triples Entities Data-levelconstants

BNB 104,139,477 10,126,344 52,671,707BNE 71,199,698 5,763,188 56,681,387BNF 304,587,809 30,671,400 192,224,487DNB 329,261,459 32,673,901 250,613,437Average 202,297,111 19,808,708 138,047,754

Picturecredits:“Threestonesofwisdom”byhttp://livertising.net/blog/2013/three-stones-of-wisdom-livertising-exam-concepts/

Page 27: Semantic assets and challenges of ontologies management · 2018-12-05 · Semantic assets and challenges of ontologies management Vasily.Bunakov  stfc.ac.uk Science and

Ontologies for biology )*

• Ontologiescanbecomplex• Ontologiescanbebig• Ontologiescanchange

)*SimonJupp(EUBioinformaticsInstitute,Cambridge,UK).BuildingarepositoryofbiomedicalontologieswithNeo4j.https://www.slideshare.net/thesimonjupp/building-a-repository-of-biomedical-ontologies-with-neo4j

Rationaleforontologiesrepository

•  Searchforterms• Queryingthehierarchy• Queryingacrossrelations

Ontologyrepositoryusecases

https://www.ebi.ac.uk/ols/index(asper1November2018)216ontologies5,526,032terms19,119properties

Page 28: Semantic assets and challenges of ontologies management · 2018-12-05 · Semantic assets and challenges of ontologies management Vasily.Bunakov  stfc.ac.uk Science and

Semantic modelling and technology with no RDF involved

FlexibleMDM(MasterDataManagement)withgraphdatabase:https://neo4j.com/case-studies/schleich/Picturecredits:https://www.ebay.co.uk/usr/bargain-vapes

Page 29: Semantic assets and challenges of ontologies management · 2018-12-05 · Semantic assets and challenges of ontologies management Vasily.Bunakov  stfc.ac.uk Science and

We may have learned something about semantic interoperability…

• Ontologies/semanticassetsdevelopmenttakessubstantialeffort.Havingaproperprocessmayhelp• Havingdifferentpracticesofapplicationforthesamesemanticassetisnormal• Havingmultiplesemanticassetsforthesamedomainisnormal•  SemanticscanbeexpressedandexploitedusingvariousmodellingtechniquesandITsolutions

Page 30: Semantic assets and challenges of ontologies management · 2018-12-05 · Semantic assets and challenges of ontologies management Vasily.Bunakov  stfc.ac.uk Science and

…but there are other flavours of interoperability beyond semantics )*

Challenge Popularresponse

Syntacticinteroperability Commonterminology,commonXMLschemasTechnicalinteroperability Configurableandwell-governedsoftware,well-

specifiedAPIsSemanticinteroperability Clearidentificationofallconcepts,connections

betweenthem,andinferencerules

)*For"layered"interpretationoftheseinteroperabilityaspects,seeAndreasTolketal.ComposableM&SWebServicesforNet-CentricApplications.TheJournalofDefenseModelingandSimulation.Vol.3(1),pp.27-44(2016).https://doi.org/10.1177/875647930600300104-kindlyindicatedbyZacharyTrautt(NIST)

Page 31: Semantic assets and challenges of ontologies management · 2018-12-05 · Semantic assets and challenges of ontologies management Vasily.Bunakov  stfc.ac.uk Science and

… also interoperability is not the end in itself

•  Thereisoftenatrade-offbetweeninteroperabilityandextensibility• Usecasesandsuccessstoriesareimportant•  Toolsandtechnologytosupportsemanticmodellingandmodelsreuseareimportant–notonlyforITinfrastructure,butasacommunicationaidandasameansofdiscourse

Page 32: Semantic assets and challenges of ontologies management · 2018-12-05 · Semantic assets and challenges of ontologies management Vasily.Bunakov  stfc.ac.uk Science and

(not mutually exclusive) Solutions for Interoperability and Reproducibility

of data-intensive R&D •  SensiblegovernanceandqualitydocumentationforITimplementations• Metadataexchangeformatorself-documenteddataexchangeformats•  APIsspecifications(canbeself-documented,too)•  OOdesignframeworkswithwell-definedobjectsforaspecificdomain•  DSLs(domain-specificprogramminglanguages)•  Schemalanguages/specifications,includingforRDF•  Ontologies• Workflows(forasmallernumberofwell-definedobjectscomparedtotheOOdesignapproach–perhapsjustonecommonobject)andenginesfortheworkflowsexecution)*

FA??->FAIR

)*SeeSeanBechhoferetal.“Whylinkeddataisnotenoughforscientists”.https://doi.org/10.1016/j.future.2011.08.004Theyrefertowww.myexperiment.orgasaplatformforthenewkindofresearchdiscourseempoweredbyworkflows

Page 33: Semantic assets and challenges of ontologies management · 2018-12-05 · Semantic assets and challenges of ontologies management Vasily.Bunakov  stfc.ac.uk Science and

(Relatively) new kid on the block: SHACL

https://www.w3.org/TR/shacl/

RDF

RDFS SPARQLOWL SHACL

Statements:Whatisbeingsaid?

Whatwordsdowehave?

Whatmakeslogicalsensetosay?

WhatdidyousayaboutXYZ?

Isthatwordusedcorrectly?Whatdoyouneedtoknowfromme?Youcan'tsaythathere!I'dneversaythat!

ThediagramreplicatestheoneinRichardCyganiak’s2016presentation“SHACL:ShapingtheBigBallofDataMud”https://www.slideshare.net/cygri/shacl-shaping-the-big-ball-of-data-mud

Page 34: Semantic assets and challenges of ontologies management · 2018-12-05 · Semantic assets and challenges of ontologies management Vasily.Bunakov  stfc.ac.uk Science and

Communication with a wider community of semantic modellers and technologists

that can be beneficial for Materials Science

•  Fintech/FIBOcommunitycanadviseonqualitygovernancefortheontologydevelopment.Lookonline,approachthemdirectly,orIcanseewhatIcando•  Bio-informaticiansmaybeabletoadviseonmanagementofmultiplesemanticassets,andontheiractualuseforindexing.Lookonline,askEMBL-EBI(UK)–directlyorusingmeasaproxy•  EUON(EuropeanOntologyNetwork)–onlyoneworkshopsofar,supportedbyEUDATproject.Ifinterested,askYannleFranc(co-chairoftheRDAVocabulariesInteroperabilityIG)–directlyorusingmeasaproxy•  TherearepocketsofEuropeanexpertiseinsemanticmodelling&visualizationtools.Ifinterested,askKārlisČerāns(UniversityofLatvia)–directlyorusingmeasaproxy

Picture:FOAF(friendofafriend)ontologylogo

Page 35: Semantic assets and challenges of ontologies management · 2018-12-05 · Semantic assets and challenges of ontologies management Vasily.Bunakov  stfc.ac.uk Science and

Opportunities and goals for further discussions

•  SemanticAssetsforMaterialsSciencetaskgroupinRDA(nextcall28thNovember14:00CET)•  EMMCInternationalWorkshopinVienna(February2019)• RDAgroupsandRDAplenaryinPhiladelphia(April2019)• DAMDIDconferenceandapotentialworkshoponinformaticsformaterialsscienceinKazanorMoscow(October2019)• PossiblesynergiesbetweenEMMCandPhysicalSciencesDataService(withservicevisiondevelopedthrough2019)•  FutureEUprojects