Upload
monika-solanki
View
633
Download
1
Embed Size (px)
DESCRIPTION
Presentation at I-Semantics 2012, Graz
Citation preview
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Realising the Potential of Algal BiomassProduction through Semantic Web and
Linked data
The LEAPS Framework
Monika SolankiKnowledge Based Engineering Lab
Birmingham City University UK
Joint work withJohannes Skarka
Karlsruhe Institute of Technology ITAS
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Outline
1 Motivation
2 Modelling Algal Biomass Knowledge
3 Lifting XML datasets to Linked data
4 System Architecture
5 Querying Linked Algal Biomass Data
6 Conclusion and Future work
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Motivation
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Algal biomass as biofuels
Extensive research is being undertaken in the search andproduction of naturally viable and sustainable energysourcesThe idea that algae biomass based biofuels could serve asan alternative to fossil fuels has been embraced bycouncils across the globeMajor companies government bodies and dedicated nonprofit organisations are getting involvedThe domain is a rich source of datainformationknowledge
httpwwwalgalbiomassorghttpwwweaba-associationeu
httpwwwenalgaeeu
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Algal biomass as biofuels Observations
No systematic analysis of the algae biomass potential forNorth-Western EuropeMost of the knowledge buried in various formats of imagesspreadsheets proprietary data sources and grey literatureLack of a knowledge level infrastructure that is equippedwith the capabilities to provide semantic grounding to thedatasets for algal biomassLow levels of motivation among stakeholders for datasetsto be interlinked shared and reused within the biomasscommunity
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
LEAPS A Potential SolutionLinked Entities for Algal Plant Sites
motivate the use of Semantic Web technologies and LODfor the algal biomass domainlaying out a set of ontological requirements for knowledgerepresentation that support the publication of algalbiomass dataelaborating on how algal biomass datasets are transformedto their corresponding RDF model representationinterlinking the generated RDF datasets along spatialdimensions with other datasets on the Web of datavisualising the linked datasets via an end user LOD RESTWeb service
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
EnAlgae Energetic Algae
Aims to reduce CO2 emissions and dependency onunsustainable energy sources in North West Europe4 Year Strategic initiative of Interreg IVb NWE programme
19 partners and 14 Observers across 7 EU states
Coordinated set of activities focussing on sharing bestpractice developing effective stakeholder engagement andencouraging transnational cooperation
httpwwwenalgaeeu
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
EnAlgae Some of the objectives
Accelerate development of sustainable technologies forBiomass productionCreate a network of pilot scale algal facilities across NWEin order to address the current lack of verifiable informationon algal productivityMaintain an up to date inventory in which pilots collect andshare data in a standardised mannerCombine information across the entire algal bioenergydelivery chain into a comprehensive and user friendlyDecision Support System for practitioners policy makersand investors
httpwwwenalgaeeu
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
SW and Linked data for Algal Biomass
Algal biomass data manifests itself across several facetsThe valuesupply chain ranges from cultivation of algae toproduction of biofuels and other productsCultivation harvesting processing and fuel productionfurther involves several intermediate processesEvery stage in the algal supply chain is governed byrequirements regulatory policies and strategiesEach of the facets consumes and produces a large volumeof unstructured data and information
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Algal Supply Chain
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
SW Linked data and the Algal Supply Chain
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Competency questions for stage 1 datasetsData driven
Which are the algal operation sites with CO2 sources thathave CO2 emissions less than 130000 kgs where totalcosts of supplying CO2 is lower then 5000 GBP per ton ofCO2 areal yield is greater than 30 tons per hectare andwhich are located within the NUTS region ldquoUKM61rdquoSupplement the data with supporting information about theregionWhich are the top ten algal operation sites with the lowestimpact on global warming potentialFor a given algal operation site which are the first five mostcost effective combinations of light water nutrients andCO2 sources
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Modelling Algal BiomassKnowledge
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontological requirements
Ontologies needed to representSpatiality location of possible algae cultivation siteslocation of the sources of consumables (CO2 nutrientsand water)Geometries area of the cultivation site - extentspolygons linear and ring arraysUnits and Measurements conventional measurementunits such as Kgs for quantities and hectares for areabespoke units of measurements ie Kgshectare orKgsannumTerritorial units for statistics core concepts of the NUTSsystemDomain specific knowledge algae cultivation sites CO2sources pipelines
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Reuse
Spatial Data WGS84 spatial relations GeonamesNeoGeoGeometries WGS84 extended NeoGeoUnits and Measurements extended QUDT
httpwwww3org200301geowgs84_poshttpwwwordnancesurveycoukoswebsiteontology
spatialrelationsowlhttpwwwgeonamesorgontologyontology_v221rdf
httpgeovocaborggeometryhttpqudtorg11vocabdimensionalunit
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Reuse
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Domainknowledge
Ontologies for modelling spatial knowledge units andmeasurements were reusedDiscovering vocabularies conceptualising the domainknowledge for algal biomass was non trivialConcepts and relationships for algal biomass had to bedefined from ground-up in accordance to the principles ofontology developmentThe design was very strongly guided by feedback fromquestionnaires made available to the stakeholdersinterviews with domain experts providers of raw datasetsand grey literature from the algal biomass and biofuelsdomain
Ontologies for Algal Biomass Domainknowledge
Ontologies available at httppurlorgbiomassontologies
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Designing URIs for Algal Biomass Data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets toLinked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked dataRaw data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format
brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Second stepThe original data sources had several limitations and aone-to-one transformation was not possible
The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs
A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF
Ontologies
Linking engine producing the linkeddata representation of the datasets
Triple store OWLIM SE 50
REST Web services
SPARQL endpoints
Web Interface
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Querying Linked Algal Biomass Data
Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query
Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region
Typical QueryWHERE
SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite
siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit
SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source
co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue
continued
Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline
pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit
SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion
owlsameAs relatedFILTER((emissionValue lt 130000)
ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related EffortsConclusions and
Future Work
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets
httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet
httpdatareegleinfo
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Summary
The LEAPS framework exploits SW and LD for the algalbiomass community
enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints
httpwwwalgaebaseorg
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Many Thanks
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Outline
1 Motivation
2 Modelling Algal Biomass Knowledge
3 Lifting XML datasets to Linked data
4 System Architecture
5 Querying Linked Algal Biomass Data
6 Conclusion and Future work
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Motivation
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Algal biomass as biofuels
Extensive research is being undertaken in the search andproduction of naturally viable and sustainable energysourcesThe idea that algae biomass based biofuels could serve asan alternative to fossil fuels has been embraced bycouncils across the globeMajor companies government bodies and dedicated nonprofit organisations are getting involvedThe domain is a rich source of datainformationknowledge
httpwwwalgalbiomassorghttpwwweaba-associationeu
httpwwwenalgaeeu
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Algal biomass as biofuels Observations
No systematic analysis of the algae biomass potential forNorth-Western EuropeMost of the knowledge buried in various formats of imagesspreadsheets proprietary data sources and grey literatureLack of a knowledge level infrastructure that is equippedwith the capabilities to provide semantic grounding to thedatasets for algal biomassLow levels of motivation among stakeholders for datasetsto be interlinked shared and reused within the biomasscommunity
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
LEAPS A Potential SolutionLinked Entities for Algal Plant Sites
motivate the use of Semantic Web technologies and LODfor the algal biomass domainlaying out a set of ontological requirements for knowledgerepresentation that support the publication of algalbiomass dataelaborating on how algal biomass datasets are transformedto their corresponding RDF model representationinterlinking the generated RDF datasets along spatialdimensions with other datasets on the Web of datavisualising the linked datasets via an end user LOD RESTWeb service
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
EnAlgae Energetic Algae
Aims to reduce CO2 emissions and dependency onunsustainable energy sources in North West Europe4 Year Strategic initiative of Interreg IVb NWE programme
19 partners and 14 Observers across 7 EU states
Coordinated set of activities focussing on sharing bestpractice developing effective stakeholder engagement andencouraging transnational cooperation
httpwwwenalgaeeu
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
EnAlgae Some of the objectives
Accelerate development of sustainable technologies forBiomass productionCreate a network of pilot scale algal facilities across NWEin order to address the current lack of verifiable informationon algal productivityMaintain an up to date inventory in which pilots collect andshare data in a standardised mannerCombine information across the entire algal bioenergydelivery chain into a comprehensive and user friendlyDecision Support System for practitioners policy makersand investors
httpwwwenalgaeeu
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
SW and Linked data for Algal Biomass
Algal biomass data manifests itself across several facetsThe valuesupply chain ranges from cultivation of algae toproduction of biofuels and other productsCultivation harvesting processing and fuel productionfurther involves several intermediate processesEvery stage in the algal supply chain is governed byrequirements regulatory policies and strategiesEach of the facets consumes and produces a large volumeof unstructured data and information
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Algal Supply Chain
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
SW Linked data and the Algal Supply Chain
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Competency questions for stage 1 datasetsData driven
Which are the algal operation sites with CO2 sources thathave CO2 emissions less than 130000 kgs where totalcosts of supplying CO2 is lower then 5000 GBP per ton ofCO2 areal yield is greater than 30 tons per hectare andwhich are located within the NUTS region ldquoUKM61rdquoSupplement the data with supporting information about theregionWhich are the top ten algal operation sites with the lowestimpact on global warming potentialFor a given algal operation site which are the first five mostcost effective combinations of light water nutrients andCO2 sources
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Modelling Algal BiomassKnowledge
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontological requirements
Ontologies needed to representSpatiality location of possible algae cultivation siteslocation of the sources of consumables (CO2 nutrientsand water)Geometries area of the cultivation site - extentspolygons linear and ring arraysUnits and Measurements conventional measurementunits such as Kgs for quantities and hectares for areabespoke units of measurements ie Kgshectare orKgsannumTerritorial units for statistics core concepts of the NUTSsystemDomain specific knowledge algae cultivation sites CO2sources pipelines
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Reuse
Spatial Data WGS84 spatial relations GeonamesNeoGeoGeometries WGS84 extended NeoGeoUnits and Measurements extended QUDT
httpwwww3org200301geowgs84_poshttpwwwordnancesurveycoukoswebsiteontology
spatialrelationsowlhttpwwwgeonamesorgontologyontology_v221rdf
httpgeovocaborggeometryhttpqudtorg11vocabdimensionalunit
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Reuse
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Domainknowledge
Ontologies for modelling spatial knowledge units andmeasurements were reusedDiscovering vocabularies conceptualising the domainknowledge for algal biomass was non trivialConcepts and relationships for algal biomass had to bedefined from ground-up in accordance to the principles ofontology developmentThe design was very strongly guided by feedback fromquestionnaires made available to the stakeholdersinterviews with domain experts providers of raw datasetsand grey literature from the algal biomass and biofuelsdomain
Ontologies for Algal Biomass Domainknowledge
Ontologies available at httppurlorgbiomassontologies
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Designing URIs for Algal Biomass Data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets toLinked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked dataRaw data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format
brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Second stepThe original data sources had several limitations and aone-to-one transformation was not possible
The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs
A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF
Ontologies
Linking engine producing the linkeddata representation of the datasets
Triple store OWLIM SE 50
REST Web services
SPARQL endpoints
Web Interface
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Querying Linked Algal Biomass Data
Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query
Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region
Typical QueryWHERE
SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite
siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit
SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source
co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue
continued
Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline
pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit
SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion
owlsameAs relatedFILTER((emissionValue lt 130000)
ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related EffortsConclusions and
Future Work
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets
httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet
httpdatareegleinfo
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Summary
The LEAPS framework exploits SW and LD for the algalbiomass community
enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints
httpwwwalgaebaseorg
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Many Thanks
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Motivation
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Algal biomass as biofuels
Extensive research is being undertaken in the search andproduction of naturally viable and sustainable energysourcesThe idea that algae biomass based biofuels could serve asan alternative to fossil fuels has been embraced bycouncils across the globeMajor companies government bodies and dedicated nonprofit organisations are getting involvedThe domain is a rich source of datainformationknowledge
httpwwwalgalbiomassorghttpwwweaba-associationeu
httpwwwenalgaeeu
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Algal biomass as biofuels Observations
No systematic analysis of the algae biomass potential forNorth-Western EuropeMost of the knowledge buried in various formats of imagesspreadsheets proprietary data sources and grey literatureLack of a knowledge level infrastructure that is equippedwith the capabilities to provide semantic grounding to thedatasets for algal biomassLow levels of motivation among stakeholders for datasetsto be interlinked shared and reused within the biomasscommunity
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
LEAPS A Potential SolutionLinked Entities for Algal Plant Sites
motivate the use of Semantic Web technologies and LODfor the algal biomass domainlaying out a set of ontological requirements for knowledgerepresentation that support the publication of algalbiomass dataelaborating on how algal biomass datasets are transformedto their corresponding RDF model representationinterlinking the generated RDF datasets along spatialdimensions with other datasets on the Web of datavisualising the linked datasets via an end user LOD RESTWeb service
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
EnAlgae Energetic Algae
Aims to reduce CO2 emissions and dependency onunsustainable energy sources in North West Europe4 Year Strategic initiative of Interreg IVb NWE programme
19 partners and 14 Observers across 7 EU states
Coordinated set of activities focussing on sharing bestpractice developing effective stakeholder engagement andencouraging transnational cooperation
httpwwwenalgaeeu
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
EnAlgae Some of the objectives
Accelerate development of sustainable technologies forBiomass productionCreate a network of pilot scale algal facilities across NWEin order to address the current lack of verifiable informationon algal productivityMaintain an up to date inventory in which pilots collect andshare data in a standardised mannerCombine information across the entire algal bioenergydelivery chain into a comprehensive and user friendlyDecision Support System for practitioners policy makersand investors
httpwwwenalgaeeu
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
SW and Linked data for Algal Biomass
Algal biomass data manifests itself across several facetsThe valuesupply chain ranges from cultivation of algae toproduction of biofuels and other productsCultivation harvesting processing and fuel productionfurther involves several intermediate processesEvery stage in the algal supply chain is governed byrequirements regulatory policies and strategiesEach of the facets consumes and produces a large volumeof unstructured data and information
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Algal Supply Chain
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
SW Linked data and the Algal Supply Chain
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Competency questions for stage 1 datasetsData driven
Which are the algal operation sites with CO2 sources thathave CO2 emissions less than 130000 kgs where totalcosts of supplying CO2 is lower then 5000 GBP per ton ofCO2 areal yield is greater than 30 tons per hectare andwhich are located within the NUTS region ldquoUKM61rdquoSupplement the data with supporting information about theregionWhich are the top ten algal operation sites with the lowestimpact on global warming potentialFor a given algal operation site which are the first five mostcost effective combinations of light water nutrients andCO2 sources
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Modelling Algal BiomassKnowledge
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontological requirements
Ontologies needed to representSpatiality location of possible algae cultivation siteslocation of the sources of consumables (CO2 nutrientsand water)Geometries area of the cultivation site - extentspolygons linear and ring arraysUnits and Measurements conventional measurementunits such as Kgs for quantities and hectares for areabespoke units of measurements ie Kgshectare orKgsannumTerritorial units for statistics core concepts of the NUTSsystemDomain specific knowledge algae cultivation sites CO2sources pipelines
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Reuse
Spatial Data WGS84 spatial relations GeonamesNeoGeoGeometries WGS84 extended NeoGeoUnits and Measurements extended QUDT
httpwwww3org200301geowgs84_poshttpwwwordnancesurveycoukoswebsiteontology
spatialrelationsowlhttpwwwgeonamesorgontologyontology_v221rdf
httpgeovocaborggeometryhttpqudtorg11vocabdimensionalunit
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Reuse
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Domainknowledge
Ontologies for modelling spatial knowledge units andmeasurements were reusedDiscovering vocabularies conceptualising the domainknowledge for algal biomass was non trivialConcepts and relationships for algal biomass had to bedefined from ground-up in accordance to the principles ofontology developmentThe design was very strongly guided by feedback fromquestionnaires made available to the stakeholdersinterviews with domain experts providers of raw datasetsand grey literature from the algal biomass and biofuelsdomain
Ontologies for Algal Biomass Domainknowledge
Ontologies available at httppurlorgbiomassontologies
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Designing URIs for Algal Biomass Data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets toLinked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked dataRaw data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format
brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Second stepThe original data sources had several limitations and aone-to-one transformation was not possible
The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs
A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF
Ontologies
Linking engine producing the linkeddata representation of the datasets
Triple store OWLIM SE 50
REST Web services
SPARQL endpoints
Web Interface
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Querying Linked Algal Biomass Data
Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query
Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region
Typical QueryWHERE
SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite
siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit
SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source
co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue
continued
Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline
pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit
SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion
owlsameAs relatedFILTER((emissionValue lt 130000)
ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related EffortsConclusions and
Future Work
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets
httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet
httpdatareegleinfo
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Summary
The LEAPS framework exploits SW and LD for the algalbiomass community
enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints
httpwwwalgaebaseorg
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Many Thanks
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Algal biomass as biofuels
Extensive research is being undertaken in the search andproduction of naturally viable and sustainable energysourcesThe idea that algae biomass based biofuels could serve asan alternative to fossil fuels has been embraced bycouncils across the globeMajor companies government bodies and dedicated nonprofit organisations are getting involvedThe domain is a rich source of datainformationknowledge
httpwwwalgalbiomassorghttpwwweaba-associationeu
httpwwwenalgaeeu
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Algal biomass as biofuels Observations
No systematic analysis of the algae biomass potential forNorth-Western EuropeMost of the knowledge buried in various formats of imagesspreadsheets proprietary data sources and grey literatureLack of a knowledge level infrastructure that is equippedwith the capabilities to provide semantic grounding to thedatasets for algal biomassLow levels of motivation among stakeholders for datasetsto be interlinked shared and reused within the biomasscommunity
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
LEAPS A Potential SolutionLinked Entities for Algal Plant Sites
motivate the use of Semantic Web technologies and LODfor the algal biomass domainlaying out a set of ontological requirements for knowledgerepresentation that support the publication of algalbiomass dataelaborating on how algal biomass datasets are transformedto their corresponding RDF model representationinterlinking the generated RDF datasets along spatialdimensions with other datasets on the Web of datavisualising the linked datasets via an end user LOD RESTWeb service
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
EnAlgae Energetic Algae
Aims to reduce CO2 emissions and dependency onunsustainable energy sources in North West Europe4 Year Strategic initiative of Interreg IVb NWE programme
19 partners and 14 Observers across 7 EU states
Coordinated set of activities focussing on sharing bestpractice developing effective stakeholder engagement andencouraging transnational cooperation
httpwwwenalgaeeu
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
EnAlgae Some of the objectives
Accelerate development of sustainable technologies forBiomass productionCreate a network of pilot scale algal facilities across NWEin order to address the current lack of verifiable informationon algal productivityMaintain an up to date inventory in which pilots collect andshare data in a standardised mannerCombine information across the entire algal bioenergydelivery chain into a comprehensive and user friendlyDecision Support System for practitioners policy makersand investors
httpwwwenalgaeeu
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
SW and Linked data for Algal Biomass
Algal biomass data manifests itself across several facetsThe valuesupply chain ranges from cultivation of algae toproduction of biofuels and other productsCultivation harvesting processing and fuel productionfurther involves several intermediate processesEvery stage in the algal supply chain is governed byrequirements regulatory policies and strategiesEach of the facets consumes and produces a large volumeof unstructured data and information
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Algal Supply Chain
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
SW Linked data and the Algal Supply Chain
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Competency questions for stage 1 datasetsData driven
Which are the algal operation sites with CO2 sources thathave CO2 emissions less than 130000 kgs where totalcosts of supplying CO2 is lower then 5000 GBP per ton ofCO2 areal yield is greater than 30 tons per hectare andwhich are located within the NUTS region ldquoUKM61rdquoSupplement the data with supporting information about theregionWhich are the top ten algal operation sites with the lowestimpact on global warming potentialFor a given algal operation site which are the first five mostcost effective combinations of light water nutrients andCO2 sources
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Modelling Algal BiomassKnowledge
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontological requirements
Ontologies needed to representSpatiality location of possible algae cultivation siteslocation of the sources of consumables (CO2 nutrientsand water)Geometries area of the cultivation site - extentspolygons linear and ring arraysUnits and Measurements conventional measurementunits such as Kgs for quantities and hectares for areabespoke units of measurements ie Kgshectare orKgsannumTerritorial units for statistics core concepts of the NUTSsystemDomain specific knowledge algae cultivation sites CO2sources pipelines
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Reuse
Spatial Data WGS84 spatial relations GeonamesNeoGeoGeometries WGS84 extended NeoGeoUnits and Measurements extended QUDT
httpwwww3org200301geowgs84_poshttpwwwordnancesurveycoukoswebsiteontology
spatialrelationsowlhttpwwwgeonamesorgontologyontology_v221rdf
httpgeovocaborggeometryhttpqudtorg11vocabdimensionalunit
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Reuse
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Domainknowledge
Ontologies for modelling spatial knowledge units andmeasurements were reusedDiscovering vocabularies conceptualising the domainknowledge for algal biomass was non trivialConcepts and relationships for algal biomass had to bedefined from ground-up in accordance to the principles ofontology developmentThe design was very strongly guided by feedback fromquestionnaires made available to the stakeholdersinterviews with domain experts providers of raw datasetsand grey literature from the algal biomass and biofuelsdomain
Ontologies for Algal Biomass Domainknowledge
Ontologies available at httppurlorgbiomassontologies
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Designing URIs for Algal Biomass Data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets toLinked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked dataRaw data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format
brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Second stepThe original data sources had several limitations and aone-to-one transformation was not possible
The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs
A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF
Ontologies
Linking engine producing the linkeddata representation of the datasets
Triple store OWLIM SE 50
REST Web services
SPARQL endpoints
Web Interface
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Querying Linked Algal Biomass Data
Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query
Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region
Typical QueryWHERE
SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite
siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit
SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source
co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue
continued
Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline
pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit
SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion
owlsameAs relatedFILTER((emissionValue lt 130000)
ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related EffortsConclusions and
Future Work
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets
httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet
httpdatareegleinfo
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Summary
The LEAPS framework exploits SW and LD for the algalbiomass community
enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints
httpwwwalgaebaseorg
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Many Thanks
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Algal biomass as biofuels Observations
No systematic analysis of the algae biomass potential forNorth-Western EuropeMost of the knowledge buried in various formats of imagesspreadsheets proprietary data sources and grey literatureLack of a knowledge level infrastructure that is equippedwith the capabilities to provide semantic grounding to thedatasets for algal biomassLow levels of motivation among stakeholders for datasetsto be interlinked shared and reused within the biomasscommunity
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
LEAPS A Potential SolutionLinked Entities for Algal Plant Sites
motivate the use of Semantic Web technologies and LODfor the algal biomass domainlaying out a set of ontological requirements for knowledgerepresentation that support the publication of algalbiomass dataelaborating on how algal biomass datasets are transformedto their corresponding RDF model representationinterlinking the generated RDF datasets along spatialdimensions with other datasets on the Web of datavisualising the linked datasets via an end user LOD RESTWeb service
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
EnAlgae Energetic Algae
Aims to reduce CO2 emissions and dependency onunsustainable energy sources in North West Europe4 Year Strategic initiative of Interreg IVb NWE programme
19 partners and 14 Observers across 7 EU states
Coordinated set of activities focussing on sharing bestpractice developing effective stakeholder engagement andencouraging transnational cooperation
httpwwwenalgaeeu
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
EnAlgae Some of the objectives
Accelerate development of sustainable technologies forBiomass productionCreate a network of pilot scale algal facilities across NWEin order to address the current lack of verifiable informationon algal productivityMaintain an up to date inventory in which pilots collect andshare data in a standardised mannerCombine information across the entire algal bioenergydelivery chain into a comprehensive and user friendlyDecision Support System for practitioners policy makersand investors
httpwwwenalgaeeu
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
SW and Linked data for Algal Biomass
Algal biomass data manifests itself across several facetsThe valuesupply chain ranges from cultivation of algae toproduction of biofuels and other productsCultivation harvesting processing and fuel productionfurther involves several intermediate processesEvery stage in the algal supply chain is governed byrequirements regulatory policies and strategiesEach of the facets consumes and produces a large volumeof unstructured data and information
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Algal Supply Chain
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
SW Linked data and the Algal Supply Chain
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Competency questions for stage 1 datasetsData driven
Which are the algal operation sites with CO2 sources thathave CO2 emissions less than 130000 kgs where totalcosts of supplying CO2 is lower then 5000 GBP per ton ofCO2 areal yield is greater than 30 tons per hectare andwhich are located within the NUTS region ldquoUKM61rdquoSupplement the data with supporting information about theregionWhich are the top ten algal operation sites with the lowestimpact on global warming potentialFor a given algal operation site which are the first five mostcost effective combinations of light water nutrients andCO2 sources
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Modelling Algal BiomassKnowledge
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontological requirements
Ontologies needed to representSpatiality location of possible algae cultivation siteslocation of the sources of consumables (CO2 nutrientsand water)Geometries area of the cultivation site - extentspolygons linear and ring arraysUnits and Measurements conventional measurementunits such as Kgs for quantities and hectares for areabespoke units of measurements ie Kgshectare orKgsannumTerritorial units for statistics core concepts of the NUTSsystemDomain specific knowledge algae cultivation sites CO2sources pipelines
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Reuse
Spatial Data WGS84 spatial relations GeonamesNeoGeoGeometries WGS84 extended NeoGeoUnits and Measurements extended QUDT
httpwwww3org200301geowgs84_poshttpwwwordnancesurveycoukoswebsiteontology
spatialrelationsowlhttpwwwgeonamesorgontologyontology_v221rdf
httpgeovocaborggeometryhttpqudtorg11vocabdimensionalunit
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Reuse
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Domainknowledge
Ontologies for modelling spatial knowledge units andmeasurements were reusedDiscovering vocabularies conceptualising the domainknowledge for algal biomass was non trivialConcepts and relationships for algal biomass had to bedefined from ground-up in accordance to the principles ofontology developmentThe design was very strongly guided by feedback fromquestionnaires made available to the stakeholdersinterviews with domain experts providers of raw datasetsand grey literature from the algal biomass and biofuelsdomain
Ontologies for Algal Biomass Domainknowledge
Ontologies available at httppurlorgbiomassontologies
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Designing URIs for Algal Biomass Data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets toLinked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked dataRaw data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format
brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Second stepThe original data sources had several limitations and aone-to-one transformation was not possible
The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs
A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF
Ontologies
Linking engine producing the linkeddata representation of the datasets
Triple store OWLIM SE 50
REST Web services
SPARQL endpoints
Web Interface
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Querying Linked Algal Biomass Data
Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query
Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region
Typical QueryWHERE
SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite
siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit
SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source
co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue
continued
Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline
pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit
SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion
owlsameAs relatedFILTER((emissionValue lt 130000)
ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related EffortsConclusions and
Future Work
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets
httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet
httpdatareegleinfo
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Summary
The LEAPS framework exploits SW and LD for the algalbiomass community
enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints
httpwwwalgaebaseorg
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Many Thanks
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
LEAPS A Potential SolutionLinked Entities for Algal Plant Sites
motivate the use of Semantic Web technologies and LODfor the algal biomass domainlaying out a set of ontological requirements for knowledgerepresentation that support the publication of algalbiomass dataelaborating on how algal biomass datasets are transformedto their corresponding RDF model representationinterlinking the generated RDF datasets along spatialdimensions with other datasets on the Web of datavisualising the linked datasets via an end user LOD RESTWeb service
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
EnAlgae Energetic Algae
Aims to reduce CO2 emissions and dependency onunsustainable energy sources in North West Europe4 Year Strategic initiative of Interreg IVb NWE programme
19 partners and 14 Observers across 7 EU states
Coordinated set of activities focussing on sharing bestpractice developing effective stakeholder engagement andencouraging transnational cooperation
httpwwwenalgaeeu
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
EnAlgae Some of the objectives
Accelerate development of sustainable technologies forBiomass productionCreate a network of pilot scale algal facilities across NWEin order to address the current lack of verifiable informationon algal productivityMaintain an up to date inventory in which pilots collect andshare data in a standardised mannerCombine information across the entire algal bioenergydelivery chain into a comprehensive and user friendlyDecision Support System for practitioners policy makersand investors
httpwwwenalgaeeu
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
SW and Linked data for Algal Biomass
Algal biomass data manifests itself across several facetsThe valuesupply chain ranges from cultivation of algae toproduction of biofuels and other productsCultivation harvesting processing and fuel productionfurther involves several intermediate processesEvery stage in the algal supply chain is governed byrequirements regulatory policies and strategiesEach of the facets consumes and produces a large volumeof unstructured data and information
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Algal Supply Chain
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
SW Linked data and the Algal Supply Chain
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Competency questions for stage 1 datasetsData driven
Which are the algal operation sites with CO2 sources thathave CO2 emissions less than 130000 kgs where totalcosts of supplying CO2 is lower then 5000 GBP per ton ofCO2 areal yield is greater than 30 tons per hectare andwhich are located within the NUTS region ldquoUKM61rdquoSupplement the data with supporting information about theregionWhich are the top ten algal operation sites with the lowestimpact on global warming potentialFor a given algal operation site which are the first five mostcost effective combinations of light water nutrients andCO2 sources
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Modelling Algal BiomassKnowledge
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontological requirements
Ontologies needed to representSpatiality location of possible algae cultivation siteslocation of the sources of consumables (CO2 nutrientsand water)Geometries area of the cultivation site - extentspolygons linear and ring arraysUnits and Measurements conventional measurementunits such as Kgs for quantities and hectares for areabespoke units of measurements ie Kgshectare orKgsannumTerritorial units for statistics core concepts of the NUTSsystemDomain specific knowledge algae cultivation sites CO2sources pipelines
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Reuse
Spatial Data WGS84 spatial relations GeonamesNeoGeoGeometries WGS84 extended NeoGeoUnits and Measurements extended QUDT
httpwwww3org200301geowgs84_poshttpwwwordnancesurveycoukoswebsiteontology
spatialrelationsowlhttpwwwgeonamesorgontologyontology_v221rdf
httpgeovocaborggeometryhttpqudtorg11vocabdimensionalunit
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Reuse
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Domainknowledge
Ontologies for modelling spatial knowledge units andmeasurements were reusedDiscovering vocabularies conceptualising the domainknowledge for algal biomass was non trivialConcepts and relationships for algal biomass had to bedefined from ground-up in accordance to the principles ofontology developmentThe design was very strongly guided by feedback fromquestionnaires made available to the stakeholdersinterviews with domain experts providers of raw datasetsand grey literature from the algal biomass and biofuelsdomain
Ontologies for Algal Biomass Domainknowledge
Ontologies available at httppurlorgbiomassontologies
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Designing URIs for Algal Biomass Data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets toLinked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked dataRaw data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format
brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Second stepThe original data sources had several limitations and aone-to-one transformation was not possible
The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs
A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF
Ontologies
Linking engine producing the linkeddata representation of the datasets
Triple store OWLIM SE 50
REST Web services
SPARQL endpoints
Web Interface
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Querying Linked Algal Biomass Data
Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query
Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region
Typical QueryWHERE
SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite
siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit
SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source
co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue
continued
Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline
pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit
SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion
owlsameAs relatedFILTER((emissionValue lt 130000)
ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related EffortsConclusions and
Future Work
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets
httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet
httpdatareegleinfo
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Summary
The LEAPS framework exploits SW and LD for the algalbiomass community
enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints
httpwwwalgaebaseorg
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Many Thanks
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
EnAlgae Energetic Algae
Aims to reduce CO2 emissions and dependency onunsustainable energy sources in North West Europe4 Year Strategic initiative of Interreg IVb NWE programme
19 partners and 14 Observers across 7 EU states
Coordinated set of activities focussing on sharing bestpractice developing effective stakeholder engagement andencouraging transnational cooperation
httpwwwenalgaeeu
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
EnAlgae Some of the objectives
Accelerate development of sustainable technologies forBiomass productionCreate a network of pilot scale algal facilities across NWEin order to address the current lack of verifiable informationon algal productivityMaintain an up to date inventory in which pilots collect andshare data in a standardised mannerCombine information across the entire algal bioenergydelivery chain into a comprehensive and user friendlyDecision Support System for practitioners policy makersand investors
httpwwwenalgaeeu
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
SW and Linked data for Algal Biomass
Algal biomass data manifests itself across several facetsThe valuesupply chain ranges from cultivation of algae toproduction of biofuels and other productsCultivation harvesting processing and fuel productionfurther involves several intermediate processesEvery stage in the algal supply chain is governed byrequirements regulatory policies and strategiesEach of the facets consumes and produces a large volumeof unstructured data and information
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Algal Supply Chain
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
SW Linked data and the Algal Supply Chain
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Competency questions for stage 1 datasetsData driven
Which are the algal operation sites with CO2 sources thathave CO2 emissions less than 130000 kgs where totalcosts of supplying CO2 is lower then 5000 GBP per ton ofCO2 areal yield is greater than 30 tons per hectare andwhich are located within the NUTS region ldquoUKM61rdquoSupplement the data with supporting information about theregionWhich are the top ten algal operation sites with the lowestimpact on global warming potentialFor a given algal operation site which are the first five mostcost effective combinations of light water nutrients andCO2 sources
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Modelling Algal BiomassKnowledge
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontological requirements
Ontologies needed to representSpatiality location of possible algae cultivation siteslocation of the sources of consumables (CO2 nutrientsand water)Geometries area of the cultivation site - extentspolygons linear and ring arraysUnits and Measurements conventional measurementunits such as Kgs for quantities and hectares for areabespoke units of measurements ie Kgshectare orKgsannumTerritorial units for statistics core concepts of the NUTSsystemDomain specific knowledge algae cultivation sites CO2sources pipelines
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Reuse
Spatial Data WGS84 spatial relations GeonamesNeoGeoGeometries WGS84 extended NeoGeoUnits and Measurements extended QUDT
httpwwww3org200301geowgs84_poshttpwwwordnancesurveycoukoswebsiteontology
spatialrelationsowlhttpwwwgeonamesorgontologyontology_v221rdf
httpgeovocaborggeometryhttpqudtorg11vocabdimensionalunit
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Reuse
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Domainknowledge
Ontologies for modelling spatial knowledge units andmeasurements were reusedDiscovering vocabularies conceptualising the domainknowledge for algal biomass was non trivialConcepts and relationships for algal biomass had to bedefined from ground-up in accordance to the principles ofontology developmentThe design was very strongly guided by feedback fromquestionnaires made available to the stakeholdersinterviews with domain experts providers of raw datasetsand grey literature from the algal biomass and biofuelsdomain
Ontologies for Algal Biomass Domainknowledge
Ontologies available at httppurlorgbiomassontologies
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Designing URIs for Algal Biomass Data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets toLinked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked dataRaw data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format
brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Second stepThe original data sources had several limitations and aone-to-one transformation was not possible
The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs
A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF
Ontologies
Linking engine producing the linkeddata representation of the datasets
Triple store OWLIM SE 50
REST Web services
SPARQL endpoints
Web Interface
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Querying Linked Algal Biomass Data
Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query
Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region
Typical QueryWHERE
SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite
siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit
SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source
co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue
continued
Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline
pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit
SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion
owlsameAs relatedFILTER((emissionValue lt 130000)
ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related EffortsConclusions and
Future Work
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets
httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet
httpdatareegleinfo
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Summary
The LEAPS framework exploits SW and LD for the algalbiomass community
enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints
httpwwwalgaebaseorg
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Many Thanks
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
EnAlgae Some of the objectives
Accelerate development of sustainable technologies forBiomass productionCreate a network of pilot scale algal facilities across NWEin order to address the current lack of verifiable informationon algal productivityMaintain an up to date inventory in which pilots collect andshare data in a standardised mannerCombine information across the entire algal bioenergydelivery chain into a comprehensive and user friendlyDecision Support System for practitioners policy makersand investors
httpwwwenalgaeeu
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
SW and Linked data for Algal Biomass
Algal biomass data manifests itself across several facetsThe valuesupply chain ranges from cultivation of algae toproduction of biofuels and other productsCultivation harvesting processing and fuel productionfurther involves several intermediate processesEvery stage in the algal supply chain is governed byrequirements regulatory policies and strategiesEach of the facets consumes and produces a large volumeof unstructured data and information
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Algal Supply Chain
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
SW Linked data and the Algal Supply Chain
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Competency questions for stage 1 datasetsData driven
Which are the algal operation sites with CO2 sources thathave CO2 emissions less than 130000 kgs where totalcosts of supplying CO2 is lower then 5000 GBP per ton ofCO2 areal yield is greater than 30 tons per hectare andwhich are located within the NUTS region ldquoUKM61rdquoSupplement the data with supporting information about theregionWhich are the top ten algal operation sites with the lowestimpact on global warming potentialFor a given algal operation site which are the first five mostcost effective combinations of light water nutrients andCO2 sources
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Modelling Algal BiomassKnowledge
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontological requirements
Ontologies needed to representSpatiality location of possible algae cultivation siteslocation of the sources of consumables (CO2 nutrientsand water)Geometries area of the cultivation site - extentspolygons linear and ring arraysUnits and Measurements conventional measurementunits such as Kgs for quantities and hectares for areabespoke units of measurements ie Kgshectare orKgsannumTerritorial units for statistics core concepts of the NUTSsystemDomain specific knowledge algae cultivation sites CO2sources pipelines
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Reuse
Spatial Data WGS84 spatial relations GeonamesNeoGeoGeometries WGS84 extended NeoGeoUnits and Measurements extended QUDT
httpwwww3org200301geowgs84_poshttpwwwordnancesurveycoukoswebsiteontology
spatialrelationsowlhttpwwwgeonamesorgontologyontology_v221rdf
httpgeovocaborggeometryhttpqudtorg11vocabdimensionalunit
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Reuse
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Domainknowledge
Ontologies for modelling spatial knowledge units andmeasurements were reusedDiscovering vocabularies conceptualising the domainknowledge for algal biomass was non trivialConcepts and relationships for algal biomass had to bedefined from ground-up in accordance to the principles ofontology developmentThe design was very strongly guided by feedback fromquestionnaires made available to the stakeholdersinterviews with domain experts providers of raw datasetsand grey literature from the algal biomass and biofuelsdomain
Ontologies for Algal Biomass Domainknowledge
Ontologies available at httppurlorgbiomassontologies
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Designing URIs for Algal Biomass Data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets toLinked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked dataRaw data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format
brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Second stepThe original data sources had several limitations and aone-to-one transformation was not possible
The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs
A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF
Ontologies
Linking engine producing the linkeddata representation of the datasets
Triple store OWLIM SE 50
REST Web services
SPARQL endpoints
Web Interface
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Querying Linked Algal Biomass Data
Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query
Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region
Typical QueryWHERE
SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite
siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit
SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source
co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue
continued
Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline
pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit
SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion
owlsameAs relatedFILTER((emissionValue lt 130000)
ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related EffortsConclusions and
Future Work
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets
httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet
httpdatareegleinfo
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Summary
The LEAPS framework exploits SW and LD for the algalbiomass community
enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints
httpwwwalgaebaseorg
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Many Thanks
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
SW and Linked data for Algal Biomass
Algal biomass data manifests itself across several facetsThe valuesupply chain ranges from cultivation of algae toproduction of biofuels and other productsCultivation harvesting processing and fuel productionfurther involves several intermediate processesEvery stage in the algal supply chain is governed byrequirements regulatory policies and strategiesEach of the facets consumes and produces a large volumeof unstructured data and information
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Algal Supply Chain
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
SW Linked data and the Algal Supply Chain
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Competency questions for stage 1 datasetsData driven
Which are the algal operation sites with CO2 sources thathave CO2 emissions less than 130000 kgs where totalcosts of supplying CO2 is lower then 5000 GBP per ton ofCO2 areal yield is greater than 30 tons per hectare andwhich are located within the NUTS region ldquoUKM61rdquoSupplement the data with supporting information about theregionWhich are the top ten algal operation sites with the lowestimpact on global warming potentialFor a given algal operation site which are the first five mostcost effective combinations of light water nutrients andCO2 sources
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Modelling Algal BiomassKnowledge
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontological requirements
Ontologies needed to representSpatiality location of possible algae cultivation siteslocation of the sources of consumables (CO2 nutrientsand water)Geometries area of the cultivation site - extentspolygons linear and ring arraysUnits and Measurements conventional measurementunits such as Kgs for quantities and hectares for areabespoke units of measurements ie Kgshectare orKgsannumTerritorial units for statistics core concepts of the NUTSsystemDomain specific knowledge algae cultivation sites CO2sources pipelines
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Reuse
Spatial Data WGS84 spatial relations GeonamesNeoGeoGeometries WGS84 extended NeoGeoUnits and Measurements extended QUDT
httpwwww3org200301geowgs84_poshttpwwwordnancesurveycoukoswebsiteontology
spatialrelationsowlhttpwwwgeonamesorgontologyontology_v221rdf
httpgeovocaborggeometryhttpqudtorg11vocabdimensionalunit
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Reuse
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Domainknowledge
Ontologies for modelling spatial knowledge units andmeasurements were reusedDiscovering vocabularies conceptualising the domainknowledge for algal biomass was non trivialConcepts and relationships for algal biomass had to bedefined from ground-up in accordance to the principles ofontology developmentThe design was very strongly guided by feedback fromquestionnaires made available to the stakeholdersinterviews with domain experts providers of raw datasetsand grey literature from the algal biomass and biofuelsdomain
Ontologies for Algal Biomass Domainknowledge
Ontologies available at httppurlorgbiomassontologies
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Designing URIs for Algal Biomass Data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets toLinked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked dataRaw data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format
brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Second stepThe original data sources had several limitations and aone-to-one transformation was not possible
The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs
A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF
Ontologies
Linking engine producing the linkeddata representation of the datasets
Triple store OWLIM SE 50
REST Web services
SPARQL endpoints
Web Interface
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Querying Linked Algal Biomass Data
Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query
Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region
Typical QueryWHERE
SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite
siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit
SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source
co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue
continued
Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline
pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit
SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion
owlsameAs relatedFILTER((emissionValue lt 130000)
ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related EffortsConclusions and
Future Work
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets
httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet
httpdatareegleinfo
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Summary
The LEAPS framework exploits SW and LD for the algalbiomass community
enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints
httpwwwalgaebaseorg
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Many Thanks
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Algal Supply Chain
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
SW Linked data and the Algal Supply Chain
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Competency questions for stage 1 datasetsData driven
Which are the algal operation sites with CO2 sources thathave CO2 emissions less than 130000 kgs where totalcosts of supplying CO2 is lower then 5000 GBP per ton ofCO2 areal yield is greater than 30 tons per hectare andwhich are located within the NUTS region ldquoUKM61rdquoSupplement the data with supporting information about theregionWhich are the top ten algal operation sites with the lowestimpact on global warming potentialFor a given algal operation site which are the first five mostcost effective combinations of light water nutrients andCO2 sources
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Modelling Algal BiomassKnowledge
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontological requirements
Ontologies needed to representSpatiality location of possible algae cultivation siteslocation of the sources of consumables (CO2 nutrientsand water)Geometries area of the cultivation site - extentspolygons linear and ring arraysUnits and Measurements conventional measurementunits such as Kgs for quantities and hectares for areabespoke units of measurements ie Kgshectare orKgsannumTerritorial units for statistics core concepts of the NUTSsystemDomain specific knowledge algae cultivation sites CO2sources pipelines
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Reuse
Spatial Data WGS84 spatial relations GeonamesNeoGeoGeometries WGS84 extended NeoGeoUnits and Measurements extended QUDT
httpwwww3org200301geowgs84_poshttpwwwordnancesurveycoukoswebsiteontology
spatialrelationsowlhttpwwwgeonamesorgontologyontology_v221rdf
httpgeovocaborggeometryhttpqudtorg11vocabdimensionalunit
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Reuse
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Domainknowledge
Ontologies for modelling spatial knowledge units andmeasurements were reusedDiscovering vocabularies conceptualising the domainknowledge for algal biomass was non trivialConcepts and relationships for algal biomass had to bedefined from ground-up in accordance to the principles ofontology developmentThe design was very strongly guided by feedback fromquestionnaires made available to the stakeholdersinterviews with domain experts providers of raw datasetsand grey literature from the algal biomass and biofuelsdomain
Ontologies for Algal Biomass Domainknowledge
Ontologies available at httppurlorgbiomassontologies
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Designing URIs for Algal Biomass Data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets toLinked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked dataRaw data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format
brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Second stepThe original data sources had several limitations and aone-to-one transformation was not possible
The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs
A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF
Ontologies
Linking engine producing the linkeddata representation of the datasets
Triple store OWLIM SE 50
REST Web services
SPARQL endpoints
Web Interface
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Querying Linked Algal Biomass Data
Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query
Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region
Typical QueryWHERE
SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite
siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit
SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source
co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue
continued
Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline
pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit
SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion
owlsameAs relatedFILTER((emissionValue lt 130000)
ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related EffortsConclusions and
Future Work
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets
httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet
httpdatareegleinfo
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Summary
The LEAPS framework exploits SW and LD for the algalbiomass community
enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints
httpwwwalgaebaseorg
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Many Thanks
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
SW Linked data and the Algal Supply Chain
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Competency questions for stage 1 datasetsData driven
Which are the algal operation sites with CO2 sources thathave CO2 emissions less than 130000 kgs where totalcosts of supplying CO2 is lower then 5000 GBP per ton ofCO2 areal yield is greater than 30 tons per hectare andwhich are located within the NUTS region ldquoUKM61rdquoSupplement the data with supporting information about theregionWhich are the top ten algal operation sites with the lowestimpact on global warming potentialFor a given algal operation site which are the first five mostcost effective combinations of light water nutrients andCO2 sources
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Modelling Algal BiomassKnowledge
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontological requirements
Ontologies needed to representSpatiality location of possible algae cultivation siteslocation of the sources of consumables (CO2 nutrientsand water)Geometries area of the cultivation site - extentspolygons linear and ring arraysUnits and Measurements conventional measurementunits such as Kgs for quantities and hectares for areabespoke units of measurements ie Kgshectare orKgsannumTerritorial units for statistics core concepts of the NUTSsystemDomain specific knowledge algae cultivation sites CO2sources pipelines
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Reuse
Spatial Data WGS84 spatial relations GeonamesNeoGeoGeometries WGS84 extended NeoGeoUnits and Measurements extended QUDT
httpwwww3org200301geowgs84_poshttpwwwordnancesurveycoukoswebsiteontology
spatialrelationsowlhttpwwwgeonamesorgontologyontology_v221rdf
httpgeovocaborggeometryhttpqudtorg11vocabdimensionalunit
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Reuse
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Domainknowledge
Ontologies for modelling spatial knowledge units andmeasurements were reusedDiscovering vocabularies conceptualising the domainknowledge for algal biomass was non trivialConcepts and relationships for algal biomass had to bedefined from ground-up in accordance to the principles ofontology developmentThe design was very strongly guided by feedback fromquestionnaires made available to the stakeholdersinterviews with domain experts providers of raw datasetsand grey literature from the algal biomass and biofuelsdomain
Ontologies for Algal Biomass Domainknowledge
Ontologies available at httppurlorgbiomassontologies
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Designing URIs for Algal Biomass Data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets toLinked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked dataRaw data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format
brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Second stepThe original data sources had several limitations and aone-to-one transformation was not possible
The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs
A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF
Ontologies
Linking engine producing the linkeddata representation of the datasets
Triple store OWLIM SE 50
REST Web services
SPARQL endpoints
Web Interface
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Querying Linked Algal Biomass Data
Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query
Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region
Typical QueryWHERE
SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite
siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit
SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source
co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue
continued
Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline
pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit
SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion
owlsameAs relatedFILTER((emissionValue lt 130000)
ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related EffortsConclusions and
Future Work
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets
httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet
httpdatareegleinfo
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Summary
The LEAPS framework exploits SW and LD for the algalbiomass community
enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints
httpwwwalgaebaseorg
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Many Thanks
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Competency questions for stage 1 datasetsData driven
Which are the algal operation sites with CO2 sources thathave CO2 emissions less than 130000 kgs where totalcosts of supplying CO2 is lower then 5000 GBP per ton ofCO2 areal yield is greater than 30 tons per hectare andwhich are located within the NUTS region ldquoUKM61rdquoSupplement the data with supporting information about theregionWhich are the top ten algal operation sites with the lowestimpact on global warming potentialFor a given algal operation site which are the first five mostcost effective combinations of light water nutrients andCO2 sources
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Modelling Algal BiomassKnowledge
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontological requirements
Ontologies needed to representSpatiality location of possible algae cultivation siteslocation of the sources of consumables (CO2 nutrientsand water)Geometries area of the cultivation site - extentspolygons linear and ring arraysUnits and Measurements conventional measurementunits such as Kgs for quantities and hectares for areabespoke units of measurements ie Kgshectare orKgsannumTerritorial units for statistics core concepts of the NUTSsystemDomain specific knowledge algae cultivation sites CO2sources pipelines
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Reuse
Spatial Data WGS84 spatial relations GeonamesNeoGeoGeometries WGS84 extended NeoGeoUnits and Measurements extended QUDT
httpwwww3org200301geowgs84_poshttpwwwordnancesurveycoukoswebsiteontology
spatialrelationsowlhttpwwwgeonamesorgontologyontology_v221rdf
httpgeovocaborggeometryhttpqudtorg11vocabdimensionalunit
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Reuse
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Domainknowledge
Ontologies for modelling spatial knowledge units andmeasurements were reusedDiscovering vocabularies conceptualising the domainknowledge for algal biomass was non trivialConcepts and relationships for algal biomass had to bedefined from ground-up in accordance to the principles ofontology developmentThe design was very strongly guided by feedback fromquestionnaires made available to the stakeholdersinterviews with domain experts providers of raw datasetsand grey literature from the algal biomass and biofuelsdomain
Ontologies for Algal Biomass Domainknowledge
Ontologies available at httppurlorgbiomassontologies
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Designing URIs for Algal Biomass Data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets toLinked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked dataRaw data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format
brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Second stepThe original data sources had several limitations and aone-to-one transformation was not possible
The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs
A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF
Ontologies
Linking engine producing the linkeddata representation of the datasets
Triple store OWLIM SE 50
REST Web services
SPARQL endpoints
Web Interface
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Querying Linked Algal Biomass Data
Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query
Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region
Typical QueryWHERE
SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite
siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit
SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source
co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue
continued
Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline
pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit
SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion
owlsameAs relatedFILTER((emissionValue lt 130000)
ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related EffortsConclusions and
Future Work
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets
httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet
httpdatareegleinfo
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Summary
The LEAPS framework exploits SW and LD for the algalbiomass community
enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints
httpwwwalgaebaseorg
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Many Thanks
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Modelling Algal BiomassKnowledge
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontological requirements
Ontologies needed to representSpatiality location of possible algae cultivation siteslocation of the sources of consumables (CO2 nutrientsand water)Geometries area of the cultivation site - extentspolygons linear and ring arraysUnits and Measurements conventional measurementunits such as Kgs for quantities and hectares for areabespoke units of measurements ie Kgshectare orKgsannumTerritorial units for statistics core concepts of the NUTSsystemDomain specific knowledge algae cultivation sites CO2sources pipelines
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Reuse
Spatial Data WGS84 spatial relations GeonamesNeoGeoGeometries WGS84 extended NeoGeoUnits and Measurements extended QUDT
httpwwww3org200301geowgs84_poshttpwwwordnancesurveycoukoswebsiteontology
spatialrelationsowlhttpwwwgeonamesorgontologyontology_v221rdf
httpgeovocaborggeometryhttpqudtorg11vocabdimensionalunit
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Reuse
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Domainknowledge
Ontologies for modelling spatial knowledge units andmeasurements were reusedDiscovering vocabularies conceptualising the domainknowledge for algal biomass was non trivialConcepts and relationships for algal biomass had to bedefined from ground-up in accordance to the principles ofontology developmentThe design was very strongly guided by feedback fromquestionnaires made available to the stakeholdersinterviews with domain experts providers of raw datasetsand grey literature from the algal biomass and biofuelsdomain
Ontologies for Algal Biomass Domainknowledge
Ontologies available at httppurlorgbiomassontologies
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Designing URIs for Algal Biomass Data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets toLinked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked dataRaw data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format
brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Second stepThe original data sources had several limitations and aone-to-one transformation was not possible
The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs
A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF
Ontologies
Linking engine producing the linkeddata representation of the datasets
Triple store OWLIM SE 50
REST Web services
SPARQL endpoints
Web Interface
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Querying Linked Algal Biomass Data
Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query
Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region
Typical QueryWHERE
SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite
siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit
SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source
co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue
continued
Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline
pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit
SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion
owlsameAs relatedFILTER((emissionValue lt 130000)
ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related EffortsConclusions and
Future Work
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets
httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet
httpdatareegleinfo
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Summary
The LEAPS framework exploits SW and LD for the algalbiomass community
enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints
httpwwwalgaebaseorg
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Many Thanks
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontological requirements
Ontologies needed to representSpatiality location of possible algae cultivation siteslocation of the sources of consumables (CO2 nutrientsand water)Geometries area of the cultivation site - extentspolygons linear and ring arraysUnits and Measurements conventional measurementunits such as Kgs for quantities and hectares for areabespoke units of measurements ie Kgshectare orKgsannumTerritorial units for statistics core concepts of the NUTSsystemDomain specific knowledge algae cultivation sites CO2sources pipelines
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Reuse
Spatial Data WGS84 spatial relations GeonamesNeoGeoGeometries WGS84 extended NeoGeoUnits and Measurements extended QUDT
httpwwww3org200301geowgs84_poshttpwwwordnancesurveycoukoswebsiteontology
spatialrelationsowlhttpwwwgeonamesorgontologyontology_v221rdf
httpgeovocaborggeometryhttpqudtorg11vocabdimensionalunit
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Reuse
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Domainknowledge
Ontologies for modelling spatial knowledge units andmeasurements were reusedDiscovering vocabularies conceptualising the domainknowledge for algal biomass was non trivialConcepts and relationships for algal biomass had to bedefined from ground-up in accordance to the principles ofontology developmentThe design was very strongly guided by feedback fromquestionnaires made available to the stakeholdersinterviews with domain experts providers of raw datasetsand grey literature from the algal biomass and biofuelsdomain
Ontologies for Algal Biomass Domainknowledge
Ontologies available at httppurlorgbiomassontologies
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Designing URIs for Algal Biomass Data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets toLinked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked dataRaw data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format
brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Second stepThe original data sources had several limitations and aone-to-one transformation was not possible
The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs
A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF
Ontologies
Linking engine producing the linkeddata representation of the datasets
Triple store OWLIM SE 50
REST Web services
SPARQL endpoints
Web Interface
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Querying Linked Algal Biomass Data
Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query
Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region
Typical QueryWHERE
SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite
siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit
SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source
co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue
continued
Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline
pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit
SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion
owlsameAs relatedFILTER((emissionValue lt 130000)
ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related EffortsConclusions and
Future Work
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets
httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet
httpdatareegleinfo
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Summary
The LEAPS framework exploits SW and LD for the algalbiomass community
enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints
httpwwwalgaebaseorg
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Many Thanks
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Reuse
Spatial Data WGS84 spatial relations GeonamesNeoGeoGeometries WGS84 extended NeoGeoUnits and Measurements extended QUDT
httpwwww3org200301geowgs84_poshttpwwwordnancesurveycoukoswebsiteontology
spatialrelationsowlhttpwwwgeonamesorgontologyontology_v221rdf
httpgeovocaborggeometryhttpqudtorg11vocabdimensionalunit
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Reuse
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Domainknowledge
Ontologies for modelling spatial knowledge units andmeasurements were reusedDiscovering vocabularies conceptualising the domainknowledge for algal biomass was non trivialConcepts and relationships for algal biomass had to bedefined from ground-up in accordance to the principles ofontology developmentThe design was very strongly guided by feedback fromquestionnaires made available to the stakeholdersinterviews with domain experts providers of raw datasetsand grey literature from the algal biomass and biofuelsdomain
Ontologies for Algal Biomass Domainknowledge
Ontologies available at httppurlorgbiomassontologies
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Designing URIs for Algal Biomass Data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets toLinked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked dataRaw data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format
brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Second stepThe original data sources had several limitations and aone-to-one transformation was not possible
The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs
A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF
Ontologies
Linking engine producing the linkeddata representation of the datasets
Triple store OWLIM SE 50
REST Web services
SPARQL endpoints
Web Interface
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Querying Linked Algal Biomass Data
Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query
Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region
Typical QueryWHERE
SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite
siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit
SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source
co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue
continued
Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline
pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit
SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion
owlsameAs relatedFILTER((emissionValue lt 130000)
ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related EffortsConclusions and
Future Work
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets
httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet
httpdatareegleinfo
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Summary
The LEAPS framework exploits SW and LD for the algalbiomass community
enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints
httpwwwalgaebaseorg
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Many Thanks
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Reuse
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Domainknowledge
Ontologies for modelling spatial knowledge units andmeasurements were reusedDiscovering vocabularies conceptualising the domainknowledge for algal biomass was non trivialConcepts and relationships for algal biomass had to bedefined from ground-up in accordance to the principles ofontology developmentThe design was very strongly guided by feedback fromquestionnaires made available to the stakeholdersinterviews with domain experts providers of raw datasetsand grey literature from the algal biomass and biofuelsdomain
Ontologies for Algal Biomass Domainknowledge
Ontologies available at httppurlorgbiomassontologies
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Designing URIs for Algal Biomass Data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets toLinked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked dataRaw data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format
brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Second stepThe original data sources had several limitations and aone-to-one transformation was not possible
The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs
A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF
Ontologies
Linking engine producing the linkeddata representation of the datasets
Triple store OWLIM SE 50
REST Web services
SPARQL endpoints
Web Interface
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Querying Linked Algal Biomass Data
Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query
Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region
Typical QueryWHERE
SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite
siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit
SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source
co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue
continued
Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline
pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit
SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion
owlsameAs relatedFILTER((emissionValue lt 130000)
ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related EffortsConclusions and
Future Work
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets
httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet
httpdatareegleinfo
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Summary
The LEAPS framework exploits SW and LD for the algalbiomass community
enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints
httpwwwalgaebaseorg
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Many Thanks
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Ontologies for Algal Biomass Domainknowledge
Ontologies for modelling spatial knowledge units andmeasurements were reusedDiscovering vocabularies conceptualising the domainknowledge for algal biomass was non trivialConcepts and relationships for algal biomass had to bedefined from ground-up in accordance to the principles ofontology developmentThe design was very strongly guided by feedback fromquestionnaires made available to the stakeholdersinterviews with domain experts providers of raw datasetsand grey literature from the algal biomass and biofuelsdomain
Ontologies for Algal Biomass Domainknowledge
Ontologies available at httppurlorgbiomassontologies
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Designing URIs for Algal Biomass Data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets toLinked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked dataRaw data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format
brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Second stepThe original data sources had several limitations and aone-to-one transformation was not possible
The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs
A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF
Ontologies
Linking engine producing the linkeddata representation of the datasets
Triple store OWLIM SE 50
REST Web services
SPARQL endpoints
Web Interface
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Querying Linked Algal Biomass Data
Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query
Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region
Typical QueryWHERE
SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite
siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit
SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source
co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue
continued
Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline
pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit
SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion
owlsameAs relatedFILTER((emissionValue lt 130000)
ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related EffortsConclusions and
Future Work
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets
httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet
httpdatareegleinfo
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Summary
The LEAPS framework exploits SW and LD for the algalbiomass community
enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints
httpwwwalgaebaseorg
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Many Thanks
Ontologies for Algal Biomass Domainknowledge
Ontologies available at httppurlorgbiomassontologies
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Designing URIs for Algal Biomass Data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets toLinked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked dataRaw data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format
brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Second stepThe original data sources had several limitations and aone-to-one transformation was not possible
The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs
A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF
Ontologies
Linking engine producing the linkeddata representation of the datasets
Triple store OWLIM SE 50
REST Web services
SPARQL endpoints
Web Interface
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Querying Linked Algal Biomass Data
Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query
Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region
Typical QueryWHERE
SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite
siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit
SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source
co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue
continued
Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline
pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit
SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion
owlsameAs relatedFILTER((emissionValue lt 130000)
ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related EffortsConclusions and
Future Work
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets
httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet
httpdatareegleinfo
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Summary
The LEAPS framework exploits SW and LD for the algalbiomass community
enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints
httpwwwalgaebaseorg
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Many Thanks
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Designing URIs for Algal Biomass Data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets toLinked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked dataRaw data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format
brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Second stepThe original data sources had several limitations and aone-to-one transformation was not possible
The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs
A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF
Ontologies
Linking engine producing the linkeddata representation of the datasets
Triple store OWLIM SE 50
REST Web services
SPARQL endpoints
Web Interface
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Querying Linked Algal Biomass Data
Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query
Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region
Typical QueryWHERE
SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite
siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit
SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source
co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue
continued
Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline
pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit
SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion
owlsameAs relatedFILTER((emissionValue lt 130000)
ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related EffortsConclusions and
Future Work
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets
httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet
httpdatareegleinfo
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Summary
The LEAPS framework exploits SW and LD for the algalbiomass community
enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints
httpwwwalgaebaseorg
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Many Thanks
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets toLinked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked dataRaw data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format
brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Second stepThe original data sources had several limitations and aone-to-one transformation was not possible
The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs
A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF
Ontologies
Linking engine producing the linkeddata representation of the datasets
Triple store OWLIM SE 50
REST Web services
SPARQL endpoints
Web Interface
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Querying Linked Algal Biomass Data
Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query
Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region
Typical QueryWHERE
SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite
siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit
SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source
co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue
continued
Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline
pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit
SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion
owlsameAs relatedFILTER((emissionValue lt 130000)
ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related EffortsConclusions and
Future Work
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets
httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet
httpdatareegleinfo
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Summary
The LEAPS framework exploits SW and LD for the algalbiomass community
enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints
httpwwwalgaebaseorg
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Many Thanks
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked dataRaw data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format
brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Second stepThe original data sources had several limitations and aone-to-one transformation was not possible
The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs
A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF
Ontologies
Linking engine producing the linkeddata representation of the datasets
Triple store OWLIM SE 50
REST Web services
SPARQL endpoints
Web Interface
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Querying Linked Algal Biomass Data
Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query
Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region
Typical QueryWHERE
SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite
siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit
SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source
co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue
continued
Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline
pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit
SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion
owlsameAs relatedFILTER((emissionValue lt 130000)
ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related EffortsConclusions and
Future Work
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets
httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet
httpdatareegleinfo
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Summary
The LEAPS framework exploits SW and LD for the algalbiomass community
enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints
httpwwwalgaebaseorg
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Many Thanks
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format
brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Second stepThe original data sources had several limitations and aone-to-one transformation was not possible
The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs
A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF
Ontologies
Linking engine producing the linkeddata representation of the datasets
Triple store OWLIM SE 50
REST Web services
SPARQL endpoints
Web Interface
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Querying Linked Algal Biomass Data
Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query
Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region
Typical QueryWHERE
SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite
siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit
SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source
co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue
continued
Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline
pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit
SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion
owlsameAs relatedFILTER((emissionValue lt 130000)
ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related EffortsConclusions and
Future Work
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets
httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet
httpdatareegleinfo
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Summary
The LEAPS framework exploits SW and LD for the algalbiomass community
enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints
httpwwwalgaebaseorg
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Many Thanks
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Second stepThe original data sources had several limitations and aone-to-one transformation was not possible
The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs
A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF
Ontologies
Linking engine producing the linkeddata representation of the datasets
Triple store OWLIM SE 50
REST Web services
SPARQL endpoints
Web Interface
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Querying Linked Algal Biomass Data
Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query
Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region
Typical QueryWHERE
SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite
siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit
SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source
co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue
continued
Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline
pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit
SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion
owlsameAs relatedFILTER((emissionValue lt 130000)
ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related EffortsConclusions and
Future Work
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets
httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet
httpdatareegleinfo
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Summary
The LEAPS framework exploits SW and LD for the algalbiomass community
enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints
httpwwwalgaebaseorg
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Many Thanks
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF
Ontologies
Linking engine producing the linkeddata representation of the datasets
Triple store OWLIM SE 50
REST Web services
SPARQL endpoints
Web Interface
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Querying Linked Algal Biomass Data
Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query
Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region
Typical QueryWHERE
SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite
siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit
SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source
co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue
continued
Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline
pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit
SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion
owlsameAs relatedFILTER((emissionValue lt 130000)
ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related EffortsConclusions and
Future Work
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets
httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet
httpdatareegleinfo
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Summary
The LEAPS framework exploits SW and LD for the algalbiomass community
enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints
httpwwwalgaebaseorg
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Many Thanks
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Lifting XML datasets to Linked data
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF
Ontologies
Linking engine producing the linkeddata representation of the datasets
Triple store OWLIM SE 50
REST Web services
SPARQL endpoints
Web Interface
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Querying Linked Algal Biomass Data
Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query
Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region
Typical QueryWHERE
SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite
siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit
SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source
co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue
continued
Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline
pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit
SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion
owlsameAs relatedFILTER((emissionValue lt 130000)
ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related EffortsConclusions and
Future Work
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets
httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet
httpdatareegleinfo
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Summary
The LEAPS framework exploits SW and LD for the algalbiomass community
enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints
httpwwwalgaebaseorg
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Many Thanks
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF
Ontologies
Linking engine producing the linkeddata representation of the datasets
Triple store OWLIM SE 50
REST Web services
SPARQL endpoints
Web Interface
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Querying Linked Algal Biomass Data
Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query
Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region
Typical QueryWHERE
SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite
siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit
SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source
co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue
continued
Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline
pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit
SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion
owlsameAs relatedFILTER((emissionValue lt 130000)
ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related EffortsConclusions and
Future Work
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets
httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet
httpdatareegleinfo
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Summary
The LEAPS framework exploits SW and LD for the algalbiomass community
enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints
httpwwwalgaebaseorg
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Many Thanks
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
System Architecture
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF
Ontologies
Linking engine producing the linkeddata representation of the datasets
Triple store OWLIM SE 50
REST Web services
SPARQL endpoints
Web Interface
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Querying Linked Algal Biomass Data
Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query
Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region
Typical QueryWHERE
SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite
siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit
SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source
co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue
continued
Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline
pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit
SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion
owlsameAs relatedFILTER((emissionValue lt 130000)
ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related EffortsConclusions and
Future Work
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets
httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet
httpdatareegleinfo
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Summary
The LEAPS framework exploits SW and LD for the algalbiomass community
enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints
httpwwwalgaebaseorg
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Many Thanks
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF
Ontologies
Linking engine producing the linkeddata representation of the datasets
Triple store OWLIM SE 50
REST Web services
SPARQL endpoints
Web Interface
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Querying Linked Algal Biomass Data
Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query
Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region
Typical QueryWHERE
SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite
siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit
SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source
co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue
continued
Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline
pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit
SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion
owlsameAs relatedFILTER((emissionValue lt 130000)
ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related EffortsConclusions and
Future Work
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets
httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet
httpdatareegleinfo
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Summary
The LEAPS framework exploits SW and LD for the algalbiomass community
enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints
httpwwwalgaebaseorg
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Many Thanks
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Querying Linked Algal Biomass Data
Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query
Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region
Typical QueryWHERE
SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite
siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit
SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source
co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue
continued
Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline
pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit
SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion
owlsameAs relatedFILTER((emissionValue lt 130000)
ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related EffortsConclusions and
Future Work
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets
httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet
httpdatareegleinfo
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Summary
The LEAPS framework exploits SW and LD for the algalbiomass community
enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints
httpwwwalgaebaseorg
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Many Thanks
Typical QueryWHERE
SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite
siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit
SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source
co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue
continued
Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline
pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit
SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion
owlsameAs relatedFILTER((emissionValue lt 130000)
ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related EffortsConclusions and
Future Work
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets
httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet
httpdatareegleinfo
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Summary
The LEAPS framework exploits SW and LD for the algalbiomass community
enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints
httpwwwalgaebaseorg
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Many Thanks
Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline
pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit
SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion
owlsameAs relatedFILTER((emissionValue lt 130000)
ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related EffortsConclusions and
Future Work
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets
httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet
httpdatareegleinfo
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Summary
The LEAPS framework exploits SW and LD for the algalbiomass community
enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints
httpwwwalgaebaseorg
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Many Thanks
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related EffortsConclusions and
Future Work
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets
httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet
httpdatareegleinfo
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Summary
The LEAPS framework exploits SW and LD for the algalbiomass community
enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints
httpwwwalgaebaseorg
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Many Thanks
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets
httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet
httpdatareegleinfo
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Summary
The LEAPS framework exploits SW and LD for the algalbiomass community
enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints
httpwwwalgaebaseorg
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Many Thanks
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Summary
The LEAPS framework exploits SW and LD for the algalbiomass community
enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints
httpwwwalgaebaseorg
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Many Thanks
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Summary
The LEAPS framework exploits SW and LD for the algalbiomass community
enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints
httpwwwalgaebaseorg
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Many Thanks
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints
httpwwwalgaebaseorg
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Many Thanks
monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz
Many Thanks