Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Data and Ontologies
Colin Batchelor, Royal Society of Chemistry Leah McEwen, Cornell University
Overview
Thinking about chemical safety
Representing experiments
Ontologies and the IUPAC Colour books
Gaps
What can ontologies do for my use case?
Provide a controlled vocabulary for referents – what your data describes, where it came from.
Provide a shared vocabulary for integrating with other people’s data.
Safety
Recall the noun classes in Dyirbal (a language spoken in Queensland) (1) Men, most animate objects (2) Women, fire and dangerous things (3) Edible fruit and vegetables (4) Things not mentioned in the first three classes What dangerous things should we be identifying?
Caution! Dangerous things! “Although we have encountered no problems in handling Cu-azido during this work, it should be treated with great caution owing to their potential explosive nature. Thus, it should only be handled in small amounts.” doi:10.1039/b702988h “Whilst no problems were encountered in the course of this work, perchlorate mixtures are potentially explosive and should therefore be handled with appropriate care.” doi:10.1039/b304841a “Thallium compounds are highly toxic; they should therefore be handled with extreme caution and all operations must be carried out in an efficient fume hood.” doi:10.1039/b102192n “we have not seen deflagration or detonation of any unconfined samples in the ignition experiments, some salts with high-oxygen and high-nitrogen content are known to be explosives, so appropriate precautions are advisable with new compounds” doi:10.1039/b602086k “Metal azide complexes are potentially explosive. Only a small amount of material should be prepared and should be handled with caution” doi:10.1039/b106314f “perchlorate salts of metal complexes are potentially explosive” doi:10.1039/b005671p “Special caution was taken in the handling of fluoranthene and in the preparation of the MIPs.” doi:10.1039/b502706c “anhydrous HF is an extremely corrosive and low boiling gas (19.5 °C) and should be handled in a well ventilated hood with protective gloves, face mask and clothing.” doi:10.1039/b206168f
A ‘simple’ taxonomy of dangerous things
Dangerous chemicals: perchlorates, thallium compounds, polonium, azides
… which have the disposition to take part in…
Dangerous processes: explosion, decarbonylation
… under certain experimental conditions…
… and these processes can be prevented (blocking the dispositions) or mitigated by…
Safety measures
Fume hoods mitigate carbon monoxide emission (part of decarbonylation)
Protective clothing prevents burning.
Greener solvents improve waste handling.
How do we express all this in an ontological framework?
Representing processes (1)
RDF is based on binary relations.
A process: “Brutus stabbed Caesar”
Simplest RDF form: ORCID:Marcus_Junius_Brutus ONT:stabbed ORCID:Gaius_Julius_Caesar .
Unsafe workplace practices, 44 BC
The Death of Caesar, Vincenzo Camuccini (1771–1844), via Wikipedia.
Representing processes (1)
A naïve representation of: “Brutus stabbed Caesar”:
Subject: ORCID:Marcus_Junius_Brutus Predicate: ONT:stabbed Object: ORCID:Gaius_Julius_Caesar .
Representing processes (2)
But what about “Brutus stabbed Caesar in the Senate on the Ides of March”?
Make the focus of the RDF the process rather than the participants.
Hence (next page):
Representing processes (2) _:e1 a ONT:stabbing;
ONT:has_agent ORCID:Marcus_Junius_Brutus;
ONT:has_patient ORCID:Gaius_Julius_Caesar;
ONT:has_location ONT:Senate;
ONT:at_time "-‐0043-‐15-‐03T14:00:00"^^xsd:datetime .
We can now add arbitrarily many facts about this event without minting too many new predicates; better for OWL; better for risk assessment.
Example synthesis
purging addition
addition
dinitrogen
pivaloyl chloride
tert-butyl alcohol
stirring and heating
addition with stirring
trifluoromethanesulfonic acid
flask
cooling
heating mantle
ice bath
precipitation
washing
diethyl ether
diethyl ether
drying filter
air
PRODUCT
Chemical processes: OreChem OreChem has a planning/enactment split.
From OreChem’s perspective, the account of planning is prospective and the account of enactment is retrospective.
Planning relates processes to other processes. Such-and-such a process follows another.
Enactment relates the products of processes to other products of processes. Such-and-such an artefact is produced from another.
Method signatures
A sample preparation step takes a material entity and converts it into some other. (m → m)
Detection methods and measurement methods take material entities and produce data. (m → d)
Data transformation methods take data and transform it into other data (d → d)
Ontology scope
Scale Process signature Molecular Laboratory m → m RXNO CHMO m → d CHMO CHMO d → d CHEMINF
Chemical processes: CHMO
CHMO is a Chemical Methods Ontology that provides classes that describe both the processes (in a planning view) and the artefacts (in an enactment view).
From the safety discussion: where are the gaps? • Connections to chemical hazard information
resources (GHS and Bretherick’s) • More thorough description of common lab apparatus • Combined processes (addition while stirring)
Other chemical hazard and safety concepts
Ontologies and the IUPAC Colour Books: an important question
Is an ontology the best tool for codifying a colour book? If it contains terminological recommendations, then these definitions can be used for the ontology. Examples follow.
Red Book: counterexample This specifies an algorithm for relating names to structures rather than a definition. It could be part of a recipe for generating definitions for an identified set of hydride molecules.
Gold Book example
“Phototransistor = A bipolar transistor with its base-collector junction acting as a photodiode, which, if irradiated, controls the response of the device.”
This is a classical Aristotelian (genus–differentia) definition and is well suited to an ontology.
Green Book examples
Definitions of fundamental constants would fit well into an appropriate ontology: Notational recommendations would fit better into an automated article checker or writing assistant:
Ontologies and the IUPAC Colour Books: an overview
Ontologies and the IUPAC Colour Books: where are the gaps?
ChEBI
CHMO
gap
gap OPSIN
gap gap
gap
OBO ontologies
What if nothing already exists?
Are there vocabulary recommendations of the right sort?
(Suited to an ontology rather than some other sort of tool.)
If not, and in any case these will be incomplete, try…
a Gedankenexperiment
… following the event-based approach described earlier to divide your domain into processes and their participants.
www.irampp.org/blog