Upload
orcid-0000-0002-2668-4821
View
1.433
Download
4
Tags:
Embed Size (px)
DESCRIPTION
I am an adjunct prof at University of North Carolina Chapel Hill so when I stopped by yesterday for a business meeting I was informed that I had been lined up to give a talk to the students at 1pm. I had 20 minutes to prepare and assembled a mish-mash of information that might be of value to Citizen Chemists, those who might want to contribute to chemistry on the internet
Citation preview
Taming the Wild, Wild West of Chemistry on the Internet. MaybeYOU Can Help?
Citizen Scientists Enable the Web
Who is writing about chemical compounds on Wikipedia?
Who is writing critical reviews of Chemistry online?
Who is blogging about chemistry on the web?
For Synthesis…TotallySynthetic.com
Org Prep Daily (Blog)
Molbank (Open Access Journal)
Synthetic Pages (Website)
Encyclopedic Articles (Wikipedia)
Chemistry online – An Overview Encyclopedic articles (Wikipedia) Chemical vendor databases Metabolic pathway databases Property databases Chemical Synthesis procedures Scientific publications Chemical vendors Blogs Wikis Open Notebook Science
What and who do you trust?
Compounds and Identifiers
What is ChemSpider? ChemSpider is:
Building a Structure Centric Community for Chemists >23 million compounds, ca. 250 data sources
A deposition and curation platform
A publishing platform for the community
Grows daily – more depositions, more links, more data sources
Search Cholesterol
Search Cholesterol
Search Cholesterol
Search Cholesterol
Search Cholesterol
Linked across the internet
Link off a structure in ChemSpider
Chemical suppliers Other publications Analytical Data Related Reactions Wikipedia Patents “Everything”
Linked to Millions of Articles
Answering Questions for Chemists
Questions a chemist might ask… What is the melting point of n-butanol? What is the chemical structure of Xanax? Chemically, what is phenolphthalein? What are the stereocenters of cholesterol? Where can I find publications about xylene? What are the different trade names for Ketoconazole? What is the NMR spectrum of Aspirin? What are the safety handling issues for Thymol Blue?
What is the structure of Flibanserin?
What is the structure of Flibanserin?
Complex Data and Information
Various Searches
Structure searching
Substructure searching
Subset searching – choose from 200 data sources
Property searching
Searches are used in various ways by different types of chemists…
ChemSpider Searches
ChemSpider Searches
Antony Williams vs Identifiers
Passport ID
Dad, Tony, others
SSN
Green Card
License5 email addressesChemSpiderman (blog, Twitter account, Facebook, Friendfeed)OpenID….
Aspirin vs Chemical Identifiers
Aspirin names and synonyms
• Text searches depend on correct association
• 335 suggested identifiers for Aspirin just on PubChem!
• Disambiguation dictionaries are necessary
The Final Search Strategy
All Those Names, One Structure
Connections Can Lead Anywhere
The InChI Identifier
Multiple Layers
InChIStrings Hash to InChIKeys
Oleoylethanolamine
InChI=1S/C20H39NO2/c1-2-3-4-5-6-7-8-9-10-11-12-13-14-15-16-17-20(23)21-18-19-22/h9-10,22H,2-8,11-19H2,1H3,(H,21,23)/b10-9-
BOWVQLFMWHZBEF-KTKRTIGZSA-N
Search Engine Dependencies
Search Engine Dependencies
Vancomycin
Vancomycin
Who will curate?
How would you clean such a large dataset?
Chemistry on the Internet
Much of the information is based on assertions and User Beware!
The Quality of information available is diverse and how does the user know what is and is not “correct”?
Caution! Question Everything!
Question Everything online: www.dhmo.org
Vancomycin on ChemSpider
Vancomycin
Vancomycin
Search Molecular SKELETON
Search Full Molecule
Full Skeleton Search: 104 Hits
Full Molecule Search: 4 Hits
The EXPERTS must get it right?!
Wikipedia, C&E News, PubChem C&E News (from ACS)
“Lathosterol”
“Lathosterol”
“Lathosterol”
“Lathosterol” Removed
“Lathosterol” on PubChem
Crowd-sourcing Chemistry Curation
Crowd-sourced curation: identify/tag errors, edit names, synonyms, identify records to deprecate
Citizen Scientists
Become a Data Source
Synthesis Procedures
Links to Data or Deposit Data
Your Blog Posted Online?
Upload Spectral Data, OPEN Data?
Semantic Mark-up for Chemistry
Semantic mark-up for chemistry is here
RSC project prospect (structure linking, IUPAC Gold Book ontology and other ontologies). Based on the OSCAR system
ChemSpider Journal of Chemistry
Nature publishing group compound linking
ChemMantis and CJOC
Name-Structure Pairs
Deposit Structures
Species – linked to Wikipedia
In Development ChemSpider Synthesis
ChemSpider Synthesis will be a home for all things “synthetic”
An online resource for synthetic procedures from blogs, other online resources, RSC supplementary info, other publishers etc.
Public peer-review and feedback for synthetic procedures
Online Journals and Live Data
ChemSpider Everywhere : Embed
ChemSpider Everywhere: Spectral Game
ChemSpider EverywhereCrowdsourced Curation of Spectra
Building a Structure Centric Community for Chemists
ChemSpider EverywhereChemMobi
ChemSpider Everywhere Linked from Wikipedia
Linked from Open Notebook Science sites
Linked from Blogs using Structure/Spectra
Integrated into structure drawing packages such as ACD/ChemSketch, Symyx Draw, Open Source applets
Where is ChemSpider Lacking?
ChemSpider is limited to “defined chemicals”. No support for: Polymers Minerals Markush structures
ChemSpider is very dependent on InChIs Stereochemistry around non-carbon centers Organometallics are not correctly represented
There are millions of errors on ChemSpider
What’s next? Keep cleaning and depositing data
Enable discovery via the semantic web (RDF)
Integrate software: Symyx Jdraw, NMRShiftDB
Integrate RSC content – a massive archive!
Integrate RSC publishing workflows and databases
Continue Building Community for Chemistry
Building a Public ADME/Tox database
Delivering ChemSpider Synthetic Pages
Delivering ChemSpider Analytical Data
Delivering ChemSpider Education
Project Focus
People Make Change HappenYou are invited.. Curate ChemSpider data and link to us
Deposit your data with us Structures Spectra Synthesis procedures
ChemSpider Synthesis is under development
People Make Change Happen ChemSpider was a “hobby project”
Housed in a basement and running off three servers – one bought, two built
Sensitive to weather and power stability
Went live at ACS Spring 2007 in Chicago
ca. 6000 visitors a day, >50,000 transactions daily
Organizations Scale Innovation
Thank you
[email protected]: ChemSpidermanwww.chemspider.com/blogSLIDES: www.slideshare.net/AntonyWilliams