4

Click here to load reader

Collection 4

Embed Size (px)

Citation preview

Page 1: Collection 4

Chemical Information Sources/Cheminformatics 1

Chemical Information Sources/CheminformaticsIntroduction

Cheminformatics is the application of information technology to the investigation of chemistry research problemsand to the organization and analysis of chemical data. Cheminformaticians work with huge amounts of data anddevelop systems to organize and evaluate data to give new insights for further chemical research. There is a fine linebetween theoretical chemistry/computational chemistry and cheminformatics. Cheminformatics has had its biggestimpact in the pharmaceutical industry, although its techniques and tools are beginning to be applied to other areas ofchemistry.

How Can Cheminformatics Help?

Cheminformatics can help chemists and other scientists produce and manage information. In silico analysis usingcheminformatics techniques can actually reduce the risks of developing a drug. Such techniqes as virtual screening,library design, and docking figure into the analysis. Physical properties that might have an impact on whether asubstance could potentially be developed as a drug are often examined in cheminformatics as features that can becompared among large numbers of substances. An example is clogP, a measure of the amount of fattiness in thesystem. Sometimes, inferences can be drawn about a related set of properties, as when Chris Lipinski formulated hisnow famous Rule of Five that says that compounds which are drug-like tend to have 5 or fewer hydrogen donoratoms, 10 or fewer hydrogen acceptor atoms, calculated logP less than or equal to 5, and molecular weight up to 500.Compounds that exhibit greater than these values tend to have poor absorption or permeation.

Techniques Used In Cheminformatics

•• Represenation of molecular structures (2D, 3D, protein structures, 3-point pharmacophores, fragments)•• Graph isomorphism: determining if 2 graphs are identical, e.g., by comparing connection•• Line notations, e.g., SMILES.

•• Representation of chemical reactions•• Molecular Modeling (simulations) and Molecular Diversity•• Structure-Activity Relationships (QSAR, QSPR)•• Combinatorial Chemistry and High-Throughput Screening•• Calculation of physicochemical effects•• Topological Indices•• Statistics

InChI--The IUPAC International Chemical Identifier

An InChI [1] is a character string generated by computer algorithm to represent a chemical structure. It is used insoftware applications and databases where chemical structures need to be represented as machine-readable strings ofinformation. InChIs are unique to the compound they describe and can encode absolute stereochemistry. InChI hasbeen called the bar-code for chemistry and chemical structures. The InChI format and algorithm are non-proprietaryand the software is open source, with ongoing development done by the community.Steve Heller wrote in a 9/15/2010 posting on CHMINF-L that virtually all major publishers are now supportingInChI and are adding the InChI/InChIKey to the chemicals reported in journal articles. InChI's and InChIKeys aresearchable in Google, Yahoo, Bing, and other search engines. The two major NIH databases (PubChem and NCI)have over 60 million InChI's, while ChemSpider has well over 20 million. All the major commercial and OpenSource structure drawing programs have imbedded InChI generation in their products. InChIs are freely usable andnon-proprietary. They allow a more advanced representation of chemical information than other codes (such as theSMILES code). InChIs are unambiguous (i.e., conversion of chemical structures using standardized algorithms only

Page 2: Collection 4

Chemical Information Sources/Cheminformatics 2

leads to one InChI), and they are precisely indexed by major search engines such as Google.

Standards for Coding Chemical Data

In order for cheminformatics to succeed, certain standards had to be developed, although often a development of adominant company turned into a standard coding method if made public, as in the case of MDL's SDF format ormore recently, their CTfile format [2]. In the field of crystallography, the CIF format is widely used for smallmolecules and mmCIF for macromolecules. Even for such things as the color of molecules in in a 3D depiction, it isimportant to follow standards. For example, the CPK (Corey-Pauling-Koltun) representation for color codingrequires:•• Carbon: grey or black (although some use green)•• Hydrogen: white•• Oxygen: red•• Nitrogen: blue•• Sulfur: yellow•• Phosphorous: orange•• Chlorine: green•• Sodium: blue•• Iron: purple•• Bromine: brown•• Zinc: brown•• Calcium: dark grey•• Other metals: dark grey•• Unknown: deep pinkCPK models have their atomic radii defined to reflect the space which molecules take up when they pack in solids orassociate in liquids.

Current Issues in Cheminformatics

•• What is a small molecule?•• What is an adequate representtion of a sample?•• Property calculations vs. measurements•• Scoring functions for drug-like molecules•• Docking for ligand binding prediction•• Calculating diversity and similarity•• Where do cheminformatics and bioinformatics merge?•• Toxicology, ADME (Absorption, Distribution, Metabolism, Excrection), and other pieces of the puzzle for drugs•• Depictions of structure and visualization of data•• Electronic notebooks

Page 3: Collection 4

Chemical Information Sources/Cheminformatics 3

Summary

Cheminformatics (or as it is more commonly known in Europe, chemoinformatics) has almost as long a history asthe computer itself. It is the application of computer technology and methods to chemistry. Related fields aremolecular modeling and computational chemistry. Chemiformatic techniques have found particular applications inthe drug industry, but are now beginning to penetrate other areas of chemistry.CIIM Link for further studySIRCh Link for Cheminformatics•• Cheminformatics Introductory Resources

References[1] http:/ / www. inchi-trust. org/[2] http:/ / www. mdl. com/ downloads/ public/ ctfile/ ctfile. jsp

Page 4: Collection 4

Article Sources and Contributors 4

Article Sources and ContributorsChemical Information Sources/Cheminformatics  Source: http://en.wikibooks.org/w/index.php?oldid=2014823  Contributors: Adrignola, Gary Dorman Wiggins

LicenseCreative Commons Attribution-Share Alike 3.0 Unported//creativecommons.org/licenses/by-sa/3.0/