Content TCM ingredients and databases Digital representation of TCM ingredients

Preview:

DESCRIPTION

Lecture 10 Cheminformatics of TCM Y.Z. Chen Department of Pharmacy National University of Singapore Tel: 65-6616-6877; Email: phacyz@nus.edu.sg ; Web: http://bidd.nus.edu.sg. Content TCM ingredients and databases Digital representation of TCM ingredients Molecular descriptors - PowerPoint PPT Presentation

Citation preview

Lecture 10 Cheminformatics of TCMLecture 10 Cheminformatics of TCM

Y.Z. ChenY.Z. Chen Department of PharmacyDepartment of Pharmacy

National University of SingaporeNational University of Singapore Tel: 65-6616-6877; Email: Tel: 65-6616-6877; Email: phacyz@nus.edu.sg ; Web: ; Web: http://bidd.nus.edu.sg

ContentContent

• TCM ingredients and databasesTCM ingredients and databases

• Digital representation of TCM ingredientsDigital representation of TCM ingredients

• Molecular descriptors Molecular descriptors

• TCM ingredient classification by molecular descriptorsTCM ingredient classification by molecular descriptors

TCM IngredientsTCM Ingredients

Pharmacology & Therapeutics 2000, 86:191-198

Medicinal Herb Databases at BIDDMedicinal Herb Databases at BIDD

Comparison with existing TCM databases:

Formula: TCM-ID: 1000 TCHFL: 270Herb: TCM-ID: 1200 TCSHL: 520 TCMD: 1500Compound: TCM-ID: 9000 CNPD: 3000 TCMD: 6800

TCM Formula

Herb

Herb

Compound

Compound

Compound

Compound

Protein

Protein

Protein

Protein

Protein

Protein

Protein

Protein

Protein

Protein

Function

Function

Function

Function

Function

Function

Function

Structure

Structure

Structure

Structure

TCMD

TCHF Library

CNPD

TCSH Library

TCM-ID: Traditional Chinese Medicine - Information Database

Only database providing integrated and comprehensive info about:• TCM formula, constituent herbs, herbal ingredients, effect on proteins• Molecular structure • Function at the formula, herb and compound levels

TCM-ID Database at BIDDTCM-ID Database at BIDD

http://bidd.nus.edu.sg/group/TCMsite/Default.aspx

TCM-ID Database at BIDDTCM-ID Database at BIDD

http://bidd.nus.edu.sg/group/TCMsite/Default.aspx

TCM-ID Database at BIDDTCM-ID Database at BIDD

TCM-ID Database at BIDDTCM-ID Database at BIDD

TCM-ID Database at BIDDTCM-ID Database at BIDD

http://bidd.nus.edu.sg/group/TCMsite/Default.aspx

TCM-ID Database at BIDDTCM-ID Database at BIDD

http://bidd.nus.edu.sg/group/TCMsite/Default.aspx

TCM-ID Database at BIDDTCM-ID Database at BIDD

http://bidd.nus.edu.sg/group/TCMsite/Default.aspx

TCM-ID Database at BIDDTCM-ID Database at BIDD

TCM-ID Database at BIDDTCM-ID Database at BIDD

TCM-ID Database at BIDDTCM-ID Database at BIDD

TCM-ID Database at BIDDTCM-ID Database at BIDD

TCM-ID Database at BIDDTCM-ID Database at BIDD

http://bidd.nus.edu.sg/group/TCMsite/Default.aspx

TCM-ID Database at BIDDTCM-ID Database at BIDD

http://bidd.nus.edu.sg/group/TCMsite/Default.aspx

TCM-ID Database at BIDDTCM-ID Database at BIDD

http://bidd.nus.edu.sg/group/TCMsite/Default.aspx

PUBCHEM DatabasePUBCHEM Database

http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=90781

PUBCHEM DatabasePUBCHEM Database

http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=90781

Representation of Herbal Representation of Herbal Ingredients by SMILESIngredients by SMILES

• Simplified Molecular Input Line Entry System (SMILES)

• Widely used AND computationally efficient• Uses atomic symbols and a set of intuitive

rules• Uses hydrogen-suppressed molecular

graphs (HSMG)

SMILES BondsSMILES Bonds

SINGLE*

DOUBLE

TRIPLE

AROMATIC*

* can be omitted

-

=

#

:

ButanolsButanols

2-Butanol

iso-Butanol

tert-Butanol

O

O

O

SMILES BranchesSMILES Branches

• Represented by enclosure in parentheses

• Can be nested or stacked

• Examples:CC(O)CC is 2-Butanol

OCC(C)C is iso-Butanol

OC(C)(C)C is tert-Butanol

SMILES BondsSMILES Bonds

Ethene

Chloroethene

1,1-Dichloroethene

cis-1,2-Dichloroethene

Trichloroethene

Perchloroethene

C=C

ClC=C

ClC(Cl)=C

ClC=CCl

ClC(Cl)=CCl

ClC(Cl)=C(Cl)Cl

SMILES AtomsSMILES Atoms

• Use normal chemical symbols

• Add punctuation symbols if necessary

• No super- or subscripts

SMILES SymbolsSMILES Symbols

• String of alphanumeric characters and certain punctuation symbols

• Terminates at the first space encountered when read left to right

• The ORGANIC SUBSET:

B, C, N, O, P, S, F, Cl, Br, I

Other SMILES AtomsOther SMILES Atoms

• Aliphatic or nonaromatic carbon: C

• Atom in aromatic ring: lowercase letter

• Designate ring closure with pairs of matching digits, e.g.

c1ccccc1 (or C1=CC=CC=C1) is Benzene, whereas

C1CCCCC1 is Cyclohexane

SMILES ChargesSMILES Charges

• Specify attached hydrogens and charges in square brackets

• Number of attached hydrogens is the symbol H followed by optional digit

SMILES ChargesSMILES Charges

[H+]

[OH-]

[OH3+]

[Fe++]

[NH4+]

proton

hydroxyl anion

hydronium cation

iron(II) cation

ammonium cation

SMILES Cyclic StructuresSMILES Cyclic Structures

• Break one single or one aromatic bond in each ring

• Number in any order– Designate ring-breaking atoms by the same

digit following the atomic symbol

Representation of Herbal Ingredients Representation of Herbal Ingredients by Molecular Descriptorby Molecular Descriptor

• Molecular descriptors are numerical values that characterize properties of molecules

• Examples:– Physicochemical properties (empirical)– Values from algorithms, such as 2D

fingerprints

• Vary in complexity of encoded information and in compute time

3232

Molecular DescriptorsMolecular Descriptors• Constitutional

– MW, N atoms,

• Topological– Connectivity,Weiner index

• Electrostatic – Polarity, polarizability, partial charges

• Geometrical Descriptors– Length, width, Molecular volume

• Quantum Chemical– HOMO and LUMO energies– Vibrational frequencies– Bond orders– Total energy

3333

Molecular DescriptorsMolecular Descriptors

• van der Waals volume– The sum of the non-overlaping

volume of van der Waals sphere of each atom of the molecule

• Molecular surface– The area of the surface

contours generated by rolling a probing sphere against the surface atoms of the molecule

3434

• Molecular size vectors– Define ranges for

distances and angles

C

(

u

)

O

(

s

1

)

O

(

s

1

)

A

A

[

O

,

S

]

O

3.6 - 4.6 Å

3.3 - 4.3 Å

6.8 - 7.8 Å

Molecular DescriptorsMolecular Descriptors

Molecular Descriptors Molecular Descriptors for Large Data Setsfor Large Data Sets

• Descriptors representing properties of complete molecules– Examples: LogP, Molar Refractivity

• Descriptors calculated from 2D graphs– Examples: Topological Indexes, 2D

fingerprints

• Descriptors requiring 3D representations• Example: Pharmacophore descriptors

Molecular Descriptors Calculated Molecular Descriptors Calculated From 2D StructuresFrom 2D Structures

• Simple counts of features– Lipinski Rule of Five (H bonds, MW, etc.)– Number of ring systems– Number of rotatable bonds

• Not likely to discriminate sufficiently when used alone

• Combined with other descriptors for best effect

Physicochemical PropertiesPhysicochemical Properties

• Hydrophobicity– LogP – the logarithm of the partition coefficient

between n-octanol and water

• ClogP (Leo and Hansch) – based on small set of values from a small set of simple molecules– BioByte: http://www.biobyte.com/

– Daylight’s MedChem Help page

– http://www.daylight.com/dayhtml/databases/medchem/medchem-help.html

– Isolating carbon: one not doubly or triply bonded to a heteroatom

3838

Molecular Descriptor LogP Molecular Descriptor LogP

Octanol-Water Partition

Coefficients

• P = C(octanol) / C(water)• log P

like rG = - RT ln Keq

• Hydrophobic - hydrophilic character• P increases then more hydrophobic

Octanol

H O2

TCM Ingredient Classification TCM Ingredient Classification by Molecular Descriptorsby Molecular Descriptors

J. Chem. Inf. Model., Vol. 47, No. 6, 2007

TCM Ingredient Classification TCM Ingredient Classification by Molecular Descriptorsby Molecular Descriptors

Classification of TCM ingredients of specific chemical classes by decision trees method

TCM Ingredient Classification TCM Ingredient Classification by Molecular Descriptorsby Molecular Descriptors

TCM Ingredient Classification by Molecular TCM Ingredient Classification by Molecular DescriptorsDescriptors

TCM Ingredient Classification by Molecular TCM Ingredient Classification by Molecular DescriptorsDescriptors

Classification of TCM ingredients of specific chemical classes by decision trees method

TCM Ingredient Classification by Molecular TCM Ingredient Classification by Molecular DescriptorsDescriptors

Distribution of TCM ingredients of specific chemical classes without using decision trees method

TCM Ingredient Classification by Molecular DescriptorsTCM Ingredient Classification by Molecular Descriptors

Distribution of TCM ingredients of specific chemical classes without using decision trees method

AcknowledgementAcknowledgement

Current Group Members: • Computer-Aided Drug Design: CY Ung, XH Ma, XH Liu, Pankaj Kumar, F Zhu, X Liu, J Jia• Protein Function, Interaction, Network: HL Zhang, CY Ung, XH Ma, F Zhu, WK Teo, Z Shi• Databases and Servers: J Jia• Medicinal Herb: CY Ung, Pankaj Kumar, Cao Jinyi(undergraduate students)• Microarray and biomarkers: J Jia, ZQ Tang

Former Members:

PhD:ZW Cao (Prof SCBIT, Tongji U), ZL Ji (Assoc Prof Xiamen U), X Chen (Assoc Prof Zhejiang U), CW Yap (Assist Prof NUS), LY Han (Postdoc NIH), CJ Zheng (Postdoc NIH), HH Lin (Postdoc Harvard ), J Cui (Postdoc U Georgia), H Li (Postdoc Einstein College Med)

Research Fellow/Assistant:ZR Li (Assoc Prof SiChuan U), Y Xue (Prof SiChuan U), W Liu (Assoc Prof DUT), D Mi (Assoc Prof DUT), CZ Cai (Prof ChongQing U), DG Zhi (Postdoc, Berkeley),

MSc:Y.J. Guo (Postdoc NIH), L.Z. Sun (RA, U Tenn.), J. F. Wang (MSU), L.X. Yao (Columbia), S Ong (Washington U), H Zhou (local company), B Xie (local company)

BSc:W.K. Yeo (IMCB, Novartis)

Recommended