27
1 PowerMV Chemical Data Mining Environment S. Stanley Young Jun Feng and Jack Liu NISS MPDM, McMaster University 4 June 2005

PowerMV Chemical Data Mining Environment

Embed Size (px)

DESCRIPTION

PowerMV Chemical Data Mining Environment. S. Stanley Young Jun Feng and Jack Liu NISS MPDM, McMaster University 4 June 2005. Outline. PowerMV, a chemistry data mining environment. rSVD, robust singular value decomposition, Liu, Hawkins, Ghosh,Young, PNAS (2003). Space-filling designs. - PowerPoint PPT Presentation

Citation preview

Page 1: PowerMV   Chemical Data Mining Environment

1

PowerMV Chemical Data Mining Environment

S. Stanley YoungJun Feng and Jack Liu

NISS

MPDM, McMaster University4 June 2005

Page 2: PowerMV   Chemical Data Mining Environment

2

Outline

1. PowerMV, a chemistry data mining environment.

2. rSVD, robust singular value decomposition, Liu, Hawkins, Ghosh,Young, PNAS (2003).

3. Space-filling designs.

4. PharmID: complex problem.

Page 3: PowerMV   Chemical Data Mining Environment

3

Environment

 

20k lines of interface code; 300k lines of algorithm code.~0.75 man-years.

Page 4: PowerMV   Chemical Data Mining Environment

4

SD File, 50 years old and bad!

1 -ISIS- 3D

20 22 0 0 0 0 0 0 0 0999 V2000 2.0680 1.5173 0.0000 C 0 0 0 0 0 2.9340 2.0173 0.0000 N 0 0 0 0 0 2.9340 3.0173 0.0000 C 0 0 0 0 0 3.8000 1.5173 0.0000 C 0 0 0 0 0 3.8000 0.5173 0.0000 C 0 0 0 0 0 4.6660 0.0173 0.0000 C 0 0 0 0 0 4.6660 -0.9827 0.0000 N 0 0 0 0 0 5.5321 -1.4827 0.0000 C 0 0 0 0 0 5.5321 -2.4827 0.0000 C 0 0 0 0 0 4.6660 -2.9827 0.0000 S 0 0 0 0 0 3.8000 -2.4827 0.0000 C 0 0 0 0 0 3.8000 -1.4827 0.0000 C 0 0 0 0 0 2.9061 -0.9480 0.0000 C 0 0 0 0 0 2.0000 -1.4619 0.0000 C 0 0 0 0 0 2.0000 -2.5035 0.0000 C 0 0 0 0 0 2.9061 -3.0173 0.0000 C 0 0 0 0 0 6.4260 -3.0173 0.0000 C 0 0 0 0 0 7.3321 -2.5035 0.0000 C 0 0 0 0 0 7.3321 -1.4619 0.0000 C 0 0 0 0 0 6.4260 -0.9480 0.0000 C 0 0 0 0 0 1 2 1 0 0 0 2 3 1 0 0 0 2 4 1 0 0 0 4 5 1 0 0 0 5 6 1 0 0 0Etc.

Page 5: PowerMV   Chemical Data Mining Environment

5

Smiles Chemical Notation

CC1=CC=C(C=C1)S(=O)(=O)N(CC(O)=O)C2=CC(Cl)=CC=C2

CC1CCC(CC1)C(C)(C)C(CC(C)C)C2CCCC(C)C2

Page 6: PowerMV   Chemical Data Mining Environment

6

Viewer

Page 7: PowerMV   Chemical Data Mining Environment

7

Molecule Blow Up

Page 8: PowerMV   Chemical Data Mining Environment

8

3D View

Page 9: PowerMV   Chemical Data Mining Environment

9

Compute Molecular Descriptors,

Drug-Like Properties

Page 10: PowerMV   Chemical Data Mining Environment

10

Chemical Graphs and Properties

Page 11: PowerMV   Chemical Data Mining Environment

11

Multiple Descriptor Types

Numerical Topology!

Page 12: PowerMV   Chemical Data Mining Environment

12

Similarity Searching,

Motivating Paper

Page 13: PowerMV   Chemical Data Mining Environment

13

Select a Target Compound

Page 14: PowerMV   Chemical Data Mining Environment

14

Search Dialog

Page 15: PowerMV   Chemical Data Mining Environment

15

Structure Comparison

Window

Target

Neighbor

Annotation

Page 16: PowerMV   Chemical Data Mining Environment

16

Statistical Methods

Page 17: PowerMV   Chemical Data Mining Environment

17Free download : www.niss.org/PowerMV

Summary

• Display SD.

• Compute descriptors.

• Cluster.

• Statistical analysis.

• Similarity search.

Become NISS affiliate!

20k display code

300k algorithm code

Page 18: PowerMV   Chemical Data Mining Environment

18

Contact Information

Stan Young [email protected] www.niss,org 919 685 9328

Jun Feng [email protected]

Become a NISS Affiliate!

Page 19: PowerMV   Chemical Data Mining Environment

19

Questions / Comments ?

Page 20: PowerMV   Chemical Data Mining Environment

20

Computed Properties – Rule of 5

Page 21: PowerMV   Chemical Data Mining Environment

21

Structure Comparison

Page 22: PowerMV   Chemical Data Mining Environment

22

Neighbor

List

Page 23: PowerMV   Chemical Data Mining Environment

23

Attribute Comparison

Page 24: PowerMV   Chemical Data Mining Environment

24

Annotated Data Bases

1. Stockwell ACL

2. ChemBank

3. Exploratory Centers for Chemoinformatics Research

4. Chemical Entities of Biological Interest

5. Commercial, e.g. GVK, ACS, MDDR, Ashgate

Page 25: PowerMV   Chemical Data Mining Environment

25

Rule

ofFive

Page 26: PowerMV   Chemical Data Mining Environment

26

Example: CNS Compounds

Page 27: PowerMV   Chemical Data Mining Environment

27

Current NISS ResearchPharmID: Pharmacophore Identification

Finds multiple binding modes! Selectivity.

1. NISS research.

2. Jun Feng, CompChem.

3. Seeking collaborators.

4. Become NISS affiliate.