Upload
channing-fisher
View
33
Download
2
Embed Size (px)
DESCRIPTION
PowerMV Chemical Data Mining Environment. S. Stanley Young Jun Feng and Jack Liu NISS MPDM, McMaster University 4 June 2005. Outline. PowerMV, a chemistry data mining environment. rSVD, robust singular value decomposition, Liu, Hawkins, Ghosh,Young, PNAS (2003). Space-filling designs. - PowerPoint PPT Presentation
Citation preview
1
PowerMV Chemical Data Mining Environment
S. Stanley YoungJun Feng and Jack Liu
NISS
MPDM, McMaster University4 June 2005
2
Outline
1. PowerMV, a chemistry data mining environment.
2. rSVD, robust singular value decomposition, Liu, Hawkins, Ghosh,Young, PNAS (2003).
3. Space-filling designs.
4. PharmID: complex problem.
3
Environment
20k lines of interface code; 300k lines of algorithm code.~0.75 man-years.
4
SD File, 50 years old and bad!
1 -ISIS- 3D
20 22 0 0 0 0 0 0 0 0999 V2000 2.0680 1.5173 0.0000 C 0 0 0 0 0 2.9340 2.0173 0.0000 N 0 0 0 0 0 2.9340 3.0173 0.0000 C 0 0 0 0 0 3.8000 1.5173 0.0000 C 0 0 0 0 0 3.8000 0.5173 0.0000 C 0 0 0 0 0 4.6660 0.0173 0.0000 C 0 0 0 0 0 4.6660 -0.9827 0.0000 N 0 0 0 0 0 5.5321 -1.4827 0.0000 C 0 0 0 0 0 5.5321 -2.4827 0.0000 C 0 0 0 0 0 4.6660 -2.9827 0.0000 S 0 0 0 0 0 3.8000 -2.4827 0.0000 C 0 0 0 0 0 3.8000 -1.4827 0.0000 C 0 0 0 0 0 2.9061 -0.9480 0.0000 C 0 0 0 0 0 2.0000 -1.4619 0.0000 C 0 0 0 0 0 2.0000 -2.5035 0.0000 C 0 0 0 0 0 2.9061 -3.0173 0.0000 C 0 0 0 0 0 6.4260 -3.0173 0.0000 C 0 0 0 0 0 7.3321 -2.5035 0.0000 C 0 0 0 0 0 7.3321 -1.4619 0.0000 C 0 0 0 0 0 6.4260 -0.9480 0.0000 C 0 0 0 0 0 1 2 1 0 0 0 2 3 1 0 0 0 2 4 1 0 0 0 4 5 1 0 0 0 5 6 1 0 0 0Etc.
5
Smiles Chemical Notation
CC1=CC=C(C=C1)S(=O)(=O)N(CC(O)=O)C2=CC(Cl)=CC=C2
CC1CCC(CC1)C(C)(C)C(CC(C)C)C2CCCC(C)C2
6
Viewer
7
Molecule Blow Up
8
3D View
9
Compute Molecular Descriptors,
Drug-Like Properties
10
Chemical Graphs and Properties
11
Multiple Descriptor Types
Numerical Topology!
12
Similarity Searching,
Motivating Paper
13
Select a Target Compound
14
Search Dialog
15
Structure Comparison
Window
Target
Neighbor
Annotation
16
Statistical Methods
17Free download : www.niss.org/PowerMV
Summary
• Display SD.
• Compute descriptors.
• Cluster.
• Statistical analysis.
• Similarity search.
Become NISS affiliate!
20k display code
300k algorithm code
18
Contact Information
Stan Young [email protected] www.niss,org 919 685 9328
Jun Feng [email protected]
Become a NISS Affiliate!
19
Questions / Comments ?
20
Computed Properties – Rule of 5
21
Structure Comparison
22
Neighbor
List
23
Attribute Comparison
24
Annotated Data Bases
1. Stockwell ACL
2. ChemBank
3. Exploratory Centers for Chemoinformatics Research
4. Chemical Entities of Biological Interest
5. Commercial, e.g. GVK, ACS, MDDR, Ashgate
25
Rule
ofFive
26
Example: CNS Compounds
27
Current NISS ResearchPharmID: Pharmacophore Identification
Finds multiple binding modes! Selectivity.
1. NISS research.
2. Jun Feng, CompChem.
3. Seeking collaborators.
4. Become NISS affiliate.