45
Johann Gasteige r Handbook of Chemoinformatic s From Data to Knowledge in 4 Volume s Volume 2

From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial

Johann Gasteiger

Handbook of Chemoinformatics

From Data to Knowledge in 4 Volumes

Volume 2

Page 2: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial
Page 3: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial
Page 4: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial
Page 5: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial
Page 6: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial
Page 7: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial
Page 8: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial
Page 9: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial
Page 10: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial
Page 11: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial
Page 12: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial
Page 13: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial
Page 14: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial
Page 15: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial
Page 16: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial

V

Databases/Data Sources 49 1

Introduction 493

V.1

Overview of Databases/Data Sources 49 6

Gary D. Wiggins

1 .1

Introduction 49 6

1 .2

Commercial Database Vendors and Databases 4971 .2 .1

Common Features of Vendor Systems 4971 .2 .1 .1

Front-End Search Software 49 7

1 .2 .1 .2

Database Search Costs 497

1 .2 .1 .3

Data Analysis Tools 49 8

1 .2 .2

STN International and CAS Databases 4981 .2 .3

SciFinder 499

1 .2 .4

Other Vendors and Databases 49 91 .2 .4 .1

Major Chemical Database Vendors 500

1 .2 .4 .2

Hybrid Publishers/ Vendors 50 1

1 .2.4.3

Beilstein and Gmelin 502

1 .2.4.4

Knovel Databases 502

1 .2 .4.5

Handbooks, Encyclopedias, Physical Property Data Compilations 50 31 .2 .4.6

Cambridge Structural Database 503

1 .2 .5

Electronic Journals 503

1 .2 .6

Free Internet Sources 504

1 .2 .7

The Future 505

References 50 5

V.2

Bibliographic Databases 507

Andreas Barth

2 .1

Introduction 507

2 .2

Abstracting and Indexing in Bibliographic Databases 5082 .2 .1

Metadata, Data Structures and Representation 5082.2 .2

Subject Indexing 508

2.2 .3

Retrieval Functions 51 0

2.2 .4

Thesaurus Function 51 2

2.3

Important Bibliographic Databases in Chemistry 51 2

2 .3 .1

The Chemical Abstracts Plus File (CAplus) 51 22 .3 .2

Databases with Relevance to Chemistry 5132 .3 .3

SCISEARCH (Science Citation Index) 515

2 .4

Search Strategies and Examples 51 5

2 .4 .1

Concepts for Searching 51 5

2 .4 .2

Example: Simple Search in CAplus 51 6

2 .4.3

Example : Search in More than one Database 51 8

2 .5

Analysis and Postprocessing 518

Page 17: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial

2 .6

Summary 51 9References 522

V.3

Databases of Chemical Structures 523C. Gregory Paris

3 .1

Introduction 5233 .1 .1

Why Store and Search Chemical Structures in a Database? 52 43 .1 .2

Context of Chemical Structure Handling and Databases 52 43 .2

Components of a Chemical Structure Database 52 53 .2 .1

Structural Data in Searchable Format 52 63 .2 .2

Graphical Query Language 5273 .2 .3

Generic Search Engine 5283 .2 .4

What is Not a Database? 52 93 .3

Chemical Structure Representation 53 03 .3 .1

Representation of 2D Structures 53 03 .3 .2

Representation of Additional 2D Features 53 13 .3 .3

2D versus 3D Information 53 23 .3 .4

Representation of 3D Model Features 53 23 .3 .5

Representation of Additional 3D Features 53 33 .3 .6

Limitations of Chemical Structure Representation 53 43 .4

Chemical Structure Retrieval Strategies 53 43 .4.1

Exact 2D Structure Search 53 53 .4 .2

Exact 2D Substructure Search 53 53 .4.3

Fuzzy 2D Substructure Searching 53 53 .4.4

R-Group and Markush Searching 53 63 .4.5

2D Similarity Searching 5363 .4.6

3D Pharmacophoric Searching 53 7

3 .4 .6 .1

Rigid 3D Pharmacophoric Search 53 83 .4 .6 .2

Rigid 3D Vector Searching 53 8

3 .4 .6 .3

Conformationally Flexible 3D Pharmacophore Search 53 93 .4 .7

3D Volume-based Searching and Docking 5403 .4 .8

3D Similarity Searching 5403 .5

Query Formulation, 2D and 3D 54 1

3 .5 .1

Published Pharmacophores 54 23 .5 .2

Binding-site Based Query Formulation 5433 .5 .2 .1

Empty, Known Site 5433 .5 .2 .2

Co-crystallized Site and Ligand 5433.5 .2 .3

Implicit Bound-ligand Conformations 544

3.5 .2 .4

Site-directed Computed Ligand Ensembles 5443 .5 .2 .5

Ensembles of Macromolecular Structures 54 43 .5 .3

Ligand-based Query Formulation 545

3 .5 .3 .1

Conformational Analysis of Single Lead Compounds 5453 .5 .3 .2

Ensemble Modeling and Multistructure Template Assembly 5453 .5 .3 .3

Automated QSAR Designs : Catalyst 54 6

3 .6

Examples of Chemical Structure Databases 546

Page 18: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial

3 .6 .1

CAS and Beilstein 54 63 .6 .2

Cambridge Structural Database 54 7

3 .6 .3

Protein Data Bank 5483 .6 .4

Databases of Commercial Reagents 5493 .6 .5

Databases of Pharmaceutical Interest 54 93 .6.6

Proprietary Corporate Databases 55 03 .7

Conclusion and Outlook 551Additional Sources 551

References 552

V.4

The CAS Information System : Applying Scientific Knowledge an d

Technology for Better Information 556William Fisanick and Eric R . Shively

4 .1

Introduction 556

4 .1 .1

History of CAS 55 7

4 .2

Major CAS Systems 55 94 .2 .1

Chemical Registry System 55 94 .2 .1 .1

Development and General Description 55 94 .2 .1 .2

Key Registry Record Components 56 14 .2 .1 .3

Special Substance Classes 5684 .2 .1 .4

Nomenclature 5744.2 .1 .5

Molecular Properties 5754.2 .2

Document Database System 5784.2 .2 .1

Information Flow 578

4.2 .2 .2

Selection and Assignment 5784.2 .2 .3

Abstract Data 57 94.2 .2 .4

Index Data 57 94.2 .2 .5

Citations

5794.3

Access to the CAS Database 58 0

4 .3 .1

Publications 5804 .3 .1 .1

Research Resources 58 04 .3 .1 .2

Current Awareness Bulletins 58 14 .3 .2

Online/ Web Access 58 1

4 .3 .2 .1

System Architecture 58 14 .3 .2 .2

CAS Databases on STN 58 24 .3 .2 .3

Fundamental Search Mechanisms 5904 .3 .3

Delivery Tools 5974 .3 .3 .1

STN/STN on the Web 59 74.3 .3 .2

STN Express/STN Express with Discover 59 84.3 .3 .3

SciFinder/SciFinder Scholar 59 94.3 .3 .4

STN Easy 6034.4

Linking and Retrieval Integration 6034 .4 .1

STN Data 6034 .4 .2

ChemPort 6044 .4 .3

eScience 604

Page 19: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial

4 .5

Conclusion 604

References 60 6

V.5

The Beilstein Database 60 8

Alexander J . Lawson5 .1

Historical Background 608

5 .2

The Beilstein Information System 61 25 .3

The CrossFire Revolution 61 55 .3 .1

Opening up Interaction with End Users, Directly 61 55 .3 .2

The Search and Retrieval Performance of CrossFire 61 65 .3 .2 .1

Structure Searching 61 65 .3 .2 .2

Text and Numerical Searching 61 7

5 .3 .3

Point and Click Access to Relevance in Context 61 85 .3 .4

Abstracts, Reactions, and Ecopharm Extensions to the Beilstei nFile

61 9

5 .3 .5

Incorporation in a Fully Integrated Environment 62 25 .4

The Future 62 6References and Notes 627

V.6

Databases in Inorganic Chemistry 62 9

Jürgen Vogt, Natalia Vogt, and Axel Schun k

6 .1

Introduction 62 96 .2

Features for Retrieving Inorganic and Organometalli cCompounds 63 0

6 .3

The Databases of Chemical Abstracts 63 1

6 .4

INSPEC Database 63 16 .5

Gmelin Database 63 2

6 .5 .1

Literature Coverage 6326 .5 .2

Stereochemistry and 3D Structures 63 36 .5 .3

Ligand Search System 634

6 .5 .4

Reaction Retrieval 6346 .5 .5

Abstracts 63 56.5 .6

Multifile Search 63 6

6.5 .7

Information about Catalysts 63 66.6

Crystallographic Databases 63 6

6.7

Database for Gas-phase Compounds 63 76.8

Landolt-Börnstein 6406.9

Resume 64 2

References 642

V.7

The Cambridge Structural Database (CSD) of Small Molecule Crysta l

Structures 64 5Frank H. Allen, Karen J. Lipscomb, and Gary Battle

7 .1

Introduction 645

7 .2

The Cambridge Structural Database 646

Page 20: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial

7 .2 .1

Information Content of the CSD 64 6

7 .2 .2

Data Acquisition 648

7 .2 .3

Data Processing, Validation and Annotation 6487 .2 .4

Statistics

64 97 .3

The CSD Software System 65 0

7 .3 .1

Searching the CSD Using ConQuest 65 0

7 .3 .2

Visualizing Crystal Structures : Mercury 65 1

7 .3 .3

Analysis and Display of Geometrical Data : Vista 654

7 .4

Research Applications of the CSD 655

7 .4 .1

Overview and Leading References 65 5

7 .4 .2

The DBUse Bibliography of CSD Applications 6567 .5

CSD Applications Examples 65 67 .5 .1

Molecular Dimensions 65 6

7 .5 .2

Conformational Analysis 65 7

7 .5 .3

Intermolecular Interactions : Weak Hydrogen-bonds 6587 .6

IsoStar - A Knowledge Base of Intermolecular Interactions 660

7 .7

CSD System Releases and Data Availability 66 37 .8

CCDC Applications Software 6637 .8 .1

Protein-Ligand Interactions 66 37 .8 .2

Crystal Structures from Powder Diffraction Data 66 47 .9

Conclusions 665References 66 5

V.8

Databases of Chemical Reactions 667

Engelbert Zas s

8.1

Introduction 6678.2

Sources of Reaction Information 66 88 .2 .1

Interfaces to Reaction Databases 66 98 .2 .2

General Organic Reaction Databases 670

8 .2 .2 .1

Theilheimer and Journal of Synthetic Methods 6708 .2 .2 .2

Cheminform RX 671

8 .2 .2 .3

Current Chemical Reactions (CCR) and Related ISI Databases 67 18 .2 .2 .4

Methods in Organic Synthesis (MOS) 67 38 .2 .2 .5

Reference Library of Synthetic Methodology (RefLib) 673

8 .2 .2 .6

ChemReact 673

8 .2 .2 .7

Science of Synthesis 6748 .2 .2 .8

CASREACT 675

8 .2 .2 .9

CrossFire Beilstein 6768 .2 .2 .10 Summary 6788 .2 .3

Special Organic Reaction Databases 68 08 .2 .3 .1

Organic Syntheses 68 08 .2 .3 .2

Comprehensive Heterocyclic Chemistry (CHC) 6808 .2 .3 .3

Encyclopedia of Reagents in Organic Synthesis (EROS) 680

8 .2 .3 .4

Protecting Groups 68 18 .2 .3 .5

Biotransformations and Metabolic Reactions 683

Page 21: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial

8 .2 .3 .6

Solid-Phase Reactions 68 38 .2 .4

Inorganic Reaction Databases 6838 .2 .5

Printed Reaction Information Sources 6848 .2 .6

Catalog Databases 6858 .3

Reaction Searching 68 68 .3 .1

Reaction Centers 68 78 .3 .2

Stereochemistry 6888 .3.3

Reagents and Solvents 68 98 .3.4

Multistep Reactions 6908 .3.5

Postprocessing and Exporting of Search Results 6938 .3.6

Current Awareness about Reactions 6938 .4

Conclusion 694References 695

V.9

Spectroscopic Databases 70 0Reinhard Neudert and Antony N. Davie s

9 .1

Introduction 700

9 .2

Data Collections 7019 .2 .1

Reference Data Generation 701

9.2 .2

Reference Quality Spectroscopic Data 702

9.2 .3

NMR Reference Data 7039.2 .4

Mass Spectrometry Reference Data 705

9.2 .5

Infrared Spectroscopy Reference Data 7059 .3

Library Searches 70 79 .3 .1

1 H NMR Search 7089 .3 .2

Searching Mass Spectrometry Reference Databases 709

9 .4

Spectrum Prediction Using Reference Databases 71 29 .4 .1

NMR Prediction 71 29 .4 .2

IR Spectrum Prediction 71 6

9 .5

Structure Generators Using Databases 71 7

9 .5 .1

Partially Automated Structure Elucidation 71 89 .5 .2

Fully Automated Structure Elucidation 71 89 .6

Conclusions and Outlook 720References 72 0

V.10

Databases on Environmental Information 722

Kristina Voigt

10.1

Introduction 72210.2

Definition and Description of Environmental Databases 72 210 .3

Classification of Environmental Databases 72 310 .3 .1

Fact-based Databases 72310 .3 .1 .1 Chemical Name Directories (CN) 72410 .3 .1 .2 Numerical Databases (NU) 72510 .3 .1 .3 Metadatabases (MD) 72 910 .3 .1 .4 Research Databases (RD) 729

Page 22: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial

10 .3 .1 .5

Catalogs of Chemical Substances (CA), Material Safety Data Sheet s(MSDS) 730

10 .3 .2

Text-based Databases 73 2

10 .3 .2 .1

Bibliographic Databases (BI)

73 2

10 .3 .2 .2

Full-text Databases (FT) 73 4

10 .3 .3

Integrated Databases (INDB) 735

10 .3 .3 .1

Structural Databases (ST) 73 5

10 .3 .3 .2

Reaction Databases (RE) 73 5

10 .4

DAIN - Metadatabase of Internet Resources for Environmenta lChemicals 73 6

10 .4 .1

General Description of DAIN 73 6

10 .4 .2

Content-related Evaluation of DAIN 73810 .5

Conclusions and Outlook 73 9

References 74 1

V.11

Patent Databases 74 3Jürgen Vogt

11 .1

Introduction 743

11 .2

Patents as Important Source of Novel Chemical Information 74 3

11 .3

Databases of the Chemical Abstracts Service 74411 .4

World Patents Index 74611 .5

Beilstein and Gmelin Database 748

11 .6

Full Text Databases 750

11 .7

Other Patent Databases 75 1

11 .8

Summary 75 4

References 755

V.12

Databases in Biochemistry and Molecular Biology 756Alexander von Homeyer and Martin Reitz

12 .1

Introduction 75 6

12 .2

Classification 75 7

12 .3

Data in Biochemistry and Molecular Biology 75812 .4

Growth of Data 759

12 .5

Annotation and Documentation 76 0

12 .6

Entry Coding 76 0

12 .7

Redundancy 762

12 .8

Formats 763

12 .9

Search Options and Algorithms 764

12 .10

Sequence Databases 768

12 .10.1

Nucleotide Sequence Databases 768

12 .10.2

Protein Sequence Databases 76 9

12 .11

Motif, Domain and Family Databases 77112 .12

Macromolecular 3D Structure Databases 77512 .13

Molecular Interaction Databases 778

Page 23: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial

12 .14

Small Molecule Databases 778

12 .15

Post-translational Modification Databases 780

12 .16

Genome Databases 78 112 .17

Metabolism Databases 78 1

12 .18

Miscellaneous Specialized Databases 78512 .19

Meta-databases 78 612 .20

Summary and Future Trends 789

References 789

V.13

Chemistry on the Internet 794

Alexei Tarkho v13 .1

Introduction 794

13 .2

Overview of Internet Technologies 79 5

13 .2 .1

Basics of Networking 79513 .2 .2

Client-Server Technologies 79 6

13 .2 .3

Differences Between Stand-alone and Client-Server-base dApplications 798

13 .2 .3 .1 Component-based Software 79 9

13 .2 .3 .2 Web-based Applications 800

13 .2 .4

Network Limitations 80 2

13 .2 .4 .1

Data Transport 802

13 .2 .4 .2 Text Encoding 802

13 .2 .4 .3 Size/Speed Problem 802

13 .2 .4 .4 Connectivity Problems 80313 .2 .4 .5

Security Problems 803

13 .3

Chemical Information on the Internet 804

13 .3 .1

Representation of Chemical Structures 805

13 .3 .2

Chemical MIME 80 9

13 .3 .3

Chemical Markup Language 809

13 .3 .4

Chemical Databases 81 0

13 .3 .5

Publishing on the Internet 81 1

13 .3 .6

Searching the Internet 81 4

13 .3 .7

Chemistry Internet Portals - ChemWeb .com 82 1

13 .4

Chemical Computations on the Internet 822

13 .4 .1

Traditional Network-oriented Applications 822

13 .4 .2

Web Interfaces to Chemical Computational Tools 824

13 .4 .2.1

Server Side 82 4

13 .4 .2.2

Client Side 82 6

13 .4 .3

Distributed Systems 82813 .4 .4

Distributed Calculations 829

13 .4 .5

On-line Chemical Services 832

13 .5

Chemical Education on the Internet 83 5

13 .6

Future Perspectives 83 8

References 840

Page 24: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial

V.14

Laboratory Information Management Systems (LIMS) 844

Markus Hemmer

14 .1

Introduction 84 4

14 .2

LIMS and Regulatory Compliance 84614 .2 .1

Good Automated Laboratory Practice (GALP) 84 614 .3

FDA 21 CFR Part 11-compliant Data Management 84 8

14 .4

LIMS Characteristics 84 9

14 .5

Why Use a LIMS? 849

14 .6

The Basic LIMS 85 0

14 .6 .1

A Functional Model 85214 .6 .1 .1 Sample Tracking 85 2

14 .6 .1 .2 Sample Analysis 852

14 .6 .1 .3

Information Structure 85314 .7

The Modern LIMS 85 314 .7 .1

The Planning System 85314 .7 .1 .1

Basic Data 854

14 .7 .1 .2

Product Standards 85 4

14 .7 .2

Controlling System 855

14 .7 .3

Laboratory Processing 85 6

14 .7 .4

The Assurance System 85814.7.5

Automatic Test Programs (ATP) 85814 .7 .6

Offline Client 85 9

14.8

Additional LIMS Modules 85 9

14 .8 .1

Stability Management 85 9

14.8.2

Complaints Management 86 1

14.8 .3

Reference Substance Module 86 114.8.4

Recipe Administration 862

14.9

LIMS and Knowledge Management in the Analytical Laboratory 863

Further Reading 864

VI

Searching Chemical Structures 86 5

Introduction 867

VI.1

Two-dimensional Structure and Substructure Searching 86 8

Jun Xu1 .1

Introduction 86 8

1 .2

2D Structure Representation 8701 .2 .1

Connection Table 8701 .2 .2

Linear Notation 871

1 .2 .3

Structure Representation Data Exchange Formats 87 1

1 .3

Structure Searching 872

1 .3 .1

Molecular Indices 872

1 .3 .2

Canonic Linear Notation 8731 .3 .3

Structure Search 875

Page 25: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial

1 .4

Substructure Searching 875

1 .4.1

Structure Normalization 875

1 .4.2

Substructure Search Query 877

1 .4 .3

Atom-by-atom Match 877

1 .4 .4

Backtracking Algorithm (Using the Example from Figure 1-9) 8781 .4 .4 .1

Step 1

878

1 .4 .4 .2

Step 2

878

1 .4 .5

Techniques to Enhance Atom-by-atom Match Performance 8791 .5

Substructure Mapping and Other Structure Perceptions 88 01 .5 .1

Finding the Smallest Set of Smallest Rings (SSSR) 8801 .5 .2

Determination of Topological RS Chirality 8811 .6

Summary and Conclusions 88 3

References 884

VI .2

Current State of the Art of Markush Topological Search Systems 88 5

Andrew H. Berks

2 .1

Introduction 88 5

2 .2

Markush DARC 886

2 .3

MARPAT 888

2 .3 .1

Database Content Issues 88 9

2 .3 .1 .1

Old Content Issues 889

2 .3 .1 .2

Current Content Issues 889

2 .3 .2

System Issues 89 1

2 .3 .2 .1

Translation Capability 89 1

2 .3 .2 .2

Other Search Engine Issues 892

2 .3 .2 .3

Client Software 896

2 .3 .2 .4

Display Issues 897

2 .3 .2 .5

Costs

90 1

2 .4

Summary and Future Directions 902References 903

VI .3

Similarity Searching in Chemical Structure Databases 90 4

Peter Willett

3 .1

Introduction 904

3 .2

Similarity Searching in 2D Databases 906

3 .2 .1

Fragment Substructures 906

3 .2 .2

Topological Indices 907

3 .2 .3

Graph-based Approaches 907

3 .2 .4

Weighting Schemes and Similarity Coefficients 908

3 .3

Similarity Searching in 3D Databases 90 9

3 .3 .1

Fragment Substructures 91 0

3 .3 .2

Alignment-based Approaches 91 1

3 .4

Conclusions 91 1

References 913

Page 26: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial
Page 27: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial
Page 28: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial
Page 29: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial
Page 30: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial
Page 31: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial
Page 32: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial
Page 33: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial
Page 34: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial
Page 35: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial
Page 36: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial
Page 37: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial
Page 38: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial
Page 39: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial
Page 40: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial
Page 41: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial
Page 42: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial
Page 43: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial
Page 44: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial
Page 45: From Data to Knowledge in 4 Volumes Volume 2 · V Databases/Data Sources 491 Introduction 493 V.1 Overview of Databases/Data Sources 496 Gary D. Wiggins 1.1 Introduction 496 1.2 Commercial