1
The Structure Elucidation Challenge Between 2003 and 2011, 112 different datasets were submitted from various industries as challenges to the ACD/Structure Elucidator Suite system. [3] The structures determined ranged from 100 to more than 1000 Da (Figure 5) and the software had excellent success determining the correct structures (Figure 6). In one recent example, Structure Elucidator Suite was able to determine the structure of a large, symmetrical dimer known as Bacillusin A [4] (containing 86 skeletal units) even with the presence of ambiguous connectivities. [5] Conclusions Modern CASE systems such as Structure Elucidator Suite provide the necessary capability accurately elucidate a novel chemical structure for complex molecules based on readily available NMR data sets. This allows organizations to avoid expensive, labor-intensive, and time-consuming synthetic efforts. The routine use of CASE tools should also prevent the introduction of many improper structures into the literature, or into compound registries. Introduction Computer-Assisted Structure Elucidation (CASE) has come to be a broadly accepted method for the derivation of novel or difficult chemical structures, especially those of natural products or drug metabolites and impurities. For completely error-free structure determination, one must first evaluate all isomers that truly match connectivity criteria defined by experiment but it is impossible to do this manually without bias. Modern NMR techniques give ever-increasing amounts of valuable data to substantiate structural evaluations, including lower limits of detection, and more information on connectivities. However, errors in published structures are rampant [1] giving weight to the argument for computer-assisted evaluation of structures. [2] CASE Methodology CASE follows a simple set of steps to comprehensively evaluate structures against available data: 1. Interpret experimental data to extract knowledge: Molecular Formula Integrals Chemical shifts Multiplicities Connectivities Known fragments Known exclusions 2. Search structure space to derive all possible structures 3. Rank-order based on set criteria Predicted chemical shift Data Interpretation The Case for CASE: Computer-Assisted Structure Elucidation References 1. Nicolaou, KC; Snyder, SA. Angew. Chem. Int. Ed. Engl., 2005, Feb 4:44(7); 1012-44. 2. Elyashberg, ME; Williams, AJ; Blinov, KA. Nat. Prod. Rep., 2010, 27(9): 1296-1328. 3. Moser, A; Elyashberg, ME; Williams, AJ; Blinov, KA; DiMartino, JC. J. Cheminf., 2012 4:5. 4. Ravu, RR, et al. J. Nat Prod., 2015, 78(4): 924-8. 5. http://www.acdlabs.com/comm/elucidation/2015_05.php Advanced Chemistry Development, Inc. Tel: (416) 368-3435 Fax: (416) 368-5596 Toll Free: 1-800-304-3988 Email: [email protected] www.acdlabs.com Reprints: [email protected] Structure Generation Ranking the Best Candidates Using the advanced algorithms of ACD/Structure Elucidator Suite, the entirety of the chemical space can be queried for possible structures which agree with the experimental data. The software efficiently evaluates all structural possibilities, and then ranks the best candidates for review based on the differences between the experimental and predicted chemical shifts. Rapid Identification of Structures With the right databases, known structures can also be quickly identified from previous work. The following DBs are available using ACD/Labs software: ACD/HNMR and CNMR DB 320,000 assigned chemical structures ACD/Structure Elucidator DB 425,000 structures 2,300,000 structural fragments ChemSpider DB 22,000,000 predicted compounds with 13 C, 1 H, 19 F, 31 P, and 15 N NMR Dictionary of Natural Products Over 220,000 compounds Marinlit > 21,000 compounds (marine sources) AntiBase > 35,000 compounds (microorganisms and higher fungi) Cl Br OH Br OH OH O C 10 H 17 Br 2 ClO 2 , 50,502,293 C 15 H 22 O 2 , 138,136,211,624 O O O O C 15 H 20 O 1 , 37,568,150,635 C 12 H 12 O 3 , 68,930,547,646 O O H OH N H N H 2 O H O C 13 H 20 O 3 , 14,431,269,166 C 11 H 12 N 2 O 2 , 310 11 <n10 12 Figure 1: The large number of possible isomers for some relatively small organic molecules. Figure 2: In ACD/Labs NMR software, peak assignments are synchronized throughout a set of related NMR experiments with NMRSync. During interpretation, the software displays correlations for assigned spectra and structures, highlighting those which are likely to be erroneous. Figure 3: A Molecular Connectivity Diagram (MCD) is created automatically during data interpretation. This can be used both for de novo elucidation, or to perform an alternative check on proposed structures. Figure 4: A list of potential structures, ranked by their associated chemical shift deviations. Structural assignments are color-coded based on their agreement with the experimental data. Figure 6: The outcomes of structural evaluations performed by ACD/Structure Elucidator Suite from 2003-2011. Figure 5: The distribution of molecular weight in structural challenges submitted to ACD/Labs. CH 3 33 CH 3 33 C H 3 32 CH 3 32 C H 3 34 CH 3 34 26 26 28 28 20 20 30 30 22 22 18 18 31 31 27 27 21 21 19 19 23 23 29 29 4 4 6 6 25 25 16 16 8 8 14 14 15 15 12 12 9 9 11 11 13 13 24 24 10 10 2 2 7 7 17 17 3 3 5 5 1 1 O O O O O O O H OH O H OH O H OH O H OH O H OH OH O H Figure 5: The elucidated structure of Bacillusin A

The Case for CASE: Computer-Assisted Structure ElucidationSMASH 2015 Keywords SMASH 2015, ACD/Structure Elucidator Suite, structure elucidator, structure elucidation, case, nmr Created

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The Case for CASE: Computer-Assisted Structure ElucidationSMASH 2015 Keywords SMASH 2015, ACD/Structure Elucidator Suite, structure elucidator, structure elucidation, case, nmr Created

The Structure Elucidation Challenge

Between 2003 and 2011, 112 different datasets were

submitted from various industries as challenges to the

ACD/Structure Elucidator Suite system. [3] The structures

determined ranged from 100 to more than 1000 Da (Figure 5)

and the software had excellent success determining the

correct structures (Figure 6).

In one recent example, Structure Elucidator Suite was able to

determine the structure of a large, symmetrical dimer known

as Bacillusin A [4] (containing 86 skeletal units) even with the

presence of ambiguous

connectivities. [5]

Conclusions

Modern CASE systems such as Structure Elucidator Suite

provide the necessary capability accurately elucidate a novel

chemical structure for complex molecules based on readily

available NMR data sets. This allows organizations to avoid

expensive, labor-intensive, and time-consuming synthetic

efforts. The routine use of CASE tools should also prevent the

introduction of many improper structures into the literature, or

into compound registries.

IntroductionComputer-Assisted Structure Elucidation (CASE) has

come to be a broadly accepted method for the derivation

of novel or difficult chemical structures, especially those of

natural products or drug metabolites and impurities. For

completely error-free structure determination, one must

first evaluate all isomers that truly match connectivity

criteria defined by experiment – but it is impossible to do

this manually without bias.

Modern NMR techniques give ever-increasing amounts of

valuable data to substantiate structural evaluations,

including lower limits of detection, and more information

on connectivities. However, errors in published structures

are rampant [1] giving weight to the argument for

computer-assisted evaluation of structures. [2]

CASE Methodology

CASE follows a simple set of steps to comprehensively

evaluate structures against available data:

1. Interpret experimental data to extract knowledge:

• Molecular Formula

• Integrals

• Chemical shifts

• Multiplicities

• Connectivities

• Known fragments

• Known exclusions

2. Search structure space to derive all possible structures

3. Rank-order based on set criteria

• Predicted chemical shift

Data Interpretation

The Case for CASE:

Computer-Assisted Structure Elucidation

References1. Nicolaou, KC; Snyder, SA. Angew. Chem. Int. Ed. Engl., 2005, Feb 4:44(7); 1012-44.

2. Elyashberg, ME; Williams, AJ; Blinov, KA. Nat. Prod. Rep., 2010, 27(9): 1296-1328.

3. Moser, A; Elyashberg, ME; Williams, AJ; Blinov, KA; DiMartino, JC. J. Cheminf., 2012 4:5.

4. Ravu, RR, et al. J. Nat Prod., 2015, 78(4): 924-8.

5. http://www.acdlabs.com/comm/elucidation/2015_05.php

Advanced Chemistry Development, Inc.

Tel: (416) 368-3435

Fax: (416) 368-5596

Toll Free: 1-800-304-3988

Email: [email protected]

www.acdlabs.com

Reprints:

[email protected]

Structure Generation

Ranking the Best CandidatesUsing the advanced algorithms of ACD/Structure Elucidator

Suite, the entirety of the chemical space can be queried for

possible structures which agree with the experimental data.

The software efficiently evaluates all structural possibilities,

and then ranks the best candidates for review based on the

differences between the experimental and predicted chemical

shifts.

Rapid Identification of StructuresWith the right databases, known structures can also be

quickly identified from previous work. The following DBs are

available using ACD/Labs software:

• ACD/HNMR and CNMR DB

• 320,000 assigned chemical structures

• ACD/Structure Elucidator DB

• 425,000 structures

• 2,300,000 structural fragments

• ChemSpider DB

• 22,000,000 predicted compounds with 13C, 1H, 19F,

31P, and 15N NMR

• Dictionary of Natural Products

• Over 220,000 compounds

• Marinlit

• > 21,000 compounds (marine sources)

• AntiBase

• > 35,000 compounds (microorganisms and higher

fungi)

Cl

Br

OH

Br

OH OH

O

C10H17Br2ClO2, 50,502,293 C15H22O2, 138,136,211,624

O

O

O

O

C15H20O1, 37,568,150,635 C12H12O3, 68,930,547,646

O

OH

OH

NH

NH2

OHO

C13H20O3, 14,431,269,166 C11H12N2O2, 31011<n1012

Figure 1: The large number of possible isomers for some

relatively small organic molecules.

Figure 2: In ACD/Labs NMR software, peak assignments

are synchronized throughout a set of related NMR

experiments with NMRSync. During interpretation, the

software displays correlations for assigned spectra and

structures, highlighting those which are likely to be

erroneous.

Figure 3: A Molecular Connectivity Diagram (MCD) is

created automatically during data interpretation. This can

be used both for de novo elucidation, or to perform an

alternative check on proposed structures.

Figure 4: A list of potential structures, ranked by their

associated chemical shift deviations. Structural

assignments are color-coded based on their agreement

with the experimental data.

Figure 6: The outcomes of structural evaluations performed

by ACD/Structure Elucidator Suite from 2003-2011.

Figure 5: The distribution of molecular weight in structural

challenges submitted to ACD/Labs.

CH333

CH333

CH3 32

CH332

CH3 34

CH334

26

26

28

28

20

20

30

30

22

22

18

18

31

31

27

27

21

21

19

19

23

23

29

29

4

4

6

6

25

25

16

16

8

8

14

14

15

15

12

12

9

9

11

11

13

13

24

24

10

10

2

2

7

7

17

17

3

3

5

5

11

O

O

O

O

OO

OH

OH

OH

OH

OH

OH

OH

OH

OH

OHOH

OH

Figure 5: The elucidated structure of Bacillusin A