Greg Challis Department of Chemistry

Preview:

DESCRIPTION

Lecture 1: Methods for in silico analysis of cryptic natural product biosynthetic gene clusters. Microbial Genomics and Secondary Metabolites Summer School, MedILS, Split, Croatia, 25-29 June 2007. Greg Challis Department of Chemistry. Overview. Introduction - PowerPoint PPT Presentation

Citation preview

Greg Challis

Department of Chemistry

Lecture 1: Methods for in silico analysis of cryptic natural product biosynthetic gene clusters

Microbial Genomics and Secondary Metabolites Summer School, MedILS, Split, Croatia, 25-29 June 2007

Overview

• Introduction

cryptic (orphan) gene clusters in microbial genomes

• Clusters encoding nonribosomal peptide synthetases (NRPSs)

domains, modules, substrate specificity, predicting products

• Clusters encoding modular polyketide synthases (PKSs)

domains, modules, substrate specificity, predicting products

• Clusters encoding other biosynthetic systems

terpene synthases, iterative PKSs

Introduction

‘Cryptic’ (orphan) biosynthetic gene clusters

• Present in many of the 300 or so sequenced microbial genomes

e.g. Streptomyces avermitilisStreptomyces coelicolor

Bacillus subtilis

Pseudomonas fluorescensPseudomonas syringae

Nostoc punctiforme

Aspergillus nidulans

• May prove a valuable new source of bioactive metabolites

• Polyketide synthases

• Nonribosomal peptide synthetases

• Terpene synthases

Genome sequence of the model antibiotic-producer Streptomyces coelicolor M145

NH

N

HN

OMe

prodiginines

OO

O

O

OH

OH

OH

OHO

O CO2H

HO2C

actinorhodin

NH O

O HN

O

HN

HN

CO2H

O NH

O

HN

O

NH

OOH

NH

OHN

R'

HO2C HN

O

OH2NOC

NH

O

HN

O

NH

O

CO2H

OH

O

HO2C

OR

calcium-dependent antibiotic

O

CO2HO

methylenomycin A

Gene clusters directing complex metabolite biosynthesis in the S. coelicolor genome

Bentley et al. Nature (2002) 417, 141-147

Part 1: Nonribosomal peptide synthetase analysis

SO

H2N

Recap of NRPS organisation and function: the gramicidin S synthetase as an example

A E C A A AC C C A TE

module 1 module 2 module 3 module 4 module 5

S

NH

O

HN

NH

O

O

N

O

O

H2N

NH2

S

HN

NH

O

O

N

O

O

H2N

NH2

S

NH

O

N

O

O

H2N

S

N

O

O

H2N

SO

H2N

grsA grsBgrsT

synthetase 1 synthetase 2

PC

P

PC

P

PC

P

PC

P

PC

P

A = AdenylationPCP = peptidyl carrier proteinC = CondensationE = EpimerisationTE = Thioesterase

S

HN

OS

NH2

OS

H2N

O

NH2

S

NH2

O

Recap of NRPS organisation and function: the gramicidin S synthetase as an example

O

NH

NH

NHO

HN

HN

HN

O

O

O

O

O

O

H2N

NH2

N

ONH

N

OHN

TE

TE

S

NH

O

HN

NH

O

O

N

O

O

H2N

NH2

PC

P

O

NH

O

HN

NH

O

O

N

O

O

H2N

NH2

For further information see Lars Robbel’s poster

Nonribosomal peptide synthetases encoded by the S. coelicolor genome

A new S. coelicolor NRPS gene cluster

cchAcchBcchH

Flavin-dependent monooxygenase (cchB)

Non-ribosomal peptide synthetase (cchH)

Formyl-tetrahydrofolate-dependent formyl transferase (cchA)

MbtH-like protein (cchK)

Esterase (cchJ)

Challis and Ravel FEMS Microbiol. Lett. (2000) 187, 111-114

Export functions

Ferric-siderophore import

cchJcchI

Prediction of domain and module structure

Conserved Domain (CD) search

(http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi)

A E C A E C A

SH SHSH

Module 1 Module 2 Module 3

Deduced domain and module organization

Prediction of A-domain selectivity pocket residues

GrsA DASVWEMFMALLTGASLYIILKDTINDFVKFEQYINQKEITVITLPPTYVVHL-----DPERILSIQTLITAGSATSPSLVNKWKEK--VTYINAYGPTETTINcs1-M1 DIAVWELLAAFVGGARLVIAEHRLRGVVPHLPELMTDHRVTVAHFVPSVLEELLGWMADGGRVG-LRLVVCGGEAVPPSQRDRLLALSGARMVHAYGPTETTI

GrsA D A W T I A A INcs1-M1 D I W H V G A I

Stachelhaus, Mootz and Marahiel Chem. Biol. (1999) 6, 493-505Challis, Ravel and Townsend Chem. Biol. (2000) 7, 211-224

Empirical correlation between specificity pocket residues and substrate

Ser

Orn

hTyrCys (ACV)HPG

Leu, Ile, ValGlu (Fengycin)Leu (Eucarya)

Threonine

Asp, Asn, Gln

Valine

Ala, Dab

Cysteine

Trp, Phe

Tyr

Val, Ala (Eucarya)

Proline

Glu, Gln

Challis, Ravel and Townsend Chem. Biol. (2000) 7, 211-224

Prediction of substrates and possible products for the S. coelicolor cryptic NRPS

O

NHNH2

OH

HN

OH

O

H

O

NOH

O

NH2

OH O

NHNH2

OH

HN

OH

O

H

O

HN

N

O

OHH

Challis and Ravel FEMS Microbiol. Lett. (2000) 187, 111-114

A E C A E C A

SH SHSH

Module 1 Module 2 Module 3

A E C A E C A

SO

HN

H2NO

N

H2N

OHO

H

S

Module 1 Module 2 Module 3

O

NH2HOH

S

OH

A E C A E C A

SO

HN

H2NO

N

H2N

OHO

H

S

Module 1 Module 2 Module 3

O

NH2HOH

S

OH

Part 2: Modular polyketide synthase analysis

• Three large modular enzymes (DEBS 1-3), encoded by eryAI, eryAII, and eryAIII, assemble 6-DEB

O

O

Me

Me

Me

OH

Me

OH

O

Me

Me

OH

6-Deoxyerythronolide B

TE cyclizes

• Each module performs one chain extension

Recap of modular PKS organisation and function: the erythromycin synthase as an example

ACPKS

DH

ER

KR

SH SH

AT ACPKS

DH

ER

KR

SH S

O

O-

O

AT ACPKS

DH

ER

KR

S S

O

O-

O

O

R

AT ACPKS

DH

ER

KR

SH S

O

R

O

AT ACPKS

DH

ER

KR

SH S

O

R

HO

AT ACPKS

DH

ER

KR

SH S

O

R

AT ACPKS

DH

ER

KR

SH S

O

R

AT

Recap of modular PKS organisation and function: the erythromycin synthase as an example

-CO2

• Three large modular enzymes (DEBS 1-3), encoded by eryAI, eryAII, and eryAIII, assemble 6-DEB

O

O

Me

Me

Me

OH

Me

OH

O

Me

Me

OH

6-Deoxyerythronolide B

TE cyclizes

• Each module performs one chain extension

Recap of modular PKS organisation and function: the erythromycin synthase as an example

Gene clusters directing complex metabolite biosynthesis in the S. coelicolor genome

Bentley et al. Nature (2002) 417, 141-147

A new S. coelicolor modular PKS cluster

Genes encoding a modular PKS

Prediction of domain and modules in CpkA

Conserved Domain (CD) search

(http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi)

Prediction of domain and modules in CpkB

Prediction of domain and modules in CpkC

Prediction of domains and modules in CpkABC

Pawlik, Kotowska, Chater, Kuczek and Takano Arch. Microbiol. (2007) 187, 87-99

Prediction of AT domain substrate selectivity

Haydock et al. FEBS Lett. (1995) 374, 246-248Banskota et al. J. Antibiot. (2006) 59, 168-176

Prediction of KR domain stereoselectivity

Prediction of KR domain stereoselectivity

Caffrey ChemBioChem (2003) 4, 654-657Reid et al. Biochemistry (2003) 42, 72-79

Prediction of substrates and possible products for the S. coelicolor cryptic PKS

OH OH OH O

Hor

Non-linear enzymatic logic can complicate things!

S

N

OH

HN

S

OH S

N CO2HH

H

Haynes and Challis, Curr. Op. Drug Discov. Develop. (2007) 10, 203-218

Non-linear enzymatic logic can complicate things!

S

O

ACP ACP ACP ACP ACP

ER

ACP TEACP

H

CO2H

S

O

S

O

S

O

S

O

S

O

S

O

OHH

CO2H OHH

CO2H OHH

CO2H

OH

HO

OHH

CO2H

OH

HO

OHH

CO2H

OH

OH

HO

OHH

CO2H

Load

Module 1

Module 2

Module 3

Module 4

Module 5 + 6 + 7

Module 7

OH

O

O

H

OH

H

CO2H

OH

O

O

NC

H

OH

H

CO2H

BorI, Bor J

ATKS

KR

ATKS

KRDH

KS

DH

AT

KR

ATKS

KR

KS

DH

AT

KR

KS

KR

ATAT

3

3

Haynes and Challis, Curr. Op. Drug Discov. Develop. (2007) 10, 203-218

Part 3: Analysis of other biosynthetic systems

Terpene synthases

OPP

OPP

C10

C15

OPPC20

C30

mono-terpene

synthase

sesqui-terpene

synthase

di-terpene

synthase

tri-terpene

synthase

HO OH

O

SCoA

OH

OAmycolatopsis

orientalis

DpgA4 x Me

O O

O

NH

O HN

O

NH

O

NHMe

Cl

O

HN

H

OH

NH

OCl

OH

NHO

OHHO

H

HO2C

HOH

O

HOHO

O

OH

OH2N

HO Me

NH2

O

HO OH

O

OHH2N

HOOH

O

Streptomycesgriseus

5 x

OH OH

OH

HO

OH OH

OH

HO

OH OH

OHmelanin

RppA

Iterative polyketide synthases – type III PKSs

Conclusions

• Reasonably confident in silico predictions of domain / module organisation and substrate specificity of modular PKS / NRPS can be made

• Non-linear enzymatic logic can complicate the reliable prediction of product structure(s)

• For other types of biosynthetic system, reasonably confident predictions of substrate specificity can sometimes be made

• Prediction of chain length and substrate specificity in some iterative PKS systems, especially type III and fungal type I, remains difficult

Recommended