About the cellulases distribution Renaud Berlemont, UCI Adam Martiny Lab

Preview:

Citation preview

About the cellulases distribution

Renaud Berlemont, UCI

Adam Martiny Lab.

About the GHx classification• CAZYdb Glycoside Hydrolases, …• Structure – Sequences Alignements : Families (>100) / Clans (14)• « Convergence – Divergence »

Some statements

• Biochemically confirmed « cellulases » = CMCases

Some statements

• Biochemically confirmed « cellulases » = CMCases

• Many cellulases are active on other substrates (e.g. xylan)

• Many « cellulases » are non-cellulolytic !?• CMCases ≠ Cellulases

• Cellulose production :– GH8 (Romling, 2002) – Biofilm / Interaction (w. plant)– GH5 (Berlemont, 2009) - Biofilm– GH6 (Delbrassine, in prep) – Cell differenciation– GH6 (Tunicate, animal)– GH9 (KORrigan, plant)

Some statements

• Biochemically confirmed « cellulases » = CMCases

• Many cellulases are active on other substrates (e.g. xylan)

• Many « cellulases » are non-cellulolytic• CMCases ≠ Cellulases• Best studied cellulose degraders all belong to

the Firmicutes group (e.g. Clostridium)• ~20 genomes of cellulose degraders have been

completely sequenced

Hypothesis 2a

Question 2

How are extracellular enzyme genes distributed among

microbial taxa ?

Some extracellular enzymes are broadly distributed across taxa while others are constrained to a small number of taxa.

Hypothesis 2b

The occurrence of different extracellular enzyme genes among taxa will be correlated. Some genes will show patterns of over-dispersion while others will show co-occurrence.

pSEED - FigFams• Sequenced genomes (patricbrc db - 4089)

In order to analyze as much as possible sequenced genomes

pSEED - FigFams« FIGfams are sets of protein sequences that are similar along the full

length of the proteins. Proteins are thought of as implementing one or more abstract functional roles, and all of the members of a single FIGfam are believed to implement precisely the same set of functional roles ».

« Unambiguous coherent annotation system » …3.2.1.4 : 1,4-beta-D-endoglucanase, 1,4-beta-D-glucan-4-glucanohydrolase, beta-1,4-endoglucan hydrolase, beta-1,4-endoglucanase, endoglucanase,

Methodology

FigFam IDs

CAZYdb

E.C. 3.2.1.4

GHx

Pfam (pro. + euk.) InterPRo (pro.)

PfGHx.FASTA IprGHx.FASTA

pSEED

PEG IDs

Home-made Script :SEQ PEG ID

GH families

Figfam IDs

Several Figfam IDs correspondTo one GHx families because Signal

Peptides and accessory domainsAre not conserved …

Methodology

FigFam IDs

GHx pSEED

Genomes Annotations

GHx Occurrence In

Sequenced genomes

AlignmentStatistic

Bacterialgroups

CBM2 …

Bacterialgroups

Bacterialgroups

Occurrence / List

Occurrence / List

Figfam IDs

Genomes annotations (pSEED)

GHx distribution

A huge data-set

A ActinobacteriaB AequfacieC Bactero./ChlorobiD Chlam./ Verruco.E ChloroflexiF ChrysiogenetesG CyanobacteriaH DeferibacterI Deinoco./ThermusJ DictyoglomiK ElusomicrobiaL Fibrob./ Acidobact.M FirmicutesN FusobacteriaO NitrospiraeP GemmatimonadetesQ PlanctomycesR ProteobacteriaS SpirochaetesT SynergistetesU TenericutesV Thermodesulfobact.W Thermotogae

Huge bias : A + C + M + R = 88% of the sequenced genomes…

Average Gene Content (AGC)

A ActinobacteriaB AequfacieC Bactero./ChlorobiD Chlam./ Verruco.E ChloroflexiF ChrysiogenetesG CyanobacteriaH DeferibacterI Deinoco./ThermusJ DictyoglomiK ElusomicrobiaL Fibrob./ Acidobact.M FirmicutesN FusobacteriaO NitrospiraeP GemmatimonadetesQ PlanctomycesR ProteobacteriaS SpirochaetesT SynergistetesU TenericutesV Thermodesulfobact.W Thermotogae

Life style (Auto Vs. Hetero)Host association

“HKG”Multi-function

GHx distribution in Genomes

Life Style

Autotrophic : Aequifacie

CyanobacteriaChrysiogenetes

Nitrospirae

Host associated: Chlam./ Verruco.

ElusomicrobiaFibrob./ Acidobact.*

FusobacteriaSpirochaetesTenericutes

GHx distribution in GenomesGHx functions

« house keeping »

GH6endoglucanase ; cellobiohydrolase

GH18… endo-β-N-acetylglucosaminidase …

Q: Planctomycetes

U: Tenericutes - Mycoplasma

GHx distribution in GenomesGHx functions

GHx families « specialization »

GH6endoglucanase ; cellobiohydrolase

GH5chitosanase ; β-mannosidase ; cellulase ; glucan β-1,3-glucosidase ; licheninase

; glucan endo-1,6-β-glucosidase mannan endo-β-1,4-mannosidase ; endo-β-1,4-xylanase

; cellulose β-1,4-cellobiosidase ; β-1,3-mannanase ; xyloglucan-specific endo-β-1,4-glucanase

; mannan transglycosylase ; endo-β-1,6-galactanase ; endoglycoceramidase

How is it possible to know if an Enzyme from the GH5

is a cellulase?

Complex architecturesGH5

chitosanase (EC 3.2.1.132); β-mannosidase (EC 3.2.1.25); cellulase (EC 3.2.1.4); glucan β-1,3-glucosidase (EC 3.2.1.58); licheninase

(EC 3.2.1.73); glucan endo-1,6-β-glucosidase (EC 3.2.1.75) mannan endo-β-1,4-mannosidase (EC 3.2.1.78); endo-β-1,4-xylanase

(EC 3.2.1.8); cellulose β-1,4-cellobiosidase (EC 3.2.1.91); β-1,3-mannanase (EC 3.2.1.-); xyloglucan-specific endo-β-1,4-glucanase

(EC 3.2.1.151); mannan transglycosylase (EC 2.4.1.-); endo-β-1,6-galactanase (EC 3.2.1.164);

endoglycoceramidase (EC 3.2.1.123)

GH6endoglucanase (EC 3.2.1.4); cellobiohydrolase (EC 3.2.1.91)

GH6 = « cellulase »

GH5 = Multifunction

Free cellulases from the GH6 are Associated to the cellulose production

In actynomycetes !

?

Is there an efficient combination of enzymes ?

Some genes are abundant

(GH5, 10, 16, 18, 19)Are these genes really involved in

PCW breakdown ?

Why are Fibrobacteria so Efficient ?

Multi-domain

Is there an efficient combination of enzymes ?

Is there an efficient combination of enzymes ?

The keys of the succes in Fibrobacteria

• Huge dataset

• Distribution of GHx amongst taxa

• Not all the GHx are equivalent– Multifunction, house keeping and specialized

GHx families

• Not all the taxa are equivelent– Life style, metabolism

• Future : « Multi-domain »

Things to remember…

What’s next

Looking at the GHx-distribution in subgroups (e.g Proteobacteria, Firmicutes, …)

Detailed table of the GHx distribution amongst (sub)-taxa

Potential publication ?

• What is the phylogenetic distribution of GHx’s and CBM-GHx’s

• Catabolism regulation analysis in Actynobacteria CebR (GHx vs CBM-GHx) :– Presence/absence of regulating sequences upstream

the GHx-coding sequences

• Environmental factors : “life style”, “metabolism”, …

• Gene Gain/loss : 16S rRNA Vs. presence/absence of

GHx’s

Do the cellulose degradation potential vary in environment ?

Some cases studies …

GHx distribution in metagenomes% of

CBM linked GHx

Warnecke 2007

Hess 2011

Bacteroidetes, Fibrobacteria,Clostridia, …

Spirochaetes, Fibrobacter, Bacteroidetes, …

…Vs. Our studyUsing the SSU…

L1 L2 L3 L4 L5 L6 PL

Per

cent

of

hits

to

bact

eria

l SS

U-r

RN

Ase

quen

ces

0

20

40

60

80

100

120Fibrobacter/AcidobacterBacteroidetesCyanobacteriaFirmicutesGammaProteobacteriaBetaProteobacteriaAlphaProteobacteriaActinoBacteriaOthers

…Vs. Our study

Reno 2012 (probably)

Actinobacteria, Alphaproteobacteria,Bacteroidetes, …

Warnecke 2007

Hess 2011

Bacteroidetes, Fibrobacteria,Clostridia, …

Spirochaetes, Fibrobacter, Bacteroidetes, …

Metagenomes Clustrering

Environment selects for different populations (with different GHx)

Leaf Litter

Leaf Litter (tr. 1)

Leaf Litter (tr. 2)

Cow Rmuen

Termites

Wood feeding insects

Human metagenome

GOS

Leaf Litter

Leaf Litter (tr. 1)

Leaf Litter (tr. 2)

Cow Rmuen

Termites

Wood feeding insects

Human metagenome

GOS

16S rRNA

Leaf Litter

Leaf Litter (tr. 1)

Leaf Litter (tr. 2)

Cow Rmuen

Termites

Wood feeding insects

Human metagenome

GOS

Leaf Litter

Leaf Litter (tr. 1)

Leaf Litter (tr. 2)

Cow Rmuen

Termites

Wood feeding insects

Human metagenome

GOS

Leaf Litter

Leaf Litter (tr. 1)

Leaf Litter (tr. 2)

Cow Rmuen

Termites

Wood feeding insects

Human metagenome

GOS

GHx

?

• Different recipes for efficient PCW breakdown• Depending on the ecosystem• Leaf litter ≠ Cow Rumen

– Bacterial content– GH content

• Regarding the ecosystems, bacteria display different strategies to access plant polymers – [GH6, GH8, GH9]LL > [GH6, GH8, GH9]CR

– [CMB-GHx]LL > [CBM-GHx]CR

Things to remember…

What’s next

• Leaf Litter Metagenome– 22 samples ~ready to be sequenced

(TruSeq TM DNA -Illumina) (first year)– samples to be prepared (second year)– Compare :

[GHx/16s rRNA in sequenced genomes] vs.

[GHx/16s rRNA in Leaf Litter]

– Compare different treatments, metagenomes

Nitrogen fertilization

Nemergut, 2008, The effects of chronic nitrogen fertilization on alpinetundra soil microbial communities: implications forcarbon and nitrogen cycling.

16S

rR

NA

GH

x

GH

y

GH

z

16S

rR

NA

GH

x

GH

y

GH

z

16S

rR

NA

GH

z

cont

rol

cont

rol

16S

rR

NA

GH

z

cont

rol

cont

rol

16S

rR

NA

GH

z

cont

rol

cont

rol

24 samples

• TruSeq TM DNA (Illumina)

• 24 samples

• 22 samples ready to be sequenced

Complex architectures

Cel5CBM2

Xyl8 Cel5CBM2

Cel5

Amount of FigFam IDs corresponding to a 2-domain protein

Plant Cell Wall

Amount of FigFam IDs ≠ Amount of genes

Leaf Litter

Leaf Litter (tr. 1)

Leaf Litter (tr. 2)

Cow Rmuen

Termites

Wood feeding insects

Human metagenome

GOS

Leaf Litter

Leaf Litter (tr. 1)

Leaf Litter (tr. 2)

Cow Rmuen

Termites

Wood feeding insects

Human metagenome

GOS

Metagenomes Clustrering

Environment selects for different GHx potential

Leaf Litter

Leaf Litter (tr. 1)

Leaf Litter (tr. 2)

Cow Rmuen

Termites

Wood feeding insects

Human metagenome

GOS

Leaf Litter

Leaf Litter (tr. 1)

Leaf Litter (tr. 2)

Cow Rmuen

Termites

Wood feeding insects

Human metagenome

GOS

16S rRNA GHx

Recommended