195
Examination of the Transcriptional Regulation and Downstream Targets of the Transcription Factor AtMYB61 by Michael Prouse A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Department of Cell & Systems Biology University of Toronto © Copyright by Michael Prouse 2013

Examination of the Transcriptional Regulation and ... · Examination of the transcriptional regulation and downstream targets of the transcription factor AtMYB61 Michael Prouse Doctor

  • Upload
    others

  • View
    21

  • Download
    0

Embed Size (px)

Citation preview

Examination of the Transcriptional Regulation and Downstream Targets of the Transcription Factor AtMYB61

by

Michael Prouse

A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy

Department of Cell & Systems Biology University of Toronto

© Copyright by Michael Prouse 2013

ii

Examination of the transcriptional regulation and downstream

targets of the transcription factor AtMYB61

Michael Prouse

Doctor of Philosophy

Department of Cell & Systems Biology, University of Toronto

2013

Thesis Abstract

The mechanisms behind how a transcription factor elicits a given phenotype can be

complex. The aim of the research presented herein was to provide experimental

evidence to characterise the upstream and downstream regulation of the Arabidopsis

thaliana R2R3-MYB transcription factor, AtMYB61. To address these aims, three

separate experiments were undertaken.

First, three direct downstream target genes of AtMYB61 were predicted based on a two-

stage complete transcriptome analysis, using publicly available microarray datasets in

combination with a custom microarray dataset comparing the transcriptomes of WT,

atmyb61 and 35S::MYB61 plants. These candidate target genes encode the following

proteins: a KNOTTED1-like transcription factor, a caffeoyl-CoA 3-O-methyltransferase

and a pectin-methylesterase. AtMYB61 bound the 5‘ non-coding regulatory regions of

these target genes, as determined by electrophoretic mobility shift assay.

Second, the preferred DNA-binding sites of recombinant AtMYB61 protein were

assessed with a cyclic amplification and selection of targets (CASTing) assay. Key

interactions between amino acids in the AtMYB61 DNA-binding site and nucleotides in

the preferred DNA targets were predicted by molecular modeling. While recombinant

iii

AtMYB61 was sufficient to drive gene expression from CASTing-identified target DNA

sequences in yeast, it did so in a manner that was not entirely consistent with predicted

DNA-binding affinities determined by a nitrocellulose filter binding assay.

Finally, the molecular components that function upstream to modulate AtMYB61

expression were determined. AtMYB61 was determined to be de-repressed by sucrose

in a mechanism involving its second intron. An over-represented motif was conserved

within the second intron of Brassicaceae AtMYB61 homologues and this motif

functioned as a binding target for a putative sugar-mediated repressor, as determined

by EMSA. Putative AtMYB61 repressor proteins that bound this motif in the absence of

sucrose were affinity purified and characterised using LC-MS/MS, and the proteins

identified based on their MS fingerprints.

iv

Acknowledgements

I thank my supervisor, Dr. Malcolm Campbell, for his ongoing mentorship and guidance

over the five years that I have had the pleasure to be in his laboratory. His tremendous

support and optimism has shaped me into the scientist I am today. Our father-son

relationship was something that I will always treasure, and for that I thank him. I would

also like to thank my committee members Dr. Darrell Desveaux and Dr. Keiko Yoshioka

and examiners, Dr. Dinesh Christendat, Dr. Daphne Goring and Dr. Shelley Hepworth,

for keeping my goals in sight and obtainable and for the constructive criticisms that I

needed to receive to reach the next level.

I would also like to thank the members of the Cell Systems Biology program, with whom

I spent countless hours discussing science, projects, and ideas. I would also like to

thank my lab mates with whom I have treated as my family and shared some of my

fondest of memories – Katharina Braeutigam, Thomas Cannam, Erin Hamanishi,

Katrina Hiiback, Hungwei Hou, Julia Nowak, Joan Ouellette, Sherosha Raj, Julia

Romano, Joseph Skaf, Michael Stokes, Heather Wheeler, and Olivia Wilkins. To my

longtime office mates Michael Stokes and Rohan Patel, I thank you for all the laughs,

pranks and great times that we shared over the years.

I am also grateful to my parents, Doris and Robert Prouse, who have constantly been

there for me throughout my life and have provided me with the guidance, unconditional

love, and support that I needed to succeed. Finally, I would like to thank my wife, Diana

– without you always loving and supporting me, I would never have made it this far.

You are everything to me and I can‘t wait to start our new family together.

―I‘m a great believer in luck, and I find the harder I work, the more I have of it.‖

—Stephen Leacock

v

Table of Contents

Thesis Abstract ........................................................................................................................... ii

Acknowledgements .................................................................................................................... iv

Table of Contents ....................................................................................................................... v

List of Abbreviations ................................................................................................................... x

List of Tables ............................................................................................................................. xii

List of Figures........................................................................................................................... xiii

Chapter 1 ................................................................................................................................... 1

1. Introduction ........................................................................................................................ 2

1.1 Transcription Factors ..................................................................................................... 2

1.2 The Nature of MYB Proteins .......................................................................................... 3

1.2.1 The MYB Transcription Factor Superfamily .............................................................. 3

1.2.2 Animal MYB Proteins ................................................................................................ 4

1.2.3 Plant MYB Proteins .................................................................................................. 6

1.2.4 Single MYB Repeat Proteins .................................................................................... 7

1.2.5 Expansion and Diversification of the MYB Family ..................................................... 7

1.3 DNA targets of MYB family members ............................................................................. 8

1.3.1 Animal MYB DNA-Binding Sites ............................................................................... 8

1.3.2 Plant MYB DNA-Binding Sites .................................................................................. 9

1.3.3 The DNA Targets of Single MYB Repeat Proteins .................................................. 14

1.4 The Nature of DNA-Binding by MYB Proteins .............................................................. 15

1.4.1 Relationship Between the MYB DNA-Binding Domain and DNA-Binding Specificity .......................................................................................................... 15

1.4.2 Involvement of MYB Repeats in DNA Binding ........................................................ 17

1.4.3 The Nature of DNA Binding By Animal MYB Proteins ............................................. 17

1.4.4 The Nature of DNA Binding By Plant MYB Proteins ............................................... 21

1.5 Future of Plant MYB-DNA Interaction Studies .............................................................. 24

vi

1.5.1 Determining the Breadth of MYB DNA Targets in vitro ........................................... 24

1.5.2 Emerging Approaches for Plant MYB Target Discovery and Analysis in vivo .......... 25

1.6 Transcriptional Regulation of MYB proteins ................................................................. 29

1.6.1 Regulators Effecting MYB Gene Expression in Networks ....................................... 29

1.6.2 The Role of Introns on MYB Transcriptional Regulation ......................................... 30

1.7 Research Hypotheses and Aims ............................................................................. 31

1.8 Acknowledgements ................................................................................................. 33

Chapter 2 ................................................................................................................................. 34

2 AtMYB61, an R2R3-MYB Transcription Factor, is a Pleiotropic Regulator of Plant Carbon Acquisition and Resource Allocation ....................................................................... 35

2.1 Abstract ....................................................................................................................... 35

2.2 Introduction .................................................................................................................. 35

2.3 Materials and Methods ................................................................................................. 37

2.3.1 Plant Material, Seed Sterilization and Growth Conditions ................................... 37

2.3.2 RNA Isolation and Quantitative PCR .................................................................. 38

2.3.3 Secondary Thickened Hypocotyls Stained with Phloroglucinol ........................... 38

2.3.4 Transmission Electron Microscopy ..................................................................... 39

2.3.5 Microarray Analysis ............................................................................................ 39

2.3.6 Bioinformatic Analyses to Identify AtMYB61 Targets .......................................... 40

2.3.7 Electrophoretic Mobility Shift Assay (EMSA) ...................................................... 41

2.3.8 Transcriptional Activation Assay ........................................................................ 41

2.3.9 Fibre Quality Analysis ........................................................................................ 42

2.4 Results and Discussion ................................................................................................ 42

2.4.1 AtMYB61 Modulates the Expression of a Specific Set of Target Genes ............. 42

2.4.2 AtMYB61 Regulates Genes with Specific Target Motifs in Their Promoters ....... 47

2.4.3 AtMYB61 Regulates Genes Which Themselves Contribute to AtMYB61-Related Phenotypes .......................................................................................... 52

2.5 Conclusion ................................................................................................................... 54

vii

2.6 Acknowledgements ...................................................................................................... 54

Chapter 3 ................................................................................................................................. 55

3 Interactions between the R2R3-MYB Transcription Factor, AtMYB61, and Target DNA Binding Sites ....................................................................................................................... 56

3.1 Abstract ....................................................................................................................... 56

3.2 Introduction .................................................................................................................. 56

3.3 Materials and Methods ................................................................................................. 59

3.3.1 Ethics Statement ................................................................................................ 59

3.3.2 Expression of Recombinant Protein in Bacteria ................................................. 59

3.3.3 Antibody Production and Western Blot Analysis ................................................. 59

3.3.4 Cyclic Amplification and Selection of Targets (CASTing) ................................... 60

3.3.5 Nitrocellulose Filter-Binding Assay ..................................................................... 60

3.3.6 Electrophoretic Mobility Shift Assay (EMSA) ...................................................... 61

3.3.7 Molecular Modelling ........................................................................................... 61

3.3.8 Transcriptional Activation Assay ........................................................................ 61

3.4 Results and Discussion ................................................................................................ 62

3.4.1 AtMYB61 Bound a Discrete Subset of DNA Target Sequences ......................... 62

3.4.2 AtMYB61 Bound to DNA Target Sequences with Varying Degrees of Affinity .... 66

3.4.3 The Affinity of AtMYB61 to Specific Target DNA Sequences Was Predicted by Molecular Interactions Determined in silico ....................................................... 69

3.4.4 The Affinity of AtMYB61 to Specific Target DNA Sequences Did Not Correlate with AtMYB61-Driven Transcriptional Activation with Each of the Target Sequences ........................................................................................................ 71

3.4.5 CASTing Target Sequences Were Found in the Promoter Regions of Three Putative Direct Downstream Targets of AtMYB61 ............................................. 76

3.5 Conclusion ................................................................................................................... 78

3.6 Acknowledgements ...................................................................................................... 78

3.7 Supplemental Figures and Tables ............................................................................... 79

Chapter 4 ................................................................................................................................. 83

viii

4 Novel Regulation of an R2R3-MYB Transcription Factor, AtMYB61, by a Non-Hexokinase Sugar-Signalling Pathway ................................................................................ 84

4.1 Abstract ....................................................................................................................... 84

4.2 Introduction .................................................................................................................. 84

4.3 Materials and Methods ................................................................................................. 86

4.3.1 Plant Material and Culture .................................................................................. 86

4.3.2 Phylogenetic Analysis of AtMYB61 Brassicaceae Homologues ......................... 87

4.3.3 Analysis of Transgenic Plants Containing Promoter::Reporter Fusions .............. 87

4.3.4 Semi-Quantitative PCR ...................................................................................... 88

4.3.5 Quantitative, Real-Time, Reverse Transcriptase Polymerase Chain Reaction (qRT-PCR) ........................................................................................................ 88

4.3.6 Electrophoretic Mobility Shift Assay (EMSA) ...................................................... 90

4.3.7 Streptavidin Biotin Pull-Down Assay .................................................................. 90

4.3.8 Mass Spectrometry ............................................................................................ 91

4.4 Results and Discussion ................................................................................................ 91

4.4.1 AtMYB61 Expression is Regulated by Sugars .................................................... 91

4.4.2 AtMYB61 Acts in a Pathway Independent of the Hexokinase Sugar Signalling Pathway ............................................................................................................ 94

4.4.3 AtMYB61 Expression is Sugar Derepressed, Involving an Intragenic Sequence within the 5‘ Coding Region Containing Two Introns ......................... 97

4.4.4 Affinity Purification Coupled with Mass Spectrometry Uncovers a Suite of Putative AtMYB61 Repressor Proteins that Bind the Conserved Second Intron Motif in a Sucrose-Dependent Manner .................................................. 103

4.4.5 A Subset of Putative AtMYB61 Repressor Genes Are Sugar Sensitive ............ 106

4.4.6 rmx Loss-of-Function Mutant Phenocopies Constitutive AtMYB61 Overexpression ............................................................................................... 108

4.5 Conclusion ................................................................................................................. 110

4.6 Acknowledgements .................................................................................................... 111

4.7 Supplemental Figures and Tables ............................................................................. 112

Chapter 5 ............................................................................................................................... 134

ix

5 General Conclusions and Future Directions .................................................................... 135

5.1 General Conclusions ................................................................................................. 135

5.2 Future Directions ....................................................................................................... 137

Molecular Characterisations of Plant Transcription Factors ........................................ 137

ChIP-Seq .................................................................................................................... 137

Characterisations of Putative AtMYB61 Repressors ................................................... 138

Appendices ............................................................................................................................ 140

A The Wound-, Pathogen-, and Ultraviolet B-Responsive MYB134 Gene Encodes an R2R3 MYB Transcription Factor that Regulates a Suite of Genes Involved in Proanthocyanidin Synthesis in Poplar ................................................................................ 141

A.1 Abstract ..................................................................................................................... 141

A.2 Introduction ............................................................................................................... 141

A.3 Materials and Methods .............................................................................................. 144

A.3.1 EMSA .............................................................................................................. 144

A.4 Results and Discussion ............................................................................................. 145

A.4.1 MYB134 Binds to Promoter Regions of PA Biosynthetic Genes ...................... 145

A.5 Conclusion ................................................................................................................ 148

A.6 Acknowledgements ................................................................................................... 149

B Study Labels ................................................................................................................... 150

References ............................................................................................................................. 151

Copyright Acknowledgements ................................................................................................ 181

x

List of Abbreviations

35S Cauliflower Mosaic Virus 35S promoter 61P AtMYB61 promoter 61PN AtMYB61 promoter and 5‘ intragenic sequences 2-DG 2-deoxyglucose 3-OMG 3-O-methylglucose aba abscisic acid loss-of-function mutant abi abscisic acid insensitive loss-of-function mutant ABRC Arabidopsis Biological Resource Center AC-1 AtMYB61 preferred target sequence-ACCTAC AC elements adenosine and cytosine enriched sequences ACT ACTIN AMV avian myeloblastosis virus AGRIS Arabidopsis Gene Regulatory Information Server ANR2 ANTHOCYANIDIN REDUCTASE2 AtHXK Arabidopsis thaliana HEXOKINASE atmyb61 Arabidopsis thaliana MYB61 loss-of-function mutant BERF1 Barley Ethylene Response Factor1 BEIL1 Barley Ethylene Insensitive Like1 BGRF1 Barley Growth Regulating Factor1 bHTH basic helix-turn-helix bHLH basic helix-loop-helix C1 COLORED1 CAST cyclic amplification and selection of targets CCoAOMT7 caffeoyl-CoA 3-O-methyltransferase ChIP-chip chromatin immunoprecipitation on chip ChIP-seq chromatin immunoprecipitation followed by high throughput sequencing Col-0 wild-type Arabidopsis thaliana Columbia CPC CAPRICE DEPC diethylpyrocarbonate DFR1 DIHYDROFLAVONOL REDUCTASE1 DOF DNA binding with one Finger EMSA electrophoretic mobility shift assay FLP FOUR LIPS gin glucose insensitive loss-of-function mutant GL1 GLABRA1 GL3 GLABRA3 GR glucocorticoid receptor GS1b GLUTAMATE SYNTHETASE-1B GSNO S-nitrosoglutathione GTFs general transcription factors

GUS -glucuronidase hxk hexokinase loss-of-function mutant IBP indicator binding protein group IFN-g human interferon-g irx11 irregular xylem11/knat-7 loss-of-function mutant Kd dissociation constant KNAT7 KNOTTED1-like transcription factor LACC Local Animal Care Committee LC-MS/MS liquid chromatography tandem mass spectrometry LCR locus control region

xi

MBS MYB binding site MIAME minimum information about a microarray experiment MEME Multiple Em for Motif Elicitation MHL mannoheptulose MS Murashige Skoog MSA M phase-specific activator element MUG methylumbelliferone-glucuronide NASC Nottingham Arabidopsis Stock Centre NBS non-binding site of AtMYB61 PA proanthocyanidins PAL1 PHENYLALANINE AMMONIA-LYASE1

PBF Pyrimidine-box Binding Factor PCR polymerase chain reaction

PDB Protein Data Bank PG phenolic glycosides

PME pectin-methylesterase

PHYRE Protein Homology/analogY Recognition Engine PLACE PLAnt Cis-Element datatbase qRT-PCR Quantitative, real-time, reverse transcriptase polymerase chain reaction R MYB repeat RAmy1a RICE ALPHA-AMYLASE rmx repressor of myb expression loss-of-function mutant RMX REPRESSOR OF MYB EXPRESSION SBEI STARCH-BRANCHING ENZYME I SELEX systematic evolution of ligands by exponential enrichment SMH single MYB histone group SNP sodium nitroprusside Sus3 sucrose synthase 3 TAIR The Arabidopsis Information Resource TRANSFAC Transcription Factor Database TRFL TRF1/2-LIKE genes UACC University of Toronto Animal Care Committee UTR untranslated regions WBS WER-binding site WER WEREWOLF WT wild-type

xii

List of Tables

1 Introduction

1.1 DNA binding specificities of members of the MYB superfamily ..................................... 12

2 AtMYB61, an R2R3-MYB transcription factor, is a pleiotropic regulator of plant carbon acquisition and resource allocation

2.1 Genes that share transcript abundance profiles with AtMYB61 determined by Pearson correlation coefficient, across the AtGenExpress developmental baseline dataset. ......................................................................................................................... 44

2.2 Genes that share transcript abundance profiles with AtMYB61 determined by Pearson correlation coefficient, across the AtMYB61 microarray dataset ..................... 46

2.3 AC elements within the promoters of putative downstream targets. .............................. 50

3 Interactions between the R2R3-MYB transcription factor, AtMYB61, and target DNA binding sites

3.1 Alignment of AtMYB61 binding sites ............................................................................. 64

3.2 AtMYB61 consensus sequence was derived from a comparison of 89 sequences recovered from 5 cycles of CASTing ............................................................................. 65

3.3 Dissociation constants (Kd) in mol/L and associated errors of CASTing targets ........... 67

3.4 Dissociation constants (Kd) in mol/L and associated errors of mutated ACCTAC (AC1 element) sequences ............................................................................................ 68

S3.1 Relative binding of CASTing targets and mutated AC1 sequences to AtMYB61 .......... 80

4 Novel regulation of an R2R3-MYB transcription factor, AtMYB61, by a non-hexokinase sugar-signalling pathway

4.1 List of putative repressors of AtMYB61 expression (RMX) that bound AtMYB61 second intron repeat ................................................................................................... 105

S4.1 AtMYB61 second intron repeat motif identified within all Arabidopsis thaliana genes 118

S4.2 AtMYB61 second intron repeat motif identified within all Arabidopsis thaliana intergenic regions ....................................................................................................... 127

S4.3 AtMYB61 second intron repeat motif identified within all Arabidopsis thaliana introns and corresponding transcript response to sugar ......................................................... 129

xiii

List of Figures

1 Introduction

1.1 Schematic representation of an R2R3-MYB transcription factor...................................... 5

1.2 Phylogenetic relationships and subgroup designations for 87 MYB superfamily members ...................................................................................................................... 10

2 AtMYB61, an R2R3-MYB transcription factor, is a pleiotropic regulator of plant carbon acquisition and resource allocation

2.1 Transcript abundance of a subset of genes in the Arabidopsis thaliana transcriptome is influenced by the presence or absence of AtMYB61 activity ..................................... 43

2.2 AtMYB61 binds to the promoters of putative downstream targets, to motifs that are over-represented in these promoters and is sufficient to activate transcription from these motifs .................................................................................................................. 48

2.3 AtMYB61 binding to the 5‘ non-coding sequences of the three putative target genes as determined by EMSA ............................................................................................... 51

2.4 AtMYB61 downstream target genes have an impact on secondary wall formation and xylem formation in secondary thickened hypocotyls ...................................................... 53

3 Interactions between the R2R3-MYB transcription factor, AtMYB61, and target DNA binding sites

3.1 Cylic amplification and selection of targets (CASTing) recovered a suite of hexamer target sequences that bound to AtMYB61 ..................................................................... 63

3.2 Relative binding affinities of AtMYB61 to CASTing targets and to mutated ACCTAC motif determined by nitrocellulose filter-binding assays are confirmed by electrophoretic mobility shift assays (EMSAs) ............................................................... 70

3.3 Molecular modelling of AtMYB61 with target sequences confirm binding preferences determined by nitrocellulose filter-binding assays and EMSAs ..................................... 72

3.4 AtMYB61-mediated activation of promoter activity in Saccharomyces cerevisiae in an AC dependent fashion .................................................................................................. 74

3.5 Sequences recovered from the CASTing assay were found in all three promoter regions of predicted direct downstream targets of AtMYB61, namely KNOTTED1-like transcription factor (KNAT7, At1g62990); caffeoyl-CoA 3-O-methyltransferase (CCoAOMT7, At4g26220), and pectin-methylesterase (PME, At2g45220) ................... 77

S3.1 AtMYB61 antibody generation and validation .............................................................. 79

4 Novel regulation of an R2R3-MYB transcription factor, AtMYB61, by a non-hexokinase sugar-signalling pathway

4.1 Sugar regulation of AtMYB61 expression in dark-grown wild-type seedlings, 7 days post-germination ........................................................................................................... 92

xiv

4.2 Promoter-reporter and qRT-PCR analysis of AtMYB61 expression in response to sugars ........................................................................................................................... 93

4.3 qRT-PCR analysis of AtMYB61 and HXK-2 expression in wild-type (WT) and glucose insensitive (gin) loss-of-function mutants ...................................................................... 96

4.4 Analysis of AtMYB61 promoter-reporter fusion constructs that contain or do not contain AtMYB61 5’ intragenic sequences in response to sucrose ............................... 98

4.5 Phylogenetic footprinting identifies a conserved repeat motif in the second intron of AtMYB61 Brassicaceae homologues ............................................................................ 99

4.6 EMSA shows AtMYB61 second intron motif bound differentially by proteins in nuclear extracts from seedlings grown in the absence or presence of sucrose in the dark, consistent with the derepression model ...................................................................... 101

4.7 Affinity purification coupled with LC-MS/MS determines putative AtMYB61 repressor proteins that bound the second intron repeat .............................................................. 104

4.8 qRT-PCR of putative repressors of AtMYB61 expression loss-of-function mutants (rmx) that had AtMYB61 misexpression in seedlings grown in the absence of sucrose in the dark, validating the repressor hypothesis .......................................................... 107

4.9 Phenotypes of Arabidopsis thaliana wild-type (WT) plants, AtMYB61 loss-of-function mutants (atmyb61), AtMYB61 over-expressor mutants (35S::MYB61) and At2g43970 loss-of-function mutants (rmx3)................................................................................... 109

S4.1 Sequence alignment of the second intron of Brassicaceae AtMYB61 homologues .... 112

S4.2 Sequence alignment of AtMYB61 and AtMYB50 reveals no second intron repeat within AtMYB50 second intron .................................................................................... 113

S4.3 EMSA shows AtMYB61 second intron motif bound differentially by proteins in nuclear extracts from seedlings grown in the absence or presence of sucrose in the dark, consistent with the derepression model ............................................................. 114

S4.4 Validation of biotinylation of AtMYB61 second intron and second intron repeat ......... 115

S4.5 Semi-quantitative PCR of AtMYB61 expression in repressors of AtMYB61 expression loss-of-function mutant (rmx) seedlings grown in the absence or presence of sucrose in the dark .................................................................................. 116

S4.6 At2g43970 and At1g09540 share inverse transcript abundance profiles across development ............................................................................................................... 117

A Appendix. The wound-, pathogen-, and ultraviolet B-responsive MYB134 gene encodes an R2R3 MYB transcription factor that regulates a suite of genes involved in proanthocyanidin synthesis in Poplar

A.1 MYB134 binds to the promoters of putative downstream target genes ........................ 146

1

Chapter 1

Introduction

This chapter contains the following publication in its entirety:

Prouse M.B., and Campbell M.M. (2012) The interaction between MYB proteins and

their target DNA binding sites. Biochimica Et Biophysica Acta-Gene Regulatory

Mechanisms. 1819: 67-77.

Contributions: MBP, MMC designed research; MBP, MMC analyzed data; MBP, MMC

wrote and edited manuscript.

MBP contributed specifically to each figure and table in this chapter.

Copyright: Sections 1.1 to 1.6 inclusive are copyrighted by Elsevier B.V.

2

1. Introduction

1.1 Transcription Factors

In eukaryotic organisms, gene expression is subject to complex patterns of spatial and

temporal regulation. The first step of transcriptional regulation of any gene is

orchestrated by the activity of sequence-specific transcription factors, proteins that

function to reconfigure gene expression in response to external and internal cues.

Sequence-specific transcription factors frequently have a modular structure –

comprising a DNA-binding domain together with a transcriptional regulatory domain

(Colladovides et al., 1991). The DNA-binding domains of transcription factors are highly

conserved, while their transcriptional regulatory domains are variable (Schwechheimer

and Bevan, 1998). Sequence-specific transcription factors can act as transcriptional

activators, repressors, or both (Maniatis et al., 1987).

In eukaryotes, transcription factors that promote transcription are termed activator

proteins. Transcriptional activators can promote transcription of protein coding genes in

numerous ways. Activator proteins can bind a cognate target DNA site to directly or

indirectly recruit RNA polymerase II and general transcription factors (GTFs) that in turn

carry out transcription of a gene (Schwechheimer and Bevan, 1998; Lee and Young,

2000). Activator proteins can also effect the rate of transcription of a gene through

interactions with RNA polymerase II and GTFs (Lee and Young, 2000). Finally,

activator proteins can promote the acetylation of histone proteins making the DNA more

accessible for transcription (Cosma et al., 1999). Transcriptional activators accomplish

these tasks by directly or indirectly recruiting other proteins with this catalytic activity to

the DNA target.

Sequence-specific transcription factors that reduce transcription are transcriptional

repressors. These proteins act in three ways: (i) by binding to a cognate DNA site to

block the binding of general transcription factors or activators; (ii) by blocking

transcription by means of inhibitory interaction with general transcription factors or

activators; or (iii) by altering the higher-order DNA structure in a way to inhibit

3

transcription (HannaRose and Hansen, 1996). Repressors can reduce the rate of

transcription, or suppress it altogether.

Large families, or superfamilies of activator and repressor proteins have evolved in

eukaryotes. These are categorised based on the similarities of the DNA-binding

domain, with several such groups composed of one hundred or more members (Pabo

and Sauer, 1992; Yanhui et al., 2006). The MYB superfamily is one of the largest and

most diverse families of sequence-specific transcription factors (Rosinski and Atchley,

1998; Riechmann et al., 2000).

Much is known about the specifics of the interaction between animal MYB proteins and

their cognate DNA binding sites. By contrast, the knowledge of the details of MYB-DNA

interactions in plants is rather incomplete. This introduction will consider the current

state of knowledge with respect to MYB-DNA interactions in animals, and contrast this

with what is known in plants, suggesting means by which the gap in knowledge in plants

can be addressed. Moreover, this introduction will address how MYB proteins are

regulated to elicit their downstream responses.

1.2 The Nature of MYB Proteins

1.2.1 The MYB Transcription Factor Superfamily

The MYB superfamily is found in all major eukaryotic lineages, and is thought to be

more than 1 billion years old (Lipsick, 1996; Rosinski and Atchley, 1998; Kranz et al.,

2000; Wilkins et al., 2009). MYB proteins acquired their name from v-MYB, the

oncogenic component of avian myeloblastosis virus (AMV), where the sequence-

specific MYB domain was initially discovered (Peters et al., 1987). The cellular

counterpart of v-MYB is c-MYB, a MYB protein that plays a critical role in controlling the

proliferation and differentiation of hematopoietic cells (Mucenski et al., 1991). c-MYB

mutations that alter target gene expression drastically reduce the proliferation of

hematopoietic cells (Gewirtz and Calabretta, 1988). In keeping with this, homozygous

c-MYB knock-out lines of mice die before reaching day 15 of the fetal lifecycle due to

the inability to sustain hepatic erythropoiesis (Mucenski et al., 1991).

4

MYB superfamily members are characterised by a highly conserved DNA-binding

domain, referred to as the MYB domain, which consists of up to four imperfect amino

acid repeats (R1, R2, R3 and R4) of 50-53 amino acids (Fig. 1.1)(Rosinski and Atchley,

1998). Each of the MYB repeats, within the MYB domain, gives rise to a helix-helix-

turn-helix secondary structure (Fig. 1.1). The MYB domain is predominantly found

within the N-terminus of MYB-proteins (Fig.1.1)(Stracke et al., 2001); however, MYB

domains recently have also been discovered within the C-termini of MYB-proteins

(Linger and Price, 2009). Each MYB repeat consists of several highly conserved

tryptophan residues that are regularly spaced forming a hydrophobic core (Fig.

1.1)(Ogata et al., 1994). In contrast to the MYB domain, the C-terminal region of MYB

proteins is characteristically highly variable from one MYB protein to another, and

usually functions as either an activation or repression domain (Jin and Martin, 1999;

Kranz et al., 2000; Stracke et al., 2001; Jia et al., 2004). This gives rise to a wide range

of variability both structurally and functionally within the MYB superfamily.

In animals, the MYB superfamily is relatively small, generally comprising four or five

proteins (Lipsick, 1996; Konig et al., 1998; Rosinski and Atchley, 1998; Wong et al.,

1998). Animal MYB superfamily members regulate gene expression related to cell

division or a discrete subset of cellular differentiation events (Biedenkapp et al., 1988;

Golay et al., 1991; Howe and Watson, 1991). By contrast, the MYB superfamily in

plants has expanded dramatically, with 100-200 MYB family members commonly found

in individual plant species (Dubos et al., 2010). In plants, MYB proteins regulate a vast

array of biochemical, cellular and developmental processes (Martin and PazAres, 1997;

Jin and Martin, 1999; Dubos et al., 2010).

1.2.2 Animal MYB Proteins

As is the case with c-MYB, animal MYB superfamily members contain three MYB

repeats (Howe et al., 1990; Luscher and Eisenman, 1990; Ogata et al., 1994); although,

there are some notable exceptions that deviate from this, including human SNAPc 190

and TRF1 (Konig et al., 1998; Wong et al., 1998). In all annotated vertebrate genomes,

5

Figure 1.1. Schematic representation of an R2R3-MYB transcription factor. The primary structure, secondary structure and protein-DNA model are indicated for an R2R3-MYB transcription factor. MYB proteins are classified depending on the number of adjacent MYB repeats (R). Each MYB repeat gives rise to a helix-helix-turn-helix secondary structure that is involved in sequence specific binding. Model of an R2R3-MYB transcription factor binding to the major groove of its target sequence was generated by Pymol. H, helix; T, turn; W, tryptophan; X, amino-acid; red, helix secondary structure; green, turn secondary structure; yellow, DNA target.

6

there are only three MYB proteins with three MYB repeats: A-MYB, B-MYB, and c-MYB

(Lipsick, 1996; Rosinski and Atchley, 1998). A-MYB and B-MYB proteins are R1R2R3-

MYB nuclear transcription factors expressed in hematopoietic cells, epithelial cells, and

fibroblasts (Nomura et al., 1988). A-MYB negatively regulates cellular proliferation

(Golay et al., 1991), while B-MYB positively regulates cell growth control, differentiation,

and cancer (Sala and Watson, 1999).

1.2.3 Plant MYB Proteins

In comparison to animals, the MYB superfamily is greatly expanded in plants (Stracke et

al., 2001; Jia et al., 2004; Wilkins et al., 2009). For example, of the over 1600

sequence-specific transcription factors identified in the genome of the model

dicotyledonous plant, Arabidopsis thaliana, almost 10% are members of the MYB

transcription factor family (Riechmann et al., 2000; Dubos et al., 2010). In contrast to

animals, Arabidopsis thaliana has 5 three-repeat MYB proteins, and 126 two-repeat

(R2R3) MYB proteins, (Martin and PazAres, 1997; Arabidopsis Genome, 2000;

Riechmann et al., 2000; Stracke et al., 2001; Yanhui et al., 2006; Dubos et al., 2010),

while the monocotyledon plant rice (Oryza sativa) has 109 predicted R2R3-MYB

proteins (Yanhui et al., 2006). In addition, single-repeat MYBs have also been identified

in plants and animals in increasing numbers (Baranowskij et al., 1994; Carre and Kay,

1995; Feldbrugge et al., 1997; Konig and Rhodes, 1997; Schaffer et al., 1998; Koering

et al., 2000; Alabadi et al., 2001; Chen et al., 2001; Hwang et al., 2001; Nishikawa et al.,

2001; Lu et al., 2002; Mohrmann et al., 2002; Li and de Lange, 2003; Marian et al.,

2003; Maxwell et al., 2003; Court et al., 2005; Xue, 2005; Fukuzawa et al., 2006; Lira et

al., 2007; Ko et al., 2008; Liao et al., 2008; Pitt et al., 2008; Ehrenkaufer et al., 2009; Ko

et al., 2009; Rawat et al., 2009; Lang and Juan, 2010; Yi et al., 2010; Yu et al., 2010).

Although, single-repeat MYB proteins have been identified in both animals and plants,

the majority of single repeat MYB proteins have not been characterised in plants.

As their name implies, R2R3-MYB proteins have two MYB repeats (Stracke et al.,

2001). R2R3-MYB proteins comprise the largest group of MYB transcription factors in

the MYB superfamily and appear to be specific to plants (Dubos et al., 2010). Plant

R2R3-MYB proteins regulate a myriad of processes, including primary and secondary

7

metabolism; regulation of cell fate and identity; regulation of plant development; and

responses to biotic and abiotic stresses (Pazares et al., 1987; Martin and PazAres,

1997; Glover et al., 1998; Jin and Martin, 1999; Martin et al., 2002; Patzlaff et al.,

2003a; Patzlaff et al., 2003b; Gomez-Maldonado et al., 2004; Jia et al., 2004; Liang et

al., 2005; Dubos et al., 2010). While analogous processes, such as regulation of cell

fate and identity, can be found in animals, the precise functions associated with R2R3-

MYB proteins appear to be plant specific (Martin and PazAres, 1997; Jin and Martin,

1999; Dubos et al., 2010).

1.2.4 Single MYB Repeat Proteins

Single MYB repeat proteins can be classified into the following two groups: 1) proteins

with MYB domain at C-terminus (Indicator Binding Protein (IBP) group), and 2) proteins

with MYB domain at the N-terminus (Single MYB Histone (SMH) group). The IBP group

of proteins includes RTBP1 from rice, AtTRP1 and AtTBP1 from Arabidopsis thaliana

(Konig et al., 1998; Chen et al., 2001; Hwang et al., 2001), as well as the highly

characterized telomeric DNA-binding proteins TRF1, TRF2, RAP1 and Taz1. SMH

proteins are a novel group of single MYB proteins that have only been identified in

plants. SMH group of proteins include PcMYB1 from Petroselinum crispum, AtTRB1,

AtTRB2, AtTRB3 from Arabidopsis thaliana, and Smh1 from Maize. AtTRB1, AtTRB2,

AtTRB3 have been studied in detail, all sharing a single MYB repeat more similar to R2

than R1 and R3 (Marian et al., 2003). In Arabidopsis thaliana, single-repeat MYB

proteins CAPRICE (CPC), TRYPTICHON (TRY), ETC1 (ENHANCER OF TRY and

CPC) and ETC2 have been identified (Schellmann et al., 2002; Kirik et al., 2004).

1.2.5 Expansion and Diversification of the MYB Family

Two theories of how the MYB superfamily evolved have been constructed based on

parsimony (Lipsick, 1996). The first is formulated on the premise that three-repeat MYB

proteins are closely related to vertebrate c-MYB and other similar three-repeat MYB

proteins in other eukaryotic groups, such as ciliates and slime molds (Braun and

Grotewold, 1999; Yang et al., 2003b). These primitive proteins are predicted to have

existed before the divergence between animals and plants (Yang et al., 2003b). This

8

theory proposes that R2R3-MYB proteins originated recently from three-repeat MYB

proteins due to loss of R1-MYB repeat (Braun and Grotewold, 1999; Dias et al., 2003).

The second theory postulates that within an ancient R2R3 predecessor that there was a

domain duplication and subsequent gain of R1, suggesting that R2R3 is a precursor of

MYB3R (Jiang et al., 2004a). Common to both theories, there was a vast expansion of

R2R3-MYB proteins in plants via duplications of entire genes (Lipsick, 1996); however,

the expansion was restricted for the three-repeat MYB proteins in both animals and

plants. Comparisons of DNA-binding specificities and functional roles between MYB

proteins with different repeats could help elucidate the nature of the evolutionary

pathway for MYB proteins.

1.3 DNA targets of MYB family members

1.3.1 Animal MYB DNA-Binding Sites

The DNA target of animal three-repeat MYB transcription factors was first determined

by isolation of chicken genomic DNA fragments bound by v-MYB on filters (Biedenkapp

et al., 1988) and by comparison of putative MYB binding sites within the SV40 enhancer

region (Nakagoshi et al., 1990). Binding-site selection methods with c-MYB protein

resulted in added minor extensions to the c-MYB consensus sequence. The c-MYB

consensus sequence was found to be ((T/C)AAC(G/T)G(A/C/T)(A/C/T)) and was termed

MYB binding site I (MBSI) (Howe et al., 1990; Weston, 1992). Mutational assays

validated by NMR structural data revealed that the MBSI sequence was bipartite. The

first half-site ((T/C)AAC)) has the majority of specific contacts with R3, and the second

half-site ((G/T)G(A/C/T)(A/C/T)) had specific contacts with R2 (Tanikawa et al., 1993;

Ogata et al., 1994; Ording et al., 1994). Following identification of the c-MYB DNA-

binding site, mammalian A-MYB and B-MYB, were subsequently shown to bind MBSI

(Mizuguchi et al., 1990; Watson et al., 1993; Ma and Calabretta, 1994; Jin and Martin,

1999).

9

1.3.2 Plant MYB DNA-Binding Sites

Although R1R2R3-MYB proteins in plants share the same functionality as animal

R1R2R3-MYB family members, their DNA-binding specificities are different (Howe and

Watson, 1991; Weston, 1992; Ito, 2005). All three characterised animal three-repeat

MYB proteins bind to the same sequence MBSI ((T/C)AAC(G/T)G(A/C/T)(A/C/T)) and

have similar functions in cell-cycle control (Biedenkapp et al., 1988; Golay et al., 1991;

Howe and Watson, 1991). In comparison, plant three-repeat MYB proteins, such as

tobacco MYBA1, MYBA2, and MYBB have an important role at the G2/M phase of the

cell-cycle, by regulating transcription of cyclin B and other cell-cycle genes that are

expressed at a similar time in the cell-cycle (Ito et al., 1998). Through a yeast one-

hybrid screen, NtMYBA1, NtMYBA2, and NtMYBB were found to bind to AACGG. This

consensus sequence is known as the M phase-specific activator (MSA) element, and

was identified previously in tobacco.

Relatively few of the possible plant R2R3-MYB DNA targets have been characterised;

but some common elements of plant MYB-DNA interactions have emerged (Fig. 1.2,

Table 1.1). Recognition of plant MYB DNA targets was first determined with studies

conducted on the Maize P protein, an R2R3-MYB protein involved in flavonoid

biosynthesis (Grotewold et al., 1994). Through binding-site selection assays and

EMSAs, P was shown to bind to ACC(A/T)ACC(A/C/T). This contrasted with the animal

MYB DNA consensus sequence of ((T/C)AAC(G/T)G(A/C/T)(A/C/T)), but was a

harbinger for the majority of plant MYB proteins, which recognise MBSI

((T/C)AAC(G/T)G(A/C/T)(A/C/T)), MBSII (AGTTAGTTA), and MBSIIG

((C/T)ACC(A/T)A(A/C)C). Nevertheless, it is important to note that not all plant MYB

proteins, especially within the R2R3-MYB family, recognise these motifs (Romero et al.,

1998). Many R2R3-MYB transcription factors recognise AC elements, DNA motifs that

are enriched in adenosine and cytosine residues (Grotewold et al., 1994; Sablowski et

al., 1994; Sablowski et al., 1995; Moyano et al., 1996; Sainz et al., 1997; Uimari and

Strommer, 1997; Tamagnone et al., 1998; Jin et al., 2000; Sugimoto et al., 2000; Yang

et al., 2001; Patzlaff et al., 2003a; Patzlaff et al., 2003b; Fukuzawa et al., 2006). Some

R2R3-MYB proteins function as transcriptional activators at these sites (Patzlaff et al.,

2003a; Patzlaff et al., 2003b), while others function as transcriptional repressors

10

Figure 1.2. Phylogenetic relationships and subgroup designations for 87 MYB superfamily members. The unrooted phylogenetic tree was generated using the amino acid sequences of the MYB proteins in Table 1.1. Whole MYB protein sequences were downloaded from The Arabidopsis Information Resource (TAIR; http://Arabidopsis.org) and from the National Center for Biotechnology Information protein database (NCBI Entrez; http://www.ncvi.nlm.nih.gov/sites/entrez). The phylogenetic analysis included 9 three-repeat MYB proteins (R1R2R3-MYB proteins), 50 two-repeat MYB proteins (R2R3-MYB proteins) and 28 one-repeat MYB proteins (R1-MYB proteins). The full-length amino acid sequences were aligned using Multiple Alignment using Fast Fourier Transform (MAFFT) using the G-INS-I algorithm (Katoh et al., 2005). A neighbour-joining tree was constructed using Molecular Evolutionary Genetics Analysis 4 (MEGA 4) (Tamura et al., 2007) with the parameters for the Jones-Taylor-Thornton substitution model and a Gamma parameter of 1.0 to account for the

11

Figure 1.2 caption continued. uneven rates of substitution across the length of the MYB proteins. Pairwise gap deletion was used, along with a bootstrap value of 1000. DNA-binding sites for MYB proteins were obtained from the literature. MYB proteins are annotated by colour based on DNA sequence recognition. Red, blue, green, orange, purple and grey represent MYB proteins that bind CNGTT(A/G), ACC(A/T)A(A/C), TTAGGG, AAAATATCT, GATA and TATCCA respectively. Black represents MYB proteins that do not bind to an assigned group. N indicates adenosine, guanine, cytosine or thymine. * indicates that the MYB protein DNA-binding specificity differs slightly from the consensus sequence of its group. Refer to Table 1.1 for specific details on DNA sequences bound by the MYB proteins.

12

Table 1.1. DNA binding specificities of members of the MYB superfamily. The information in the table represents the current state of knowledge pertaining to the DNA targets of MYB proteins, as determined from the literature. N indicates adenosine, guanine, cytosine or thymine. * indicates that the MYB protein DNA-binding specificity differs slightly from the consensus sequence of its group.

Group MYB Protein Binding Site Species MYB REPEAT References

1. CNGTT(A/G) *p85 AACGGT Drosophila melanogaster R1R2R3 Beall et al., 2002

GCAGTTT

At MYB1 CAGTTA Arabidopsis thaliana R2R3 Urao et al., 1993

At MYB2 CAGTTA Arabidopsis thaliana R2R3 Urao et al., 1993

AAACCA Hoeren et al., 1998

*AtMYB46 GTT(A/T)GTT(A/G) Arabidopsis thaliana R2R3 Ramirez et al., 2011

At MYB66 or WER CNGTT(A/G)G Arabidopsis thaliana R2R3 Koshino-Kimura et al., 2005; Ryu et al., 2005

AGTAGTTA

At MYB77 AAAAAACGGTTA Arabidopsis thaliana R2R3 Romero et al., 1998

*At MYB98 ANGTTAC Arabidopsis thaliana R2R3 Punwani et al., 2007; Punwani et al., 2008

*At MYBGL1 AAAGTTAGTTA Arabidopsis thaliana R2R3 Oppenheimer et al., 1991; Telfer et al., 1997

DUO1 CGGTTA Arabidopsis thaliana R2R3 Borg et al., 2011

Eh MYB10 CCGTTA Entamoeba histolytica R2R3 Menese et al., 2010

gMYB2 CTGT(A/T)G Giardia lamblia R2R3 Sun et al., 2002

GTTT(G/T)(G/T) Yang et al., 2003

CTGTTG Huang et al., 2008

CTGTAG

CAGTAG

GTGTAG

GmMYB76 AAAAAACCGTTATA Glycine max R2R3 Liao et al., 2008

ATCCTTTTTTCCGG

GmMYB92 AAAAAACCGTTATA Glycine max R2R3 Liao et al., 2008

ATCCTTTTTTCCGG

GGTAGGTGAGA

GmMYB177 AAAAAACCGTTATA Glycine max R1 Liao et al., 2008

ATCCTTTTTTCCGG

Hv MYBGa GTTTGTTA Hordeum vulgare R2R3 Gubler et al., 1995

Nt MYB1 CAGTT(A/G) Nicotiana tabacum R2R3 Yang and Klessig, 1996

G(G/T)T(A/T)GGT(A/G)

Nt MYBAS1 GCNGTT(A/G) Nicotiana tabacum R2R3 Yang et al., 2001

NtMYBJS1 AACAACCAC Nicotiana tabacum R2R3 Galis et al., 2006

ACCAACCCC

GAMYB TAACCACC Oryza sativa R2R3 Chen et al., 2006

ATTCAGTTA Oryza sativa R2R3 Aya et al., 2009

*OsMYB5 TGTT Oryza sativa R2R3 Suzuki et al., 1998

MYB.Ph3 A(A/G/T)(A/G/T)C(C/G)GTTA Petunia hybrida R2R3 Solano et al., 1997

AGTTAGTTA

PsMYB26 AAAAAACGGTTA Pisum sativum R2R3 Uimari and Strommer, 1997

AAAAGTTAGGTTA

PiMyb2R1 CNGTTG Phytophthora infestans R2R3 Xiang et al., 2010

v-MYB (A/G/T)(A/G/T)C(A/C)GTT(A/G) Avian myeloblastosis virus R1R2R3 Howe et al., 1990; Weston et al., 1992

Dd MYB CNGTT(A/G) Dictyostelium discoideum R1R2R3 Stobergrasser et al., 1992

c-MYB (A/G/T)(A/G/T)C(A/C)GTT(A/G) Gallus gallus domesticus R1R2R3 Howe et al., 1990; Weston et al., 1992

A-MYB AACCGTTA Homo sapien R1R2R3 Ma and Calabretta, 1994

B-MYB GTCAGTTA Mus musculus R1R2R3 Watson et al., 1993

Nt MYBA1 T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001

Nt MYBA2 T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001

Nt MYBB T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001

2.ACC(A/T)A(A/C) PcMYB1 AACCTAAC Petroselinum crispum R1 Feldbrugge et al., 1997

At MYB6 ACCTACCA Arabidopsis thaliana R2R3 Li and Parish, 1995

At MYB7 ACCTACCA Arabidopsis thaliana R2R3 Li and Parish, 1995

AtMYB13 (T/C)ACC(A/T)AAC Arabidopsis thaliana R2R3 Sugimoto et al., 2000

At MYB15 CACCTA(A/C)CG Arabidopsis thaliana R2R3 Romero et al, 1998

AtMYB58 ACCTACC Arabidopsis thaliana R2R3 Zhou et al., 2009

ACCAACC

ACCTAAC

AtMYB63 ACCTACC Arabidopsis thaliana R2R3 Zhou et al., 2009

ACCAACC

ACCTAAC

At MYB84 CACCTA(A/C)CG Arabidopsis thaliana R2R3 Romero et al, 1998

AtMYB85 ACCTACC Arabidopsis thaliana R2R3 Zhong et al., 2008

ACCTAAC

Am MYB305 (C/T)ACC(A/T)A(A/C)C Antirrhinum majus R2R3 Sablowski et al., 1994; Moyano et al., 1996;Romero et al., 1998

(C/T)AAC(A/T)AAC

Group MYB Protein Binding Site Species MYB REPEAT References

1. CNGTT(A/G) *p85 AACGGT Drosophila melanogaster R1R2R3 Beall et al., 2002

GCAGTTT

At MYB1 CAGTTA Arabidopsis thaliana R2R3 Urao et al., 1993

At MYB2 CAGTTA Arabidopsis thaliana R2R3 Urao et al., 1993

AAACCA Hoeren et al., 1998

*AtMYB46 GTT(A/T)GTT(A/G) Arabidopsis thaliana R2R3 Ramirez et al., 2011

At MYB66 or WER CNGTT(A/G)G Arabidopsis thaliana R2R3 Koshino-Kimura et al., 2005; Ryu et al., 2005

AGTAGTTA

At MYB77 AAAAAACGGTTA Arabidopsis thaliana R2R3 Romero et al., 1998

*At MYB98 ANGTTAC Arabidopsis thaliana R2R3 Punwani et al., 2007; Punwani et al., 2008

*At MYBGL1 AAAGTTAGTTA Arabidopsis thaliana R2R3 Oppenheimer et al., 1991; Telfer et al., 1997

DUO1 CGGTTA Arabidopsis thaliana R2R3 Borg et al., 2011

Eh MYB10 CCGTTA Entamoeba histolytica R2R3 Menese et al., 2010

gMYB2 CTGT(A/T)G Giardia lamblia R2R3 Sun et al., 2002

GTTT(G/T)(G/T) Yang et al., 2003

CTGTTG Huang et al., 2008

CTGTAG

CAGTAG

GTGTAG

GmMYB76 AAAAAACCGTTATA Glycine max R2R3 Liao et al., 2008

ATCCTTTTTTCCGG

GmMYB92 AAAAAACCGTTATA Glycine max R2R3 Liao et al., 2008

ATCCTTTTTTCCGG

GGTAGGTGAGA

GmMYB177 AAAAAACCGTTATA Glycine max R1 Liao et al., 2008

ATCCTTTTTTCCGG

Hv MYBGa GTTTGTTA Hordeum vulgare R2R3 Gubler et al., 1995

Nt MYB1 CAGTT(A/G) Nicotiana tabacum R2R3 Yang and Klessig, 1996

G(G/T)T(A/T)GGT(A/G)

Nt MYBAS1 GCNGTT(A/G) Nicotiana tabacum R2R3 Yang et al., 2001

NtMYBJS1 AACAACCAC Nicotiana tabacum R2R3 Galis et al., 2006

ACCAACCCC

GAMYB TAACCACC Oryza sativa R2R3 Chen et al., 2006

ATTCAGTTA Oryza sativa R2R3 Aya et al., 2009

*OsMYB5 TGTT Oryza sativa R2R3 Suzuki et al., 1998

MYB.Ph3 A(A/G/T)(A/G/T)C(C/G)GTTA Petunia hybrida R2R3 Solano et al., 1997

AGTTAGTTA

PsMYB26 AAAAAACGGTTA Pisum sativum R2R3 Uimari and Strommer, 1997

AAAAGTTAGGTTA

PiMyb2R1 CNGTTG Phytophthora infestans R2R3 Xiang et al., 2010

v-MYB (A/G/T)(A/G/T)C(A/C)GTT(A/G) Avian myeloblastosis virus R1R2R3 Howe et al., 1990; Weston et al., 1992

Dd MYB CNGTT(A/G) Dictyostelium discoideum R1R2R3 Stobergrasser et al., 1992

c-MYB (A/G/T)(A/G/T)C(A/C)GTT(A/G) Gallus gallus domesticus R1R2R3 Howe et al., 1990; Weston et al., 1992

A-MYB AACCGTTA Homo sapien R1R2R3 Ma and Calabretta, 1994

B-MYB GTCAGTTA Mus musculus R1R2R3 Watson et al., 1993

Nt MYBA1 T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001

Nt MYBA2 T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001

Nt MYBB T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001

2.ACC(A/T)A(A/C) PcMYB1 AACCTAAC Petroselinum crispum R1 Feldbrugge et al., 1997

At MYB6 ACCTACCA Arabidopsis thaliana R2R3 Li and Parish, 1995

At MYB7 ACCTACCA Arabidopsis thaliana R2R3 Li and Parish, 1995

AtMYB13 (T/C)ACC(A/T)AAC Arabidopsis thaliana R2R3 Sugimoto et al., 2000

At MYB15 CACCTA(A/C)CG Arabidopsis thaliana R2R3 Romero et al, 1998

AtMYB58 ACCTACC Arabidopsis thaliana R2R3 Zhou et al., 2009

ACCAACC

ACCTAAC

AtMYB63 ACCTACC Arabidopsis thaliana R2R3 Zhou et al., 2009

ACCAACC

ACCTAAC

At MYB84 CACCTA(A/C)CG Arabidopsis thaliana R2R3 Romero et al, 1998

AtMYB85 ACCTACC Arabidopsis thaliana R2R3 Zhong et al., 2008

ACCTAAC

Am MYB305 (C/T)ACC(A/T)A(A/C)C Antirrhinum majus R2R3 Sablowski et al., 1994; Moyano et al., 1996;Romero et al., 1998

(C/T)AAC(A/T)AAC

Group MYB Protein Binding Site Species MYB REPEAT References

1. CNGTT(A/G) *p85 AACGGT Drosophila melanogaster R1R2R3 Beall et al., 2002

GCAGTTT

At MYB1 CAGTTA Arabidopsis thaliana R2R3 Urao et al., 1993

At MYB2 CAGTTA Arabidopsis thaliana R2R3 Urao et al., 1993

AAACCA Hoeren et al., 1998

*AtMYB46 GTT(A/T)GTT(A/G) Arabidopsis thaliana R2R3 Ramirez et al., 2011

At MYB66 or WER CNGTT(A/G)G Arabidopsis thaliana R2R3 Koshino-Kimura et al., 2005; Ryu et al., 2005

AGTAGTTA

At MYB77 AAAAAACGGTTA Arabidopsis thaliana R2R3 Romero et al., 1998

*At MYB98 ANGTTAC Arabidopsis thaliana R2R3 Punwani et al., 2007; Punwani et al., 2008

*At MYBGL1 AAAGTTAGTTA Arabidopsis thaliana R2R3 Oppenheimer et al., 1991; Telfer et al., 1997

DUO1 CGGTTA Arabidopsis thaliana R2R3 Borg et al., 2011

Eh MYB10 CCGTTA Entamoeba histolytica R2R3 Menese et al., 2010

gMYB2 CTGT(A/T)G Giardia lamblia R2R3 Sun et al., 2002

GTTT(G/T)(G/T) Yang et al., 2003

CTGTTG Huang et al., 2008

CTGTAG

CAGTAG

GTGTAG

GmMYB76 AAAAAACCGTTATA Glycine max R2R3 Liao et al., 2008

ATCCTTTTTTCCGG

GmMYB92 AAAAAACCGTTATA Glycine max R2R3 Liao et al., 2008

ATCCTTTTTTCCGG

GGTAGGTGAGA

GmMYB177 AAAAAACCGTTATA Glycine max R1 Liao et al., 2008

ATCCTTTTTTCCGG

Hv MYBGa GTTTGTTA Hordeum vulgare R2R3 Gubler et al., 1995

Nt MYB1 CAGTT(A/G) Nicotiana tabacum R2R3 Yang and Klessig, 1996

G(G/T)T(A/T)GGT(A/G)

Nt MYBAS1 GCNGTT(A/G) Nicotiana tabacum R2R3 Yang et al., 2001

NtMYBJS1 AACAACCAC Nicotiana tabacum R2R3 Galis et al., 2006

ACCAACCCC

GAMYB TAACCACC Oryza sativa R2R3 Chen et al., 2006

ATTCAGTTA Oryza sativa R2R3 Aya et al., 2009

*OsMYB5 TGTT Oryza sativa R2R3 Suzuki et al., 1998

MYB.Ph3 A(A/G/T)(A/G/T)C(C/G)GTTA Petunia hybrida R2R3 Solano et al., 1997

AGTTAGTTA

PsMYB26 AAAAAACGGTTA Pisum sativum R2R3 Uimari and Strommer, 1997

AAAAGTTAGGTTA

PiMyb2R1 CNGTTG Phytophthora infestans R2R3 Xiang et al., 2010

v-MYB (A/G/T)(A/G/T)C(A/C)GTT(A/G) Avian myeloblastosis virus R1R2R3 Howe et al., 1990; Weston et al., 1992

Dd MYB CNGTT(A/G) Dictyostelium discoideum R1R2R3 Stobergrasser et al., 1992

c-MYB (A/G/T)(A/G/T)C(A/C)GTT(A/G) Gallus gallus domesticus R1R2R3 Howe et al., 1990; Weston et al., 1992

A-MYB AACCGTTA Homo sapien R1R2R3 Ma and Calabretta, 1994

B-MYB GTCAGTTA Mus musculus R1R2R3 Watson et al., 1993

Nt MYBA1 T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001

Nt MYBA2 T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001

Nt MYBB T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001

2.ACC(A/T)A(A/C) PcMYB1 AACCTAAC Petroselinum crispum R1 Feldbrugge et al., 1997

At MYB6 ACCTACCA Arabidopsis thaliana R2R3 Li and Parish, 1995

At MYB7 ACCTACCA Arabidopsis thaliana R2R3 Li and Parish, 1995

AtMYB13 (T/C)ACC(A/T)AAC Arabidopsis thaliana R2R3 Sugimoto et al., 2000

At MYB15 CACCTA(A/C)CG Arabidopsis thaliana R2R3 Romero et al, 1998

AtMYB58 ACCTACC Arabidopsis thaliana R2R3 Zhou et al., 2009

ACCAACC

ACCTAAC

AtMYB63 ACCTACC Arabidopsis thaliana R2R3 Zhou et al., 2009

ACCAACC

ACCTAAC

At MYB84 CACCTA(A/C)CG Arabidopsis thaliana R2R3 Romero et al, 1998

AtMYB85 ACCTACC Arabidopsis thaliana R2R3 Zhong et al., 2008

ACCTAAC

Am MYB305 (C/T)ACC(A/T)A(A/C)C Antirrhinum majus R2R3 Sablowski et al., 1994; Moyano et al., 1996;Romero et al., 1998

(C/T)AAC(A/T)AAC

Group MYB Protein Binding Site Species MYB REPEAT References

1. CNGTT(A/G) *p85 AACGGT Drosophila melanogaster R1R2R3 Beall et al., 2002

GCAGTTT

At MYB1 CAGTTA Arabidopsis thaliana R2R3 Urao et al., 1993

At MYB2 CAGTTA Arabidopsis thaliana R2R3 Urao et al., 1993

AAACCA Hoeren et al., 1998

*AtMYB46 GTT(A/T)GTT(A/G) Arabidopsis thaliana R2R3 Ramirez et al., 2011

At MYB66 or WER CNGTT(A/G)G Arabidopsis thaliana R2R3 Koshino-Kimura et al., 2005; Ryu et al., 2005

AGTAGTTA

At MYB77 AAAAAACGGTTA Arabidopsis thaliana R2R3 Romero et al., 1998

*At MYB98 ANGTTAC Arabidopsis thaliana R2R3 Punwani et al., 2007; Punwani et al., 2008

*At MYBGL1 AAAGTTAGTTA Arabidopsis thaliana R2R3 Oppenheimer et al., 1991; Telfer et al., 1997

DUO1 CGGTTA Arabidopsis thaliana R2R3 Borg et al., 2011

Eh MYB10 CCGTTA Entamoeba histolytica R2R3 Menese et al., 2010

gMYB2 CTGT(A/T)G Giardia lamblia R2R3 Sun et al., 2002

GTTT(G/T)(G/T) Yang et al., 2003

CTGTTG Huang et al., 2008

CTGTAG

CAGTAG

GTGTAG

GmMYB76 AAAAAACCGTTATA Glycine max R2R3 Liao et al., 2008

ATCCTTTTTTCCGG

GmMYB92 AAAAAACCGTTATA Glycine max R2R3 Liao et al., 2008

ATCCTTTTTTCCGG

GGTAGGTGAGA

GmMYB177 AAAAAACCGTTATA Glycine max R1 Liao et al., 2008

ATCCTTTTTTCCGG

Hv MYBGa GTTTGTTA Hordeum vulgare R2R3 Gubler et al., 1995

Nt MYB1 CAGTT(A/G) Nicotiana tabacum R2R3 Yang and Klessig, 1996

G(G/T)T(A/T)GGT(A/G)

Nt MYBAS1 GCNGTT(A/G) Nicotiana tabacum R2R3 Yang et al., 2001

NtMYBJS1 AACAACCAC Nicotiana tabacum R2R3 Galis et al., 2006

ACCAACCCC

GAMYB TAACCACC Oryza sativa R2R3 Chen et al., 2006

ATTCAGTTA Oryza sativa R2R3 Aya et al., 2009

*OsMYB5 TGTT Oryza sativa R2R3 Suzuki et al., 1998

MYB.Ph3 A(A/G/T)(A/G/T)C(C/G)GTTA Petunia hybrida R2R3 Solano et al., 1997

AGTTAGTTA

PsMYB26 AAAAAACGGTTA Pisum sativum R2R3 Uimari and Strommer, 1997

AAAAGTTAGGTTA

PiMyb2R1 CNGTTG Phytophthora infestans R2R3 Xiang et al., 2010

v-MYB (A/G/T)(A/G/T)C(A/C)GTT(A/G) Avian myeloblastosis virus R1R2R3 Howe et al., 1990; Weston et al., 1992

Dd MYB CNGTT(A/G) Dictyostelium discoideum R1R2R3 Stobergrasser et al., 1992

c-MYB (A/G/T)(A/G/T)C(A/C)GTT(A/G) Gallus gallus domesticus R1R2R3 Howe et al., 1990; Weston et al., 1992

A-MYB AACCGTTA Homo sapien R1R2R3 Ma and Calabretta, 1994

B-MYB GTCAGTTA Mus musculus R1R2R3 Watson et al., 1993

Nt MYBA1 T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001

Nt MYBA2 T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001

Nt MYBB T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001

2.ACC(A/T)A(A/C) PcMYB1 AACCTAAC Petroselinum crispum R1 Feldbrugge et al., 1997

At MYB6 ACCTACCA Arabidopsis thaliana R2R3 Li and Parish, 1995

At MYB7 ACCTACCA Arabidopsis thaliana R2R3 Li and Parish, 1995

AtMYB13 (T/C)ACC(A/T)AAC Arabidopsis thaliana R2R3 Sugimoto et al., 2000

At MYB15 CACCTA(A/C)CG Arabidopsis thaliana R2R3 Romero et al, 1998

AtMYB58 ACCTACC Arabidopsis thaliana R2R3 Zhou et al., 2009

ACCAACC

ACCTAAC

AtMYB63 ACCTACC Arabidopsis thaliana R2R3 Zhou et al., 2009

ACCAACC

ACCTAAC

At MYB84 CACCTA(A/C)CG Arabidopsis thaliana R2R3 Romero et al, 1998

AtMYB85 ACCTACC Arabidopsis thaliana R2R3 Zhong et al., 2008

ACCTAAC

Am MYB305 (C/T)ACC(A/T)A(A/C)C Antirrhinum majus R2R3 Sablowski et al., 1994; Moyano et al., 1996;Romero et al., 1998

(C/T)AAC(A/T)AAC

Group MYB Protein Binding Site Species MYB REPEAT References

1. CNGTT(A/G) *p85 AACGGT Drosophila melanogaster R1R2R3 Beall et al., 2002

GCAGTTT

At MYB1 CAGTTA Arabidopsis thaliana R2R3 Urao et al., 1993

At MYB2 CAGTTA Arabidopsis thaliana R2R3 Urao et al., 1993

AAACCA Hoeren et al., 1998

*AtMYB46 GTT(A/T)GTT(A/G) Arabidopsis thaliana R2R3 Ramirez et al., 2011

At MYB66 or WER CNGTT(A/G)G Arabidopsis thaliana R2R3 Koshino-Kimura et al., 2005; Ryu et al., 2005

AGTAGTTA

At MYB77 AAAAAACGGTTA Arabidopsis thaliana R2R3 Romero et al., 1998

*At MYB98 ANGTTAC Arabidopsis thaliana R2R3 Punwani et al., 2007; Punwani et al., 2008

*At MYBGL1 AAAGTTAGTTA Arabidopsis thaliana R2R3 Oppenheimer et al., 1991; Telfer et al., 1997

DUO1 CGGTTA Arabidopsis thaliana R2R3 Borg et al., 2011

Eh MYB10 CCGTTA Entamoeba histolytica R2R3 Menese et al., 2010

gMYB2 CTGT(A/T)G Giardia lamblia R2R3 Sun et al., 2002

GTTT(G/T)(G/T) Yang et al., 2003

CTGTTG Huang et al., 2008

CTGTAG

CAGTAG

GTGTAG

GmMYB76 AAAAAACCGTTATA Glycine max R2R3 Liao et al., 2008

ATCCTTTTTTCCGG

GmMYB92 AAAAAACCGTTATA Glycine max R2R3 Liao et al., 2008

ATCCTTTTTTCCGG

GGTAGGTGAGA

GmMYB177 AAAAAACCGTTATA Glycine max R1 Liao et al., 2008

ATCCTTTTTTCCGG

Hv MYBGa GTTTGTTA Hordeum vulgare R2R3 Gubler et al., 1995

Nt MYB1 CAGTT(A/G) Nicotiana tabacum R2R3 Yang and Klessig, 1996

G(G/T)T(A/T)GGT(A/G)

Nt MYBAS1 GCNGTT(A/G) Nicotiana tabacum R2R3 Yang et al., 2001

NtMYBJS1 AACAACCAC Nicotiana tabacum R2R3 Galis et al., 2006

ACCAACCCC

GAMYB TAACCACC Oryza sativa R2R3 Chen et al., 2006

ATTCAGTTA Oryza sativa R2R3 Aya et al., 2009

*OsMYB5 TGTT Oryza sativa R2R3 Suzuki et al., 1998

MYB.Ph3 A(A/G/T)(A/G/T)C(C/G)GTTA Petunia hybrida R2R3 Solano et al., 1997

AGTTAGTTA

PsMYB26 AAAAAACGGTTA Pisum sativum R2R3 Uimari and Strommer, 1997

AAAAGTTAGGTTA

PiMyb2R1 CNGTTG Phytophthora infestans R2R3 Xiang et al., 2010

v-MYB (A/G/T)(A/G/T)C(A/C)GTT(A/G) Avian myeloblastosis virus R1R2R3 Howe et al., 1990; Weston et al., 1992

Dd MYB CNGTT(A/G) Dictyostelium discoideum R1R2R3 Stobergrasser et al., 1992

c-MYB (A/G/T)(A/G/T)C(A/C)GTT(A/G) Gallus gallus domesticus R1R2R3 Howe et al., 1990; Weston et al., 1992

A-MYB AACCGTTA Homo sapien R1R2R3 Ma and Calabretta, 1994

B-MYB GTCAGTTA Mus musculus R1R2R3 Watson et al., 1993

Nt MYBA1 T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001

Nt MYBA2 T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001

Nt MYBB T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001

2.ACC(A/T)A(A/C) PcMYB1 AACCTAAC Petroselinum crispum R1 Feldbrugge et al., 1997

At MYB6 ACCTACCA Arabidopsis thaliana R2R3 Li and Parish, 1995

At MYB7 ACCTACCA Arabidopsis thaliana R2R3 Li and Parish, 1995

AtMYB13 (T/C)ACC(A/T)AAC Arabidopsis thaliana R2R3 Sugimoto et al., 2000

At MYB15 CACCTA(A/C)CG Arabidopsis thaliana R2R3 Romero et al, 1998

AtMYB58 ACCTACC Arabidopsis thaliana R2R3 Zhou et al., 2009

ACCAACC

ACCTAAC

AtMYB63 ACCTACC Arabidopsis thaliana R2R3 Zhou et al., 2009

ACCAACC

ACCTAAC

At MYB84 CACCTA(A/C)CG Arabidopsis thaliana R2R3 Romero et al, 1998

AtMYB85 ACCTACC Arabidopsis thaliana R2R3 Zhong et al., 2008

ACCTAAC

Am MYB305 (C/T)ACC(A/T)A(A/C)C Antirrhinum majus R2R3 Sablowski et al., 1994; Moyano et al., 1996;Romero et al., 1998

(C/T)AAC(A/T)AAC

Group MYB Protein Binding Site Species MYB REPEAT References

1. CNGTT(A/G) *p85 AACGGT Drosophila melanogaster R1R2R3 Beall et al., 2002

GCAGTTT

At MYB1 CAGTTA Arabidopsis thaliana R2R3 Urao et al., 1993

At MYB2 CAGTTA Arabidopsis thaliana R2R3 Urao et al., 1993

AAACCA Hoeren et al., 1998

*AtMYB46 GTT(A/T)GTT(A/G) Arabidopsis thaliana R2R3 Ramirez et al., 2011

At MYB66 or WER CNGTT(A/G)G Arabidopsis thaliana R2R3 Koshino-Kimura et al., 2005; Ryu et al., 2005

AGTAGTTA

At MYB77 AAAAAACGGTTA Arabidopsis thaliana R2R3 Romero et al., 1998

*At MYB98 ANGTTAC Arabidopsis thaliana R2R3 Punwani et al., 2007; Punwani et al., 2008

*At MYBGL1 AAAGTTAGTTA Arabidopsis thaliana R2R3 Oppenheimer et al., 1991; Telfer et al., 1997

DUO1 CGGTTA Arabidopsis thaliana R2R3 Borg et al., 2011

Eh MYB10 CCGTTA Entamoeba histolytica R2R3 Menese et al., 2010

gMYB2 CTGT(A/T)G Giardia lamblia R2R3 Sun et al., 2002

GTTT(G/T)(G/T) Yang et al., 2003

CTGTTG Huang et al., 2008

CTGTAG

CAGTAG

GTGTAG

GmMYB76 AAAAAACCGTTATA Glycine max R2R3 Liao et al., 2008

ATCCTTTTTTCCGG

GmMYB92 AAAAAACCGTTATA Glycine max R2R3 Liao et al., 2008

ATCCTTTTTTCCGG

GGTAGGTGAGA

GmMYB177 AAAAAACCGTTATA Glycine max R1 Liao et al., 2008

ATCCTTTTTTCCGG

Hv MYBGa GTTTGTTA Hordeum vulgare R2R3 Gubler et al., 1995

Nt MYB1 CAGTT(A/G) Nicotiana tabacum R2R3 Yang and Klessig, 1996

G(G/T)T(A/T)GGT(A/G)

Nt MYBAS1 GCNGTT(A/G) Nicotiana tabacum R2R3 Yang et al., 2001

NtMYBJS1 AACAACCAC Nicotiana tabacum R2R3 Galis et al., 2006

ACCAACCCC

GAMYB TAACCACC Oryza sativa R2R3 Chen et al., 2006

ATTCAGTTA Oryza sativa R2R3 Aya et al., 2009

*OsMYB5 TGTT Oryza sativa R2R3 Suzuki et al., 1998

MYB.Ph3 A(A/G/T)(A/G/T)C(C/G)GTTA Petunia hybrida R2R3 Solano et al., 1997

AGTTAGTTA

PsMYB26 AAAAAACGGTTA Pisum sativum R2R3 Uimari and Strommer, 1997

AAAAGTTAGGTTA

PiMyb2R1 CNGTTG Phytophthora infestans R2R3 Xiang et al., 2010

v-MYB (A/G/T)(A/G/T)C(A/C)GTT(A/G) Avian myeloblastosis virus R1R2R3 Howe et al., 1990; Weston et al., 1992

Dd MYB CNGTT(A/G) Dictyostelium discoideum R1R2R3 Stobergrasser et al., 1992

c-MYB (A/G/T)(A/G/T)C(A/C)GTT(A/G) Gallus gallus domesticus R1R2R3 Howe et al., 1990; Weston et al., 1992

A-MYB AACCGTTA Homo sapien R1R2R3 Ma and Calabretta, 1994

B-MYB GTCAGTTA Mus musculus R1R2R3 Watson et al., 1993

Nt MYBA1 T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001

Nt MYBA2 T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001

Nt MYBB T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001

2.ACC(A/T)A(A/C) PcMYB1 AACCTAAC Petroselinum crispum R1 Feldbrugge et al., 1997

At MYB6 ACCTACCA Arabidopsis thaliana R2R3 Li and Parish, 1995

At MYB7 ACCTACCA Arabidopsis thaliana R2R3 Li and Parish, 1995

AtMYB13 (T/C)ACC(A/T)AAC Arabidopsis thaliana R2R3 Sugimoto et al., 2000

At MYB15 CACCTA(A/C)CG Arabidopsis thaliana R2R3 Romero et al, 1998

AtMYB58 ACCTACC Arabidopsis thaliana R2R3 Zhou et al., 2009

ACCAACC

ACCTAAC

AtMYB63 ACCTACC Arabidopsis thaliana R2R3 Zhou et al., 2009

ACCAACC

ACCTAAC

At MYB84 CACCTA(A/C)CG Arabidopsis thaliana R2R3 Romero et al, 1998

AtMYB85 ACCTACC Arabidopsis thaliana R2R3 Zhong et al., 2008

ACCTAAC

Am MYB305 (C/T)ACC(A/T)A(A/C)C Antirrhinum majus R2R3 Sablowski et al., 1994; Moyano et al., 1996;Romero et al., 1998

(C/T)AAC(A/T)AAC

Group MYB Protein Binding Site Species MYB REPEAT References

1. CNGTT(A/G) *p85 AACGGT Drosophila melanogaster R1R2R3 Beall et al., 2002

GCAGTTT

At MYB1 CAGTTA Arabidopsis thaliana R2R3 Urao et al., 1993

At MYB2 CAGTTA Arabidopsis thaliana R2R3 Urao et al., 1993

AAACCA Hoeren et al., 1998

*AtMYB46 GTT(A/T)GTT(A/G) Arabidopsis thaliana R2R3 Ramirez et al., 2011

At MYB66 or WER CNGTT(A/G)G Arabidopsis thaliana R2R3 Koshino-Kimura et al., 2005; Ryu et al., 2005

AGTAGTTA

At MYB77 AAAAAACGGTTA Arabidopsis thaliana R2R3 Romero et al., 1998

*At MYB98 ANGTTAC Arabidopsis thaliana R2R3 Punwani et al., 2007; Punwani et al., 2008

*At MYBGL1 AAAGTTAGTTA Arabidopsis thaliana R2R3 Oppenheimer et al., 1991; Telfer et al., 1997

DUO1 CGGTTA Arabidopsis thaliana R2R3 Borg et al., 2011

Eh MYB10 CCGTTA Entamoeba histolytica R2R3 Menese et al., 2010

gMYB2 CTGT(A/T)G Giardia lamblia R2R3 Sun et al., 2002

GTTT(G/T)(G/T) Yang et al., 2003

CTGTTG Huang et al., 2008

CTGTAG

CAGTAG

GTGTAG

GmMYB76 AAAAAACCGTTATA Glycine max R2R3 Liao et al., 2008

ATCCTTTTTTCCGG

GmMYB92 AAAAAACCGTTATA Glycine max R2R3 Liao et al., 2008

ATCCTTTTTTCCGG

GGTAGGTGAGA

GmMYB177 AAAAAACCGTTATA Glycine max R1 Liao et al., 2008

ATCCTTTTTTCCGG

Hv MYBGa GTTTGTTA Hordeum vulgare R2R3 Gubler et al., 1995

Nt MYB1 CAGTT(A/G) Nicotiana tabacum R2R3 Yang and Klessig, 1996

G(G/T)T(A/T)GGT(A/G)

Nt MYBAS1 GCNGTT(A/G) Nicotiana tabacum R2R3 Yang et al., 2001

NtMYBJS1 AACAACCAC Nicotiana tabacum R2R3 Galis et al., 2006

ACCAACCCC

GAMYB TAACCACC Oryza sativa R2R3 Chen et al., 2006

ATTCAGTTA Oryza sativa R2R3 Aya et al., 2009

*OsMYB5 TGTT Oryza sativa R2R3 Suzuki et al., 1998

MYB.Ph3 A(A/G/T)(A/G/T)C(C/G)GTTA Petunia hybrida R2R3 Solano et al., 1997

AGTTAGTTA

PsMYB26 AAAAAACGGTTA Pisum sativum R2R3 Uimari and Strommer, 1997

AAAAGTTAGGTTA

PiMyb2R1 CNGTTG Phytophthora infestans R2R3 Xiang et al., 2010

v-MYB (A/G/T)(A/G/T)C(A/C)GTT(A/G) Avian myeloblastosis virus R1R2R3 Howe et al., 1990; Weston et al., 1992

Dd MYB CNGTT(A/G) Dictyostelium discoideum R1R2R3 Stobergrasser et al., 1992

c-MYB (A/G/T)(A/G/T)C(A/C)GTT(A/G) Gallus gallus domesticus R1R2R3 Howe et al., 1990; Weston et al., 1992

A-MYB AACCGTTA Homo sapien R1R2R3 Ma and Calabretta, 1994

B-MYB GTCAGTTA Mus musculus R1R2R3 Watson et al., 1993

Nt MYBA1 T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001

Nt MYBA2 T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001

Nt MYBB T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001

2.ACC(A/T)A(A/C) PcMYB1 AACCTAAC Petroselinum crispum R1 Feldbrugge et al., 1997

At MYB6 ACCTACCA Arabidopsis thaliana R2R3 Li and Parish, 1995

At MYB7 ACCTACCA Arabidopsis thaliana R2R3 Li and Parish, 1995

AtMYB13 (T/C)ACC(A/T)AAC Arabidopsis thaliana R2R3 Sugimoto et al., 2000

At MYB15 CACCTA(A/C)CG Arabidopsis thaliana R2R3 Romero et al, 1998

AtMYB58 ACCTACC Arabidopsis thaliana R2R3 Zhou et al., 2009

ACCAACC

ACCTAAC

AtMYB63 ACCTACC Arabidopsis thaliana R2R3 Zhou et al., 2009

ACCAACC

ACCTAAC

At MYB84 CACCTA(A/C)CG Arabidopsis thaliana R2R3 Romero et al, 1998

AtMYB85 ACCTACC Arabidopsis thaliana R2R3 Zhong et al., 2008

ACCTAAC

Am MYB305 (C/T)ACC(A/T)A(A/C)C Antirrhinum majus R2R3 Sablowski et al., 1994; Moyano et al., 1996;Romero et al., 1998

(C/T)AAC(A/T)AAC

13

Table 1.1 continued.

Group MYB Protein Binding Site Species MYB REPEAT References

1. CNGTT(A/G) *p85 AACGGT Drosophila melanogaster R1R2R3 Beall et al., 2002

GCAGTTT

At MYB1 CAGTTA Arabidopsis thaliana R2R3 Urao et al., 1993

At MYB2 CAGTTA Arabidopsis thaliana R2R3 Urao et al., 1993

AAACCA Hoeren et al., 1998

*AtMYB46 GTT(A/T)GTT(A/G) Arabidopsis thaliana R2R3 Ramirez et al., 2011

At MYB66 or WER CNGTT(A/G)G Arabidopsis thaliana R2R3 Koshino-Kimura et al., 2005; Ryu et al., 2005

AGTAGTTA

At MYB77 AAAAAACGGTTA Arabidopsis thaliana R2R3 Romero et al., 1998

*At MYB98 ANGTTAC Arabidopsis thaliana R2R3 Punwani et al., 2007; Punwani et al., 2008

*At MYBGL1 AAAGTTAGTTA Arabidopsis thaliana R2R3 Oppenheimer et al., 1991; Telfer et al., 1997

DUO1 CGGTTA Arabidopsis thaliana R2R3 Borg et al., 2011

Eh MYB10 CCGTTA Entamoeba histolytica R2R3 Menese et al., 2010

gMYB2 CTGT(A/T)G Giardia lamblia R2R3 Sun et al., 2002

GTTT(G/T)(G/T) Yang et al., 2003

CTGTTG Huang et al., 2008

CTGTAG

CAGTAG

GTGTAG

GmMYB76 AAAAAACCGTTATA Glycine max R2R3 Liao et al., 2008

ATCCTTTTTTCCGG

GmMYB92 AAAAAACCGTTATA Glycine max R2R3 Liao et al., 2008

ATCCTTTTTTCCGG

GGTAGGTGAGA

GmMYB177 AAAAAACCGTTATA Glycine max R1 Liao et al., 2008

ATCCTTTTTTCCGG

Hv MYBGa GTTTGTTA Hordeum vulgare R2R3 Gubler et al., 1995

Nt MYB1 CAGTT(A/G) Nicotiana tabacum R2R3 Yang and Klessig, 1996

G(G/T)T(A/T)GGT(A/G)

Nt MYBAS1 GCNGTT(A/G) Nicotiana tabacum R2R3 Yang et al., 2001

NtMYBJS1 AACAACCAC Nicotiana tabacum R2R3 Galis et al., 2006

ACCAACCCC

GAMYB TAACCACC Oryza sativa R2R3 Chen et al., 2006

ATTCAGTTA Oryza sativa R2R3 Aya et al., 2009

*OsMYB5 TGTT Oryza sativa R2R3 Suzuki et al., 1998

MYB.Ph3 A(A/G/T)(A/G/T)C(C/G)GTTA Petunia hybrida R2R3 Solano et al., 1997

AGTTAGTTA

PsMYB26 AAAAAACGGTTA Pisum sativum R2R3 Uimari and Strommer, 1997

AAAAGTTAGGTTA

PiMyb2R1 CNGTTG Phytophthora infestans R2R3 Xiang et al., 2010

v-MYB (A/G/T)(A/G/T)C(A/C)GTT(A/G) Avian myeloblastosis virus R1R2R3 Howe et al., 1990; Weston et al., 1992

Dd MYB CNGTT(A/G) Dictyostelium discoideum R1R2R3 Stobergrasser et al., 1992

c-MYB (A/G/T)(A/G/T)C(A/C)GTT(A/G) Gallus gallus domesticus R1R2R3 Howe et al., 1990; Weston et al., 1992

A-MYB AACCGTTA Homo sapien R1R2R3 Ma and Calabretta, 1994

B-MYB GTCAGTTA Mus musculus R1R2R3 Watson et al., 1993

Nt MYBA1 T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001

Nt MYBA2 T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001

Nt MYBB T(A/G)(A/G)CCGTT(A/G)GA Nicotiana tabacum R1R2R3 Ito et al., 2001

2.ACC(A/T)A(A/C) PcMYB1 AACCTAAC Petroselinum crispum R1 Feldbrugge et al., 1997

At MYB6 ACCTACCA Arabidopsis thaliana R2R3 Li and Parish, 1995

At MYB7 ACCTACCA Arabidopsis thaliana R2R3 Li and Parish, 1995

AtMYB13 (T/C)ACC(A/T)AAC Arabidopsis thaliana R2R3 Sugimoto et al., 2000

At MYB15 CACCTA(A/C)CG Arabidopsis thaliana R2R3 Romero et al, 1998

AtMYB58 ACCTACC Arabidopsis thaliana R2R3 Zhou et al., 2009

ACCAACC

ACCTAAC

AtMYB63 ACCTACC Arabidopsis thaliana R2R3 Zhou et al., 2009

ACCAACC

ACCTAAC

At MYB84 CACCTA(A/C)CG Arabidopsis thaliana R2R3 Romero et al, 1998

AtMYB85 ACCTACC Arabidopsis thaliana R2R3 Zhong et al., 2008

ACCTAAC

Am MYB305 (C/T)ACC(A/T)A(A/C)C Antirrhinum majus R2R3 Sablowski et al., 1994; Moyano et al., 1996;Romero et al., 1998

(C/T)AAC(A/T)AAC

Am MYB308 (C/T)ACC(A/T)A(A/C)C Antirrhinum majus R2R3 Tamagnone et al., 1998

Am MYB340 (C/T)ACC(A/T)A(A/C)C Antirrhinum majus R2R3 Moyano et al.,1996

DcMYB1 ACC(A/T)(A/T)CC Daucus carota R2R3 Maeda et al., 2005

Eg MYB1 (C/T)ACC(A/T)A(A/C)C Eucalyptus gunnii R2R3 Legay et al., 2007

Eg MYB2 CACCTACC Eucalyptus gunnii R2R3 Goicoechea et al., 2005

TACCTAAC

NlMYB305 TCACCTAAC Nicotiana langsdorffii R2R3 Liu et al., 2009

GCACCTAAT

NtMYB2 ATCTCACCTACCA Nicotiana tabacum R2R3 Sugimoto et al., 2000

PtMYB1 ACCTACC Pinus taeda R2R3 Patzlaff et al., 2003b

ACCAACC

ACCTAAC

PtMYB4 ACCTACC Pinus taeda R2R3 Patzlaff et al., 2003a

ACCAACC

ACCTAAC

Pt MYB134 ACCTAC Populus tremuloides R2R3 Mellway et al., 2009

*Le MYBI TCTAATCTCATCC Solanum lycopersicum R2R3 Rose et al., 1999

ZmMYB31 ACC(T/A)ACC Zea mays R2R3 Fornale et al., 2010

Zm MYBC1 A(A/C)C(A/T)A(A/C)C Zea mays R2R3 Sainz et al., 1997

GTT(A/T)GTT(A/G)

ZmMYB-IF35 ACC(A/T)ACC(A/C/T) Zea mays R2R3 Heine et al., 2007

Zm P ACC(A/T)ACC(A/C/T) Zea mays R2R3 Grotewold et al., 1994

3.TTAGGG At TBP1 TTTAGGG Arabidopsis thaliana R1 Hwang et al., 2001

At TRP1 TTTAGGG Arabidopsis thaliana R1 Chen et al., 2001

hTRF1 TTTAGGG Homo sapien R1 Nishikawa et al., 2001; Court et al., 2005

Rap1 TTAGGG Homo sapien R1 Li et al., 2003

LaTBP1 TTTAGGG Leishmania amazonensis R1 Lira et al., 2007

TGTGTGGG

Ng TRF1 TTTAGGG Nicotiana glutinosa R1 Ko et al., 2008

RTBP1 TTTAGGG Oryza sativa R1 Ko et al., 2009

*Tbf1p TAGGGTTGG Saccharomyces cerevisiae R1 Koering et al., 2000

Smhl TTTAGGG Zea mays R1 Marian et al., 2003

Rap1 TTAGGG Saccharomyces cerevisiae R2R3 Konig and Rhodes, 1997

ACA(C/T)CCCAT(C/T) Lascaris et al., 1999

ACACCC(A/G)(C/T)ACA(C/T)(A/C) Lieb et al., 2001

4.AAAATATCT *CCA1 AA(A/C)AATCT Arabidopsis thaliana R1 Carre et al., 1995

LHY AAAATATCT Arabidopsis thaliana R1 Schaffer et al., 1998

RVE1 AAAATATCT Arabidopsis thaliana R1 Maxwell et al., 2003; Rawat et al., 2009

RVE2 AAAATATCT Arabidopsis thaliana R1 Maxwell et al., 2003; Rawat et al., 2009

TOC1 AAAATATCT Arabidopsis thaliana R1 Alabadi et al., 2001

5.GATA MYBSt1 GGATA Solanum tuberosum R1 Baranowskij et al., 1994

TaMYB80 AGATAC Triticum aestivum R1 Xue et al., 2005

GGAATATNC

Tv MYB1 ANAACGATA Trichomonas vaginalis R2R3 Ong et al., 2006

TAACGA

TATCGT

Tv MYB2 CGATA Trichomonas vaginalis R2R3 Ong et al., 2007

TATCGTC

6.TATCCA Os MYBS1 TATCCA Oryza sativa R1 Lu et al., 2002

Os MYBS2 TATCCA Oryza sativa R1 Lu et al., 2002

Os MYBS3 TATCCA Oryza sativa R1 Lu et al., 2002

7.Miscellaneous Ca Rap1 GGTGT Candida albicans R1 Yu et al., 2010

GGATG

dd MYBE CACCCCAC Dictyostelium discoideum R1 Fukuzawa et al., 2006

Adf-1 (G(C/T)(C/T))x4 Drosophila funebris R1 Lang et al., 2010

Zeste (T/C/G)GAGTG(A/G/C) Drosophila melanogaster R1 Mohrmann et al., 2002

Eh Mybdr CCCCCC Entamoeba histolytica R1 Ehrenkaufer et al., 2009

Gm MYB176 TAGT(A/T)(A/T) Glycine max R1 Yi et al., 2010

Tbf1 ACAGGGTT Schizosaccharomyces pombe R1 Pitt et al., 2008

At MYBCDC5 CTCAGCG Arabidopsis thaliana R2R3 Hirayama and Shinozaki, 1996

14

(Jin et al., 2000). Compendia of plant MYB DNA-binding sites can be found in

databases such as The Arabidopsis Gene Regulatory Information Server (AGRIS)

(http://arabidopsis.med.ohio-state.edu/) and the Transcription Factor Database

(TRANSFAC) (http://www.gene-regulation.com/pub/databases.html). These databases

contain many of the MYB DNA-binding sites reported in the literature, most of which

have been experimentally validated, and all of which are reported here (Fig. 1.2, Table

1.1). Plant MYB DNA-binding sites were determined on a protein-by-protein basis

(Luscher and Eisenman, 1990; Ramsay et al., 1991; Urao et al., 1993; Baranowskij et

al., 1994; Grotewold et al., 1994; Sablowski et al., 1994; Gubler et al., 1995; Li and

Parish, 1995; Moyano et al., 1996; Yang and Klessig, 1996; Sainz et al., 1997; Solano

et al., 1997; Uimari and Strommer, 1997; Romero et al., 1998; Suzuki et al., 1998;

Tamagnone et al., 1998; Wang and Tobin, 1998; Rose et al., 1999; Chen et al., 2001;

Ito et al., 2001; Yang et al., 2001; Patzlaff et al., 2003a; Patzlaff et al., 2003b; Koshino-

Kimura et al., 2005; Heine et al., 2007; Legay et al., 2007; Punwani et al., 2007; Liao et

al., 2008; Aya et al., 2009; Ko et al., 2009; Liu et al., 2009; Mellway et al., 2009), and

generally reside approximately 500bp upstream of the transcriptional start site (Fig. 1.2,

Table 1.1).

1.3.3 The DNA Targets of Single MYB Repeat Proteins

In contrast to two and three-repeat MYB proteins, single-repeat MYB proteins bind

predominantly to the telomeric sequence TTAGGG and display similar sequence

identity; however, not all single-repeat MYB proteins bind this sequence and moreover,

they do not share the same functional roles (Table 1.1). Single-repeat MYB proteins

are involved with telomere binding and circadian clock regulation (Martin and PazAres,

1997). These functionalities have been conserved during the evolution of yeast,

animals and plants (Bilaud et al., 1996; Lipsick, 1996).

Both C-terminal and N-terminal single-MYB repeat proteins bind to double-stranded

DNA of telomeric repeats TTTAGGG. AtTRB1, AtTRB2, AtTRB3 all bind the telomeric

DNA sequence containing a minimum of two repeats (TTTAGGG)2. In this regard,

these single-MYB repeat proteins are divergent from R2R3-MYB and R1R2R3-MYB

proteins both in terms of the primary sequence of the MYB domains, and also,

15

consistent with their divergent DNA-binding domain, in terms of their cognate DNA

target binding sites. By contrast, some single-repeat MYB proteins seem to bind to

DNA targets that are coincident with R2R3-MYB proteins (Feldbrugge et al., 1997; Lu et

al., 2002; Liao et al., 2008). In this regard, they function as competitors for the same

DNA targets.

The rice single-MYB repeat proteins OsMYBS1, OsMYBS2, and OsMYBS3 can form

dimers to bind to the sequence TATCCA with different binding affinities, as determined

by EMSAs with competition (Lu et al., 2002). Mutational assays showed that

nucleotides CCA are more important for OsMYBS1 and OsMYBS3 binding than the

TAT nucleotides. In contrast, sequence TAT seems to be more important for OsMYBS2

binding. Moreover, all three of these MYB proteins alter alpha-amylase gene

expression. OsMYBS1 had a higher transactivation ability than OsMYBS2 and

OsMYBS3. OsMYBS3 acted as a transcriptional repressor in both yeast and barley

aleurone cells. These results demonstrate differential binding affinities and

transactivation ability of three MYB proteins for the same target DNA sequence.

Competing with each other, single-repeat MYB proteins and R2R3-MYB proteins

provide a means by which to fine-tune gene expression of genes that contain the gene

regulatory regions that are the sites of competition (Konig and Rhodes, 1997; Lu et al.,

2002; Liao et al., 2008).

1.4 The Nature of DNA-Binding by MYB Proteins

1.4.1 Relationship Between the MYB DNA-Binding Domain and DNA-

Binding Specificity

The MYB superfamily has been categorised based both on the number of MYB repeats

and on the amino acid sequence of the MYB domain (Stracke et al., 2001; Jia et al.,

2004). In other families of transcription factors, overall sequence conservation is low

and variability in DNA-binding specificity is high (Treisman et al., 1992; Klug and

Schwabe, 1995). Contrary to this, members of the plant R2R3-MYB family share higher

amino acid sequence similarity, especially in their recognition helices, and display

16

similar DNA-recognition patterns (Romero et al., 1998). These similarities in recognition

specificity are heightened between members of the same phylogenetic group.

R2R3-MYB family members from different species have been previously classified into

different phylogenetic clades (group A, B, and C) based on sequence similarities

(Romero et al., 1998). These clades were then analysed for DNA-binding specificities.

It was shown that members from group A, bind MBS type I sequence

(C(A/C/G/T)GTT(A/G)), members of group B bind equally to both MBS type I and type II

(G(G/T)T(A/T)GTT(A/G)), and most members of group C bind MBS type IIG

((C/T)ACC(A/T)A(A/C)C). For example, AtMYB6 and AtMYB7 are both members of

group C and share 90% amino acid sequence identity (Romero et al., 1998). AtMYB6

and AtMYB7 both bind to the MBS type IIG sequence (Li and Parish, 1995)(Fig. 1.2,

Table 1.1).

Well-characterised DNA-binding sites can be extracted from the literature for 87

proteins from the MYB superfamily (Fig. 1.2, Table 1.1). Characterisation of DNA

targets was derived from both in vivo and in vitro protein-DNA-binding assays (see

captions for Fig. 1.2 and Table 1.1). DNA-binding sites for these proteins can be

categorised into seven groups based on DNA-binding specificities. Examination of the

protein-sequence-similarity-derived phylogenetic relationships between 87 MYB

proteins reveals that, in general, MYB proteins that share protein sequences bind to

similar DNA sequences (Fig. 1.2, Table 1.1). That is, in general similar protein structure

implies similar DNA-binding sequences recognised by MYB proteins; however, there

are instances in the phylogenetic tree and in other studies where this is not the case (Li

and Parish, 1995)(Fig. 1.2, Table 1.1).

Members of the MYB superfamily do not always share similar DNA-binding sites based

on similar structure. Although Romero et al. had shown a correlation between MYB

protein structure and DNA-binding specificity, there were MYB family members that

could not be predicted (Romero et al., 1998). For example, Group C MYB family

members prefer in general type IIG sequence; however, the two Group C MYB proteins,

AtMYB2 and AtMYBGL1, bound DNA with different patterns. AtMYB2 bound to type I

sequences (Urao et al., 1993), and AtMYBGL1 bound only to type II sequence (Romero

17

et al., 1998). These examples show that MYB proteins DNA binding sites can generally

be predicted; however, there are examples of MYB proteins similar in structure and

function that can bind to different DNA cognate target sites. These exceptions highlight

the importance of conducting DNA-binding site experiments for individual MYB proteins.

1.4.2 Involvement of MYB Repeats in DNA Binding

Both R2 and R3-MYB repeats are necessary for DNA binding, either by R1R2R3-MYB

or R2R3-MYB proteins (Ogata et al., 1994). Neither R2 nor R3 can alone bind DNA

specifically (Ogata et al., 1994). This implies that both the R2 and R3 repeats bind

cooperatively to its cognate DNA target sequence. The resolved structure of the c-

MYB-DNA complex has displayed that both the C-termini recognition helices of R2 and

R3 contact directly with each other prior to sequence-specific binding (Ogata et al.,

1994; Ogata et al., 1995; Tahirov et al., 2001; Tahirov et al., 2002). Furthermore, the

phosphate backbone interacts simultaneously with the amino acids in both the R2 and

R3 repeats to aid in DNA-binding.

As R1 is not necessary for the specific recognition of DNA target sequences, both

R1R2R3 and R2R3-MYB proteins bind DNA in a similar manner. By contrast, single-

repeat MYB proteins, which only contain one MYB DNA-binding repeat, bind DNA in a

different manner than R1R2R3 and R2R3-MYB proteins (Hwang et al., 2001). This first

became clear when S. cerevisiae Rap1 was found to contain two MYB repeats in its

MYB-DNA-binding domain and its orthologous MYB counterpart, Homo sapiens RAP1,

only contained one MYB repeat (Konig and Rhodes, 1997). It was subsequently found

that S. cerevisiae Rap1 binds DNA as a monomer because it contains two MYB

repeats. In contrast, Homo sapiens RAP1 contains only one MYB repeat and does not

bind DNA directly; however, it tethers itself to TRF2 to bind to its DNA targets.

1.4.3 The Nature of DNA Binding By Animal MYB Proteins

The nature of DNA binding by any MYB protein has been most extensively examined

using c-MYB and its cognate target, MBSI. Mutational studies on c-MYB have shown

that R1 can be deleted without significant loss of DNA-binding ability, and that both R2

and R3 are essential for MYB-DNA recognition and binding (Anton and Frampton, 1988;

18

Sakura et al., 1989; Frampton et al., 1991). Although R1 is not involved in the direct

recognition of DNA sequence motifs, it does enhance the stability of DNA binding by the

R2R3 repeats without significantly altering the DNA-R2R3 conformation (Tanikawa et

al., 1993; Ogata et al., 1994).

c-MYB-DNA interactions were validated structurally with the resolution of the NMR

solution structures and X-ray crystal structures of c-MYB DNA-binding domain in the

free and DNA-bound states (Ogata et al., 1994; Ogata et al., 1995; Tahirov et al., 2001;

Tahirov et al., 2002). Each third helix (C-terminal helix) of R2 and R3 were

subsequently found to act as the recognition helix (Fig. 1.1)(Ogata et al., 1994). In

keeping with this, the recognition helix of R3 interacts with the core of the DNA

consensus sequence ((T/C)AAC)); while the recognition helix of R2 interacts less

specifically with nucleotides surrounding the core recognition motif

((G/T)G(A/C/T)(A/C/T)) (Ogata et al., 1995). The binding of R2 and R3 to its consensus

sequence ((T/C)AAC(G/T)G(A/C/T)(A/C/T)) widens the major groove and causes a

bend of local helical axis (Ogata et al., 1994). Several interhelical interactions occur

between the helices of R2 and R3, stabilizing the MYB-DNA interaction. Moreover, R2

and R3 bound the major groove continuously, similar to transcription factor IIIA (TFIIIA)-

type Zn fingers (Pavletich and Pabo, 1991; Fairall et al., 1993; Pavletich and Pabo,

1993). Contrary to TFIIIA-type Zn fingers, the recognition helices of c-MYB R2 and R3

are more closely packed together in the major groove. This type of direct interaction

between the recognition helices from different DNA-binding units is unique among other

DNA-binding domain complexes (Ogata et al., 1994).

Not all amino acid residues within the DNA-binding site of transcription factors partake

in DNA recognition and binding. Within the MYB protein family, certain key residues are

critical for these tasks (Ogata et al., 1994; Solano et al., 1997). For example, for c-

MYB, the three key base contacts are governed by residues Lys128 (R2), Lys182 (R3),

and Asn183 (R3), which are found to be fully conserved in all known animal and plant

MYB proteins (Ogata et al., 1994; Ogata et al., 1995).

Each MYB DNA-binding domain contains several conserved regularly spaced

tryptophan residues that participate in a hydrophobic cluster (Anton and Frampton,

19

1988; Saikumar et al., 1990). Through mutational and structural studies on c-MYB, this

hydrophobic cluster was determined to be essential for both the stability of MYB-protein

interaction and for sequence-specific binding to its consensus sequence

((T/C)AAC(G/T)G(A/C/T)(A/C/T)). Mutational and structural studies on animal c-MYB

have aided in providing critical knowledge on the molecular mechanisms behind MYB-

DNA interactions. Moreover, these studies allow one to generate testable hypotheses

on MYB-DNA interactions in other organisms where orthologous MYB proteins reside.

A cysteine residue located in the DNA recognition helix of R2 has remained completely

conserved in animals, fungi, and plants during the evolution of MYB domains (Heine et

al., 2004). R1R2R3-MYB domains have a single cysteine residue (Cys130) that is

included in the hydrophobic core. Cys130 of c-MYB needs to be reduced to allow for

sequence-specific DNA-binding. When reduced, Cys130 accomplishes this by

structurally stabilising the three helices of the R2-MYB repeat during sequence-specific

DNA-binding (Graesser et al., 1992; Guehmann et al., 1992; Melcher, 2000). In

contrast, most R2R3-MYB domains contain two cysteine residues (Cys49 and Cys53)

with the equivalent position as Cys130 in R1R2R3 MYB (Heine et al., 2004).

c-MYB has been extensively studied with regards to dynamics of DNA binding

(Tanikawa et al., 1993; Ebneth et al., 1994; Ogata et al., 1994; Ogata et al., 1995). The

c-MYB R2R3-domain was shown to bind tightly to the MYB binding site

((T/C)AAC(G/T)G(A/C/T)(A/C/T)) with a binding constant of 1.5E-09M±28% (Tanikawa et

al., 1993; Ebneth et al., 1994). Mutational analyses have shown that specific residues

within the R2R3-MYB repeats of c-MYB bound to specific nucleotides with different

affinities (Tanikawa et al., 1993). High affinity interactions within the R2R3-MYB

repeats of c-MYB are disproportionately localised across the MYB-binding site -

AACTGAC. The first adenosine, the third cytosine, and the fifth guanine are involved in

high affinity binding with c-MYB, in which any base substitutions reduce the binding

affinity by more than 500-fold in comparison to binding to an unmutated MYB-binding

site sequence. In contrast to this, the interaction with the second adenosine is involved

in lower affinity binding, with an affinity reduction in the range of 6 to 15-fold when

subjected to base change. The seventh cytosine shows an interesting interaction in that

only guanine substitution abolishes the binding affinity. All together, these affinity data

20

show that the second and third MYB-repeats cover the AACTGAC region from the

major groove of DNA in an orientation that allows the third MYB-repeat to cover the core

AAC sequence. Moreover, the results show that the third MYB-repeat recognises the

core AAC sequence with high binding affinity; however, the second repeat recognises

the GAC sequence with lower binding affinity.

MYB-DNA kinetic studies also found that mutating the R1 repeat does not affect the

DNA recognition of c-MYB but does effect the stability of the MYB-DNA complex.

Furthermore, the N-terminal acidic activation region upstream of the first MYB repeat

was found to reduce the binding affinity by interfering with R1 binding to DNA. NMR, X-

ray crystallography, and surface plasmon resonance studies have validated these c-

MYB-DNA binding kinetic results (Ogata et al., 1994; Ogata et al., 1995; Oda et al.,

1999). Further studies on c-MYB DNA affinity indicated that when c-MYB binds DNA,

the orientation of R2 and R3 are immobilised by sequence-specific binding and their

conformations are slightly changed. No significant conformational changes occur in R1

during MYB DNA-binding, further emphasising that R1 is not involved in DNA-binding

site recognition (Ogata et al., 1995).

In a comparison between the binding kinetics of the three vertebrate MYB proteins (A-,

B- and c-MYB), both A- and c-MYB bound the MYB recognition site with similar binding

constants and specificity; however, B-MYB formed DNA-protein complexes of lower

stability, rapidly dissociating under competitive conditions and showed less tolerance to

DNA-binding site variations (Bergholtz et al., 2001). These studies on animal MYB

proteins have granted insight into the molecular mechanisms behind MYB-DNA

interactions in general because R2R3-MYB proteins bind DNA in a similar fashion

(Ogata et al., 1994; Solano et al., 1997).

Kinetics on single-repeat animal MYB proteins binding to their DNA cognate sequences

have also been examined. For example, the human single-repeat MYB protein TRF1

shows that TRF1 can bind to the telomeric sequence TTAGGG with high affinity (Kd =

3.2 ± 0.5 × 10–9 M) and specificity as a monomer (Konig et al., 1998). The recorded

DNA binding affinity lies in the range of various homeodomains that also bind

specifically to DNA as monomers (Affolter et al., 1990; Florence et al., 1991; Ades and

21

Sauer, 1994; Carra and Privalov, 1997). Although the interaction of TRF1 and the

telomeric sequence is specific, the specificity and affinity is significantly increased as a

homodimer (Bianchi et al., 1997).

1.4.4 The Nature of DNA Binding By Plant MYB Proteins

To date, some of the specifics of plant MYB interaction with target DNA have relied on

model-building based on the c-MYB binding to DNA, as no crystal structure has been

generated yet for any plant multi-repeat MYB protein. For example, PAP1/AtMYB75,

the R2R3-domain was modelled according to the known structural data of c-MYB

(Zimmermann et al., 2004). A conserved amino acid signature found within several

MYB proteins was hypothesised to predict new MYB/BHLH interactions for Arabidopsis

thaliana proteins. Consistent with this hypothesis, analysis of the predicted 3D model of

PAP1/AtMYB75 showed that the amino acids of the conserved motif are surface-

exposed on helices 1 and 2 of the R3 repeat, forming hydrophobic and charged residue

patterns (Zimmermann et al., 2004). These surface-exposed amino acids are thought

to stabilise the protein-protein interactions (Zimmermann et al., 2004). This model was

validated by mutational assays (Zimmermann et al., 2004).

The Petunia MYB.Ph3 structure was also modelled after c-MYB. MYB.Ph3, a plant

R2R3-MYB transcription factor involved in the regulation of flavonoid biosynthetic

pathway in petunia flowers (Avila et al., 1993; Sablowski et al., 1994; Solano et al.,

1995), shows divergence in binding specificity compared to c-MYB (Solano et al., 1997).

MYB.Ph3 can bind two types of sites: MBSI ((T/C)AAC(G/T)G(A/C/T)(A/C/T)) and

MBSII (AGTTAGTTA) (Solano et al., 1997; Romero et al., 1998). Modeling predicted

that a single residue substitution in the R2 repeat of MYB.Ph3 (Leu71►Glu) would

change its DNA recognition to that of c-MYB, and the reciprocal substitution in c-MYB,

Glu132►Leu would change c-MYB specificity to that of MYB.Ph3 (Solano et al., 1997).

This model was experimentally validated via mutational assays. Even though it was

previously found that these residues do not directly bind DNA (Ogata et al., 1994), the

MYB.Ph3 Leu71 and c-MYB Glu132 residues interact with residues that do interact with

DNA, enabling them to impact DNA-specificity indirectly (Solano et al., 1997). By

contrast, other studies had found that P and v-MYB DNA-binding domains, which are

22

conserved among animal and plant MYB domains, are necessary for the high affinity

DNA-binding activity of these proteins to their respective DNA target sites but are not

sufficient for their unique DNA-binding site recognition of P and v-MYB (Williams and

Grotewold, 1997). Furthermore, Williams and Grotewold (1997) found that chimeric

MYB domains have novel DNA-binding specificities. Resolution of these differences will

require crystal or solution structures for plant MYB proteins.

As is the case with c-MYB, both Cys49 and Cys53 are thought to be essential for the

DNA-binding or transcriptional activity of plant MYB proteins, forming an intramolecular

disulfide bond with each other under non-reducing conditions. This disulfide bond has

been hypothesised to impair DNA binding under non-reducing conditions, causing

R2R3-MYB proteins to be functionally active only under reducing conditions. Toward

this end, the same two cysteines are conserved in the R2-MYB repeat of the R2R3-

MYB protein WEREWOLF (WER) (Koshino-Kimura et al., 2005). WER cannot bind to

its DNA-binding sites within its downstream target promoter regions without the addition

of dithiothreitol (a reducing agent). The dithiothreitol is thought to abolish the disulfide

bond, leading to the sequence specific binding of WER to its downstream targets. Nitric

oxide (NO) was shown to modifiy the DNA-binding activity of AtMYB2 by a

posttranslational modification of its conserved Cys53 (Serpa et al., 2007). AtMYB2

bound to the core binding site AAACCA in an EMSA assay; however, the addition of NO

donors, such as SNP (sodium nitroprusside) and GSNO (S-nitrosoglutathione), inhibited

sequence specific binding of AtMYB2. The NO-mediated inhibitory effect was reversed

by DTT, demonstrating that sequence specific DNA-binding of AtMYB2 is inhibited by S-

nitrosylation of Cys53 as a result of NO action. The role of cysteine residues in MYB

proteins displays the divergence of DNA binding mechanisms between both animals

and plant MYB proteins. Despite some similarities in DNA binding, given the

divergence of target DNA-binding sites of R2R3-MYB proteins relative to R1R2R3-MYB

protein, it follows that the residues critical for DNA recognition and binding within the

binding site of many plant MYB proteins differ from those of animal MYB proteins.

These examples display why there is inherent flexibility of DNA recognition by the MYB

superfamily of transcription factors because merely one change in residue can alter the

DNA recognition by a particular MYB protein.

23

Despite the vast knowledge of plant MYB transcription factor function at the gross

morphological level, little is known about the dynamics of MYB protein-DNA

interactions. Nevertheless, some general themes regarding plant MYB-DNA binding

kinetics are emerging (Solano et al., 1997; Lu et al., 2002; Liao et al., 2008). Most plant

MYB proteins display considerable inherent flexibility in their ability to recognise target

sites (Fig. 1.2, Table 1.1). For example, Petunia protein MYB.Ph3 bound to both MBSI

and MBSII sites with the same affinity, inducing similar DNA-bending/distortions in both

cases (Solano et al., 1997). Affinities for these two plant MYB binding sites vary among

other plant MYB proteins; however, certain MYB proteins have been shown to only bind

one of these sequences (Urao et al., 1993; Grotewold et al., 1994; Sablowski et al.,

1994; Gubler et al., 1995; Li and Parish, 1995; Moyano et al., 1996). The maize R2R3-

MYB C1 protein bound to its target sequences in the a1 (dihydroflavonol reductase)

promoter (Sainz et al., 1997). Determined by EMSA assays, the affinity of binding was

reduced by mutations in the C1 DNA-binding domain or in the a1 sequences recognised

and bound by C1. Maize transient assays determined that C1 directly activated the a1

gene. Towards this end, the two C1 binding sites were also bound by the maize P

protein. One of the sites (ACC(A/T)ACC) were bound with higher affinity by P (Kd = 52

± 4 × 10–9M) relative to C1 (Kd = 330 ± 50 × 10–9 M). In contrast, the other site

(AACTACCGG) is bound with similar low affinities by P (Kd = 860 ± 150 × 10–9 M) and

C1 (Kd = 780 ± 70 × 10–9 M). These results allow a greater understanding of the

mechanism behind the anthocyanin biosynthetic pathway in maize. In another example,

all three Soy-MYB proteins, GmMYB76, GmMYB92, and GmMYB177, bound to the

MBSI sequence (Liao et al., 2008). GmMYB92 could also bind sequences MRE4

(TCTCACCTACC) and mMRE1 (CCGGAAAAAAGGAT). Unlike GmMYB92,

GmMYB76 and GmMYB177 bound to the mMRE1 sequence with weak affinity.

It is important to note that, while the aforementioned studies have provided profoundly

useful insights into plant R2R3-MYB interactions with DNA sequences, they also

provide a rather incomplete picture of the specific interactions that are possible. Given

the sheer number of plant MYB proteins, the correspondingly large number of

downstream DNA targets for these proteins, and the breadth of processes controlled by

the MYB family members in plants, the complexity of plant MYB-DNA interactions

24

characterised to date is the tip of the proverbial iceberg. Clearly, there is a need for

more extensive analysis of these important interactions. One might expect considerable

inroads to be made in the future, with the emergence of new technologies to probe

DNA-protein interactions.

1.5 Future of Plant MYB-DNA Interaction Studies

1.5.1 Determining the Breadth of MYB DNA Targets in vitro

The identification of in vitro MYB DNA-binding sequences in a rapid and high-

throughput manner is required in the future to identify all variants of their DNA targets.

Transcription factors, including MYB proteins, are promiscuous in terms that they can

interact and initiate transcription from multiple target sequences (Solano et al., 1997;

Patzlaff et al., 2003a; Meijsing et al., 2009). Well-established protocols based on

recombinant MYB transcription factor DNA-binding domains have been used to enrich

for target sequences from libraries of random DNA sequences (Howe and Watson,

1991; Weston, 1992; Grotewold et al., 1994; Jackson et al., 2001). These experiments

include cyclic amplification and selection of targets (CASTing) (Wright et al., 1991) and

systematic evolution of ligands by exponential enrichment (SELEX) (Roche et al.,

1992). Both of these procedures have determined numerous MYB in vitro DNA binding

motifs for several MYB transcription factors, and their underlying principles can now be

scaled to accommodate high-throughout approaches. Microarray based technologies,

such as protein-binding microarrays, have been developed to identify transcription

factor sequence specificities (Seong and Choi, 2003; Mukherjee et al., 2004; Berger et

al., 2006; Kim et al., 2009). Binding sites identified by this technology have correlated

with in vivo transcription factor-bound DNA sequences identified by ChIP experiments

(Mukherjee et al., 2004; Badis et al., 2009; Grove et al., 2009).

Two types of protein-binding microarrays have emerged: double-stranded DNA

microarrays and transcription factor microarrays. Double-stranded DNA microarrays

contain all possible double-stranded 11bp sequences (approximately 4.2 million

sequences) in roughly 240,000 oligonucleotides (Godoy et al., 2011). Recombinant

protein from a transcription factor of interest is flowed over the double-stranded DNA

25

microarray and washed with increasing concentrations of salts. This technology allows

the accurate quantification of binding affinities to all possible DNA-binding sites

recognised by the transcription factor of interest in just one hybrization step.

Transcription factor DNA-binding enrichment, based on a protein array, allows for the

capture of multiple transcription factors and then discovery of their binding sites (Linnell

et al., 2004). A library of random oligonucleotides is flowed over captured proteins to

identify the transcription factors‘ DNA-binding sites. The array is then washed with

increasing salt concentrations to allow for the identification of relative binding affinities.

This specific protein-binding microarray has a slight advantage over the double-

stranded DNA microarray because multiple transcription factors can hybridise onto a

chip, allowing for the identification of binding preferences for transcription factor families

(Gong et al., 2008). Both these techniques are powerful means to identify in vitro DNA-

binding sites of proteins of interest in a time efficient manner (Linnell et al., 2004),

(Gong et al., 2008); however, there are limitations to these experiments. One limitation

is that protein-DNA complexes that have weak affinity for each other will be washed

away with low concentrations of salts, biasing the results. Another limitation is that the

whole structure of the protein is not available to bind to its preferred DNA targets

because a portion of the protein is hybridised to the array. Not knowing the DNA

binding domain of the protein of interest could lead to misleading results.

1.5.2 Emerging Approaches for Plant MYB Target Discovery and Analysis

in vivo

Crucially, in vitro MYB DNA-binding sites might differ from those preferred in vivo

(Barbulescu et al., 2001; Verrijdt et al., 2003). These differences are a result of in vivo

protein-protein interactions and post-translational modifications altering DNA binding

specificity, as well as conformational differences between in vitro recombinant DNA-

binding domains and in vivo native conformations of these domains. The in vivo

availability of transcription factor DNA-binding sites is also controlled by the packaging

of genomic DNA in chromatin. Therefore, alternative in vivo approaches are necessary

to map MYB-DNA binding sites in the genome accurately.

26

Transient expression assays and yeast one-hybrid assays are now a staple in

identifying that a particular MYB binds to a specific DNA target in vivo (Patzlaff et al.,

2003a; Patzlaff et al., 2003b; Xie et al., 2010). These procedures involve the

expression of a transcription factor of interest within organisms, such as plants or

yeasts, to see if it is sufficient to enable the transactivation of an artificial gene

comprising a tandem repeat of its putative DNA-binding site fused to a minimal

promoter, upstream of a reporter gene. These experiments, with the right controls,

ensure that a specific transcription factor of interest interacts and activates transcription

from its putative DNA binding site in vivo. For example, the R2R3-MYB transcription

factors AtMYB11, AtMYB12 and AtMYB111 were shown, through transient expression

assays in Arabidopsis thaliana protoplasts, that they were functionally similar to its

structurally similar maize P protein (Mehrtens et al., 2005; Stracke et al., 2007).

AtMYB11, AtMYB12, AtMYB111, and P protein had similar target gene specificity,

regulating a myriad of flavonoid biosynthetic genes. Furthermore, all activated target

gene promoters in vivo in the presence of a MYB recognition element. Transient

expression assays and yeast one-hybrid assays are well established experiments to

validate if a particular protein activates transcription from a particular motif; however,

chromatin immunoprecipitation (ChIP) followed by either whole-genome tiled microarray

analysis (ChIP-chip) or high-throughput signature sequencing (ChIP-seq) can identify

novel in vivo DNA targets of proteins of interest, resulting in more biologically significant

results. ChIP-chip or ChIP-seq has proven to be powerful tools by which to identify in

vivo binding sites of sequence-specific transcription factors in the context of chromatin

(Massie and Mills, 2008), which avoids many caveats of the aforementioned techniques

(Solomon et al., 1988).

Recently, ChIP identified in vivo DNA-binding target sites for a select group of MYB

proteins (Wang et al., 2000; Berge et al., 2007; Georlette et al., 2007; Hara et al., 2009;

Morohashi and Grotewold, 2009; Fornale et al., 2010; Xie et al., 2010). In plants, ChIP-

chip identified in vivo binding sites for two Arabidopsis thaliana two-MYB-repeat

proteins, FOUR LIPS (FLP; AtMYB124) and AtMYB88 (Xie et al., 2010). FLP and

MYB88 were shown to directly bind promoters of cell cycle genes, including CDKA:1

(At3g48750), CELL DIVISION CYCLE6a and 6b (CDC6a At2g29680 and 6b

27

At1g07270), Cyclind4:1 (At5g65420), a cyclin-like gene, CYCLINT:1 (CYCT:1,

At1g35440), CDKD1:3 (At1g8040) and CYCB1:3 (At3g11520). These results were

consistent with FLP/MYB88 in suppressing DNA replication and cell cycle progression

within the stomata. Systematic evolution of ligands by exponential enrichment (SELEX)

and EMSA, along with ChIP-chip, helped identified that this group bound to the core

consensus sequence of (A/T/G)(A/T/G)C(C/G)(C/G). Similarily, Zea mays MYB31 was

shown by SELEX and ChIP to bind to the sequence ACC(T/A)ACC within the two lignin

promoters XmCOMT and ZmF5H, resulting in the repression of lignin biosynthetic gene

expression (Fornale et al., 2010). Furthermore, ChIP-chip was performed on the

trichome developmental selectors GLABRA3 (GL3) and GLABRA1 (GL1), encoding

basic helix-loop-helix (bHLH) and MYB transcription factors respectively. ChIP-chip

identified 20 novel in vivo GL3/GL1 direct targets such as SCL8 and MYC1 (involved in

the control of gene expression), SIM (a cyclin-dependent kinase inhibitor), and RBR1 (a

negative regulator of the cell cycle transcription factor E2F) (Morohashi and Grotewold,

2009). Recently, ChIP coupled with high-throughput sequencing was employed to

determine that the R2R3-MYB P1 protein has a broader suite of direct target genes

outside of the already known flavonoid biosynthetic genes (Morohashi et al., 2012).

ChIP-chip and ChIP-seq are difficult procedures to conduct because they require an

antibody that specifically recognises a transcription factor of interest. Although these

procedures are the most effective way in determining true in vivo DNA-binding targets

and sites, they have not been used to study the majority of the MYB superfamily.

Epitope tagging is the process of making the product of a gene of interest

immunoreactive to an already synthesised antibody (Massie and Mills, 2008). This can

be done by inserting a polynucleotide encoding an epitope into a gene of interest and

expressing the gene in an appropriate host. This protein from the gene of interest can

now be located via an antibody that has already been generated. This method could be

used as an alternative to generating novel antibodies before conducting a ChIP-chip.

When an antibody cannot be generated to a protein of interest, this method is best used

to determine in vivo DNA-protein binding data on the protein of interest.

Other methods can also identify in vivo MYB DNA-binding sites and downstream

targets. A glucocorticoid receptor (GR)-mediated inducible system has successfully

28

been used to define direct target genes of several putative transcription factors

(Sablowski and Meyerowitz, 1998; Wagner et al., 1999; Samach et al., 2000). In a GR-

mediated inducible system a fusion protein between a protein of interest and the rat

glucocorticoid receptor hormone binding domain is engineered. This fusion protein is

retained in the cytoplasm in absence of the synthetically made steroid hormone

dexamethasone. Upon addition of dexamethasone the protein of interest-GR fusion

protein enters the nucleus and binds to the protein of interest‘s downstream target

genes. The addition of translational inhibitors such as cycloheximide will inhibit further

downstream effects of your protein of interest. By assaying genome-wide expression

changes on microarrays, one can determine direct target genes of a protein of interest.

The GR inducible system was used to show the single-repeat MYB protein CAPRICE

(CPC) transcription is regulated directly by WER (an R2R3-MYB transcription factor)

(Ryu et al., 2005). Using EMSAs, two WER-binding sites (WBSs; WBSI and WBSII)

were verified in the CPC promoter. WER-WBSI binding was further validated in vivo

using yeast one-hybrid assays. In another example, AtMYB80 involvement in tapetal

and pollen development was examined (Phan et al., 2011). Using the GR system, it

was determined that 79 genes were changed when the R2R3-MYB transcription factor

AtMYB80 function was restored in the myb80 mutant following dexamethasone

induction (Phan et al., 2011). Thirty-two of these genes were analyzed using ChIP, and

three were identified as direct targets of AtMYB80. These genes were shown to encode

a glyoxal oxidase (GLOX1), a pectin methylesterase (VANGUARD1), and an A1

aspartic protease (UNDEAD) and corresponded with in vitro binding data. This

procedure is a powerful way of identifying direct targets of transcription factors. When

the GR-inducible system is coupled with in silico processes, such as promoter analyses,

and DNA-binding experiments, such as EMSAs, yeast-one hybrid assays and

protoplasts assays, it has proven quite useful in identifying in vivo DNA binding sites.

The use of both expression data and DNA-binding site experiments are required to

reduce the amount of MYB false positive targets. The forkheadboxA homolog, PHA-4

regulates organogenesis of the pharynx in Caenorhabditis elegans (Gaudet and Mango,

2002). Expression of PHA-4 targets correlated with its binding sites in promoter

regions, and that the timing of target expression correlated with binding affinity between

29

PHA-4 and its target sequence (Gaudet et al., 2004). The data suggested that PHA-4

regulates pharyngeal organ development by combining PHA-4 binding affinity and

cooperating factors to regulate gene expression temporally. ChIP-seq data for PHA-4

validated this assessment; 87% of the associated genes were expressed when PHA-4

binding was present, and this expression was reduced to 60% when PHA-4 binding was

not present (Zhong et al., 2010). Towards this end, using both expression data and

DNA-binding site experiments is a powerful means of validating that the binding of a

factor activates the expression of its putative target genes. The use of both expression

data and MYB-DNA binding site assays in tandem will aid in generating a more

biologically significant MYB network.

Another problem with the identification of MYB binding sites is that some MYB binding

sites are not proximal to the predicted target genes. In an early ChIP-seq study, there

were a large number of bound sites observed for the human interferon-g (IFN-g)

responsive transcription factor STAT1 (Robertson et al., 2007). Before stimulating the

cells with IFN-g, 10,000 binding sites were identified. Binding sites for STAT1 increased

fourfold after stimulating the cells with IFN-g. In both conditions, approximately 50% of

the total sites were intragenic, 25% of the total sites were intergenic. Most binding sites

were not located near STAT1-regulated genes, which suggested that bound sites were

not directly regulating nearby genes. This can be explained by chromatin looping - a

mechanism for transcriptional control that involves bringing regulatory elements into

proximity of target genes (Vakoc et al., 2005). The use of chromosome conformation

capture studies in the future will determine if the distant locus control region (LCR) with

MYB target genes are required for high-level transcription.

1.6 Transcriptional Regulation of MYB proteins

1.6.1 Regulators Effecting MYB Gene Expression in Networks

MYB expression has been shown to be transcriptionally regulated by a suite of

regulators in animals and plants (Dubos et al., 2010). These regulators of MYB

expression are governed by biotic and abiotic stimuli (Martin and PazAres, 1997). In

Arabidopsis thaliana, MYB transcription factors have been direct targets of 87 other

30

regulators (http://arabidopsis.med.ohio-state.edu/). For instance, AGL15, a MADS

family protein that regulates embryo development, directly binds to 29 different MYB

genes, although DNA binding does not always imply transcriptional regulation (Zheng et

al., 2009). Regulators have mainly been shown to modulate expression from MYB

promoter regions 500bp upstream of the transcriptional start site; however, other

regions of MYB genes, including introns can be involved in modulating MYB expression

(Table 1.1, Fig. 1.2).

1.6.2 The Role of Introns on MYB Transcriptional Regulation

Introns impact multiple steps in the expression of genes in plants and animals (Le Hir et

al., 2003; Rose, 2008). In Arabidopsis thaliana, approximately 80% of genes contain

introns (Rose, 2002; Carmel et al., 2007). Introns can contain regulatory sequences

that allow binding of activator and repressor proteins to these sites to modulate

transcription (Rippe et al., 1989; Bruhat et al., 1990; Deyholos and Sieburth, 2000).

Examples of intragenic regulation of gene expression are growing. Prominent among

these examples is the intron-mediated regulation of animal Myb expression.

A direct link between animal Myb intragenic sequences and gene expression has been

reported (Dooley et al., 1996). Human c-Myb was transcriptionally regulated by nuclear

protein complexes that bind to a conserved motif of c-Myb intron 1. A 70 kDa protein

bound to this intragenic motif, and was associated with transcriptionally active leukemia

cells. Furthermore, a 20 kDa repressor protein (with a c-Jun domain) in transcriptionally

silent cells bound to another motif within the intron 1, demonstrating complex regulation

of transcription during leukemic cell growth and differentiation.

Intragenic modulation of MYB expression is not limited to animals. In plants, the

Arabidopsis thaliana R2R3-MYB gene GLABRA1 (GL1) regulates trichome

development (Oppenheimer et al., 1991). The first intron of GL1 plays a role in

patterning trichomes (Wang et al., 2004). This intron operates as an enhancer in

trichome cells and a repressor in nontrichome cells, generating a trichome-specific

pattern of MYB gene expression. A motif was identified (CA/CGTTA) in the first intron

of GL1 and the position of the motif was conserved between closely related MYB

proteins AtWER and GaMYB2. This conserved intragenic motif was critical in the

31

regulation of trichome patterning and it was suggested that this motif might be a binding

site for activators and repressors that regulate transcription of this gene. These studies

suggest that MYB introns can play regulatory roles; however, more research is needed

to elucidate the molecular components involved in such regulation.

Novel regulatory proteins that bind MYB intragenic regions can be determined through

pull-down assays followed by mass spectrometry. For a streptavidin biotin pull-down

assay, the intragenic DNA sequence of interest is biotinylated and bound to streptavidin

beads. Nuclear protein extracts are then passed over the complex and washed to

remove proteins that bind non-specifically. The bound nuclear proteins are eluted from

the complex and subsequently identified using a mass-spectrometry approach (Hewel

et al., 2010). This technique can identify novel proteins that bind to MYB intragenic

regions. Further analyses, including genetic over-expression and loss-of-function

approaches, can be used in a complementary manner to validate the role of such

proteins in the regulation of MYB expression.

1.7 Research Hypotheses and Aims

The past decades have seen remarkable inroads made into our understanding of the

molecular interactions between sequence-specific transcription factors and their DNA

targets in general, and MYB proteins and their binding sites more specifically. This is

particularly true for animal R1R2R3-MYB transcription factors. Insights into the

specificities of R1R2R3-MYB interactions with DNA target sequences is, in turn,

providing greater understanding of the molecular mechanisms that control cellular

processes (Howe and Watson, 1991; Weston, 1992; Grotewold et al., 1994; Solano et

al., 1997; Patzlaff et al., 2003b)(Fig. 1.2, Table 1.1), and is aiding in the development of

diagnostics and therapeutics for when those mechanisms go awry (Vicente et al., 2009;

Stenman et al., 2010). By contrast, comparable understanding of the molecular

mechanisms that proceed from the interaction of plant R2R3-MYB proteins with their

cognate DNA targets is much less complete, with only a handful of MYB-DNA

interactions characterised at the molecular level for any given plant species.

Consequently, the precise means by which plant R2R3-MYB proteins coordinate gene

expression and are regulated to give rise to plant phenotype is still at a relatively

32

nascent stage. This said, with the emergence of new techniques that enable the

dissection of protein-DNA interactions more rapidly (Gertz et al., 2005; Vavouri and

Elgar, 2005), and/or with higher resolution (Mardis, 2007; Massie and Mills, 2008),

and/or for a larger number of proteins (Huang, 2003; Seong and Choi, 2003; Godoy et

al., 2011), the characterisation of new MYB-DNA interactions and MYB regulatory

proteins, particularly those that occur in plant species, should proceed apace. Given

this, the coming decade promises to provide great insights into the means by which

members of this remarkable family of proteins convert molecular information into whole

organism responses.

Recently, the R2R3-MYB transcription factor AtMYB61 was shown to be involved in

carbon acquisition and resource allocation (Penfield et al., 2001; Newman et al., 2004;

Liang et al., 2005; Romano et al., 2012). The aim of this research presented in this

thesis is to better characterise the molecular function of an R2R3-MYB family member,

focusing on the interplay between AtMYB61 and its DNA target sequences. In addition

to shedding light on a particular transcription factor, the project should establish a

pipeline for the characterisation of the molecular functions of any plant transcription

factor. The aims are:

(1) To test the hypothesis that AtMYB61 binds to the 5‘ non-coding regulatory

regions of a distinct set of predicted target genes to modify transcription.

(2) To test the hypothesis that AtMYB61 binds to a preferred DNA sequence to

activate transcription.

(3) To test the hypothesis that AtMYB61 is regulated by an intragenic motif

within the 5‘ coding region of its second intron in a sucrose dependent manner.

To address these aims, three separate experiments were undertaken. First,

electrophoretic mobility shift assays (EMSAs) were used to examine the binding of

AtMYB61 to predicted downstream targets upstream regulatory regions. Second, a

cyclic amplification and selection of target sequences (CASTing) assay was conducted

on AtMYB61 recombinant protein and compared to yeast activation assays to determine

AtMYB61 preferred DNA-binding sites. Finally, an EMSA and streptavidin-biotin pull-

33

down assay followed by mass-spectrometry were used to examine if putative repressors

bind to the conserved motif within AtMYB61 second intron in a sugar dependent

manner.

1.8 Acknowledgements

We are very grateful to Dr. Katharina Bräutigam, Joseph Skaf, and Heather Wheeler for

fruitful discussions and extensive assistance with earlier drafts of this manuscript. This

work was generously supported by a Natural Science and Engineering Research

Council of Canada (NSERC) Canadian Graduate Scholarship (CGSD) awarded to MP,

and by funding from the University of Toronto and NSERC to MMC.

34

Chapter 2

AtMYB61, an R2R3-MYB transcription factor, is a pleiotropic regulator of plant carbon acquisition and resource allocation

This chapter is an extract of material originally contained in the following publication:

Romano J, Dubos, C., Prouse, M.B., Wilkins, O., Hong, H., Poole, M., Kang, K., Li, E., ,

Douglas, C.J., Western, T.L., Mansfield, S.D., and Campbell, M.M. (2012) AtMYB61, an

R2R3-MYB transcription factor, is a pleiotropic regulator of plant carbon acquisition and

resource allocation. New Phytologist. 195: 774-786.

Contributions: MBP, JMR, CD, MMC designed research; MBP, JMR, CD, and HH

performed research; JMR, CD, MBP, MP, and OW analysed data; MBP, JMR, CD, and

MMC wrote manuscript with editorial assistance from MBP, HH, and OW.

MBP contributed specifically to Fig. 2.2, Fig. 2.3, Table 2.3.

Copyright: The material in this chapter is copyrighted by Wiley and Wiley.

35

2 AtMYB61, an R2R3-MYB Transcription Factor, is a Pleiotropic Regulator of Plant Carbon Acquisition and Resource Allocation

2.1 Abstract

Throughout their lifetimes, plants must coordinate the regulation of various facets of

growth and development. Previous evidence has determined that the Arabidopsis

thaliana R2R3-MYB, AtMYB61, functions as a coordinate regulator of multiple aspects

of plant resource allocation. Using a combination of cell biology and transcriptome

analysis, in conjunction with over-expression and loss-of-function genetics, the role of

AtMYB61 in conditioning resource allocation was explored. Putative downstream

targets of AtMYB61 were predicted and include genes that encode the following

proteins: a KNOTTED1-like transcription factor (KNAT7, At1g62990); a caffeoyl-CoA 3-

O-methyltransferase (CCoAOMT7, At4g26220); and a pectin-methylesterase (PME,

At2g45220). Statistically over-represented motifs were identified in the 5‘ non-coding

regions of the putative target genes, and these correspond to previously characterised

AC element motifs that function as R2R3-MYB targets. The consensus motif functions

as a bona fide target for AtMYB61 binding as determined by an electrophoretic mobility

shift assay. Binding between the gene regulatory sequences of the putative target

genes, which contain multiples of these motifs, was confirmed via electrophoretic

mobility shift assays. Altogether these experiments provide assessment of the ability of

AtMYB61 to bind to gene regulatory sequences present in the 5‘ non-coding sequences

of the three putative downstream targets: KNAT7, CCoAOMT7 and a PME,

substantiating its role as a potential regulator of the transcription of these genes.

Together with the analysis of the regulation of AtMYB61 expression, these studies

provide insights into the transcriptional regulatory circuit downstream of AtMYB61.

2.2 Introduction

Plants have evolved mechanisms that enable them to contend with fluctuations in their

capacity to fix carbon (Halford and Paul, 2003; Koch, 2004; Gibson, 2005; Rogers et al.,

2005; Coupe et al., 2006; Rolland et al., 2006; Solfanelli et al., 2006; Shimazaki et al.,

36

2007; Hanson and Smeekens, 2009). Some of these mechanisms control the aperture

of stomata and regulate the uptake of CO2 for photosynthesis (Hetherington and

Woodward, 2003; Coupe et al., 2006; Shimazaki et al., 2007). Plants have also evolved

mechanisms to appropriately modulate the allocation of carbon to various facets of plant

growth, development and metabolism (Osuna et al., 2007; Smith and Stitt, 2007; Stitt et

al., 2007; Usadel et al., 2008; Gibon et al., 2009; Sulpice et al., 2009; Graf et al., 2010).

Although a body of evidence suggests a link between the pathways that modulate

carbon acquisition through stomata with those involved in resource allocation (Tallman,

2004; Coupe et al., 2006; Shimazaki et al., 2007; Liang et al., 2010; Romano et al.,

2012), little is known about the specific factors involved.

AtMYB61 (At1g09540), which encodes a member of the Arabidopsis thaliana R2R3-

MYB family of transcription factors, is a gene that controls resource acquisition and

allocation (Penfield et al., 2001; Newman et al., 2004; Liang et al., 2005; Romano et al.,

2012). AtMYB61 expression was both sufficient and necessary to bring about

reductions in stomatal aperture with consequent effects on gas exchange (Liang et al.,

2005). Analysis of loss-of-function atmyb61 mutants showed that AtMYB61 was also

necessary for the deposition of seed coat mucilage (Penfield et al., 2001). Other

experiments have revealed that AtMYB61 plays a role in the control of lignification and

photomorphogenesis (Newman et al., 2004; Dubos et al., 2005). In keeping with its role

as a regulator of resource allocation, AtMYB61 was shown to be expressed in sink

tissues, notably xylem, roots and developing seeds (Romano et al., 2012). Loss of

AtMYB61 function decreases xylem formation, induces qualitative changes in xylem cell

structure and decreases lateral root formation; in contrast, over-expression of AtMYB61

has the opposite effect on these traits.

The link between AtMYB61 and its role in the regulation of carbon allocation is not

obvious. We show here that AtMYB61 orchestrates changes in transcriptome activity

that modify plant resource allocation. Together with our previous results (Newman et

al., 2004; Liang et al., 2005; Romano et al., 2012), these new data support the

hypothesis that AtMYB61 binds to the upstream regulatory regions of predicted

downstream targets, modifying transcription to control both resource acquisition through

37

stomata, as well as resource allocation, largely into non-recoverable carbon sinks,

throughout plant growth and development.

2.3 Materials and Methods

2.3.1 Plant Material, Seed Sterilization and Growth Conditions

All wild-type (WT) and mutant Arabidopsis thaliana seeds were in the Columbia-0

background. Plants over-expressing AtMYB61 under the control of the Cauliflower

Mosaic Virus 35S promoter (35S::MYB61) were as described previously (Newman et

al., 2004; Liang et al., 2005). Similarly, AtMYB61 loss-of-function mutants (atmyb61)

have been described previously (Penfield et al., 2001; Liang et al., 2005). One

independently transformed AtMYB61 overexpressing line (35S::MYB61) and minimally

two loss-of-function alleles (atmyb61-1 and atmyb61-2) were used in all experiments,

and results are representative. T-DNA insertional mutant lines corresponding to either

AtMYB61 or putative downstream targets of AtMYB61 were obtained from the

Arabidopsis Biological Resource Center (ABRC) (Alonso et al., 2003). Homozygous T-

DNA lines were obtained by PCR screening using the left border A T-DNA primer and a

right border gene-specific primer. Insertion sites were sequenced for all mutants to

verify insertional mutagenesis (data not shown), and quantitative PCR was conducted to

show that the mutants were loss-of-function (data not shown).

For primary bolt and hypocotyl analyses, seeds were germinated and plants were grown

on soil. Seeds were sown on dampened soil and then cold stratified for 3 d before

placement in a growth chamber at 21°C with a regime of 12 h of light (120 μmol m−2 s−2)

and 12 h of dark. This growth regime is referred to as short-day conditions herein.

To induce secondary growth in the hypocotyl, the primary inflorescence and secondary

inflorescences were continually removed from plants grown under short days for 10 wk

(Chaffey et al., 2002). Hypocotyl sections were fixed, coated or stained for transmission

electron microscopy, scanning electron microscopy and bright- and dark-field

microscopy, respectively.

38

2.3.2 RNA Isolation and Quantitative PCR

In order to verify that the insertional mutagenised mutants identified as above were loss-

of-function mutants, transcript accumulation corresponding to the mutagenized gene

was determined in the mutants. Primary and secondary inflorescences were excised

with a scalpel and immediately frozen in liquid nitrogen. Approximately 1 g (fresh

weight) of ground tissue was used per RNA extraction. TRIzol reagent (Invitrogen) was

used following the manufacturer‘s recommendations. The RNA pellet was dissolved in

30 μl diethylpyrocarbonate (DEPC)-treated water. RNA quantity and purity were

analysed using a NanoDrop spectrophotometer (Thermo Scientific, Wilmington, DE,

USA), and RNA integrity was assessed by loading 1 μg of RNA onto a 1% agarose 0.5X

TBE (Tris-borate-EDTA) gel. First-strand cDNA was generated using 5 μg total RNA

with oligo dT primer with SuperScript II (Invitrogen). Standard curves, quantitative PCR

and melt curves were conducted with a Bio-Rad Chromo4 Real-Time PCR detector

using Sybr-Green florescent dye (Bio-Rad). In order to avoid the generation of a

reverse transcription-polymerase chain reaction (RT-PCR) amplicon from genomic

DNA, primers were designed so that the 3‘-end of at least one of the primers spanned

an intron splice site. The relative mRNA levels were determined by normalizing the

PCR threshold cycle number of each gene with that of TUBULIN-4 reference gene

(At1g04820). Primer sequences and quantitative PCR amplification conditions are

available on request.

2.3.3 Secondary Thickened Hypocotyls Stained with Phloroglucinol

Secondary thickened hypocotyl sections (c. 1 mm) were stained with phloroglucinol as

described previously (Newman et al., 2004). Sections were viewed with an Olympus

SZX16 microscope under both bright and dark field. Images were captured with a

QImaging MicroPublisher 3.3RTV digital camera utilizing QCapture version 2.7

software. Measurements employed to calculate the area of xylem and area of phloem

were calculated using ImageJ 1.38x (Collins, 2007).

39

2.3.4 Transmission Electron Microscopy

Segments (1 cm) of primary inflorescence stems from stage 6.30 plants, or secondary

thickened hypocotyls, were fixed in 2% glutaraldehyde in 0.1 M Sorensen‘s phosphate

buffer (pH 7.4) for 72 h at room temperature, and postfixed in 1% osmium tetroxide in

0.1 M phosphate buffer for 1 h in the dark. Samples were then dehydrated through an

ascending graded series of ethanol (30%, 50%, 70%, 80%, 90%, 100%), infiltrated with

Spurr‘s epoxy resin and polymerized overnight at 65°C. Semi-thin sections (0.5–1 μm)

were cut using glass knives on a Leica EM UC6 ultramicrotome (Leica, Allendale, NJ,

USA), stained with 0.1% toluidine blue and 0.025% methylene blue, and examined by

light microscopy to determine the quality of fixation and orientation of the samples.

Ultrathin sections (60–90 nm) were cut using a diamond knife, stained with 3% uranyl

acetate in 50% methanol, poststained with Reynold‘s lead citrate and examined using a

Hitachi H7000 transmission electron microscope (Hitachi, Mississauga, ON, Canada)

operated at 75 kV. Pictures were taken using Kodak 4489 electron microscope film,

and negatives were scanned using an Epson Perfection 1680 scanner (Epson,

Markham, ON, Canada) at 1200 dpi.

2.3.5 Microarray Analysis

Total RNA was extracted using described methods (Newman et al., 2004) from 7-d-old

Arabidopsis thaliana seedlings grown in the dark in liquid MS medium as described

above. Each pool of RNA was derived from hundreds of seedlings. Three biological

replicates were collected for each condition (genotype × sucrose presence/absence) for

RNA extraction. As there were three genotypes (WT, atmyb61, 35S::MYB61) and two

conditions (presence and absence of sucrose), there were 18 RNA samples in total.

The quality of the total RNA was assessed using an Agilent Bioanalyser (Agilent,

Mississauga, ON, Canada) at the Genomic Arabidopsis Resource Network (GARNet)

microarray facility at the Nottingham Arabidopsis Stock Centre. Hybridization to the 18

Affymetrix GeneChip Arabidopsis ATH1 Arrays (Affymetrix, Santa Clara, CA, USA),

scanning of the hybridized arrays and raw data collection were performed at the

GARNet facility at the Nottingham Arabidopsis Stock Centre according to standard

Affymetrix protocols (http://affymetrix.com). The data for the RNA quality control, the

40

raw data for the triplicated microarray experiments and the detailed description of the

MIAME (minimum information about a microarray experiment)-compliant experimental

conditions are publicly available at:

http://ssbdjc2.nottingham.ac.uk/narrays/experimentpage.pl?experimentid=14.

2.3.6 Bioinformatic Analyses to Identify AtMYB61 Targets

To identify AtMYB61 targets, a two-stage complete transcriptome analysis was

undertaken. In the first stage, publicly available, complete Arabidopsis thaliana

transcriptome microarray data were used to identify those genes sharing the same

transcript abundance profile as AtMYB61 across multiple stages of development.

Genes were identified whose transcript abundance profiles had a Pearson correlation

coefficient > 0.8 when compared with the transcript abundance of AtMYB61 across the

66 microarrays comprising the AtGenExpress ‗Developmental Baseline‘ dataset

(http://web.uni-frankfurt.de/fb15/botanik/mcb/AFGN/atgenex.htm,

http://bar.utoronto.ca/ntools/cgi-bin/ntools_expression_angler.cgi). The 58 genes

identified in this manner (Supporting Information Table S1) should fit into one of two

categories: (1) genes regulated in parallel with AtMYB61, and (2) genes regulated by

AtMYB61. To select genes in the latter category, a second stage of analysis was

undertaken.

The second stage of transcriptome analysis identified genes whose transcript

abundance was influenced by the presence or absence of AtMYB61. A complete

transcriptome microarray dataset was generated using WT, atmyb61 and 35S::MYB61

grown at a time point and under conditions that allow the comparison of the impact of

AtMYB61 on transcriptome activity (seedlings grown in the dark in the absence or

presence of sucrose). In this dataset, genes that are either direct or indirect targets of

AtMYB61 should have reduced transcript abundance in atmyb61 mutants and elevated

expression in 35S::MYB61 over-expressing plants in comparison with WT. Using these

criteria to generate a ‗bait‘ transcript abundance profile for use in the Expression Angler

co-expression tool, 31 genes were identified that had a Pearson correlation coefficient

> 0.8 across the 18 microarrays in the ‗AtMYB61 dataset‘ (Fig. 2.1, Table 2.2). Groups

of genes with transcript abundance profiles that had high Pearson correlation

41

coefficients relative to AtMYB61 were identified using Expression Angler

(http://bar.utoronto.ca/ntools/cgi-bin/ntools_expression_angler.cgi) (Toufighi et al.,

2005). The calculations of the Pearson correlation coefficients were based on raw

expression values across all 18 GeneChips generated in our study. Both gene lists,

generated from the microarray and Expression Angler analyses, were then compared

using Venn Selector (http://bar.utoronto.ca/ntools/cgi-bin/ntools_venn_selector.cgi).

Three genes were identified in the intersection set.

The 5‘ noncoding sequences (1000 bp) for the three genes with Pearson correlation

coefficients > 0.8 in both datasets were obtained by bulk download from The

Arabidopsis Information Resource (TAIR,

http://www.arabidopsis.org/tools/bulk/sequences/index.jsp). Over-represented

sequence motifs in the 5‘ noncoding sequences were identified using option 2 of

Promomer (http://bar.utoronto.ca/ntools/cgi-bin/BAR_Promomer.cgi) (Toufighi et al.,

2005) with the following parameters: base pairs in the element = 6, minimum

percentage of genes in which the identified element should occur = 75. Bootstrap

analysis (n = 1000) with a randomized dataset allows the validation of the significance

of an over-represented motif within the sequences being queried.

2.3.7 Electrophoretic Mobility Shift Assay (EMSA)

Recombinant AtMYB61 protein was produced in Escherichia coli using the coding

sequence cloned in frame into the NdeI and BamHI sites of the pET15b vector

(Novagen, EMD Millipore, Mississauga, ON, Canada). Recombinant AtMYB61 protein

was produced, extracted and affinity purified as described previously for pine MYB

proteins (Patzlaff et al., 2003b). EMSA conditions were exactly as described previously

(Patzlaff et al., 2003b; Gomez-Maldonado et al., 2004), but using recombinant AtMYB61

protein instead of pine MYB protein.

2.3.8 Transcriptional Activation Assay

Transcriptional activation assays using yeast were performed as described previously

(Patzlaff et al., 2003b), but substituting the AtMYB61 coding sequence instead of the

pine MYB sequences.

42

2.3.9 Fibre Quality Analysis

Fibre quality analysis Secondary thickened hypocotyls were subjected to fibre quality

analysis according to published methods (Chaffey et al., 2002). Fibre quality analysis

enables the determination of cell types liberated from secondary xylem following

maceration, by documenting cell lengths, widths and frequencies as suspended cells

pass through a flow chamber.

2.4 Results and Discussion

2.4.1 AtMYB61 Modulates the Expression of a Specific Set of Target

Genes

As a transcription factor, AtMYB61 should exert its control over facets of the plant

transpiration stream by modulating the expression of specific target genes (Ptashne and

Gann, 1997). To identify such targets, a two-stage complete transcriptome analysis

was undertaken, using publicly available microarray datasets in combination with a

custom microarray dataset comparing the transcriptomes of wild-type (WT), atmyb61

and 35S::MYB61 plants (see the Materials and Methods section; Fig. 2.1; Tables 2.1,

2.2). Three genes emerged from the sequential filtering of publicly available microarray

data and the AtMYB61-specific microarray data, which were shared in both tiers of data

mining. These genes are strong candidates for direct targets of AtMYB61: At1g62990,

At2g45220 and At4g26220.

The nature of the gene products encoded by the three putative AtMYB61 targets is

consistent with their role in xylem development. At1g62990 encodes the homeobox

protein AtKNAT7. AtKNAT7 is expressed in xylem fibres, and xylem vessels of

AtKNAT7 loss-of-function mutants (irregular xylem11, irx11) have thin, weak cell walls

resulting in collapsed vessels (Brown et al., 2005; Zhong et al., 2008). At2g45220

encodes a pectin methylesterase (AtPME), a class of enzymes with demonstrable roles

43

Figure 2.1. Transcript abundance of a subset of genes in the Arabidopsis thaliana transcriptome is influenced by the presence or absence of AtMYB61 activity. Clustergram of transcript abundance of genes that share the same transcript abundance profile as AtMYB61 in 7 d old, dark-grown wild-type (WT), atmyb61, and 35S::MYB61 seedlings. Each row shows transcript abundance data for a given gene in 7 d old, dark-grown seedlings as determined by Affymetrix ATH1 GeneChip microarrays. Three biological replicates were analysed for each genotype. Green indicates low transcript abundance; whereas, red indicates high transcript abundance. Genes that share the same transcript abundance profile as AtMYB61 as determined by Expression Angler, using a Pearson correlation coefficient >0.8 as the cut-off, are characterised by having low transcript abundance in the atmyb61 mutant and high transcript abundance in the 35S::MYB61 overexpressor.

44

Table 2.1. Genes that share transcript abundance profiles with AtMYB61 Determined by Pearson correlation coefficient, across the AtGenExpress developmental baseline dataset.

Pearson correlation co-

efficient relative to AtMYB61

Arabidopsis Gene

Identifier (AGI) Gene Product Description

0.911 At1g63300 unknown protein

0.906 At5g58930 unknown protein 0.906 At5g40960 unknown protein 0.898 At1g62990 KNOTTED-LIKE HOMEOBOX 7

0.894 At3g11690 unknown protein

0.894 At1g14380 IQ67 DOMAIN PROTEIN 28

0.892 At2g43060 transcription factor

0.890 At5g60820 C3HC4-type RING finger

0.887 At4g33330 PGSIP3__transferase, transferring glycosyl groups

0.883 At1g07750 cupin family protein

0.880 At4g26220 caffeoyl-CoA 3-O-methyltransferase, putative

0.879 At3g51000 epoxide hydrolase, putative

0.878 At5g54530 unknown protein

0.864 At4g28370 protein binding / zinc ion binding

0.857 At4g14930 acid phosphatase survival protein SurE, putative

0.855 At5g14500 aldose 1-epimerase family protein

0.853 At4g32350 unknown protein

0.844 At3g53520 UDP-GLUCURONIC ACID DECARBOXYLASE 1

0.844 At3g17940 aldose 1-epimerase family protein

0.841 At1g63690 protease-associated domain-containing protein

0.840 At4g37530 peroxidase, putative

0.837 At4g30500 unknown protein

0.836 At1g01900 ATSBT1.1__SBTI1.1; serine-type endopeptidase

0.835 At3g14720 ATMPK19; MAP kinase

0.832 At5g39190 GERMIN-LIKE PROTEIN 2

0.831 At5g66460 (1-4)-beta-mannan endohydrolase, putative

0.830 At5g65710 HAESA-Like 2

0.829 At1g76550 pyrophosphate-dependent 6-phosphofructose-1-kinase

0.828 At1g12550 oxidoreductase family protein

0.828 At1g19190 hydrolase

0.828 At5g59310 LIPID TRANSFER PROTEIN 4

0.825 At5g66660 unknown protein 0.825 At5g27360 SFP2; carbohydrate transmembrane transporter

0.823 At5g23720 PROPYZAMIDE-HYPERSENSITIVE 1

0.821 At3g54200 unknown protein

0.820 At1g49450 transducin family protein / WD-40 repeat family protein

0.819 At3g45130 lanosterol synthase 1

45

Table 2.1 continued

0.819 At1g75390 basic leucine-zipper 44

0.816 At2g24170 endomembrane protein 70, putative

0.815 At1g29050 unknown protein

0.815 At2g43050 ATPMEPCRD; enzyme inhibitor/ pectinesterase

0.815 At2g38710 AMMECR1 family

0.815 At5g15490 UDP-glucose 6-dehydrogenase, putative

0.813 At3g18440 aluminum-activated malate transporter 9

0.813 At1g11070 proline-rich family protein

0.812 At1g72220 zinc finger (C3HC4-type RING finger) family protein

0.810 At3g53400 CONSERVED PEPTIDE UPSTREAM ORF 47

0.810 At2g16990 tetracycline transporter

0.807 At5g66170 unknown protein

0.807 At5g62150 peptidoglycan-binding LysM domain-containing protein

0.806 At1g03920 protein kinase, putative

0.805 At3g13640 ATRLI1; transporter

0.804 At3g51300 RHO-RELATED PROTEIN FROM PLANTS 1

0.801 At1g16490 MYB DOMAIN PROTEIN 58

0.800 At2g38360 PRENYLATED RAB ACCEPTOR 1.B4

0.800 At1g54160 NUCLEAR FACTOR Y, SUBUNIT A5

0.800 At2g45220 pectinesterase family protein

46

Table 2.2. Genes that share transcript abundance profiles with AtMYB61

Determined by Pearson correlation coefficient, across the AtMYB61 microarray dataset.

Pearson correlation co-

efficient relative to AtMYB61

Arabidopsis Gene

Identifier (AGI) Gene Product Description

0.937 At1g26270 phosphatidylinositol 3- and 4-kinase family protein

0.916 At3g59480 pfkB-type carbohydrate kinase family protein

0.912 At1g11210 unknown protein

0.908 At4g26220 caffeoyl-CoA 3-O-methyltransferase, putative

0.873 At2g16720 MYB DOMAIN PROTEIN 7

0.864 At3g61440 CYSTEINE SYNTHASE C1

0.862 At1g21590 protein kinase family protein

0.859 At2g44840 ETHYLENE-RESPONSIVE BINDING FACTOR 13 0.854 At2g12290 unknown protein 0.853 At1g62990 KNOTTED-LIKE HOMEOBOX 7

0.849 At3g52870 calmodulin-binding family protein

0.843 At5g47230 ETHYLENE RESPONSIVE ELEMENT BINDING FACTOR 5

0.837 At4g17980 NAC domain containing protein 71

0.835 At1g77590 LONG CHAIN ACYL-COA SYNTHETASE 9

0.833 At3g06390 integral membrane family protein

0.833 At3g62730 unknown protein

0.833 At3g48690 ATCXE12__CXE12; carboxylesterase

0.827 At1g29240 unknown protein

0.826 At5g02200 FAR-RED-ELONGATED HYPOCOTYL1-LIKE

0.825 At4g36780 transcription regulator

0.822 At1g77220 unknown protein

0.812 At3g28290 unknown protein

0.812 At3g04870 ZETA-CAROTENE DESATURASE

0.809 At1g72410 COP1-interacting protein-related

0.805 At5g26340 STP13__MSS1; hexose:hydrogen symporter

0.802 At4g05090 inositol monophosphatase family protein

0.801 At4g36930 SPATULA

0.800 At1g63870 disease resistance protein (TIR-NBS-LRR class), putative

0.800 At3g18250 unknown protein

0.800 At1g53570 MAP3KA

0.800 At2g45220 pectinesterase family protein

47

in reconfiguring plant cell wall chemistry (Pelloux et al., 2007). At4g26220 encodes a

caffeoyl-CoA O-methyltransferase (AtCCoAOMT7), which, based on the extent of

sequence similarity, is probably involved in the genesis of the monolignol precursors

used to build the lignin polymer, as do related homologues (Do et al., 2007).

2.4.2 AtMYB61 Regulates Genes with Specific Target Motifs in Their

Promoters

Consistent with the three genes functioning as downstream targets of AtMYB61,

recombinant AtMYB61 protein bound to 300-bp DNA regions residing upstream of the

TATA box for each of the putative target genes (Fig. 2.2). Candidate AtMYB61 binding

sites in the gene regulatory regions of these three genes were identified by algorithm-

based screening for over-represented motifs in the three DNA sequences. The most

over-represented DNA motif in the gene regulatory sequences showed high similarity to

canonical R2R3-MYB binding sites known as AC elements (Fig. 2.2). Three such AC

elements were found in each of the upstream regions of AtPME and AtKNAT7, whereas

four elements were found in the AtCCoAOMT7 upstream noncoding sequences

(Fig. 2.2; Table 2.3). Recombinant AtMYB61 bound to this element, but could not bind

to a mutated version of the element, confirming that this is the likely target of AtMYB61

binding in these genes (Fig. 2.2). Non-AC-element-containing DNA could not be bound

by AtMYB61 (Fig. 2.2), nor could it compete with AC elements for AtMYB61 binding

(Fig. 2.3). The AC element was also an effective competitor for recombinant AtMYB61

bound to the 300-bp upstream regulatory sequences (Fig. 2.2). Moreover, expression

of AtMYB61 in yeast transactivated an artificial target gene comprising a tandem repeat

of the AC element fused to a yeast minimal promoter, upstream of the reporter β-

galactosidase (Fig. 2.2). These binding data are in accordance with the literature

surrounding MYB–DNA interactions (Prouse and Campbell, 2012). Thus, AtMYB61

activity appeared to promote the expression of target genes containing the AC element.

Strikingly, evidence suggests that AtKNAT7 is the target of other transcription factors

and, on that basis, is thought to be a component of a transcriptional network that

regulates xylem differentiation (Zhong et al., 2007; Zhong et al., 2008). AtKNAT7

appears to function as a common target for several transcriptional networks that are

48

Figure 2.2. AtMYB61 binds to the promoters of putative downstream targets, to motifs that are over-represented in these promoters and is sufficient to activate transcription from these motifs. (a) Schematic representation of the 5‘ noncoding sequences of the three putative AtMYB61 downstream target genes identified as the intersection set of genes found to be co-regulated with AtMYB61 in both the AtGenExpress developmental dataset and a

49

Figure 2.2 caption continued. AtMYB61-specific microarray experiment, as determined by Expression Angler. +/− indicate the orientation of canonical R2R3-MYB binding site motifs relative to the sense coding strand, and numbers indicate the position of these motifs relative to the putative transcriptional start (indicated by an arrow). Blue horizontal lines under the sequences correspond to the location of the DNA sequence used as the target in the electrophoretic mobility shift assay (EMSA) conducted in (b). (b) AtMYB61 binding of the 5‘ noncoding sequences of the three putative target genes as determined by EMSA. Recombinant AtMYB61 binds to all three 5‘-noncoding sequences, as determined by a gel shift of the probe, and can be outcompeted with increasing quantities of unlabelled DNA corresponding to a canonical R2R3-MYB binding site, known as an AC element. (c) Left: over-represented motif in the 5‘ noncoding sequences of the three genes outlined above, as determined by the Promomer algorithm (Toufighi et al., 2005) (average = 2.9; Z-score = 13; significance = 0.001). Right: AtMYB61 binding to the AC-rich motif as determined by EMSA. Recombinant AtMYB61 binds to the AC-rich motif (AC: 5′ attgttcttcctggggtgaccgtccACCTAAcgctaaaagccgtcgcgggataagcctgtctg 3′), but not to a mutated version of the putative binding motif (NBS: 5′ attgttcttcctggggtgaccgtgcATGGATcgctaaaagccgtcgcgggataagcctgtctg 3′). (d) AtMYB61-mediated activation of promoter activity in Saccharomyces cerevisiae. AC (5′ gaagacgaggtaccagccACCTAAcccACCTAAcccACCTAAcgctgttctcgagcctcatct 3′) and NBS (5′ gaagacgaggtaccagTCCATGGATcgccATGGATcgccATGGATcctgttctcgagccctcatct 3′) sequences are triplicated within the segment. Left: schematic representation of the effector (top) and reporter (bottom) constructs used in this study (CYC1: minimal yeast promoter). Right: quantitative analysis of β-galactosidase activity in yeast (noninducible medium: glucose, open bars; inducible medium: galactose, closed bars). Error bars represent standard deviation. *Significantly different from control, P < 0.05, t-test.

50

Table 2.3. AC elements within the promoters of putative downstream targets. Table of the orientation and location of AC elements within the upstream non-coding regions of the putative targets.

AGI Gene AC Element Orientation Location

At1g62990 AtKNAT7 ACCTAA Antisense 558

ACCTAA Antisense 665

ACCTAA Antisense 704

At2g45220 AtPME ACCAAC Antisense 139

ACCAAT Antisense 143

ACCAAT Sense 151

At4g26220 AtCCoAOMT7 ACCAAA Antisense 82

ACCAAC Sense 128

ACCAAA Antisense 165

ACCAAA Antisense 235

51

Figure 2.3. AtMYB61 binding to the 5’ non-coding sequences of the three putative target genes as determined by EMSA. Recombinant AtMYB61 bound to all three 5‘-non coding sequences of AtKNAT7, AtCCoAOMT7 and AtPME, as determined by a gel shift of the probe (arrows), and could not be outcompeted with increasing quantities of unlabelled DNA corresponding a random binding site (NBS: 5‘ attgttcttcctggggtgaccgtgcATGGATcgctaaaagccgtcgcgggataagcctgtctg 3‘).

52

involved in xylem differentiation, including one that involves AtMYB61. As such,

AtKNAT7 could be viewed as a regulatory module that is co-opted by several gene

regulatory networks.

2.4.3 AtMYB61 Regulates Genes Which Themselves Contribute to

AtMYB61-Related Phenotypes

To determine whether the putative AtMYB61 targets contribute to any of the xylem-

related traits in which AtMYB61 is involved (Romano et al., 2012), the phenotypes of the

loss-of-function mutants for the target genes (atknat7/irx11, atpme and atccoaomt7)

were compared with atmyb61 and WT. Loss-of-function mutations in each of the three

target genes generated xylem-related phenotypes that at least partially phenocopied

atmyb61 phenotypes. For example, secondary thickening of xylem vessel cell walls

was reduced in atknat7/irx11 and atpme mutants relative to WT, like atmyb61 (Fig. 2.4).

As with atmyb61 mutants, the xylem : phloem ratio was reduced relative to WT in

secondary thickened hypocotyls of atknat7/irx11, atpme and atccoaomt7 mutants

(Fig. 2.4). Strikingly, the atknat7/irx11, atpme and atccoaomt7 mutants had far fewer

fibre cells and disproportionately more vessel cells relative to WT (Fig. 2.4). Unlike

atmyb61 mutants, the atknat7/irx11, atpme and atccoaomt7 mutants were able to make

vessels, and fusiform cambial cells were not the predominant cell type. These findings

are in keeping with the hypothesis that AtMYB61 functions upstream of AtKNAT7,

AtPME and AtCCoAOMT7, as AtMYB61 activity promotes the differentiation of both

vessels and fibres, whereas the differentiation of vessels more prominently occurs in the

atknat7/irx11, atpme and atccoaomt7 mutants. This suggests that AtKNAT7, AtPME

and AtCCoAOMT7 are involved in pathways governing fibre differentiation in secondary

hypocotyl development, whereas AtMYB61 sits upstream of both fibre and vessel

differentiation pathways in the development of this anatomical region.

53

Figure 2.4. AtMYB61 downstream target genes have an impact on secondary wall formation and xylem formation in secondary thickened hypocotyls. Transmission electron micrographs (×2000) of cross-sections obtained from primary inflorescence stems of growth stage 6.03 Arabidopsis thaliana plants grown under 12 h light : 12 h dark conditions, for (a) wild-type (WT), (b) atmyb61, (c) atknat7, (d) atpme and (e) atccoaomt7 genotypes. All plants were grown until the inflorescence stems were an equivalent length (26 cm), and cross-sections were made at 0.5 cm from the base of the stem (adjacent to the rosette). Bars, 10 μm. (f–j) Secondary thickened hypocotyls from mature plants after 10 wk of growth with continuous removal of primary and secondary inflorescences under 12 h light : 12 h dark conditions. Sections were stained with phloroglucinol to reveal alterations of lignified xylem cells to phloem cells. Sections are (f) WT, (g) atmyb61, (h) atknat7, (i) atpme and (j) atccoaomt7 genotypes. (k) Quantitative assessment of the ratio of xylem area : phloem area obtained from multiple measurements (biological replicates, n > 10) of secondary thickened hypocotyl cross-sections obtained as already described. (l) Fibre quality analysis of secondary thickened hypocotyls from plants after 10 wk of growth with continuous removal of the primary and secondary inflorescence under 12 h light : 12 h dark conditions. Results are shown as the ratio of length to diameter to reflect particular cell types. Length : diameter (L : D) ratios of 10 indicate vessels, of 17.5 indicate fibres and of 20 indicate cambial cells. Bars represent ± SE. *Significantly different from WT (P < 0.05). Data from experiments performed in triplicate with 5–20 seedlings per genotype per experiment, depending on the nature of the experiment. (f-j) Bars, 50 μm.

54

2.5 Conclusion

These findings suggest that AtMYB61 functions as a pleiotropic regulator of carbon

acquisition and allocation of the plant via a small gene network. Three direct

downstream targets of AtMYB61 were predicted based on comparative transcriptome

analyses between microarrays that examined changes in gene expression that were

modulated by differences in AtMYB61 activity and sugar, and those that examined the

co-expression of AtMYB61 across plant development and in different organs. These

predicted direct downstream targets of AtMYB61 are: a KNOTTED1-like transcription

factor (KNAT7, At1g62990); a caffeoyl-CoA 3-O-methyltransferase (CCoAOMT7,

At4g26220), and a pectin-methylesterase (PME, At2g45220). AtMYB61 bound the

putative downstream targets‘ promoter regions in an AC-motif-dependent fashion.

Expression of AtMYB61 protein in yeast was sufficient to drive the transactivation of a

reporter gene comprising a tandem repeat of an AC element fused to a yeast minimal

promoter, upstream of the reporter lac-Z. Together, these results suggest that

AtMYB61 binds to promoter regions of downstream targets to modulate transcription to

regulate the allocation of carbon to non-recoverable sinks when conditions are

favourable to do so.

2.6 Acknowledgements

We are most grateful to Astrid Patzlaff, Christine Surman and Joan Ouellette for

excellent technical assistance. This work was generously supported by funding from

the Natural Science and Engineering Research Council of Canada (NSERC) and the

Canada Foundation for Innovation (CFI) to S.D.M., by a Canadian Graduate

Scholarship (CGSD) from NSERC awarded to M.B.P., and O.W., by an NSERC

Discovery Grant and the NSERC Green Crops Network to C.J.D., and by funding from

the University of Toronto, CFI and NSERC to M.M.C. Research infrastructure was

provided by the Centre for Analysis of Genome Evolution and Function at the University

of Toronto.

55

Chapter 3

Interactions between the R2R3-MYB transcription factor, AtMYB61, and target DNA binding sites

This chapter is the equivalent of the following submitted manuscript in its entirety:

Prouse M.B., and Campbell M.M. (2013) Interactions between the R2R3-MYB

transcription factor, AtMYB61, and target DNA binding sites. PLOS ONE. 8(5): e65132.

Contributions: MBP, MMC designed research; MBP, MMC analyzed data; MBP, MMC

wrote and edited manuscript.

MBP contributed specifically to each figure and table in this chapter.

Copyright: The material in this chapter is copyrighted by PLOS.

56

3 Interactions between the R2R3-MYB Transcription Factor, AtMYB61, and Target DNA Binding Sites

3.1 Abstract

Despite the prominent roles played by R2R3-MYB transcription factors in the regulation

of plant gene expression, little is known about the details of how these proteins interact

with their DNA targets. For example, while Arabidopsis thaliana R2R3-MYB protein

AtMYB61 is known to alter transcript abundance of a specific set of target genes, little is

known about the specific DNA sequences to which AtMYB61 binds. To address this

gap in knowledge, DNA sequences bound by AtMYB61 were identified using cyclic

amplification and selection of targets (CASTing). The DNA targets identified using this

approach corresponded to AC elements, sequences enriched in adenosine and

cytosine nucleotides. The preferred target sequence that bound with the greatest

affinity to AtMYB61 recombinant protein was ACCTAC, the AC-I element. Mutational

analyses based on the AC-I element showed that ACC nucleotides in the AC-I element

served as the core recognition motif, critical for AtMYB61 binding. Molecular modelling

predicted interactions between AtMYB61 amino acid residues and corresponding

nucleotides in the DNA targets. The affinity between AtMYB61 and specific target DNA

sequences did not correlate with AtMYB61-driven transcriptional activation with each of

the target sequences. CASTing-selected motifs were found in the regulatory regions of

genes previously shown to be regulated by AtMYB61. Taken together, these findings

are consistent with the hypothesis that AtMYB61 regulates transcription from specific

cis-acting AC elements in vivo. The results shed light on the specifics of DNA binding

by an important family of plant-specific transcriptional regulators.

3.2 Introduction

Much of plant growth and development is shaped by sequence-specific transcription

factors, proteins that act in response to external and internal cues to modulate gene

expression. The MYB family is the largest family of plant sequence-specific

transcription factors, with greater than 100 family members in individual plant species

57

(Martin and PazAres, 1997; Arabidopsis Genome, 2000; Riechmann et al., 2000;

Stracke et al., 2001; Dubos et al., 2010). MYB transcription factors are recognised by

the presence of the MYB domain, which comprises characteristic helix-helix-turn-helix

repeats of approximately 50 amino acids. The MYB domain binds DNA in a sequence-

specific manner and is highly conserved in yeast, vertebrates, and plants (Rosinski and

Atchley, 1998). The MYB domain is normally found near the amino terminus of the

protein, and generally contains either 1, 2, or 3 of the 50 amino-acid MYB repeat.

R2R3-MYB proteins have two such repeats, and comprise the largest sub-family of the

plant and animal MYB family. Moreover, R2R3-MYB proteins are plant specific,

regulating facets of plant growth, development and metabolism (Lipsick, 1996; Martin

and PazAres, 1997; Glover et al., 1998; Jin and Martin, 1999; Stracke et al., 2001;

Martin et al., 2002; Patzlaff et al., 2003a; Gomez-Maldonado et al., 2004; Newman et

al., 2004; Liang et al., 2005).

While members of the R2R3-MYB family are being characterised in increasing

numbers, these investigations largely focus on the involvement of a particular MYB in

the manifestation of a specific plant phenotype. That is, most of these analyses do not

extend to a more detailed examination of MYB function at the molecular level.

Nevertheless, some general themes with respect to R2R3-MYB function at the

molecular level are emerging (Prouse and Campbell, 2012). For example, many R2R3-

MYB transcription factors bind to DNA motifs that are enriched in adenosine (A) and

cytosine (C) residues (Patzlaff et al., 2003b; Gomez-Maldonado et al., 2004), where

guanine (G) residues are either absent or depleted (Hatton et al., 1995; Prouse and

Campbell, 2012). These motifs have been variously referred to as AC elements, H

boxes, or PAL boxes (Lois et al., 1989; Joos and Hahlbrock, 1992; Leyva et al., 1992;

Hauffe et al., 1993; Hatton et al., 1995; Logemann et al., 1995; BellLelong et al., 1997;

Seguin et al., 1997; Lacombe et al., 2000; Lauvergeat et al., 2002). Some R2R3-MYB

proteins function as transcriptional activators at these sites (Patzlaff et al., 2003a;

Patzlaff et al., 2003b), while others function as transcriptional repressors (Jin et al.,

2000). AC elements are relatively short, comprising 5 or 6 nucleotides, where 3

residues form a relatively invariant core (Ogata et al., 1993; Ogata et al., 1994; Ogata et

al., 1995). R2R3-MYB proteins bind to AC elements in a manner that relies on specific

58

amino acid residues in the R2R3-MYB domain (Ogata et al., 1993; Ogata et al., 1994;

Ogata et al., 1995; Tahirov et al., 2001; Tahirov et al., 2002). To date, the details of

such interactions have been relatively scant, aside from their putative involvement in the

regulation of plant-specific gene expression.

AtMYB61, a member of the Arabidopsis thaliana R2R3-MYB family of transcription

factors, illustrates the involvement of R2R3-MYB family members in the regulation of

plant-specific processes. AtMYB61 is a pleiotropic regulator of three major facets of the

plant transpiration system: xylem cell differentiation; lateral root outgrowth; and,

stomatal aperture (Liang et al., 2005; Romano et al., 2012). AtMYB61 modifies gene

expression in response to diurnal cues so as to appropriately modify the aperture of

stomata (Liang et al., 2005), the pore-like structures on leaf surfaces that enable gas

exchange. Thus, AtMYB61 plays a role in modifying the capacity to take up carbon

dioxide for photosynthesis, while limiting the loss of water from the plant body.

AtMYB61 also alters gene expression in response to sugars, resulting in modification of

plant architecture and cell wall structure (Penfield et al., 2001; Newman et al., 2004;

Dubos et al., 2005). As is the case for most R2R3-MYB transcription factors, the

precise mechanisms that enable AtMYB61 to bring about important changes in plant

function are unknown. Furthermore, although AtMYB61 has been shown to bind to

certain consensus motifs (Romano et al., 2012), the preferred binding of AtMYB61 has

not yet been determined quantitatively.

Given that R2R3-MYB proteins are involved in a rich variety of plant-specific processes

(Dubos et al., 2010), it would be desirable to have a more detailed understanding of

R2R3-MYB and DNA motif interactions. The work described herein focuses on the

interplay between AtMYB61 and its DNA target sequences. Cyclic amplification and

selection of targets (CASTing), which enables identification of a transcription factor‘s

DNA-binding sites from a pool of random oligonucleotides, was used to identify target

DNA-binding sites for AtMYB61 (Wright et al., 1991). The sequences identified served

as a useful foundation to examine mechanisms responsible for AtMYB61 sequence-

specific binding, and to hypotheses about the roles these may play in shaping AtMYB61

function in vivo.

59

3.3 Materials and Methods

3.3.1 Ethics Statement

Antibody generation was carried out in strict accordance with the Province of Ontario‘s

Animals for Research Act, and the requirements of the federal Canadian Council on

Animal Care. The protocol was approved at the University of Toronto, which involved

full committee review by the Local Animal Care Committee (LACC), followed by

approval by the University of Toronto Office of Research Ethics, the University

Veterinarian, and finally the University of Toronto Animal Care Committee (UACC)

(Permit Number: 20007080, approved 14/01/08). All efforts were made to minimise

suffering.

3.3.2 Expression of Recombinant Protein in Bacteria

Recombinant AtMYB61 protein was produced in E. coli using the coding sequence

cloned in frame into the NdeI and BamHI sites of the pET15b vector (Novagen).

Recombinant AtMYB61 protein was produced, extracted and affinity purified as

described previously for pine MYB proteins (Patzlaff et al., 2003b).

3.3.3 Antibody Production and Western Blot Analysis

Anti-AtMYB61 polyclonal antibodies were produced against the recombinant fusion

protein in rabbits as described previously (Harlow, 1988). Affinity-purified recombinant

antigen was gel-purified on a 10% SDS-PAGE gel and shipped in phosphate buffered

saline to University of Toronto BioScience Support Laboratories for antibody production.

In brief, 2 rabbits were each injected a total of 4 times with 300 g of antigen per

injection over a 6 week period. Production bleeds were performed after nitrocellulose

dot blot assays indicated acceptable titre.

For western blot analysis, total soluble protein extracts were separated by SDS-PAGE

and transferred to Bio-Rad Laboratories Nitrocellulose Trans-Blot Transfer Medium

(0.45µm) by electrophoretic transfer (BioRad, Mississauga, ON, Canada).

Chemiluminescent western blot analysis was performed on the filters with Invitrogen‘s

60

Western Breeze Chemiluminescent kit as described by the manufacturer (Invitrogen,

Burlington, ON, Canada). Primary antibody dilutions were done at a final dilution of

1/20000.

3.3.4 Cyclic Amplification and Selection of Targets (CASTing)

The CASTing assay was completed according to Wright et al. (Wright et al., 1991).

CASTing was completed by incubating 15 μg of double stranded random

olionucleotides (27 mers) flanked in between two constant priming sequences with the

AtMYB61 full length recombinant protein. This complex was added to a Protein G

Dynabead (Invitrogen, Burlington, ON, Canada) plus post-injection AtMYB61 antibody

complex, causing the complex to immunoprecipitate. The immunoprecipitated complex

was then washed 3 times, resuspended in 100 μL PCR buffer, boiled and then PCR

amplified for 30 cycles with 15 pmol of forward and reverse primers. 10 μl of the

amplified selected targets were kept for analysis and 90 μL were used to continue with

the next cycle. This cycle was repeated four more times to select for AtMYB61

consensus DNA target sequences. The selected targets were then cloned into

Invitrogen‘s pCR4 TOPO vector and sequenced (Invitrogen, Burlington, ON, Canada).

3.3.5 Nitrocellulose Filter-Binding Assay

The nitrocellulose filter-binding assay was conducted as described by Hall and Kranz

(Hall and Kranz, 2008). The CASTing targets that were over-represented were ordered

from Invitrogen and PCR amplified (Invitrogen, Burlington, ON, Canada). These PCR

products were Qiagen nucleotide purified according to the Qiagen manufacturer

(Qiagen, Toronto, ON, Canada). The cleaned up PCR products were then radioactively

labelled with 32P via primer extension and further Qiagen nucleotide purified according

to the Qiagen manufacturer (Qiagen, Toronto, ON, Canada). The CPM levels were

measured via a liquid scintillation counter to measure the incorporation of 32P into the

probe. The radioactively labelled probes were combined in a binding reaction with

recombinant AtMYB61 protein and passed through BioRad nitrocellulose filters (0.2µm)

(BioRad, Mississauga, ON, Canada). The relative binding of recombinant AtMYB61

protein to the CASTing motifs and mutated AC-I sequences were recorded. The

61

dissociation constants (Kd) of the CASTing targets to AtMYB61 were determined by

GRAFIT program which linearised the nonlinear regression via scatchard plots to

calculate the point at which half of the binding sites of AtMYB61 was bound by ligand.

3.3.6 Electrophoretic Mobility Shift Assay (EMSA)

Recombinant AtMYB61 protein was produced, extracted and affinity purified as

described previously for pine MYB proteins (Patzlaff et al., 2003b). EMSA conditions

were exactly as described previously (Patzlaff et al., 2003b; Gomez-Maldonado et al.,

2004) but using recombinant AtMYB61 protein in place of pine MYB protein.

3.3.7 Molecular Modelling

The tertiary structure of AtMYB61 was predicted using the tool Protein

Homology/analogY Recognition Engine (PHYRE)(McDonnell et al., 2006);

www.sbg.bio.ic.ac.uk/phyre/html/index.html). PHYRE proposed that the resolved

structure that shared the most homology to AtMYB61 was the animal c-MYB DNA-

binding domain, which was resolved previously with its DNA consensus motif (AACNG)

by heteronuclear multidimensional NMR (Ogata et al., 1994). This solution structure

was used to predict a 3D protein model of AtMYB61 with an E-value of 3.8e-13 and an

estimated precision of 100%. The two protein sequences were 44% alike using amino

acid sequence alignment. The PDB (Protein Data Bank) file recovered from the PHYRE

analysis (PDB ID = c1msfC) was used to superimpose the predicted AtMYB61 structure

with the c-MYB structure using DaliLite (Holm and Park, 2000). The c-MYB protein was

resolved along with its DNA binding sequence allowing one to predict the binding

domain of AtMYB61 using homology. The PDB files for the AC-I and NBS nucleotide

motifs were created from the http://structure.usc.edu/make-na/server.html server. Using

Pymol (Seeliger and de Groot, 2010) the two structures were modelled and

superimposed (DeLano, 2002). Polar interactions were determined using Pymol.

3.3.8 Transcriptional Activation Assay

Transcriptional activation assays using yeast were as described previously (Patzlaff et

al., 2003b), but substituting the AtMYB61 coding sequence in place of pine MYB

62

sequences. Transcriptional activation assays were conducted with three biologically

independent replicates per condition.

3.4 Results and Discussion

3.4.1 AtMYB61 Bound a Discrete Subset of DNA Target Sequences

To generate an antibody of adequate specificity for the cyclic amplification and selection

of targets (CASTing) assay, antibodies were raised against a non-conserved region in

the AtMYB61 C-terminus (Fig. S3.1). CASTing was initiated with a pool of 63-base-pair

double-stranded oligonucleotides, where each oligonucleotide consisted of a segment

of 27 random nucleotides flanked by designed sequences for PCR priming. A 15 μg

(2.21x1014 DNA molecules) pool of ―randomers‖ was incubated with AtMYB61 full-length

recombinant protein (Fig. 3.1a). Assuming the average protein-binding site is a

hexamer, the 27-bp degenerate core of each double-stranded oligomer contained 21

possible positions. Therefore, in the initial round of CASTing, 21 X 1014 unique sites

were available for binding.

Five CASTing cycles were undertaken to enrich the pool of oligonucleotides in DNA

binding-sites bound by AtMYB61. The enriched oligonucleotides were cloned into

pCR4 TOPO (Invitrogen, Burlington, ON, Canada) and sequenced. Following

enrichment, 89 CASTing-derived oligonucleotides were sequenced. Sequences were

subjected to analysis to discover over-represented motifs using MEME (Multiple Em for

Motif Elicitation) (Bailey et al., 2006) (Table 3.1, Table 3.2, Fig. 3.1b). MEME filtering

criteria identified sequences with a min/max motif width of 6, any number of repetitions

of a single motif distributed among the sequences, and no restrictions on the number of

motifs identified. Following MEME analysis, all CASTing-enriched sequences contained

over-represented motifs characterised by an abundance of adenosine and cytosine

residues. These over-represented motifs had a conserved set of ACC nucleotides

present at the beginning of the motifs, suggesting that these nucleotides may be

essential for recognition and binding (Table 3.1, Table 3.2, Fig. 3.1b). These motifs

correspond to canonical AC elements, also known as H-boxes or PAL-boxes (Table 3.1,

Table 3.2, Fig. 3.1b).

63

Figure 3.1. Cylic amplification and selection of targets (CASTing) recovered a suite of hexamer target sequences that bound to AtMYB61. (a) 27bp random sequences flanked by two primer sites (63bp in total) were used in the CASTing assay. (b) Sequence logo of CASTing targets discovered by MEME. The ACC motif was conserved among all target sequences. Two nucleotides upstream and downstream of the over-represented hexamer target sequences were included to analyse if the over-represented motifs could be extended beyond a hexameric sequence.

64

Table 3.1. Alignment of AtMYB61 binding sites obtained from CASTing Assay

Seven hexomer targets were determined to be overrepresentative by MEME (Multiple EM for Motif Elicitation).

Group AtMYB61 Site

ACCACC

1 ACCCCAGAGTCCC ACCACC CGACCCCC

2 ACCCAAACACCACGCCCTAG ACCACC C

3 GCTAAACGTTCATTCCCCT ACCACC CC

4 A ACCACC TCAACAAACCCCGGCCGCCC

5 ACCAC ACCACC ACCCACCCCCCCCCCC

6 G ACCACC CTCCAACCTATACCGGCCCC

7 CCAAACTCGACCGTTCCCGC ACCACC C

8 GCACCCC ACCACC ACCATACCTACCCC

9 ACCCGATCAGGCCCTCC ACCACC CCCC

10 CCACACCCCACCCCGAACG ACCACC GC

11 ACCAACGGACTAGCTCCCAC ACCACC C

12 C ACCACC CCACCATACAATCCCTAGGC

13 ACCAC ACCACC ACCCCACCCTAGGACC

14 ACCACC ACTACCCGGACCCGGCCCCCC

15 ACACGAGATAACGACCCG ACCACC CCC

ACCTAC

16 GACACAAGACAC ACCTAC ACCCCCCCC

17 GCAGCCC ACCTAC ACTCCCGCTCCCCC

18 GCACCCCACCACCACCAT ACCTAC CCC

19 ACCCCCCCTAATTG ACCTAC GGCAGGC

20 CAG ACCTAC CCCCGCCCCCAACCCGCC

21 CACCCACCGTCCAACG ACCTAC ACCCC

22 GCGCACCCCACCCCCC ACCTAC GGCCC

ACCACA

23 ACCACA ATGCAGCCGTACTTCGACCCC

24 ACCACA CCACCACCCACCCCCCCCCCC

25 A ACCACA TCAACAAACCCCGGCCGCCC

26 CAACCCCTCCA ACCACA CCTCCCCGCC

27 CC ACCACA CTCTGCATTCTTGACCGCC

ACCATA

28 GGGTAATGTC ACCATA GCCCCCCCCCC

29 GCACCCCACCACC ACCATA CCTACCCC

30 CA ACCATA CACAACGCCCCGACCCCCC

31 CACCACCCC ACCATA CAATCCCTAGGC

32 CAGGCACCCCCAACCCCCC ACCATA CC

ACCAAT

33 AAAGGGTATACACAGGT ACCAAT GGCC

34 AACCTTAGGG ACCAAT CAATAAGGGAC

35 ACCAAT GAAGAGACCCCTAACCATTAC

36 ATGTGTAG ACCAAT GGCATAATCTGCA

37 GTCGAGTCG ACCAAT GCAGCACGCAGC

ACCAAC

38 CAG ACCAAC CTCATACCCCCCCCTGCC

39 CC ACCAAC CCTCCCTCCCAATGCCCGC

40 ACCAAC GGACTAGCTCCCACACCACCC

41 AACATGCTGTGCAACCAA ACCAAC ACC

ACCAAA

42 ACCAAA AGATCAACCCCCCCCCGTACC

43 AACATGCTGTGCA ACCAAA CCAACGCC

44 ACACATAAACAGCA ACCAAA CCAGCCC

45 AACATGCTGTGCA ACCAAA CCAACACC

65

Table 3.2. AtMYB61 consensus sequence was derived from a comparison of 89 sequences recovered from 5 cycles of CASTing The composition of each base at each position of the hexameric sequence is provided. -/+ indicate the bases 5' or 3' of hexameric consensus sequence. The bases 5' or 3' of hexameric consensus sequence does not add up to 45 in certain circumstances because primer sites were negated from the analysis. W corresponds to A/T, H corresponds to A/T/C, – corresponds to a zero value.

-2 -1 A C C W H H +1 +2

G 3 11 – – – – – – 9 7

A 10 8 45 – – 38 20 14 10 4

T 2 3 – – – 7 5 5 2 4

C 20 17 – 45 45 – 20 26 24 27

Total 45 45 45 45 45 45

66

AC elements, also known as PAL boxes or H-boxes, play key roles in regulating

transcription for a variety of genes, particularly those encoding enzymes implicated in

phenylpropanoid metabolism (Lois et al., 1989; Joos and Hahlbrock, 1992; Leyva et al.,

1992; Hauffe et al., 1993; Hatton et al., 1995; Logemann et al., 1995; BellLelong et al.,

1997; Seguin et al., 1997; Lacombe et al., 2000; Lauvergeat et al., 2002). R2R3-MYB

proteins are known to bind AC elements and activate transcription from these motifs in

yeast and in planta (Prouse and Campbell, 2012). For example, pine (Pinus taeda)

MYB1 (Patzlaff et al., 2003a) and MYB4 (Patzlaff et al., 2003b) and eucalyptus

(Eucalyptus grandis) MYB2 (Goicoechea et al., 2005), were all able to bind to AC

elements present in the promoters of lignin biosynthetic genes. Similarly, pine (Pinus

taeda) MYB1 and MYB4 bound AC elements present in the gene regulatory sequences

of a pine gene encoding GLUTAMATE SYNTHETASE1b (GS1b) (Gomez-Maldonado et

al., 2004). R2R3-MYB binding to AC elements is predicted to play a role in dictating

xylem-localised expression of the aforementioned genes (Patzlaff et al., 2003a; Patzlaff

et al., 2003b; Gomez-Maldonado et al., 2004; Goicoechea et al., 2005). Given the

xylem-localised expression of AtMYB61 (Romano et al., 2012), it is likely that it

functions in an equivalent manner to drive AC-element-mediated expression in

Arabidopsis thaliana.

3.4.2 AtMYB61 Bound to DNA Target Sequences with Varying Degrees of

Affinity

The relative binding affinities of recombinant AtMYB61 protein to the CASTing-derived

sequences were determined (Table S3.1). Dissociation constants for each CASTing

target were calculated by GRAFIT software program by using Scatchard plots (Table

3.3). The CASTing target that bound with the highest affinity (9.12E-09 M) was ACCTAC

(AC-I) (Table 3.3). Since the AC-I motif was the preferred target of AtMYB61, a

mutational assay was conducted on this motif to examine which nucleotides were

essential for binding (Table 3.4). A guanine nucleotide was substituted one nucleotide

at a time and shifted along the motif. A nitrocellulose filter-binding assay was used to

calculate the Kds of the mutated AC-I motifs (Table 3.4). Binding diminished when a

mutation was present in the first three nucleotides of the AC-I motif (Kd>5.00E-06 M);

67

Table 3.3. Dissociation constants (Kd) in mol/L and associated errors of CASTing targets. Relative binding affinities of the CASTing targets to AtMYB61 were determined by a nitrocellulose filter-binding assay. The relative binding affinities were used to determine the dissociation constants of the CASTing targets by GRAFIT program which linearized the nonlinear regression via scatchard plots to calculate the ligand concentration at which half of the binding sites of AtMYB61 are occupied. ACCTAC bound with the greatest affinity to AtMYB61. NBS or non-binding site did not bind to recombinant AtMYB61.

Kd Error

ACCTAC 9.12E-09 3.11E-09

ACCAAT 1.21E-08 3.42E-09

ACCAAA 1.68E-08 4.07E-09

ACCATA 1.83E-08 5.06E-09

ACCAAC 7.37E-08 1.53E-08

ACCACA 8.08E-08 6.93E-09

ACCACC 6.90E-07 2.27E-08

NBS >5.00E-06

68

Table 3.4. Dissociation constants (Kd) in mol/L and associated errors of mutated ACCTAC (AC1 element) sequences A guanine nucleotide was inserted one nucleotide at a time and shifted along the AC1 motif. Relative binding affinities of the mutated AC1 elements to AtMYB61 were determined by a nitrocellulose filter-binding assay. The relative binding affinities were used to determine the dissociation constants of the CASTing targets by GRAFIT program which linearized the nonlinear regression via scatchard plots to calculate the ligand concentration at which half of the binding sites of AtMYB61 are occupied. Underlined bases corresponds to a substituted guanine.

Kd Error

ACCTAC 9.12E-09 3.11E-09

GCCTAC >5.00E-06

AGCTAC >5.00E-06

ACGTAC >5.00E-06

ACCGAC 7.19E-07 2.12E-07

ACCTGC 7.97E-08 1.83E-08

ACCTAG 5.60E-08 5.09E-09

69

however, when a mutation is present in the last three nucleotides of the AC-I motif, the

binding is reduced but not completely abolished (Table 3.4). The relative binding

affinities of recombinant AtMYB61 protein to CASTing targets and mutated motifs were

validated by EMSAs (Fig. 3.2). EMSAs were conducted at a protein concentration of

5x10-08 M because this was the protein concentration at which not all the targets

reached their binding max as determined by nitrocellulose filter-binding assay (Fig. 3.2,

Table S3.1). This enabled detection of differential binding via EMSAs.

AtMYB61 bound its preferred target AC-I (ACCTAC) with a binding constant of 9.12E-09

M (Table 3.3), which is similar to the tight binding of the vertebrate c-MYB R2R3 domain

to the MYB binding site ((T/C)AAC(G/T)G(A/C/T)(A/C/T)) (binding constant = 1.5E-09

M±28% ) (Tanikawa et al., 1993; Ebneth et al., 1994). Tanikawa et al. found that AACG

nucleotides in the c-MYB binding site were critical for binding (Tanikawa et al., 1993).

The second adenosine, fourth cytosine, and sixth guanine were particularly important in

determining binding specificity. If any of these core nucleotides were mutated, binding

affinity decreased by greater than 500 fold. The third adenosine was not as crucial - if it

was mutated, the binding affinity would be decreased up to 15 fold. Consistent with

this, AtMYB61 had a set of core recognition nucleotides – ACC – that could not be

mutated without abolishing binding (Fig. 3.2b, Table 3.4). Moreover, mutation of the

latter half of the binding site, occurring at residues TAC, reduced binding but did not

abolish it completely.

3.4.3 The Affinity of AtMYB61 to Specific Target DNA Sequences Was

Predicted by Molecular Interactions Determined in silico

Computational analysis of the 3-dimensional structure of the N-terminal DNA-binding

region of AtMYB61 was conducted in order to validate the role of this domain in

sequence-specific binding. Previously, the structure of the N-terminal DNA-binding

domain of animal c-MYB bound to its DNA consensus motif (AACNG) was solved by

heteronuclear multidimensional NMR (Ogata et al., 1994). Animal c-MYB DNA-binding

region contains a conserved R2R3-MYB domain that exhibits high similarity to plant

R2R3-MYB DNA binding domains. This NMR structure was used as a template to

model the structure of AtMYB61. The AC-I (ACCTAC) and NBS (GAGACC) nucleotide

70

Figure 3.2. Relative binding affinities of AtMYB61 to CASTing targets and to mutated ACCTAC motif determined by nitrocellulose filter-binding assays are confirmed by electrophoretic mobility shift assays (EMSAs). (a) EMSA of recombinant AtMYB61 protein binding to 6 labelled CASTing target sequences. The protein concentration used was 5x10-08M. Protein concentrations were conducted at 5x10-08M because this was the protein concentration at which targets had not all reached their binding max as determined by nitrocellulose filter-binding assay, allowing one to observe differential binding. (b) EMSA validating relative binding affinities of AtMYB61 to mutated ACCTAC motif. The protein concentration used was 5x10-08M. Mutations were conducted by substituting a single guanine nucleotide along the AC1 element. Black arrow indicates gel shift by the probe. Non-binding site (NBS) is a sequence that does not bind AtMYB61, acting as a negative control. Probes were engineered for the EMSA reaction by inserting the hexamer CASTing sequence or mutated AC1 element sequence into the underlined area.

71

models were then docked into the predicted binding sites of the AtMYB61 model (Fig.

3.3).

Based on the model of AtMYB61, the molecular interactions shared between the

binding sites of AtMYB61 to its targets supported in vitro binding data (Fig. 3.3). For

example, there were more hydrogen bonds shared between AtMYB61 DNA-binding

domain and AC-I compared to NBS (Fig. 3.3bcd). Based on the model of AtMYB61

bound to AC-I, several specific intermolecular interactions are predicted to create

binding specificity. These include hydrogen bonds between the following residues:

asparagine-59 (R3 helix) of AtMYB61 with adenosine-1 nitrogen of AC-I; asparagine-

106 (R3 helix) oxygen of AtMYB61 with adenosine-1 hydrogen of AC-I; asparagine-59

(R3 helix) oxygen of AtMYB61 with cytosine-2 hydrogen of AC-I; asparagine-102 (R3

helix) oxygen of AtMYB61 with cytosine-3 hydrogen of AC-I; arginine-56 (R2 helix)

oxygen of AtMYB61 with cytosine-3 hydrogen of AC-I; arginine-54 (R2 helix) hydrogen

of AtMYB61 with thymidine-4 oxygen of AC-I; and, lysine-51 (R2 helix) of AtMYB61 with

adenosine-5 nitrogen of AC-I (Fig. 3.3bc). The leucine-55 (R2 helix) methyl group of

AtMYB61 is predicted to form a non-polar bond with thymidine-4 methyl group of AC-I.

Cytosine-6 remained unbound in the model (Fig. 3.3c). In comparison, the NBS model

had only one hydrogen bond present, involving asparagine-59 (R3 helix) oxygen of

AtMYB61 with adenosine-2 hydrogen of AC-I (Fig. 3.3d).

3.4.4 The Affinity of AtMYB61 to Specific Target DNA Sequences Did Not

Correlate with AtMYB61-Driven Transcriptional Activation with

Each of the Target Sequences

Previous studies have shown that AtMYB61 protein is sufficient to drive transcription in

yeast from promoter sequences that contain AC elements (Romano et al., 2012).

Consequently, yeast transcriptional activation assays were used to determine the

relationship between AtMYB61 affinity to specific DNA sequences and its capacity to

drive transcription (Fig. 3.4). Reporter constructs comprised the coding sequence for -

galactosidase under the control of the yeast minimal CYC1 promoter fused to triple

repeats of a given CASTing target or a mutated AC-I motif (Fig. 3.4). The minimal

72

Figure 3.3. Molecular modelling of AtMYB61 with target sequences confirm binding preferences determined by nitrocellulose filter-binding assays and EMSAs. (a) Pymol models of ACCTAC motif docked into the binding site of AtMYB61. Molecular modelling was completed by using the online program PHYRE (Protein Homology/analogY Recognition Engine) to predict a crystal structure of AtMYB61 using homology to the c-MYB DNA binding domain. The PDB (Protein Data Bank) file recovered from the PHYRE analysis was used to superimpose the predicted AtMYB61 crystal structure with the c-MYB crystal structure using DaliLite. Using Pymol the 3D sequence model -- ACCTAC -- was docked into the predicted binding sites of AtMYB61. The AC1 element model is displayed in yellow, the loop secondary structure of AtMYB61 inferred model is displayed in green, and the helix secondary structure of AtMYB61 inferred model is displayed in red. (b) Model of AtMYB61 binding site with the first three ACC nucleotides in the ACCTAC sequence determines that these nucleotides are essential for binding. The AC1 (ACCTAC) nucleotide model was docked into the predicted binding site of AtMYB61. The specific hydrogen bonding between the amino acids of AtMYB61 binding site to the ACC nucleotides of AC1 were predicted by Pymol and

73

Figure 3.3 caption continued. listed as follows: asparagine-59 (R3 helix) hydrogen to adenosine-1 nitrogen; asparagine-106 (R3 helix) oxygen to adenosine-1 hydrogen; asparagine-59 (R3 helix) oxygen to cytosine-2 hydrogen; asparagine-102 (R3 helix) oxygen to cytosine-3 hydrogen; and arginine-56 (R2 helix) oxygen to cytosine-3 hydrogen. This confirms binding data determined by the nitrocellulose filter-binding assay and EMSAs, iterating that the ACC motif is the core recognition motif of AtMYB61. (c) Model of AtMYB61 binding site with the TAC nucleotides in the ACCTAC sequence determine that these nucleotides are less essential for binding. The AC1 (ACCTAC) nucleotide models were docked into the predicted binding sites of AtMYB61. The molecular interactions between the amino acids of AtMYB61 binding site and the TAC nucleotides of AC1 were analyzed by Pymol and are listed as follows: leucine-55 (R2 helix) methyl group was predicted to form a non-polar bond with thymidine-4 methyl group; Arginine-54 (R2 helix) hydrogen was predicted to form a hydrogen bound with thymidine-4 oxygen; lysine-51 (R2 helix) hydrogen was predicted to form a hydrogen bound with adenosine-5 nitrogen; and cytosine-6 remained unbound in the model. (d) Model of AtMYB61 binding site with non-binding site (GAGACC) predicts that this motif is not recognised by AtMYB61. The non binding site model was docked into AtMYB61 binding site via Pymol and hydrogen bonding was analyzed. Only one hydrogen bond was predicted between AtMYB61 asparagine-59 (R3 helix) oxygen and the non-binding site adenosine-2 hydrogen. Yellow dashed lines indicate hydrogen bonding established by Pymol program, and blue dashed lines indicate non-polar interactions.

74

Figure 3.4. AtMYB61-mediated activation of promoter activity in Saccharomyces cerevisiae in an AC dependent fashion. (a) The sequence of the oligonucleotides cloned into the reporter vector using EcoRI and SalI sites. Each AC element or mutated ACI element is triplicated within the segment. (b) Schematic representations of the Effector

75

Figure 3.4 caption continued. (pYES2TRP::AtMYB61) and Reporter (pLacZi::AC) constructs used in this assay (CYC1: minimal yeast promoter). (c) Quantitative analysis of β-galactosidase activity in yeast after induction. The measurements in liquid assay were made from three biological independent replicates. Activation of artificial genes comprising a minimal CYC1 promoter fused to a tandem AC element or mutated ACI element upstream of the lacZ gene by AtMYB61 protein, upon growth of the yeast in galactose (light grey bars), gave rise to β-galactosidase activity that was significantly greater than the controls, as determined by analysis of variance (P < 0.005); including each vector alone, or both together after growth on non-inducing glucose (dark grey bars). Error bars represent standard deviations. * indicates statistically significant, P < 0.005, determined by t-test. Underlined bases corresponds to a substituted guanine.

76

CYC1 promoter is unable to support transcription, so reporter expression would be

contingent on the capacity of AtMYB61 to bind to the fused motifs, which would function

as gene regulatory sequences. The expression of AtMYB61 was under the control of

the galactose-inducible GAL1 promoter. As determined by the quantification of -

galactosidase activity, when AtMYB61 protein was induced by galactose, the protein

was able to activate transcription from the CASTing target sequences but not from the

mutated AC-I elements (Fig. 3.4). The extent of transcriptional activation varied for

each CASTing target (Fig. 3.4c). Notably, CASTing target sequences ACCATA,

ACCAAT, and ACCAAA supported greater amounts of -galactosidase induction

relative to the AC-I element, which bound with the greatest affinity to AtMYB61 (Fig.

3.4c).

Previously, R2R3-MYB proteins have been shown to bind to AC elements and activate

transcription in yeast and in planta; however, these studies did not correlate binding

affinity with ability to activate transcription (Jin et al., 2000; Patzlaff et al., 2003a;

Patzlaff et al., 2003b; Gomez-Maldonado et al., 2004). Yeast activation assays

determined that the affinity of AtMYB61 to specific target DNA sequences did not

correlate with AtMYB61-driven transcriptional activation with each of the target

sequences. This is consistent with results obtained using the glucocorticoid receptor

(GR), where no correlation between in vitro binding affinities and in vivo transcriptional

activities was observed (Meijsing et al., 2009). GR target sequences, differing by as

little as a single nucleotide, differentially affected GR DNA binding and transcriptional

activity, with no correlation between these parameters. Similarly, binding affinity of

AtMYB61 to specific target DNA sequences did not correlate with AtMYB61-driven

transcriptional activation with each of the target sequences. It may be that conformation

of AtMYB61 changes when binding to a specific DNA sequence, altering its ability to

activate transcription.

3.4.5 CASTing Target Sequences Were Found in the Promoter Regions

of Three Putative Direct Downstream Targets of AtMYB61

Previous experiments identified three putative direct downstream target genes of

AtMYB61 (Fig. 3.5)(Romano et al., 2012). These gene targets encode the following

77

Figure 3.5. Sequences recovered from the CASTing assay were found in all three promoter regions of predicted direct downstream targets of AtMYB61, namely KNOTTED1-like transcription factor (KNAT7, At1g62990); caffeoyl-CoA 3-O-methyltransferase (CCoAOMT7, At4g26220), and pectin-methylesterase (PME, At2g45220). The three putative AtMYB61 direct downstream target genes were identified by Romano et al. by using the intersection set of genes found to be co-regulated with AtMYB61 in both the AtGenExpress developmental dataset and AtMYB61-specific microarray experiment. 1000bp upstream regulatory regions were examined of the three genes. +/- indicate the orientation of CASTing target sequences relative to the sense coding strand; whereas, numbers indicate the position of these motifs relative to the putative transcriptional start (indicated by an arrow). Triangle represents ACCAAA, square represents ACCAAT, and circle represents ACCATA.

78

gene products: a KNOTTED1-like transcription factor (KNAT7, At1g62990); a caffeoyl-

CoA 3-O-methyltransferase (CCoAOMT7, At4g26220); and a pectin-methylesterase

(PME, At2g45220). The CASTing targets were identified in the 1000bp 5‘ non-coding

regions of the three putative direct target genes (Fig. 3.5). AtMYB61 bound to the 5‘

gene regulatory sequences of all three putative direct target genes in an AC dependent

manner (Romano et al., 2012). These data support the hypothesis that AtMYB61 binds

to AC elements in a distinct set of target genes to modify gene expression.

3.5 Conclusion

Despite the size and importance of the plant R2R3-MYB family of transcriptional

regulators, little is known about the molecular functioning of given family members. The

work described herein casts greater light on the interaction between an R2R3-MYB

family member and its cognate DNA targets. The findings support the hypothesis that

AtMYB61 is recruited to target genes via its interactions with a set of unique sequences,

and thereby modifies gene expression. Surprisingly, the affinity of AtMYB61 to specific

target DNA sequences did not correlate with AtMYB61-driven transcriptional activation

with each of the target sequences, suggesting that the conformation of AtMYB61 may

be altered allosterically when bound to specific target sequences. These findings point

to additional complexities in the regulation of plant gene expression, and argue for the

need for greater exploration of the molecular intricacies involved in the interactions

between plant transcription factors and their DNA targets.

3.6 Acknowledgements

We are very grateful to Ms. Joan Ouellette for technical assistance, Mr. Ke Wu for

assistance on the CASTing assay and to Ms. Stephanie Tung, Ms. Kate Lee and Ms.

Trisha Min for assistance with the yeast experiments. This work was generously

supported by a Natural Science and Engineering Research Council of Canada

(NSERC) Canadian Graduate Scholarship (CGSD) awarded to MP, and funding from

NSERC to M.M.C.

79

3.7 Supplemental Figures and Tables

Figure S3.1. AtMYB61 antibody generation and validation. (a) Amino acid sequence similarity of AtMYB61 along with its closest family member AtMYB50. The two proteins have conserved N-terminal amino acid sequences but unique C-terminal domains, which was the domain selected to generate AtMYB61 antibodies against (highlighted region). (b) A chemiluminescence western-blot of full length AtMYB61 recombinant protein (Lane 1), of antibody alone (Lane 2), and AtMYB61 recombinant protein immunoprecipitated with prebleed serum (Lane 3) and with AtMYB61 specific antiserum (Lane 4) validate AtMYB61 antibody specificity. Western-blot was done with 1:20 000 dilution of post-injected serum. Western-blot shows greater quantities of AtMYB61 protein eluted off the Magnetic Dynabeads Protein G post-injected antibody complex compared to the Magnetic Dynabeads Protein G pre-injected antibody complex, showing that the immunoprecipitation was successful.

80

Supplemental Table S3.1. Relative binding of CASTing targets and mutated AC1

sequences to AtMYB61

ACCAAC 1.00E-09 5.00E-09 1.00E-08 5.00E-08 1.00E-07 5.00E-07 1.00E-06 5.00E-06 Probe

Trial 1 16203 20451 26456 111235 223153 310225 325212 335456 460122

Trial 2 15145 18513 23513 92214 242153 315212 321021 324658 458213

Trial 3 13142 19088 20578 114285 231026 288232 304666 307279 446521

Average 14830 19350 23515 105911 232110 304556 316966 322464 454952

Binding 0.0352182 0.045954 0.055845 0.251517 0.551214 0.723257 0.752728 0.765785

ACCACC 1.00E-09 5.00E-09 1.00E-08 5.00E-08 1.00E-07 5.00E-07 1.00E-06 5.00E-06 Probe

Trial 1 32891 42654 47895 54112 112356 220167 328989 363354 481234

Trial 2 33564 41258 45654 55333 111589 226896 322644 312578 495242

Trial 3 38289 41124 46524 49992 117458 236446 328227 341592 475863

Average 34914 41678 46691 53145 113801 227836 326619 339174 484113

Binding 0.0774535 0.092459 0.103578 0.117897 0.252453 0.505425 0.724564 0.752415

ACCAAA 1.00E-09 5.00E-09 1.00E-08 5.00E-08 1.00E-07 5.00E-07 1.00E-06 5.00E-06 Probe

Trial 1 26442 31996 176351 255645 319665 321348 345631 347562 496372

Trial 2 25854 31254 169856 251335 314556 318964 342654 342556 489653

Trial 3 22542 26978 141629 241274 310182 310808 331499 342380 481234

Average 24946 30076 162612 249418 314801 317040 339928 344166 489086.33

Binding 0.0551982 0.066549 0.359815 0.55189 0.696565 0.701519 0.752165 0.761542

ACCAAT 1.00E-09 5.00E-09 1.00E-08 5.00E-08 1.00E-07 5.00E-07 1.00E-06 5.00E-06 Probe

Trial 1 20155 46254 211558 266242 312585 357456 354231 362645 487651

Trial 2 18982 45335 208334 263423 311225 349978 344580 350024 480225

Trial 3 19003 41241 196044 266540 305857 297827 292797 300952 479852

Average 19380 44276 205312 265401 309888 335086 330536 337873 482576

Binding 0.0431234 0.098522 0.456848 0.590556 0.689546 0.745615 0.735489 0.751816

ACCACA 1.00E-09 5.00E-09 1.00E-08 5.00E-08 1.00E-07 5.00E-07 1.00E-06 5.00E-06 Probe

Trial 1 17245 24288 37524 126997 224568 308521 334568 337851 462254

Trial 2 16670 22853 36293 124789 219895 305452 325586 329987 461235

Trial 3 11815 24571 35957 117110 140819 296508 325909 318383 459978

Average 15243 23904 36591 122965 195093 303493 328687 328740 461155

Binding 0.0354852 0.055647 0.085182 0.286255 0.454165 0.706512 0.765162 0.765285

81

Table S3.1 continued

ACCATA 1.00E-09 5.00E-09 1.00E-08 5.00E-08 1.00E-07 5.00E-07 1.00E-06 5.00E-06 Probe

Trial 1 21258 28269 171654 242588 324689 320115 345456 345571 475821

Trial 2 20199 27855 169558 234560 311471 319524 339887 340129 474458

Trial 3 18837 22493 154411 185371 266696 315799 309027 325918 468521

Average 20097 26205 165207 220839 300951 318479 331456 337206 472933

Binding 0.0456419 0.059512 0.375182 0.50152 0.683453 0.723257 0.752728 0.765785

ACCTAC 1.00E-09 5.00E-09 1.00E-08 5.00E-08 1.00E-07 5.00E-07 1.00E-06 5.00E-06 Probe

Trial 1 42125 51226 251665 288968 332541 331224 361547 358702 489213

Trial 2 40242 52874 248552 287110 325574 312123 358990 348873 486237

Trial 3 37496 45365 232628 250710 288998 323556 288460 312273 480411

Average 39954 49821 244281 275596 315704 322300 336332 339949 485287

Binding 0.0882924 0.110098 0.539824 0.609024 0.697657 0.712234 0.743241 0.751234

GCCTAC 1.00E-09 5.00E-09 1.00E-08 5.00E-08 1.00E-07 5.00E-07 1.00E-06 5.00E-06 Probe

Trial 1 13586 23558 25334 34135 55101 68951 62548 66352 478921

Trial 2 11440 21040 22114 37526 57241 69524 60177 63874 476621

Trial 3 5666 15369 22859 37855 50639 65307 54646 56274 465312

Average 10230 19989 23435 36505 54326 67927 59123 62166 473618

Binding 0.0232423 0.0454123 0.0532423 0.0829349 0.123423 0.154321 0.134321 0.141234

AGCTAC 1.00E-09 5.00E-09 2.00E+00 5.00E-08 1.00E-07 5.00E-07 1.00E-06 5.00E-06 Probe

Trial 1 14448 18542 23868 38512 51224 85304 71452 74289 468621

Trial 2 16273 17520 21264 37246 49254 82330 68871 72555 461255

Trial 3 13322 10614 21663 29321 42945 81802 53817 68382 458913

Average 14680 15558 22264 35026 47807 83145 64713 71741 462929

Binding 0.0345132 0.0365768 0.0523423 0.0823432 0.11239 0.195465 0.152134 0.168657

ACGTAC 1.00E-09 5.00E-09 1.00E-08 5.00E-08 1.00E-07 5.00E-07 1.00E-06 5.00E-06 Probe

Trial 1 27881 31456 44586 49853 78921 69889 81227 83556 498533

Trial 2 26648 35213 41001 44571 69246 71526 82254 84470 491524

Trial 3 22840 22499 32838 35768 78246 62399 71100 90315 489255

Average 25789 29722 39475 43397 75471 67937 78526 86113 493104

Binding 0.0565421 0.0651652 0.0865465 0.0951456 0.165465 0.148949 0.172165 0.188798

82

Table S3.1 Continued

ACCGAC 1.00E-09 5.00E-09 1.00E-08 5.00E-08 1.00E-07 5.00E-07 1.00E-06 5.00E-06 Probe

Trial 1 8014 18520 43558 51201 63599 133563 262147 335864 475561

Trial 2 4861 14332 41002 49664 54248 119211 246610 312247 472608

Trial 3 3577 9157 43689 42203 49600 113920 210525 333435 467823

Average 5484 14003 42749 47689 55815 122231 239760 327182 471997

Binding 0.012591 0.03215 0.098156 0.109498 0.12816 0.280651 0.55051 0.75123

ACCTGC 1.00E-09 5.00E-09 1.00E-08 5.00E-08 1.00E-07 5.00E-07 1.00E-06 5.00E-06 Probe

Trial 1 17265 34552 36521 92234 233458 290142 330121 388914 466258

Trial 2 16998 32621 37229 88841 225895 289521 322449 311258 462135

Trial 3 9756 34186 31294 94421 207276 256933 308889 269165 459532

Average 14673 33786 35014 91832 222209 278865 320486 323112 462641

Binding 0.034285 0.07895 0.081816 0.214575 0.51922 0.651598 0.74885 0.75499

ACCTAG 1.00E-09 5.00E-09 1.00E-08 5.00E-08 1.00E-07 5.00E-07 1.00E-06 5.00E-06 Probe

Trial 1 18554 34858 44578 155441 225898 321580 335412 344521 465852

Trial 2 15587 33654 41876 159872 241014 333148 301512 335215 462344

Trial 3 9779 35282 30872 139987 206279 220897 328389 300488 461247

Average 14639 34597 39108 151766.76 224397 291874.9 321771 326741 463147

Binding 0.034285 0.08102 0.091588 0.355419 0.52551 0.683535 0.75355 0.76519

NBS 1.00E-09 5.00E-09 1.00E-08 5.00E-08 1.00E-07 5.00E-07 1.00E-06 5.00E-06 Probe

Trial 1 19985 37885 34458 43528 51985 68555 55512 71445 465664

Trial 2 17753 37521 32141 41152 47880 64118 52998 66529 463887

Trial 3 17636 34577 34782 42211 52763 67915 53950 74516 460014

Average 18457 36661 33793 42296 50875 66862 54153 70830 463188

Binding 0.043123 0.08565 0.078952 0.098818 0.11886 0.156212 0.12652 0.16548

This table includes nitrocellulose filter binding data determining the relative binding of AtMYB61 to the CASTing targets and to the mutated ACCTAC motifs in triplicate. The 60 bp DNA probes were present in excess amounts. The probe concentration for each sequence was 1065nM and the total amount of DNA added to each reaction was 124.41ng. The protein concentrations are labelled in red and vary from 0M to 5.00E -09 M. The cpm of each sample was measured by a liquid scintillation counter. If AtMYB61 bound to a sequence then it would reach a binding max of ~0.75 binding. If AtMYB61 did not bind to a sequence, then the binding would not increase with the increase in protein concentration.

83

Chapter 4

Novel regulation of an R2R3-MYB transcription factor, AtMYB61, by a non-hexokinase sugar-signalling pathway

This chapter comprises the following manuscript-in-preparation in its entirety:

Michael B. Prouse, Christian Dubos, Cécile Vriet, & Malcolm M. Campbell (2013) Novel

Regulation of an R2R3-MYB transcription factor AtMYB61 by a non-hexokinase sugar-

signalling pathway.

Contributions: MBP, CD, MMC designed research; MBP, CD, CV performed

research; MBP, CD, CV, MMC analysed data; MBP, MMC wrote manuscript with

editorial assistance from CD, CV.

MBP contributed specifically to Fig. 4.5, Fig. 4.6, Fig. 4.7, Fig. 4.8, Fig. 4.9, Fig. 4.10,

Fig. S4.1, Fig. S4.2, Fig. S4.3, Fig. S4.4, Table 4.1, and Table S4.1.

84

4 Novel Regulation of an R2R3-MYB Transcription Factor, AtMYB61, by a Non-Hexokinase Sugar-Signalling Pathway

4.1 Abstract

AtMYB61, a member of the R2R3-MYB family of transcription factors in Arabidopsis

thaliana, alters gene expression, resulting in pleiotropic modifications of carbon

allocation throughout the plant body. Here, we demonstrate that AtMYB61 expression

is modulated by photosynthate through a novel sugar-signalling pathway that does not

appear to directly involve hexokinase. Analysis of promoter-reporter fusion constructs

that contained or did not contain AtMYB61 5‘ intragenic sequences determined that

AtMYB61 expression is de-repressed by soluble sugars in a mechanism involving

intragenic sequences. Phylogenetic footprinting identified a repeat motif, termed

second intron repeat, that was conserved across second intron of Brassicaceae

AtMYB61 homologues. Nuclear proteins from seedlings grown in the presence or

absence of sugars bound differentially to the second intron repeat consistent with the

derepression model. Second intron repeat binding proteins were identified and

characterised using a combination of loss-of-function genetics and transcriptome

analysis. Taken together, a novel protein activity that binds a conserved repeat motif

within AtMYB61 second intron is uncovered, and suggested to regulate sugar mediated

gene expression in other genes that contain this repeat. The elucidation of the

upstream regulation of AtMYB61 thereby uncovers a novel sugar signalling pathway

that makes use of intragenic sequences as regulatory elements, which appears to act in

a pathway independent of hexokinase.

4.2 Introduction

As sessile photoautotrophs, plants must balance their requirements for carbon against

their ability to fix carbon through photosynthesis. Consequently, plants have evolved

mechanisms that enable them to contend with fluctuations in their capacity to fix carbon.

Some of these mechanisms control the aperture of stomata, dynamic pores found on

85

the surfaces of plant leaves that control water loss from the plant and regulate the

uptake of CO2 for photosynthesis. Such mechanisms modulate carbon acquisition

relative to prevailing environmental conditions, thereby creating variations in the levels

of photosynthate, the sugars derived from photosynthesis. Accordingly, plants have

also evolved mechanisms to perceive and respond to photosynthate (Koch, 1996;

Smeekens, 2000; Rolland et al., 2002; Halford and Paul, 2003). These mechanisms

appropriately modulate the allocation of carbon to various facets of plant growth,

development and metabolism. While a significant body of evidence suggests a link

between the signalling pathways that modulate carbon acquisition and those involved in

resource allocation, rather little is known about the specific factors involved.

We found that a gene encoding a member of the Arabidopsis thaliana R2R3-MYB family

of transcription factors, AtMYB61 (At1g09540), was expressed in guard cells in a

manner consistent with involvement in the control of stomatal aperture (Liang et al.,

2005). Over-expression and loss-of-function mutant analyses revealed that AtMYB61

expression was both sufficient and necessary to bring about reductions in stomatal

aperture with consequent effects on gas exchange (Liang et al., 2005). Taken together,

the data provided evidence that AtMYB61 encodes a transcription factor implicated in

the closure of stomata. Aside from its involvement in the control of stomatal aperture,

we recently found that AtMYB61 is sufficient and necessary to allocate carbon to the

two other major components of the plant transpiration system – the water conducting

xylem cells and the root system (Romano et al., 2012). Thus, it appears that AtMYB61

regulates processes related to the acquisition and allocation of carbon, perhaps

functioning to balance carbon supply with demand. Ideally, such a chemostat would be

informed by the level of carbon itself.

We show here that AtMYB61 expression is modulated by photosynthate. The results

suggest that AtMYB61 integrates signals derived from the perception of sugars, but that

this does not directly involve the hexokinase sugar-signalling pathway. Conversion of

the sugar signals into a transcriptional response is dependent on intragenic sequences

located within one of the two introns of AtMYB61. A novel protein activity that binds to

a repeat motif found within this intron is uncovered, and proposed to regulate sugar-

mediated gene expression in a suite of genes that contain repeats of the same motif,

86

predominantly in intragenic regions. AtMYB61 thereby uncovers a novel sugar-

signalling pathway that makes use of intragenic non-coding sequences as cis-acting

elements, and a novel repressor protein in the regulatory pathway.

4.3 Materials and Methods

4.3.1 Plant Material and Culture

Wild-type (WT) Arabidopsis thaliana seeds (Col-0) were obtained from the Nottingham

Arabidopsis Stock Centre (NASC). The AtMYB61 promoter/5’ coding sequence

(containing introns)::uidA (61PN::GUS) fusion, AtMYB61 promoter (not containing

introns)::GFP (61P::GFP) fusion and AtMYB61 promoter/5’ coding sequence

(containing introns)::GFP (61PN::GFP) fusion were constructed and stably transformed

into Arabidopsis thaliana plants as described previously (Newman et al., 2004; Liang et

al., 2005). Arabidopsis thaliana seeds were sterilised and grown according to standard

protocols (Newman et al., 2004), except where indicated below. Growth stages were

assigned based on published standards (Boyes et al., 2001).

Plants over-expressing AtMYB61 under the control of the Cauliflower Mosaic Virus 35S

promoter (35S::MYB61) were as described previously (Newman et al., 2004; Liang et

al., 2005). Similarly, AtMYB61 loss-of-function mutants (atmyb61) and glucose

insensitive mutants (gin) have been described previously (Zhou et al., 1998; Arenas-

Huertero et al., 2000; Penfield et al., 2001; Moore et al., 2003; Liang et al., 2005).

35S::MYB61 line, loss-of-function allele (atmyb61-1), loss-of-function allele (gin1-3),

loss-of-function allele (gin6-1), and loss-of-function allele (gin2-1) were used in all

experiments, and results are representative. T-DNA insertional mutant lines

corresponding to either AtMYB61 or putative downstream targets of AtMYB61 were

obtained from the Arabidopsis Biological Resource Center (ABRC) (Alonso et al., 2003).

Homozygous T-DNA lines were obtained by PCR screening using the left border T-DNA

primer and a right border gene-specific primer

(http://signal.salk.edu/tdnaprimers.2.html). Insertion sites were sequenced for all

mutants to verify insertional mutagenesis (data not shown), and quantitative PCR was

conducted to show that the mutants were loss-of-function.

87

For primary bolt and hypocotyl analyses, seeds were germinated and plants were grown

on soil. Seeds were sown on dampened soil and then cold stratified for 3 d before

placement in a growth chamber at 21°C with a regime of 12 h of light (120 μmol m−2 s−2)

and 12 h of dark. This growth regime is referred to as short-day conditions herein.

4.3.2 Phylogenetic Analysis of AtMYB61 Brassicaceae Homologues

Sequences of Brassicaceae AtMYB61 homologues were obtained from the online

Phytozome tool (http://phytozome.net). Sequence alignments were conducted by

aligning intragenic and flanking exonic AtMYB61 Brassicaceae homologues sequences

(Arabidopsis thaliana gene At1g09540; Arabidopsis lyrata gene 919710; Capsella

rubella gene Carubv10009497m.g; Brassica rapa gene Bra020016; and Thellungiella

halophila gene Thhalv10008000m.g) by using the online ClustalW2 tool

(http://ebi.ac.uk?Tools/msa/clustalw2/). AtMYB61 Brassicaceae homologues sequence

alignments were used in Berkeley‘s online WebLog tool

(http://weblogo.berkeley.edu/logo.cgi) to obtain sequence logos.

4.3.3 Analysis of Transgenic Plants Containing Promoter::Reporter

Fusions

For the analysis of GUS expression, seedlings were germinated in the dark in liquid

Murashige Skoog (MS) medium as described previously (Newman et al., 2004). Liquid

MS medium contained no carbon source, and sugars were supplemented to a final

concentration of 30mM with either sucrose, glucose, fructose, maltose, turanose,

palatinose, or raffinose. 2-deoxyglucose and 3-O-methylglucose were added to the MS

medium, in the dark, to a final concentration of 30mM, after the seeds had germinated.

In the experiment with mannoheptulose (MHL), the MS medium was supplemented with

30mM sucrose and 100 mM MHL. Seedlings were grown for 7 days in the dark prior to

analysis by confocal scanning laser microscopy. These seedlings were mounted in an

aqueous solution of 10g/ml propidium iodide on a microscope slide and examined

using a Zeiss LSM 510 confocal laser scanning microscope according to published

protocols (Matsumoto, 2002). Histochemical localisation and quantitative fluorometric

(methylumbelliferone-glucuronide, MUG) assay of GUS activity was conducted as

88

described previously (Gallagher, 1992). Quantitative GUS analyses used protein

extracts obtained from 37 seedlings that had been frozen in liquid nitrogen and then

ground to a fine powder. Three biological replicates were obtained per condition

examined, and each protein extract was measured in duplicate.

4.3.4 Semi-Quantitative PCR

Semi-quantitative PCR on sugar sensitive putative repressors of AtMYB61 expression

loss-of-function mutants (rmx), WT Col, atmyb61-1, and 35S::MYB61 were conducted

by extracting RNA from 400mg of 7 day-old dark-grown seedlings, grown in the

presence or absence of 30mM sucrose. RNA was extracted from frozen tissue using

RNeasy Plant Mini Kit (Qiagen, Toronto, ON, Canada) according to manufacturer‘s

instructions. cDNA Synthesis was accomplished from oligo(dT) with SuperScript II

Reverse Transcriptase (Invitrogen, Burlington, ON, Canada) following the

manufacturer's instructions. PCR primers amplified the AtMYB61 gene using primers

F61Bam (5′-GGATCCATGGGGAGACATTCTTTGCTGTTAC-3′) and R61Eco (5′-

GAATTCTAAAGGGACTGACCAAAAGAGAC-3′). Semi-quantitative PCR was

performed on first strand cDNA using the MJ Research PTC Thermal Cycler (Bio-Rad,

Mississauga, ON, Canada). Semi-quantitative PCR conditions were: 90°C for 2 min,

and then 35 cycles of the following: 90°C for 40 sec, 65°C for 1 min, 72°C for 2 min, and

then 72°C for 5 min. The data were normalized to an actin control gene (ACT-11) that

was amplified using primers ACT1 (5′-GCC-AAAGCAGTGATCTCTTTGCTC-3′) and

ACT2 (5′-GTGTTGGAC-TCTGGAGATGGTGTG-3′), using the above reaction

conditions with either 25 or 35 amplification cycles. Results were analysed for

AtMYB61 misexpression to validate these nuclear proteins, which bound to the second

intron and second intron repeat, as putative AtMYB61 repressors.

4.3.5 Quantitative, Real-Time, Reverse Transcriptase Polymerase Chain

Reaction (qRT-PCR)

qRT-PCR for sugar regulation of AtMYB61 was conducted by extracting RNA from

400mg of seedlings that had been grown in the dark for 7 days in liquid MS medium

supplemented with sugars as described above. RNA was extracted from frozen tissue

89

using the RNeasy Plant Mini Kit (Qiagen, Toronto, ON, Canada) according to

manufacturer‘s instructions. The extracted RNA was subjected to DNase digestion

(DNA-free Kit, Ambion, Burlington, ON, Canada), precipitation (GlycoBlue, Ambion,

Burlington, ON, Canada), and purification (RNeasy Plant Mini Kit, Qiagen, Toronto, ON,

Canada). cDNA Synthesis was accomplished using the RETROscript Kit (Ambion,

Burlington, ON, Canada) according to manufacturer‘s instructions. Three sets of

polymerase chain reaction (PCR) primers were designed using Primer Express 2.0

(Applied Biosystems). These primers spanned an intron/exon boundary in order to

circumvent the amplification of genomic DNA. The first set of primers generated an

amplicon corresponding to AtMYB61 (At1g09540U: 5‘-TGG AAA CAG ATG GTC ACA

GAT TG-3‘; At1g09540U: 5‘-ATG CTT GAG TTC CAT AGA TTC TTG ATC-3‘), the

second to HEXOKINASE-2 (AtHXK2) (AtHXK2U: 5‘-ACA AAT GCA GCC TAT GTC

GAA CGT G-3‘; AtHXK2L: 5‘-TGT TCG GGG TCC TTA TGA TGA ATG G-3‘) and the

third set amplified the internal qPCR control, AtTUBULIN4 (At5g44340U: 5‘-AAC GCT

GAC GAG TGT ATG GTT TT-3‘; At5g44340L: 5‘-CCA AAG GTA GGA TTA GCG AGC

TT-3‘). qRT-PCR was performed on first strand cDNA using the QuantiTect SYBR

Green PCR Kit (Qiagen, Toronto, ON, Canada). qRT-PCR conditions were: 50ºC for 2

min, 95ºC for 10 min, and then 40 cycles of 95ºC for 15 s and 60ºC for 1 min.

Quantification was performed using the ABI PRISM 7700 Sequence Detection System

(Applied Biosystems).

qRT-PCR for sugar sensitive putative repressors of AtMYB61 expression loss-of-

function mutants (rmx) was conducted by extracting RNA from 7 day-old dark-grown

(rmx) seedlings, grown in the presence or absence of 30mM sucrose. RNA was

extracted from frozen tissue using RNeasy Plant Mini Kit (Qiagen, Toronto, ON,

Canada) according to manufacturer‘s instructions (Qiagen, Toronto, ON, Canada).

cDNA was synthesised from the RNA using SuperScript II Reverse Transcriptase

(Invitrogen, Burlington, ON, Canada) initiated using an oligo(dT) primer, following the

manufacturer's instructions. qRT-PCR was performed on first strand cDNA using the

iCycler iQ real-time PCR detection system (Bio-Rad, Mississauga, ON, Canada). The

AtMYB61 amplicon was generated using primers F61Bam (5′-

GGATCCATGGGGAGACATTCTTTGCTGTTAC-3′) and R61Eco (5′-

90

GAATTCTAAAGGGACTGACCAAAAGAGAC-3′) qRT-PCR conditions were: 90°C for 2

min, and then 35 cycles of the following: 90°C for 40 sec, 65°C for 1 min, 72°C for 2

min, and then 72°C for 5 min. The data were normalized to an actin control gene (ACT-

11) that was amplified using primers ACT1 (5′-GCC-AAAGCAGTGATCTCTTTGCTC-3′)

and ACT2 (5′-GTGTTGGAC-TCTGGAGATGGTGTG-3′), using the above reaction

conditions with either 25 or 35 amplification cycles.

4.3.6 Electrophoretic Mobility Shift Assay (EMSA)

Nuclear extracts were purified from 7 day-old dark-grown wild-type Columbia

Arabidopsis thaliana seedlings grown in the absence or presence of sucrose according

to Saleh et al. (Saleh et al., 2008). The 90bp intron-two repeat was amplified and

purified using a Qiagen nucleotide removal column, according to the Qiagen

manufacturer (Qiagen, Toronto, ON, Canada). The purified PCR products were then

radioactively labelled with 32P via incorporation of a radiolabelled nucleotide following

primer extension (Sablowski et al., 1994; Hatton et al., 1995). The labelled

oligonucleotide was then subjected to a final purification using the Qiagen nucleotide

purification kit, according to the manufacturer‘s instructions (Qiagen, Toronto, ON,

Canada). Radioactivity levels were measured via a liquid scintillation counter to

measure the incorporation of 32P into the probe. Affinity of binding was assessed using

competition assays with the unlabelled AtMYB61 intron-two repeat sequence. EMSA

conditions were exactly as described previously (Patzlaff et al., 2003b), but using

nuclear extracts purified from 7 day-old dark-grown wild-type Columbia Arabidopsis

thaliana seedlings grown with the absence or presence of sucrose protein in place of

pine MYB protein.

4.3.7 Streptavidin Biotin Pull-Down Assay

For the streptavidin-bioitin pull-down assay, the second intron and the second intron

repeat was biotinylated and immobilised on M280 Streptavidin Dynabeads (Invitrogen,

California, USA). Nuclear extracts from 7 day-old, dark-grown wild-type Columbia

Arabidopsis thaliana seedlings grown in either the absence or presence of sucrose,

according to Saleh et al. (Saleh et al., 2008), were exposed to biotinylated complexes

91

that were confirmed by the Chemiluminescent Biofisher Biotin Detection Kit (Nepean,

Ontario, Canada). 0.1mg/ml of Poly-R478 was used in each reaction to reduce non-

specific binding. The proteins that bound the biotinylated complexes were subjected to

mass spectrometry.

4.3.8 Mass Spectrometry

Liquid chromatography tandem mass spectrometry (LC-MS/MS) with the Orbital Mass

Spectrometer was conducted on peptides purified from the streptavidin biotin pull-down

assay as previously described (Hewel et al., 2010). Confidence of each protein

identified was calculated by StatQuest program (Kislinger et al., 2003). The database

for the identification of proteins was UNIPROT database of the Arabidopsis thaliana

subset. All spectra were also searched separately against human/mouse database

without obtaining significant identifications verifying identifications in Arabidopsis

thaliana samples.

4.4 Results and Discussion

4.4.1 AtMYB61 Expression is Regulated by Sugars

Arabidopsis thaliana seedlings that have been germinated and grown in the dark are

etiolated, and rely completely on seed reserves or exogenous sugar as a source of

carbon (Roldan et al., 1999). Such seedlings serve as a useful model to examine the

effects of sugars on plant cells. AtMYB61 transcript abundance in dark-grown

Arabidopsis thaliana seedlings increased when the seedlings were grown in the

presence of the metabolisable sugars sucrose, glucose or fructose as revealed by

quantitative, real-time, reverse-transcriptase, polymerase chain reaction (qRT-PCR)

(Fig. 4.1). This effect was not osmotic as the presence of sorbitol (a non-metabolisable

sugar alcohol) did not induce an increase in transcript abundance.

To investigate how AtMYB61 expression is shaped by sugars, qualitative and

quantitative changes in the activity of the -glucuronidase (GUS) reporter gene driven

by a translational fusion with the AtMYB61 promoter and 5‘ intragenic sequences

(61PN::GUS) were examined (Fig. 4.2ab). Metabolisable sugars (sucrose, glucose,

92

Figure 4.1. Sugar regulation of AtMYB61 expression in dark-grown wild-type seedlings, 7 days post-germination. qRT-PCR analysis of AtMYB61 expression in response to sugars was conducted on wild-type seedlings grown for 7 days of dark. Sucrose, glucose and fructose all induced AtMYB61 expression. Sorbitol acted as an osmotic control and did not induce AtMYB61 expression. * indicates significantly different from the no sugar control, p<0.05, t-test.

93

Figure 4.2. Promoter-reporter and qRT-PCR analysis of AtMYB61 expression in response to sugars. (a) 61PN::GUS expression of 7 day-old dark-grown seedlings within the hypocotyl xylem in response to sugars. In response to metabolisable sugars (sucrose, glucose, fructose and maltose), AtMYB61 gene regulatory sequences were sufficient to drive GUS expression in the hypocotyls of 7 day-old dark-grown seedlings. A mannitol control confirmed that this effect was not due to osmotic regulation. Turanose and palatinose controls validated that the effect was not due to sucrose sensing alone. A raffinose control showed that this effect was not due a sucrose translocation effect. 3-O-methylglucose (3-OMG) and 2-deoxyglucose (2-DG) controls displayed that this effect was not due to the detection of hexose sugars involving the hexokinase (HXK) pathway (b) Quantitative analysis of 61PN::GUS expression in response to the same sugars and controls presented in (a). Bars in (a) represent 100µm. * in (b) represent signicantly different from no sugar control (P < 0.05).

94

fructose or maltose) significantly increased expression; whereas, an equivalent change

in osmotic conditions using sorbitol did not (Fig. 4.2ab). The disaccharides, turanose

and palatinose, which can interact with extracellular sucrose sensors (Loreti et al., 2000;

Sinha et al., 2002), failed to increase expression. Similarly, raffinose, which is

translocated with sucrose in the phloem, but not hydrolysed (Haritatos et al., 2000), did

not increase expression (Fig. 4.2ab).

Sugars have been shown to modify gene expression within a few members of R2R3-

MYB family members, AtMYB61 being one within this subset. Previously, it was shown

that AtMYB61 was diurnally regulated, to account for light-to-dark transitions in stomatal

aperture (Liang et al., 2005). Furthermore, AtMYB61 expression was shown to be

modulated by two amino acids implicated in nitrogen partitioning and signalling,

glutamate and glycine (Dubos et al., 2005). It is striking that AtMYB61 activity is up-

regulated by the most significant product of photosynthesis, sucrose, and that it is

down-regulated by two amino acids that are significant by-products of photorespiration,

glutamate and glycine. It may be that AtMYB61 is poised to respond to the abundance

of different carbon skeletons in plants, and thereby modulate carbon acquisition via

stomata and carbon allocation in sink tissues.

4.4.2 AtMYB61 Acts in a Pathway Independent of the Hexokinase Sugar

Signalling Pathway

Hexokinase (HXK) is important as a sugar sensor in plants (Jang et al., 1997;

Smeekens, 2000; Xiao et al., 2000; Rolland et al., 2002; Halford and Paul, 2003; Moore

et al., 2003; Gibson, 2005). Experiments using 3-O-methylglucose (3-OMG), which is

transported into plant cells but not metabolised by HXK, and 2-deoxyglucose (2-DG)

and mannose, which are phosphorylated by HXK but not metabolised further, can be

used to examine the involvement of HXK in sugar signalling (Jang et al., 1997; Pego et

al., 2000). GUS expression driven by the AtMYB61 promoter was not increased in

dark-grown plants provided with 3-O-methylglucose (3-OMG) or 2-deoxyglucose (2-DG)

(Fig. 4.2ab), showing that the sugar-sensing pathway did not simply entail detection of

hexose sugars, nor involve direct signalling via HXK (Jang et al., 1997; Gibson, 2000;

Smeekens, 2000). AtMYB61 promoter-mediated expression was increased by sucrose

95

even in the presence of the specific HXK inhibitor mannoheptulose (MHL) (Jang et al.,

1997; Chiou and Bush, 1998; Smeekens, 2000) (Fig. 4.2ab). The ability of sucrose to

increase AtMYB61 expression in the presence of MHL supports the hypothesis that

signalling directly by HXK is unlikely to be involved in AtMYB61 expression. AtMYB61

expression was not simply a response to the presence of carbon-based metabolites, as

acetate, pyruvate, succinate and trehalose, which are implicated in carbon metabolite

signalling (Graham et al., 1994), did not induce expression (data not shown).

The relationship between AtMYB61 expression and HXK sugar signalling was also

examined using the Arabidopsis thaliana loss-of-function mutants involved in the

hexokinase sugar signalling pathway: glucose insensitive2 (gin2) (Moore et al., 2003),

glucose insensitive1 (gin1) (Zhou et al., 1998), and glucose insensitive6 (gin6) (Arenas-

Huertero et al., 2000) (Fig. 4.3ab). As determined by qRT-PCR, AtMYB61 transcript

abundance increased in response to sucrose and glucose in wild-type plants (Fig. 4.3a).

This was also observed in gin2, gin1, and gin6 mutants. Moreover, the largest increase

in AtMYB61 transcript abundance was observed when sucrose was added to the

medium. In contrast, transcript abundance of HXK2, which is regulated through the

HXK signalling pathway, increased dramatically in response to glucose, and this

increase was significantly less in the gin2 mutant (Fig 4.3b). Together, these results

suggest that AtMYB61 transcript abundance is not modulated via the HXK1 signalling

pathway.

Transcript abundance data support the hypothesis that, under most circumstances,

AtMYB61 is likely to function independently of HXK. That is, AtMYB61 and

AtHXK1/GIN2 (At4g29130) have transcript abundance profiles that are slightly

negatively correlated in the AtGenExpress developmental dataset (RAGE=-0.236). This

indicates that the two genes are likely to have inverse transcript abundance relative to

each other, in those instances when their expression is coincident at all. Thus, the HXK

pathway and a distinct ―AtMYB61 pathway‖ are likely to operate non-redundantly, and

the pathway that is deployed is likely to be contingent on the developmental context. A

novel sugar-signalling pathway that does not involve hexokinase has been predicted

(Chiou and Bush, 1998; Tiessen et al., 2003; Dekkers et al., 2004), but the components

96

Figure 4.3. qRT-PCR analysis of AtMYB61 and HXK-2 expression in wild-type (WT) and glucose insensitive (gin) loss-of-function mutants. (a) qRT-PCR of AtMYB61 expression in response to sugars (glucose and sucrose) in WT and glucose insensitive mutants (gin1, gin6 and gin2) reveal that AtMYB61 acts in a sugar signalling pathway independent of HXK. (b) qRT-PCR of HXK-2 expression in response to sugars (glucose and sucrose) in wild-type and gin1, gin6 and gin2 mutants confirm that HXK-2 acts in the HXK sugar signalling pathway.

97

of this signalling pathway have yet to be elucidated. It may be that AtMYB61 is a

component of this pathway. One might be able to capitalise on this information to

uncover additional components of the uncharacterised AtMYB61-related sugar-

signalling pathway.

4.4.3 AtMYB61 Expression is Sugar Derepressed, Involving an Intragenic

Sequence within the 5‘ Coding Region Containing Two Introns

AtMYB61 gene regulatory sequences comprising the 5‘ intragenic region (61PN::GFP)

were sufficient to drive the expression of GFP in the xylem of seedlings grown in the

presence of sucrose but not in the absence of sucrose (Fig. 4.4ab). In contrast,

AtMYB61 gene regulatory sequences without the 5‘ intragenic region (61P::GFP)

constitutively expressed GFP in the seedlings grown in the presence and absence of

sucrose. The most parsimonious hypothesis for this finding is that AtMYB61 expression

is de-repressed by soluble sugars in a mechanism involving intragenic sequences.

Sequence comparison of Brassicaceae AtMYB61 homologues (Arabidopsis thaliana,

Arabidopsis lyrata, Capsella rubella, Brassica rapa, and Thellungiella halophila)

uncovered a highly conserved motif (CTCTGTTTT) in intron-two, repeated 4 times (Fig.

4.5; Fig. S4.1). The repeats within the second introns of AtMYB61 homologues occur 4

times - 3 times in the sense direction and once in the antisense direction. Scanning the

Arabidopsis thaliana genome for this repeat, with an occurrence cutoff of 3 times within

500bp, identified 83 genes and 15 intergenic regions (Table S4.1, S4.2). Of the 98

instances, 45 of these occurrences were in introns (Table S4.3). That is, when this

motif is repeated 3 or more times within a 500bp region of the Arabidopsis thaliana

genome, 45.9% of these occurrences are within introns (Table S4.3). Notably, introns

comprise only 15.6% of the Arabidopsis thaliana genome (Kaul et al., 2000). Of the 45

occurrences of this repeat within Arabidopsis thaliana introns, 21 are within sugar-

responsive genes (Table S4.3). Arabidopsis thaliana splicing prediction tools

(http://cbs.dtu.dk/services/NetPGene)(Hebsgaard et al., 1996)) nor miRNA and siRNA

prediction tools (http://www.athamap.de/)(Steffens et al., 2004, 2005; Galuschka et al.,

2007; Bulow et al., 2009; Bulow et al., 2010)) suggest that the motif is neither likely to

be a splice site nor a miRNA or siRNA binding site.

98

Figure 4.4. Analysis of AtMYB61 promoter-reporter fusion constructs that contain or do not contain AtMYB61 5’ intragenic sequences in response to sucrose. (a) Schematic representation of the constructs used to drive the expression of GUS (uidA) and GFP (GFP). 61P correspond to the promoter of AtMYB61, and 61PN to the promoter of AtMYB61 plus the portion of the coding sequence that encodes the N-terminus of the protein, which includes the two introns (E:exon; I: intron; NosT: nopaline synthase terminator sequence). (b) Expression of 61P and 61PN constructs within developing xylem of 7 day-old dark-grown seedlings in response to 30mM sucrose support the sugar derepression model.

99

Figure 4.5. Phylogenetic footprinting identifies a conserved repeat motif in the second intron of AtMYB61 Brassicaceae homologues. Sequence logo of AtMYB61 second intron (green highlight) flanked by exon 2 and exon 3 (red highlight). An over-represented conserved motif is present within AtMYB61 second intron that repeats itself three times in a sense direction and once in an antisense direction. Brassicaceae AtMYB61 homologues include: Arabidopsis thaliana gene At1g09540; Arabidopsis lyrata gene 919710; Capsella rubella gene Carubv10009497m.g; Brassica rapa gene Bra020016; and Thellungiella halophila gene Thhalv10008000m.g.

100

To further assess the putative functional role of the conserved over-represented motif

within AtMYB61 second intron, AtMYB61 was aligned with AtMYB50 (At1g57560), its

most closely related R2R3-MYB family member (Fig. S4.2)(Stracke et al., 2001). Direct

nucleotide sequence comparison between the two genes shows that, while AtMYB50

contain 2 introns, and while the introns of both genes share significant sequence

similarity; neither of the AtMYB50 intron contains the AtMYB61 second intron repeat.

Notably, AtMYB50 is not sugar induced (data not shown). Taken together with the data

above, the findings support the hypothesis that the repeat sequences found in the

second AtMYB61 intron might function as a gene regulatory sequence to mediate

sugar-responsive gene regulation. What‘s more, if they do function in this manner, they

might serve as binding sites for a repressor that binds to the sequences in the absence

of sugar, which are then released when sugar is present.

To determine if the repeats in the second intron of AtMYB61 could function as targets

for binding by a hypothetical sugar-mediated repressor, EMSAs were undertaken.

EMSAs used radioactively labeled second-intron repeats, and nuclear extracts from

plants that were grown in either the presence or absence of sucrose in the dark. The

second intron repeat motif was bound by to a greater extent by proteins in nuclear

extracts obtained from seedlings grown in the absence of sucrose in the dark, relative to

those from seedlings grown in the presence of sucrose in the dark (Fig. 4.6). This

interaction was specific as determined by a competition assay using either unlabelled

second intron repeat or poly(dI-dC) as a competitor (Fig. S4.3). Taken together, these

findings are consistent with a nuclear-localised repressor protein binding to the second

intron repeat in seedlings grown in the absence of sugar.

Recently, intragenic regulatory elements have been identified that can function as either

repressors, enhancers or promoters of gene transcription (Dooley et al., 1996; Busch et

al., 1999; Deyholos and Sieburth, 2000; Kapranov et al., 2001; Fiume et al., 2004;

Wang et al., 2004; Fu et al., 2005; Osnato et al., 2010). In barley, a tandem duplication

of 305bp in intron IV is responsible for the dominant Hooded phenotype, which leads to

an ectopic over expression of Knox3 at the distal end of the lemma and the

development of an extra flower in place of an awn present in wild-type spikelets (Muller

et al., 1995). In transgenic Nicotiana tabacum lines, the 305bp element can drive

101

Figure 4.6. EMSA shows AtMYB61 second intron motif bound differentially by proteins in nuclear extracts from seedlings grown in the absence or presence of sucrose in the dark, consistent with the derepression model. EMSA of nuclear extracts from wild-type Columbia seedlings grown for 7 days of dark with or without 30mM of sucrose on the second intron repeat shows differential binding. Competition with AtMYB61 second intron repeat cold probe shows that this interaction is specific. Arrows indicate gel shifts by the probe.

102

reporter gene expression within the flower base, in contrast to the Knox3 promoter,

whose activity is restricted to the SAM (Santi et al., 2003). The 305bp intron element

acts as a floral-specific regulatory element. A one-hybrid screen identified three

proteins that bound the 305bp intron element (Osnato et al., 2010). The proteins were

Barley Ethylene Response Factor1 (BERF1), Barley Ethylene Insensitive Like1 (BEIL1),

and Barley Growth Regulating Factor1 (BGRF1). Both BERF1 and BEIL1 are ethylene

signalling proteins that act at the interface between ethylene sensing and gene

regulation. In rice protoplasts, BEIL1 activated a reporter gene driven by the 305-bp

intron element. In contrast, BERF1 counteracted this activation, acting as a repressor

at this site. All in all, BEIL1 and BERF1 mediate fine-tuning of Knox expression by

ethylene through binding to the 305-bp intron element, providing cross-talk between the

KNOX and ethylene pathways.

The gene encoding the Zea mays starch-branching enzyme I (SBEI) also has an intron-

derived transcriptional regulatory sequence. Importantly, this functions in sugar-

mediated gene regulation. In transient gene expression analysis, inclusion of the SBEI

first intron increased transcript abundance 14-fold relative to gene constructs that did

not contain this intron (Kim and Guiltinan, 1999). Two cis elements were found within a

60bp region that bound nuclear proteins prepared from maize kernels in a sucrose-

dependent manner. In another study, the first intron of cotton sucrose synthase 3

(Sus3), a regulator of cotton fiber development, was analysed (Ruan et al., 2009). The

first intron of Sus3 is a negative regulator of gene expression and represses expression

of its transcripts in pollen. A Pyrimidine-box (CCTTTTG) was identified in the first intron

of Sus3. This motif was also present in the promoter of the RICE ALPHA-AMYLASE

(RAmy1a) gene (Morita et al., 1998) and the barley alpha amylase (Amy2/32b) gene

(Mena et al., 2002). In barley, DOF (DNA binding with one Finger) transcription factor

PBF (Pyrimidine-box Binding Factor) protein is induced by gibberellin to recognise this

Pyrimidine-box motif. This suggests that the motifs within the Sus3 first intron is

recognised by hormone inducible transcription factors to then regulate gene expression.

Gene regulation of animal c-MYB has also been shown to be regulated by intragenic

sequences (Dooley et al., 1996). The first intron of human and mouse c-Myb share

74% sequence conservation. These sites contain conserved GC-rich motifs, serving as

103

binding sites to a 70kDa activator protein and a 20kDa repressor protein (with a c-Jun

domain). Both c-MYB intron binding proteins regulate cell cycle progression (Dooley et

al., 1996). In concert with this, c-MYB has previously been shown to regulate the

proliferation and differentiation of hematopoietic cells (Mucenski et al., 1991), strongly

suggesting that the intragenic regulation of this gene is crucial for proper regulation.

Studies on the intragenic regulation of c-MYB paved the way for plant MYB studies.

In plants, the Arabidopsis thaliana R2R3-MYB gene GLABRA1 (GL1) is a central

regulator of trichome development (Oppenheimer et al., 1991). Trichome formation by

GL1 and GL1-like MYB genes (AtWER and GaMYB2) was regulated by their 5‘

intragenic regions (Wang et al., 2004). Both AtWER and GaMYB2 contain a conserved

MYB binding site (CA/CGTTA) within their first intron that is suggested to be a binding

site for R2R3-MYB repressor and activators that act upstream of GL1 (Wang et al.,

2004). The findings presented here suggest that AtMYB61 may also be regulated

through its intron sequences. Determination of the proteins that may bring about such

regulation would enable a more stringent test of this hypothesis.

4.4.4 Affinity Purification Coupled with Mass Spectrometry Uncovers a

Suite of Putative AtMYB61 Repressor Proteins that Bind the

Conserved Second Intron Motif in a Sucrose-Dependent Manner

Affinity purification was used to identify putative repressor proteins that bound to the

second AtMYB61 intron. The second AtMYB61 intron and the second intron repeat

were both end-labelled with biotin. Biotinylation was confirmed using the

Chemiluminescent Biofisher Biotin Detection Kit (Nepean, Ontario, Canada; Fig. S4.4).

Nuclear proteins were affinity purified from 7 day-old dark grown wild-type Columbia

seedlings that had been grown either in the absence or presence of sucrose.

Streptavidin beads were used to immobilise the biotinylated second intron and second

intron repeat. The streptavidin-biotin complexes were then used to affinity purify

proteins that bound to the intron sequences generally, and the second intron repeat

specifically (Fig. 4.7). Affinity purified proteins were then characterised using liquid

chromatography coupled with tandem mass spectrometry (LC-MS/MS), and the proteins

identified based on their MS fingerprints (Table 4.1)(Kislinger et al., 2003; Hewel et al.,

104

Figure 4.7. Affinity purification coupled with LC-MS/MS determines putative AtMYB61 repressor proteins that bound AtMYB61 second intron repeat. Nuclear proteins were purified from 7 day-old dark grown wild-type Columbia seedlings grown in the absence or presence of sucrose. The nuclear proteins were exposed to AtMYB61 second intron or second intron repeat sequences. The silver stained gel of proteins eluted from the streptavidin-biotin pull-down assay displays certain proteins binding with greater affinity in the no sucrose condition compared to the 30mM sucrose condition consistent with the derepression model.

105

Table 4.1. List of putative repressors of AtMYB61 expression (RMX) that bound AtMYB61 second intron repeat

List of putative RMX proteins that bound AtMYB61 second intron repeat with corresponding Arabidopsis thaliana gene idenfications (AGIs), mutant labels, SALK lines and protein annotations. AtMYB61 transcript abundance was misexpressed in a subset of rmx loss-of-function mutants in response to sucrose as determined by qRT-PCR and semi-quantitative RT-PCR. Confidence of each protein identified was calculated by StatQuest program (Kislinger et al., 2003) and each protein identified had a confidence level of greater than 50 percent (Hewel et al., 2010).

AGI Mutant Label

AtMYB61 Misexpression to Sucrose SALK Line Protein Annotation

At4g04940 No SALK_112391C Putative WD-repeat membrane protein At1g06840 No SALK_134409C Leucine-rich transmembrane kinase

At4g16830 No SALK_143514C Putative nuclear antigen homolog protein

At3g45810 rmx1 Yes SALK_050658 Respiratory burst oxidase-like protein

At5g35700 rmx2 Yes SALK_082219C Fimbrin FIMBRIN-LIKE PROTEIN 2 At2g43970 rmx3 Yes SALK_046986 La and winged repressor domain protein

At1g10170 No SALK_129409C Homologue of human repressor NF-X1

At3g52100 No SALK_047892C PHD finger family protein At3g22980 No SALK_150941C Elongation factor EF-2

At5g11700 No SALK_147133 Glycine rich protein on chromosome 5

At2g24650 rmx4 Yes SALK_109533C DNA binding / transcription factor

At1g50680 rmx5 Yes SALK_047550C RAV-like DNA-binding protein

At5g22760 No SALK_125978 PHD finger family protein At1g07650 No SALK_009225C Leucine-rich

transmembrane protein kinase

At4g02430 rmx6 Yes SALK_032344C Putative SR1 Protein At4g24710 No SALK_031449C Putative nucleotide binding

protein

At5g55670 No SALK_036546C RNA recognition motif-containing protein

At1g34460 No SALK_100844C B1 cyclin cyclin-dependent protein kinase

106

2010). Consistent with a role in binding the second intron repeat of AtMYB61, the

proteins identified were all nuclear proteins and were mainly nucleic acid binding

proteins or proteins involved in DNA-binding complexes. All of the identified AtMYB61

second intron binding proteins have not been biochemically characterised to date (Table

4.1).

4.4.5 A Subset of Putative AtMYB61 Repressor Genes Are Sugar

Sensitive

In order to determine whether the putative repressor proteins played a role in the

regulation of AtMYB61 expression, a genetic loss-of-function approach was taken.

Loss-of-function Arabidopsis thaliana mutants with T-DNA insertions in exons

corresponding to affinity purified proteins were ordered from the Arabidopsis Biological

Resource Centre and verified. These were then tested as putative repressors of

AtMYB61 expression (rmx) mutants. Sucrose-dependent AtMYB61 expression was

examined in putative rmx mutants using semi-quantitative PCR and quantitative real-

time PCR (Fig. S4.4; Fig. 4.8). In rmx mutants, it is hypothesised that AtMYB61

transcript abundance should be elevated, specifically in seedlings that had been grown

in the dark in the absence of sucrose. Of the 18 proteins for which putative rmx mutants

could be obtained, six proteins had rmx mutants that showed the predicted transcript

abundance profile, with elevated AtMYB61 transcripts in seedlings that had been grown

in the absence of sucrose (Fig. S4.5; Fig. 4.8; Table 4.1). While these six proteins have

yet to be characterized biochemically, four have been annotated as putative DNA-

binding proteins (Table 4.1). Notably, in certain rmx backgrounds (rmx1, rmx3, rmx4,

and rmx5), AtMYB61 transcript abundance is higher in the absence of sucrose

compared to the presence of sucrose (Fig. 4.8). Moreover, in the presence of sucrose,

AtMYB61 transcript abundance is, in general, lower in rmx background compared to

wild-type (Fig. 4.8).

107

Figure 4.8. qRT-PCR of putative repressors of AtMYB61 expression loss-of-function mutants (rmx) that had AtMYB61 misexpression in seedlings grown in the absence of sucrose in the dark, consistent with the repressor hypothesis. The RNA and cDNA were purified from rmx mutants, and analysis of AtMYB61 gene expression, via AtMYB61 specific primers, validated a subset of rmx mutants that had higher AtMYB61 expression when grown in the absence of sucrose. These 6 positive rmx mutants were filtered out from a screen of 18 putative rmx mutants recovered from the streptavidin-biotin pull-down assay. Wild-type Columbia, loss of function atmyb61 mutants, and 35S::MYB61 overexpressor mutants acted as controls for AtMYB61 expression for the quantitative PCR assay. ACTIN-11 control was used as a reference gene for the qRT-PCR.

108

4.4.6 rmx Loss-of-Function Mutant Phenocopies Constitutive AtMYB61

Overexpression

To determine the phenotypic effect of rmx mutants, plants were grown for 8 weeks.

Mutants were grown simultaneously with AtMYB61 overexpressors (35S::MYB61), loss-

of-function mutants (atmyb61) and wild-type plants (Fig. 4.9). If the rmx mutants were

impaired in making protein that repressed AtMYB61 expression, then they should have

features of plants that constitutively overexpress AtMYB61. Consistitutive AtMYB61

overexpressors developed quickly, bolted and flowered earlier, and senesced earlier

than wild-type plants (Romano et al., 2012). In contrast, atmyb61 plants developed

more slowly, bolted and flowered later, and senesced later relative to wild-type plants.

Of the rmx mutants, one (rmx3) phenocopied AtMYB61 overexpressors (Fig. 4.9). This

mutant bolted and flowered early, and senesced earlier than wild-type plants. The gene

corresponding to this mutant (At2g43970) is annotated as a La domain-containing

protein that functions in nucleic acid binding. This protein has a winged helix repressor

DNA-binding domain; however, no studies have biochemically characterised this protein

activity to date. Both molecular and phenotypic characterisations reported here suggest

that this protein represses AtMYB61 transcription.

Transcript abundance data across development support the hypothesis that At2g43970

is a negative regulator of AtMYB61 (Fig. S4.6) (http://bar.utoronto.ca/efp/cgi-

bin/efpWeb.cgi)(Schmid et al., 2005). Consistent with being a repressor, At2g43970

transcript abundance is inversely correlated with that of AtMYB61. That is, when the

transcript of At2g43970 is abundant within a tissue at a developmental time point,

AtMYB61 transcript abundance is lower, and vice versa. Combined, the data presented

here support the hypothesis that At2g43970 might be a direct repressor of AtMYB61,

functioning to regulate the expression of this gene in a sugar-dependent manner.

109

Figure 4.9. Phenotypes of Arabidopsis thaliana wild-type (WT) plants, AtMYB61 loss-of-function mutants (atmyb61), AtMYB61 over-expressor mutants (35S::MYB61) and At2g43970 loss-of-function mutants (rmx3). (a) Plants grown on soil for 21 d at WT growth stage 1.12. (b) Plants grown on soil for 28 d at WT growth stage 5.90. (c) Graph displaying leaf senescence of plants grown for 8 weeks at WT growth stage 8.00 (1, fully yellow leaves: 5, fully green leaves). Leaf senescence assay was conducted on the basis of published standards (Romano et al., 2012). *Significantly different from WT (P < 0.05). Data from experiments were conducted on >10 plants per genotype per experiment. Plants were grown in individual pots, and were randomized in flats to discourage position dependent effects. All rosette leaves were harvested. Measurement line represents 1 cm. Growth stages were assigned on the basis of published standards (Boyes et al., 2001).

110

4.5 Conclusion

The data presented herein provides evidence that AtMYB61, an R2R3-MYB

transcription factor, functions at the interface of sugar perception and sugar response.

Although in plants, gene specific transcriptional regulation is generally effected by the

binding of regulatory proteins to 5‘ non-coding regions (Schwechheimer and Bevan,

1998; Lee and Young, 2000), the findings presented in this study support the hypothesis

that AtMYB61 makes use of intragenic, non-coding sequences as cis-acting binding

sites for a sugar mediated repressor protein to regulate its gene expression in a sugar

dependent manner. AtMYB61 was regulated by metabolisable sugars, particularly

sucrose, in a sugar-signalling pathway that does not appear to directly involve

hexokinase (Jang et al., 1997; Gibson, 2000; Smeekens, 2000). AtMYB61 expression

was de-repressed by sucrose in a mechanism involving intragenic sequences

determined by promoter-reporter fusion constructs, hinting at a sugar mediated

repression mechanism (Rolland et al., 2006). An over-represented motif was conserved

within the second intron of Brassicaceae AtMYB61 homologues and this motif

functioned as a binding target for a putative sugar-mediated repressor, as determined

by EMSA. Putative repressor proteins (RMX) that bound AtMYB61 second intron motif

in seedlings grown in the absence of sucrose were affinity purified and characterised

using LC-MS/MS, and the proteins identified based on their MS fingerprints. These

proteins were all nuclear proteins and were mainly DNA-binding proteins or proteins

involved in DNA-binding complexes and have not been chararcterised to date. In rmx

mutants, it was hypothesised that AtMYB61 transcript abundance should be elevated in

seedlings that have been grown in the dark in the absence of sucrose. Six rmx mutants

showed this predicted transcript profile. Only one rmx mutant, whose gene corresponds

to At2g43970, could phenocopy transgenic plants overexpressing AtMYB61 (Romano et

al., 2012), this result supports the hypothesis that this gene encodes a repressor protein

that modulates AtMYB61 gene expression in vivo. At2g43970 gene encodes a La

domain-containing protein that contains a winged helix repressor DNA-binding domain

and has not been characterised in Arabidopsis thaliana to date (Schwartz et al., 1999).

Moreover, AtMYB61 and At2g43970 had inverse transcript abundance data across

development, supporting the hypothesis that At2g43970 encodes a protein that

111

represses AtMYB61. Taken together, a novel protein activity that binds a conserved

repeat motif within AtMYB61 second intron is uncovered, and suggested to regulate

sugar mediated gene expression in AtMYB61 and other genes that contain this repeat,

acting independently of the HXK sugar signalling pathway.

4.6 Acknowledgements

The authors are grateful to Christine Surman (University of Oxford) for technical

assistance; John Baker (University of Oxford) for assistance with photography; Ho-

Young Koo for assistance with nuclear protein extractions; Hilda Doan for the

assistance with plant phenotype analyses; Nottingham Arabidopsis Stock Centre

(NASC) and Arabidopsis Biological Resource Center for provision of seeds. This

research was generously supported by the Natural Science and Engineering Research

Council of Canada (NSERC) Canadian Graduate Scholarship (CGSD) awarded to

M.B.P., and by competitive grant funding from the UK Biotechnology and Biological

Sciences Research Council (BBSRC), the Canada Foundation for Innovation, and the

Natural Science and Engineering Research Council of Canada (NSERC) to M.M.C..

112

4.7 Supplemental Figures and Tables

Figure S4.1. Sequence alignment of the second intron of Brassicaceae AtMYB61 homologues. Sequence comparison of Brassicaceae AtMYB61 homologues (Arabidopsis thaliana gene At1g09540; Arabidopsis lyrata gene 919710; Capsella rubella gene Carubv10009497m.g; Brassica rapa gene Bra020016; and Thellungiella halophila gene Thhalv10008000m.g) uncovers a highly conserved motif (16-21 million years ago) in intron-2. Yellow boxes indicate second intron repeat in sense direction. Purple boxes indicate second intron repeat in antisense direction. * indicates positions which have a single, fully conserved residue. : indicates conservation between groups of strongly similar properties. . indicates conservation between groups of weakly similar properties.

113

Figure S4.2. Sequence alignment of AtMYB61 and AtMYB50 reveals no second intron repeat within AtMYB50 second intron. Sequence alignment was conducted on the AtMYB61 and AtMYB50 intron 2. AtMYB50 is AtMYB61 closest related R2R3-MYB member. AtMYB50 is not sugar responsive and did not contain the second intron repeat. Yellow boxes indicate second intron motif in the sense direction (5‘ – CTCTGTTTT - 3‘). Purple boxes indicate second intron motif in the antisense direction (5‘ - AAAACAGAG - 3‘).

114

Figure S4.3. EMSA shows AtMYB61 second intron motif bound differentially by proteins in nuclear extracts from seedlings grown in the absence or presence of sucrose in the dark, consistent with the derepression model. EMSA of nuclear proteins from wild-type Columbia seedlings grown for 7 days of dark with or without 30mM of sucrose on the second intron repeat shows differential binding. Competition with the nonspecific competitor poly(dIdC) could not outcompete this specific interaction. Arrows indicate gel shifts by the probe.

115

Figure S4.4. Validation of biotinylation of AtMYB61 second intron and second intron repeat. The biotinylation of the second intron and the second intron repeat of AtMYB61 was confirmed using the Chemiluminescent Biofisher Detection Biotin Kit. Detection was only reported on biotinylated AtMYB61 intron 2 and second intron repeat.

116

Figure S4.5. Semi-quantitative PCR of AtMYB61 expression in repressors of AtMYB61 expression loss-of-function mutant (rmx) seedlings grown in the absence or presence of sucrose in the dark. Semi-quantitative PCR validated a subset of rmx mutants that had higher AtMYB61 expression when grown in the absence of sucrose in the dark. These 6 positive rmx mutants were filtered out from a screen of 18 putative rmx mutants. Wild-type (WT), loss-of-

function atmyb61 mutants, and AtMYB61 overexpressor mutants (35S::MYB61) provided

controls for AtMYB61 expression in response to sucrose. ACTIN-11 (ACT-11) control was used as a reference gene and a loading control for the assay. 25 PCR cycles were used in this experiment.

117

Figure S4.6. At2g43970 and At1g09540 share inverse transcript abundance profiles across development. eFP browser shows that both (a) At2g43970 and (b) At1g09540 (AtMYB61) have inverse transcript abundance profiles across development in different organs. This suggests, along with other data presented within this study, that At2g43970 is a repressor of AtMYB61.

118

Table S4.1. AtMYB61 second intron repeat motif identified within all Arabidopsis thaliana genes Note a cutoff of at least 3 motifs occurring at least 500 bp apart was set. Thus 83 genes contain this repeat. Highlighted regions indicate unique gene.

AGI # of hits Position Orientation

AT5G46240.1 8 246 238 AAAACAGAG

AT5G46240.1 8 370 378 CTCTGTTTT

AT5G46240.1 8 384 392 CTCTGTTTT

AT5G46240.1 8 436 444 CTCTGTTTT

AT5G46240.1 8 472 480 CTCTGTTTT

AT5G46240.1 8 1130 1122 AAAACAGAG

AT5G46240.1 8 1845 1837 AAAACAGAG

AT5G46240.1 8 2594 2586 AAAACAGAG

AT1G67070.1 7 564 572 CTCTGTTTT

AT1G67070.1 7 576 584 CTCTGTTTT

AT1G67070.1 7 766 774 CTCTGTTTT

AT1G67070.1 7 779 787 CTCTGTTTT

AT1G67070.1 7 805 813 CTCTGTTTT

AT1G67070.1 7 951 959 CTCTGTTTT

AT1G67070.1 7 964 972 CTCTGTTTT

AT3G60130.1 6 1707 1715 CTCTGTTTT

AT3G60130.1 6 1724 1732 CTCTGTTTT

AT3G60130.1 6 1751 1759 CTCTGTTTT

AT3G60130.1 6 1778 1786 CTCTGTTTT

AT3G60130.1 6 1805 1813 CTCTGTTTT

AT3G60130.1 6 2597 2605 CTCTGTTTT

AT5G38970.1 6 75 67 AAAACAGAG

AT5G38970.1 6 95 87 AAAACAGAG

AT5G38970.1 6 107 99 AAAACAGAG

AT5G38970.1 6 721 729 CTCTGTTTT

AT5G38970.1 6 745 753 CTCTGTTTT

AT5G38970.1 6 849 857 CTCTGTTTT

AT5G45340.1 6 279 271 AAAACAGAG

AT5G45340.1 6 296 288 AAAACAGAG

AT5G45340.1 6 740 732 AAAACAGAG

AT5G45340.1 6 776 768 AAAACAGAG

AT5G45340.1 6 781 789 CTCTGTTTT

AT5G45340.1 6 804 812 CTCTGTTTT

119

Table S4.1 continued.

AT5G53660.1 6 680 672 AAAACAGAG

AT5G53660.1 6 815 823 CTCTGTTTT

AT5G53660.1 6 842 850 CTCTGTTTT

AT5G53660.1 6 855 863 CTCTGTTTT

AT5G53660.1 6 900 892 AAAACAGAG

AT5G53660.1 6 1003 995 AAAACAGAG

AT1G30320.1 5 203 211 CTCTGTTTT

AT1G30320.1 5 726 734 CTCTGTTTT

AT1G30320.1 5 737 745 CTCTGTTTT

AT1G30320.1 5 834 842 CTCTGTTTT

AT1G30320.1 5 877 885 CTCTGTTTT

AT2G21560.1 5 56 64 CTCTGTTTT

AT2G21560.1 5 591 583 AAAACAGAG

AT2G21560.1 5 635 627 AAAACAGAG

AT2G21560.1 5 645 637 AAAACAGAG

AT2G21560.1 5 708 700 AAAACAGAG

AT4G02780.1 5 35 27 AAAACAGAG

AT4G02780.1 5 346 354 CTCTGTTTT

AT4G02780.1 5 476 484 CTCTGTTTT

AT4G02780.1 5 566 574 CTCTGTTTT

AT4G02780.1 5 718 726 CTCTGTTTT

AT5G37600.1 5 368 376 CTCTGTTTT

AT5G37600.1 5 556 564 CTCTGTTTT

AT5G37600.1 5 580 588 CTCTGTTTT

AT5G37600.1 5 592 600 CTCTGTTTT

AT5G37600.1 5 1688 1680 AAAACAGAG

AT1G09540.1 4 575 567 AAAACAGAG

AT1G09540.1 4 604 612 CTCTGTTTT

AT1G09540.1 4 621 629 CTCTGTTTT

AT1G09540.1 4 661 669 CTCTGTTTT

AT1G17830.1 4 2338 2346 CTCTGTTTT

AT1G17830.1 4 2351 2359 CTCTGTTTT

AT1G17830.1 4 2381 2389 CTCTGTTTT

AT1G17830.1 4 2394 2402 CTCTGTTTT

AT1G32700.1 4 198 206 CTCTGTTTT

AT1G32700.1 4 225 233 CTCTGTTTT

AT1G32700.1 4 373 381 CTCTGTTTT

120

Table S4.1 continued.

AT1G32700.1 4 390 398 CTCTGTTTT

AT1G51950.1 4 451 459 CTCTGTTTT

AT1G51950.1 4 469 461 AAAACAGAG

AT1G51950.1 4 662 670 CTCTGTTTT

AT1G51950.1 4 833 841 CTCTGTTTT

AT1G61800.1 4 668 676 CTCTGTTTT

AT1G61800.1 4 729 737 CTCTGTTTT

AT1G61800.1 4 757 765 CTCTGTTTT

AT1G61800.1 4 787 795 CTCTGTTTT

AT1G69530.1 4 764 772 CTCTGTTTT

AT1G69530.1 4 787 779 AAAACAGAG

AT1G69530.1 4 826 834 CTCTGTTTT

AT1G69530.1 4 849 841 AAAACAGAG

AT1G70550.1 4 639 647 CTCTGTTTT

AT1G70550.1 4 650 658 CTCTGTTTT

AT1G70550.1 4 661 669 CTCTGTTTT

AT1G70550.1 4 710 718 CTCTGTTTT

AT1G72150.1 4 2 10 CTCTGTTTT

AT1G72150.1 4 476 468 AAAACAGAG

AT1G72150.1 4 491 483 AAAACAGAG

AT1G72150.1 4 521 513 AAAACAGAG

AT2G01540.1 4 499 507 CTCTGTTTT

AT2G01540.1 4 540 548 CTCTGTTTT

AT2G01540.1 4 580 588 CTCTGTTTT

AT2G01540.1 4 619 627 CTCTGTTTT

AT2G37440.1 4 244 252 CTCTGTTTT

AT2G37440.1 4 502 510 CTCTGTTTT

AT2G37440.1 4 660 668 CTCTGTTTT

AT2G37440.1 4 703 711 CTCTGTTTT

AT2G38120.1 4 354 362 CTCTGTTTT

AT2G38120.1 4 411 419 CTCTGTTTT

AT2G38120.1 4 425 433 CTCTGTTTT

AT2G38120.1 4 466 474 CTCTGTTTT

AT2G40320.1 4 515 507 AAAACAGAG

AT2G40320.1 4 707 715 CTCTGTTTT

AT2G40320.1 4 733 741 CTCTGTTTT

AT2G40320.1 4 759 767 CTCTGTTTT

121

Table S4.1 continued.

AT3G03650.1 4 479 471 AAAACAGAG

AT3G03650.1 4 518 510 AAAACAGAG

AT3G03650.1 4 535 527 AAAACAGAG

AT3G03650.1 4 557 549 AAAACAGAG

AT3G16520.1 4 298 306 CTCTGTTTT

AT3G16520.1 4 691 683 AAAACAGAG

AT3G16520.1 4 715 723 CTCTGTTTT

AT3G16520.1 4 844 852 CTCTGTTTT

AT3G28180.1 4 932 940 CTCTGTTTT

AT3G28180.1 4 943 951 CTCTGTTTT

AT3G28180.1 4 975 983 CTCTGTTTT

AT3G28180.1 4 986 994 CTCTGTTTT

AT4G00430.1 4 463 471 CTCTGTTTT

AT4G00430.1 4 480 488 CTCTGTTTT

AT4G00430.1 4 505 513 CTCTGTTTT

AT4G00430.1 4 539 547 CTCTGTTTT

AT4G13710.1 4 488 496 CTCTGTTTT

AT4G13710.1 4 533 525 AAAACAGAG

AT4G13710.1 4 733 741 CTCTGTTTT

AT4G13710.1 4 778 770 AAAACAGAG

AT4G19230.1 4 746 754 CTCTGTTTT

AT4G19230.1 4 813 805 AAAACAGAG

AT4G19230.1 4 829 837 CTCTGTTTT

AT4G19230.1 4 1300 1292 AAAACAGAG

AT4G34990.1 4 361 369 CTCTGTTTT

AT4G34990.1 4 398 406 CTCTGTTTT

AT4G34990.1 4 412 420 CTCTGTTTT

AT4G34990.1 4 608 600 AAAACAGAG

AT5G02170.1 4 208 216 CTCTGTTTT

AT5G02170.1 4 362 370 CTCTGTTTT

AT5G02170.1 4 600 608 CTCTGTTTT

AT5G02170.1 4 1578 1586 CTCTGTTTT

AT5G40030.1 4 734 742 CTCTGTTTT

AT5G40030.1 4 846 854 CTCTGTTTT

AT5G40030.1 4 934 942 CTCTGTTTT

AT5G40030.1 4 1170 1178 CTCTGTTTT

AT5G61570.1 4 687 695 CTCTGTTTT

122

Table S4.1 continued.

AT5G61570.1 4 699 707 CTCTGTTTT

AT5G61570.1 4 932 940 CTCTGTTTT

AT5G61570.1 4 1284 1276 AAAACAGAG

AT5G63850.1 4 839 847 CTCTGTTTT

AT5G63850.1 4 869 877 CTCTGTTTT

AT5G63850.1 4 890 898 CTCTGTTTT

AT5G63850.1 4 910 918 CTCTGTTTT

AT1G01590.1 3 1752 1760 CTCTGTTTT

AT1G01590.1 3 2134 2142 CTCTGTTTT

AT1G01590.1 3 2164 2172 CTCTGTTTT

AT1G04610.1 3 927 919 AAAACAGAG

AT1G04610.1 3 946 954 CTCTGTTTT

AT1G04610.1 3 965 957 AAAACAGAG

AT1G07340.1 3 632 640 CTCTGTTTT

AT1G07340.1 3 653 661 CTCTGTTTT

AT1G07340.1 3 675 683 CTCTGTTTT

AT1G10220.1 3 354 346 AAAACAGAG

AT1G10220.1 3 628 620 AAAACAGAG

AT1G10220.1 3 742 750 CTCTGTTTT

AT1G10750.1 3 676 684 CTCTGTTTT

AT1G10750.1 3 716 724 CTCTGTTTT

AT1G10750.1 3 745 753 CTCTGTTTT

AT1G16380.1 3 2527 2535 CTCTGTTTT

AT1G16380.1 3 2591 2599 CTCTGTTTT

AT1G16380.1 3 2757 2749 AAAACAGAG

AT1G19050.1 3 277 285 CTCTGTTTT

AT1G19050.1 3 303 311 CTCTGTTTT

AT1G19050.1 3 334 342 CTCTGTTTT

AT1G26770.1 3 756 748 AAAACAGAG

AT1G26770.1 3 778 786 CTCTGTTTT

AT1G26770.1 3 801 793 AAAACAGAG

AT1G64355.1 3 458 466 CTCTGTTTT

AT1G64355.1 3 497 505 CTCTGTTTT

AT1G64355.1 3 529 537 CTCTGTTTT

AT1G65150.1 3 1653 1661 CTCTGTTTT

AT1G65150.1 3 1756 1764 CTCTGTTTT

AT1G65150.1 3 1817 1825 CTCTGTTTT

123

Table S4.1 continued.

AT1G65920.1 3 178 186 CTCTGTTTT

AT1G65920.1 3 205 213 CTCTGTTTT

AT1G65920.1 3 215 223 CTCTGTTTT

AT1G76360.1 3 180 172 AAAACAGAG

AT1G76360.1 3 191 183 AAAACAGAG

AT1G76360.1 3 515 523 CTCTGTTTT

AT1G77330.1 3 239 247 CTCTGTTTT

AT1G77330.1 3 349 341 AAAACAGAG

AT1G77330.1 3 593 585 AAAACAGAG

AT1G78440.1 3 223 231 CTCTGTTTT

AT1G78440.1 3 444 452 CTCTGTTTT

AT1G78440.1 3 463 471 CTCTGTTTT

AT2G03730.1 3 145 153 CTCTGTTTT

AT2G03730.1 3 397 405 CTCTGTTTT

AT2G03730.1 3 506 514 CTCTGTTTT

AT2G13840.1 3 552 560 CTCTGTTTT

AT2G13840.1 3 619 627 CTCTGTTTT

AT2G13840.1 3 687 695 CTCTGTTTT

AT2G23320.1 3 806 814 CTCTGTTTT

AT2G23320.1 3 834 826 AAAACAGAG

AT2G23320.1 3 1049 1041 AAAACAGAG

AT2G25460.1 3 41 33 AAAACAGAG

AT2G25460.1 3 67 59 AAAACAGAG

AT2G25460.1 3 91 83 AAAACAGAG

AT2G33230.1 3 747 739 AAAACAGAG

AT2G33230.1 3 785 777 AAAACAGAG

AT2G33230.1 3 801 809 CTCTGTTTT

AT2G38090.1 3 216 224 CTCTGTTTT

AT2G38090.1 3 520 528 CTCTGTTTT

AT2G38090.1 3 643 635 AAAACAGAG

AT2G39210.1 3 670 662 AAAACAGAG

AT2G39210.1 3 681 673 AAAACAGAG

AT2G39210.1 3 691 683 AAAACAGAG

AT3G03780.1 3 115 123 CTCTGTTTT

AT3G03780.1 3 146 154 CTCTGTTTT

AT3G03780.1 3 195 203 CTCTGTTTT

AT3G24600.1 3 2056 2064 CTCTGTTTT

124

Table S4.1 continued.

AT3G24600.1 3 2068 2076 CTCTGTTTT

AT3G24600.1 3 2311 2319 CTCTGTTTT

AT3G46110.1 3 310 318 CTCTGTTTT

AT3G46110.1 3 325 333 CTCTGTTTT

AT3G46110.1 3 519 527 CTCTGTTTT

AT3G48360.1 3 554 562 CTCTGTTTT

AT3G48360.1 3 627 635 CTCTGTTTT

AT3G48360.1 3 976 984 CTCTGTTTT

AT3G61230.1 3 429 437 CTCTGTTTT

AT3G61230.1 3 443 451 CTCTGTTTT

AT3G61230.1 3 567 575 CTCTGTTTT

AT3G61750.1 3 106 114 CTCTGTTTT

AT3G61750.1 3 272 280 CTCTGTTTT

AT3G61750.1 3 300 292 AAAACAGAG

AT4G03210.1 3 281 289 CTCTGTTTT

AT4G03210.1 3 299 307 CTCTGTTTT

AT4G03210.1 3 313 321 CTCTGTTTT

AT4G09460.1 3 362 354 AAAACAGAG

AT4G09460.1 3 381 373 AAAACAGAG

AT4G09460.1 3 558 566 CTCTGTTTT

AT4G12080.1 3 741 749 CTCTGTTTT

AT4G12080.1 3 753 761 CTCTGTTTT

AT4G12080.1 3 912 920 CTCTGTTTT

AT4G22880.1 3 65 73 CTCTGTTTT

AT4G22880.1 3 105 113 CTCTGTTTT

AT4G22880.1 3 117 125 CTCTGTTTT

AT4G25420.1 3 423 415 AAAACAGAG

AT4G25420.1 3 618 610 AAAACAGAG

AT4G25420.1 3 681 689 CTCTGTTTT

AT4G28025.1 3 106 98 AAAACAGAG

AT4G28025.1 3 270 278 CTCTGTTTT

AT4G28025.1 3 419 427 CTCTGTTTT

AT4G35300.1 3 235 243 CTCTGTTTT

AT4G35300.1 3 332 340 CTCTGTTTT

AT4G35300.1 3 427 435 CTCTGTTTT

AT5G09220.1 3 1108 1116 CTCTGTTTT

AT5G09220.1 3 1122 1130 CTCTGTTTT

125

Table S4.1 continued.

AT5G09220.1 3 1163 1171 CTCTGTTTT

AT5G09460.1 3 137 145 CTCTGTTTT

AT5G09460.1 3 198 206 CTCTGTTTT

AT5G09460.1 3 475 483 CTCTGTTTT

AT5G09461.1 3 137 145 CTCTGTTTT

AT5G09461.1 3 198 206 CTCTGTTTT

AT5G09461.1 3 475 483 CTCTGTTTT

AT5G09462.1 3 137 145 CTCTGTTTT

AT5G09462.1 3 198 206 CTCTGTTTT

AT5G09462.1 3 475 483 CTCTGTTTT

AT5G09463.1 3 137 145 CTCTGTTTT

AT5G09463.1 3 198 206 CTCTGTTTT

AT5G09463.1 3 475 483 CTCTGTTTT

AT5G12050.1 3 240 232 AAAACAGAG

AT5G12050.1 3 430 422 AAAACAGAG

AT5G12050.1 3 458 450 AAAACAGAG

AT5G14370.1 3 109 101 AAAACAGAG

AT5G14370.1 3 150 142 AAAACAGAG

AT5G14370.1 3 331 323 AAAACAGAG

AT5G26230.1 3 453 461 CTCTGTTTT

AT5G26230.1 3 632 624 AAAACAGAG

AT5G26230.1 3 651 643 AAAACAGAG

AT5G39785.1 3 387 379 AAAACAGAG

AT5G39785.1 3 399 407 CTCTGTTTT

AT5G39785.1 3 435 427 AAAACAGAG

AT5G39850.1 3 134 142 CTCTGTTTT

AT5G39850.1 3 269 277 CTCTGTTTT

AT5G39850.1 3 309 317 CTCTGTTTT

AT5G40460.1 3 139 147 CTCTGTTTT

AT5G40460.1 3 153 161 CTCTGTTTT

AT5G40460.1 3 175 183 CTCTGTTTT

AT5G41380.1 3 485 477 AAAACAGAG

AT5G41380.1 3 731 739 CTCTGTTTT

AT5G41380.1 3 748 756 CTCTGTTTT

AT5G49340.1 3 512 520 CTCTGTTTT

AT5G49340.1 3 682 674 AAAACAGAG

AT5G49340.1 3 707 699 AAAACAGAG

126

Table S4.1 continued.

AT5G51670.1 3 525 517 AAAACAGAG

AT5G51670.1 3 536 528 AAAACAGAG

AT5G51670.1 3 549 541 AAAACAGAG

AT5G57350.1 3 2683 2691 CTCTGTTTT

AT5G57350.1 3 2695 2703 CTCTGTTTT

AT5G57350.1 3 2720 2728 CTCTGTTTT

AT5G58000.1 3 257 249 AAAACAGAG

AT5G58000.1 3 492 484 AAAACAGAG

AT5G58000.1 3 705 713 CTCTGTTTT

AT5G62140.1 3 289 281 AAAACAGAG

AT5G62140.1 3 472 480 CTCTGTTTT

AT5G62140.1 3 514 522 CTCTGTTTT

127

Table S4.2. AtMYB61 second intron repeat motif identified within all Arabidopsis thaliana intergenic regions Note a cutoff of at least 3 motifs occurring at least 500 bp apart was set. Thus 15 intergenic regions contain this repeat. Highlighted regions indicate unique intergenic regions.

AGI of intergenic region # of hits Position Orientation

AT5G57480-AT5G57490 6 37 29 AAAACAGAG

AT5G57480-AT5G57490 6 51 43 AAAACAGAG

AT5G57480-AT5G57490 6 70 62 AAAACAGAG

AT5G57480-AT5G57490 6 81 73 AAAACAGAG

AT5G57480-AT5G57490 6 101 93 AAAACAGAG

AT5G57480-AT5G57490 6 121 113 AAAACAGAG

AT4G16880-AT4G16890 4 1840 1848 CTCTGTTTT

AT4G16880-AT4G16890 4 2114 2106 AAAACAGAG

AT4G16880-AT4G16890 4 2284 2276 AAAACAGAG

AT4G16880-AT4G16890 4 3170 3162 AAAACAGAG

AT5G16970-AT5G16980 4 350 342 AAAACAGAG

AT5G16970-AT5G16980 4 369 361 AAAACAGAG

AT5G16970-AT5G16980 4 743 735 AAAACAGAG

AT5G16970-AT5G16980 4 755 747 AAAACAGAG

AT1G71680-AT1G71690 3 60 52 AAAACAGAG

AT1G71680-AT1G71690 3 79 71 AAAACAGAG

AT1G71680-AT1G71690 3 92 84 AAAACAGAG

AT1G71950-AT1G71960 3 674 682 CTCTGTTTT

AT1G71950-AT1G71960 3 705 713 CTCTGTTTT

AT1G71950-AT1G71960 3 717 725 CTCTGTTTT

AT2G05360-AT2G05370 3 63 55 AAAACAGAG

AT2G05360-AT2G05370 3 291 283 AAAACAGAG

AT2G05360-AT2G05370 3 304 296 AAAACAGAG

AT3G16120-AT3G16130 3 583 591 CTCTGTTTT

AT3G16120-AT3G16130 3 666 658 AAAACAGAG

AT3G16120-AT3G16130 3 682 674 AAAACAGAG

AT3G27610-AT3G27620 3 813 805 AAAACAGAG

AT3G27610-AT3G27620 3 835 827 AAAACAGAG

AT3G27610-AT3G27620 3 845 837 AAAACAGAG

AT4G23200-AT4G23210 3 604 596 AAAACAGAG

AT4G23200-AT4G23210 3 620 612 AAAACAGAG

AT4G23200-AT4G23210 3 648 640 AAAACAGAG

128

Table S4.2 continued.

AT4G34400-AT4G34410 3 447 439 AAAACAGAG

AT4G34400-AT4G34410 3 454 462 CTCTGTTTT

AT4G34400-AT4G34410 3 713 705 AAAACAGAG

AT4G37030-AT4G37040 3 202 210 CTCTGTTTT

AT4G37030-AT4G37040 3 243 251 CTCTGTTTT

AT4G37030-AT4G37040 3 259 267 CTCTGTTTT

AT5G12950-AT5G12960 3 543 551 CTCTGTTTT

AT5G12950-AT5G12960 3 564 572 CTCTGTTTT

AT5G12950-AT5G12960 3 586 594 CTCTGTTTT

AT5G29015-AT5G29020 3 1008 1016 CTCTGTTTT

AT5G29015-AT5G29020 3 1300 1308 CTCTGTTTT

AT5G29015-AT5G29020 3 1328 1336 CTCTGTTTT

AT5G57520-AT5G57530 3 3396 3404 CTCTGTTTT

AT5G57520-AT5G57530 3 3471 3479 CTCTGTTTT

AT5G57520-AT5G57530 3 3496 3504 CTCTGTTTT

AT5G57535-AT5G57540 3 169 177 CTCTGTTTT

AT5G57535-AT5G57540 3 181 189 CTCTGTTTT

AT5G57535-AT5G57540 3 202 210 CTCTGTTTT

129

Table S4.3. AtMYB61 second intron repeat motif identified within all Arabidopsis thaliana introns and corresponding transcript response to sugar Note a cutoff of at least 3 motifs occurring at least 500 bp apart was set. Thus 45 introns contain this repeat. 45.9% of the AtMYB61 second intron repeat motif occurrences are within introns. Within the 45 introns with these repeats, 21 of these occurrences are within sugar responsive genes (46.7%). Highlighted regions indicate unique gene. -n within AGI represents which intron the motif is present within. Sugar responsive genes were identified from microarray data conducted by Romano et al. (Romano et al., 2012).

AGI # of hits Position Orientation Sugar Responsive

AT1G67070.1-1 7 15 23 CTCTGTTTT NO

AT1G67070.1-1 7 27 35 CTCTGTTTT

AT1G67070.1-1 7 217 225 CTCTGTTTT

AT1G67070.1-1 7 230 238 CTCTGTTTT

AT1G67070.1-1 7 256 264 CTCTGTTTT

AT1G67070.1-1 7 402 410 CTCTGTTTT

AT1G67070.1-1 7 415 423 CTCTGTTTT

AT3G60130.1-7 5 168 176 CTCTGTTTT YES

AT3G60130.1-7 5 185 193 CTCTGTTTT

AT3G60130.1-7 5 212 220 CTCTGTTTT

AT3G60130.1-7 5 239 247 CTCTGTTTT

AT3G60130.1-7 5 266 274 CTCTGTTTT

AT1G09540.1-2 4 58 50 AAAACAGAG YES

AT1G09540.1-2 4 87 95 CTCTGTTTT

AT1G09540.1-2 4 104 112 CTCTGTTTT

AT1G09540.1-2 4 144 152 CTCTGTTTT

AT1G30320.1-2 4 76 84 CTCTGTTTT NO

AT1G30320.1-2 4 87 95 CTCTGTTTT

AT1G30320.1-2 4 184 192 CTCTGTTTT

AT1G30320.1-2 4 227 235 CTCTGTTTT

AT1G32700.1-1 4 40 48 CTCTGTTTT YES

AT1G32700.1-1 4 67 75 CTCTGTTTT

AT1G32700.1-1 4 215 223 CTCTGTTTT

AT1G32700.1-1 4 232 240 CTCTGTTTT

AT1G61800.1-1 4 89 97 CTCTGTTTT YES

AT1G61800.1-1 4 150 158 CTCTGTTTT

AT1G61800.1-1 4 178 186 CTCTGTTTT

AT1G61800.1-1 4 208 216 CTCTGTTTT

AT1G69530.1-2 4 25 33 CTCTGTTTT YES

AT1G69530.1-2 4 48 40 AAAACAGAG

130

Table S4.3 continued.

AT1G69530.1-2 4 87 95 CTCTGTTTT

AT1G69530.1-2 4 110 102 AAAACAGAG

AT1G70550.1-1 4 85 93 CTCTGTTTT NO

AT1G70550.1-1 4 96 104 CTCTGTTTT

AT1G70550.1-1 4 107 115 CTCTGTTTT

AT1G70550.1-1 4 156 164 CTCTGTTTT

AT2G01540.1-1 4 125 133 CTCTGTTTT YES

AT2G01540.1-1 4 166 174 CTCTGTTTT

AT2G01540.1-1 4 206 214 CTCTGTTTT

AT2G01540.1-1 4 245 253 CTCTGTTTT

AT2G37440.1-1 4 137 145 CTCTGTTTT NO

AT2G37440.1-1 4 395 403 CTCTGTTTT

AT2G37440.1-1 4 553 561 CTCTGTTTT

AT2G37440.1-1 4 596 604 CTCTGTTTT

AT2G38120.1-1 4 82 90 CTCTGTTTT YES

AT2G38120.1-1 4 139 147 CTCTGTTTT

AT2G38120.1-1 4 153 161 CTCTGTTTT

AT2G38120.1-1 4 194 202 CTCTGTTTT

AT3G28180.1-1 4 15 23 CTCTGTTTT YES

AT3G28180.1-1 4 26 34 CTCTGTTTT

AT3G28180.1-1 4 58 66 CTCTGTTTT

AT3G28180.1-1 4 69 77 CTCTGTTTT

AT4G00430.1-1 4 36 44 CTCTGTTTT YES

AT4G00430.1-1 4 53 61 CTCTGTTTT

AT4G00430.1-1 4 78 86 CTCTGTTTT

AT4G00430.1-1 4 112 120 CTCTGTTTT

AT4G02780.1-1 4 202 210 CTCTGTTTT NO

AT4G02780.1-1 4 332 340 CTCTGTTTT

AT4G02780.1-1 4 422 430 CTCTGTTTT

AT4G02780.1-1 4 574 582 CTCTGTTTT

AT5G37600.1-1 4 109 117 CTCTGTTTT YES

AT5G37600.1-1 4 297 305 CTCTGTTTT

AT5G37600.1-1 4 321 329 CTCTGTTTT

AT5G37600.1-1 4 333 341 CTCTGTTTT

AT5G40030.1-1 4 84 92 CTCTGTTTT NO

AT5G40030.1-1 4 196 204 CTCTGTTTT

AT5G40030.1-1 4 284 292 CTCTGTTTT

131

Table S4.3 continued.

AT5G40030.1-1 4 520 528 CTCTGTTTT

AT5G45340.1-2 4 33 25 AAAACAGAG YES

AT5G45340.1-2 4 69 61 AAAACAGAG

AT5G45340.1-2 4 74 82 CTCTGTTTT

AT5G45340.1-2 4 97 105 CTCTGTTTT

AT5G46240.1-1 4 22 30 CTCTGTTTT NO

AT5G46240.1-1 4 36 44 CTCTGTTTT

AT5G46240.1-1 4 88 96 CTCTGTTTT

AT5G46240.1-1 4 124 132 CTCTGTTTT

AT5G61570.1-1 4 7 15 CTCTGTTTT NO

AT5G61570.1-1 4 19 27 CTCTGTTTT

AT5G61570.1-1 4 252 260 CTCTGTTTT

AT5G61570.1-1 4 604 596 AAAACAGAG

AT5G63850.1-3 4 17 25 CTCTGTTTT NO

AT5G63850.1-3 4 47 55 CTCTGTTTT

AT5G63850.1-3 4 68 76 CTCTGTTTT

AT5G63850.1-3 4 88 96 CTCTGTTTT

AT1G04610.1-1 3 77 69 AAAACAGAG NO

AT1G04610.1-1 3 96 104 CTCTGTTTT

AT1G04610.1-1 3 115 107 AAAACAGAG

AT1G07340.1-2 3 46 54 CTCTGTTTT NO

AT1G07340.1-2 3 67 75 CTCTGTTTT

AT1G07340.1-2 3 89 97 CTCTGTTTT

AT1G10750.1-1 3 149 157 CTCTGTTTT NO

AT1G10750.1-1 3 189 197 CTCTGTTTT

AT1G10750.1-1 3 218 226 CTCTGTTTT

AT1G19050.1-1 3 29 37 CTCTGTTTT YES

AT1G19050.1-1 3 55 63 CTCTGTTTT

AT1G19050.1-1 3 86 94 CTCTGTTTT

AT1G26770.1-3 3 37 29 AAAACAGAG YES

AT1G26770.1-3 3 59 67 CTCTGTTTT

AT1G26770.1-3 3 82 74 AAAACAGAG

AT1G64355.1-1 3 8 16 CTCTGTTTT NO

AT1G64355.1-1 3 47 55 CTCTGTTTT

AT1G64355.1-1 3 79 87 CTCTGTTTT

AT1G65920.1-1 3 34 42 CTCTGTTTT NO

AT1G65920.1-1 3 61 69 CTCTGTTTT

132

Table S4.3 continued.

AT1G65920.1-1 3 71 79 CTCTGTTTT

AT2G13840.1-1 3 198 206 CTCTGTTTT NO

AT2G13840.1-1 3 265 273 CTCTGTTTT

AT2G13840.1-1 3 333 341 CTCTGTTTT

AT2G33230.1-1 3 48 40 AAAACAGAG NO

AT2G33230.1-1 3 86 78 AAAACAGAG

AT2G33230.1-1 3 102 110 CTCTGTTTT

AT2G39210.1-1 3 67 59 AAAACAGAG NO

AT2G39210.1-1 3 78 70 AAAACAGAG

AT2G39210.1-1 3 88 80 AAAACAGAG

AT2G40320.1-2 3 19 27 CTCTGTTTT NO

AT2G40320.1-2 3 45 53 CTCTGTTTT

AT2G40320.1-2 3 71 79 CTCTGTTTT

AT3G03780.1-1 3 60 68 CTCTGTTTT NO

AT3G03780.1-1 3 91 99 CTCTGTTTT

AT3G03780.1-1 3 140 148 CTCTGTTTT

AT4G03210.1-1 3 39 47 CTCTGTTTT YES

AT4G03210.1-1 3 57 65 CTCTGTTTT

AT4G03210.1-1 3 71 79 CTCTGTTTT

AT4G09460.1-1 3 46 38 AAAACAGAG YES

AT4G09460.1-1 3 65 57 AAAACAGAG

AT4G09460.1-1 3 242 250 CTCTGTTTT

AT4G12080.1-1 3 19 27 CTCTGTTTT YES

AT4G12080.1-1 3 31 39 CTCTGTTTT

AT4G12080.1-1 3 190 198 CTCTGTTTT

AT4G19230.1-2 3 9 17 CTCTGTTTT YES

AT4G19230.1-2 3 76 68 AAAACAGAG

AT4G19230.1-2 3 92 100 CTCTGTTTT

AT4G22880.1-1 3 39 47 CTCTGTTTT NO

AT4G22880.1-1 3 79 87 CTCTGTTTT

AT4G22880.1-1 3 91 99 CTCTGTTTT

AT4G34990.1-1 3 20 28 CTCTGTTTT NO

AT4G34990.1-1 3 57 65 CTCTGTTTT

AT4G34990.1-1 3 71 79 CTCTGTTTT

AT4G35300.4-1 3 295 303 CTCTGTTTT NO

AT4G35300.4-1 3 392 400 CTCTGTTTT

AT4G35300.4-1 3 487 495 CTCTGTTTT

133

Table S4.3 continued.

AT5G09220.1-3 3 11 19 CTCTGTTTT YES

AT5G09220.1-3 3 25 33 CTCTGTTTT

AT5G09220.1-3 3 66 74 CTCTGTTTT

AT5G38970.1-2 3 28 36 CTCTGTTTT NO

AT5G38970.1-2 3 52 60 CTCTGTTTT

AT5G38970.1-2 3 156 164 CTCTGTTTT

AT5G39850.1-1 3 75 83 CTCTGTTTT YES

AT5G39850.1-1 3 210 218 CTCTGTTTT

AT5G39850.1-1 3 250 258 CTCTGTTTT

AT5G51670.1-1 3 41 33 AAAACAGAG NO

AT5G51670.1-1 3 52 44 AAAACAGAG

AT5G51670.1-1 3 65 57 AAAACAGAG

AT5G53660.1-2 3 32 40 CTCTGTTTT YES

AT5G53660.1-2 3 59 67 CTCTGTTTT

AT5G53660.1-2 3 72 80 CTCTGTTTT

AT5G57350.1-5 3 14 22 CTCTGTTTT YES

AT5G57350.1-5 3 26 34 CTCTGTTTT

AT5G57350.1-5 3 51 59 CTCTGTTTT

134

Chapter 5

General Conclusions and Future Directions

135

5 General Conclusions and Future Directions

5.1 General Conclusions

This thesis investigated the upstream and downstream regulation of the Arabidopsis

thaliana R2R3-MYB transcription factor, AtMYB61. It addressed three major aims. The

first aim related to the identification of direct downstream targets of AtMYB61. The

second aim related to the determination of DNA targets preferentially bound by

AtMYB61. The third aim dealt with the examination of upstream regulatory mechanisms

that impact the transcription of AtMYB61. The scientific objectives and the major

findings that arose by addressing these aims are as follows:

(1) To determine the direct downstream targets of AtMYB61

Three putative downstream target genes of AtMYB61 were identified. Putative

AtMYB61 targets were predicted on the basis of comparative transcriptome analysis.

This transcriptome analysis entailed identification and comparison of genes whose

transcript abundance was modulated by differences in AtMYB61 activity, relative to

those genes whose transcript abundance profiles paralleled AtMYB61 across

development and in different organs.

The three putative AtMYB61 targets identified through this comparison are predicted to

encode the following proteins: a KNOTTED1-like transcription factor (KNAT7,

At1g62990); a caffeoyl-CoA 3-O-methyltransferase (CCoAOMT7, At4g26220), and a

pectin-methylesterase (PME, At2g45220). Statistically over-represented motifs were

identified in the 5‘ non-coding regions of the three putative target genes. These motifs

corresponded to previously-characterised AC-element motifs that function as R2R3-

MYB targets in other systems (Grotewold et al., 1994; Sablowski et al., 1994; Sablowski

et al., 1995; Moyano et al., 1996; Sainz et al., 1997; Uimari and Strommer, 1997;

Tamagnone et al., 1998; Jin et al., 2000; Sugimoto et al., 2000; Yang et al., 2001;

Patzlaff et al., 2003a; Patzlaff et al., 2003b; Fukuzawa et al., 2006).

The consensus motif identified in the gene regulatory regions of the three putative

AtMYB61 target genes functions as a bona fide target for AtMYB61 binding, as

136

determined by EMSA using purified recombinant AtMYB61 protein. Moreover, the 5‘

non-coding regulatory regions of each of the putative target genes could also be bound

by AtMYB61, as determined by EMSA. AtMYB61 expression in yeast was sufficient to

drive transcription of a synthetic reporter gene comprising a tandem AC-element fused

to a yeast minimal promoter, upstream of the reporter gene lac-Z. Together, these

findings support the hypothesis that AtMYB61 binds to, and regulates, the expression of

a small subset of genes, which in turn shape multiple facets of plant growth and

metabolism.

(2) To identify and characterise the DNA binding motifs to which AtMYB61

preferentially binds

The DNA binding sites to which a gene regulatory protein binds can be affinity purified

using the CASTing system. This system was used to identify DNA recognition sites to

which recombinant AtMYB61 protein preferentially binds in vitro. The binding kinetics of

AtMYB61 to the CASTing-selected DNA target sequences were determined using a

nitrocellulose filter-binding assay. These experiments confirmed that a core ACC

nucleotide motif was essential for binding by AtMYB61. The nature of the interactions

between amino acids in the AtMYB61 DNA-binding site and nucleotides in the

preferential DNA targets were explored using molecular modeling in silico. These

predict key interactions that likely shape the affinity of protein binding to the cognate

DNA sequence. Notably, while recombinant AtMYB61 was sufficient to drive gene

expression from CASTing-identified target DNA sequences in yeast, it did so in a

manner that was not entirely consistent with predicted affinities. Together, these

findings illustrate the binding specificity of an R2R3-MYB protein, and underscore the

fact that such specificity may play out in a complex manner in a biological system.

(3) To determine the molecular components that function upstream to modulate

AtMYB61 expression.

AtMYB61 was regulated by photosynthate in a sugar-signalling pathway that appears to

act independent of the hexokinase sugar signalling pathway. Analysis of AtMYB61

promoter-reporter fusion constructs with or without AtMYB61 5‘ intragenic sequences

suggested that AtMYB61 expression is de-repressed by sucrose in a mechanism

137

involving intragenic sequences. An over-represented conserved motif was identified

within the second intron of Brassicaceae AtMYB61 homologues. The second intron

repeats of AtMYB61 could function as binding targets for a putative sugar-mediated

repressor, as determined by EMSA. Putative repressor proteins that bound this motif in

the absence of sucrose were identified by affinity purification coupled with mass

spectrometry, and characterised using a combination of loss-of-function genetics and

transcriptome analysis. Together, these findings support the hypothesis of a novel

protein activity that binds a conserved repeat motif within AtMYB61 second intron to

regulate sugar mediated gene expression in AtMYB61.

5.2 Future Directions

Molecular Characterisations of Plant Transcription Factors

Despite the vast knowledge of plant transcription factor function at the gross

morphological level, little is known about the mechanistic basis for transcription factor

activity. In addition to shedding light on a particular transcription factor AtMYB61, this

thesis has established a pipeline for the characterisation of the functions of any

transcription factor on the molecular level. This pipeline is essential because it gives

insight into the mechanisms that drive phenotypes. The identification of more DNA-

binding sites of regulatory proteins should lead to more accurate in silico motif

prediction programs for novel DNA-binding proteins. These insights are not only

important from a basic science perspective, but can also be fruitful in terms of

developing schemes for the modification of important transcription factors, like

AtMYB61, for specific end purposes, such as the directed modification of plant

architecture or metabolic engineering.

ChIP-Seq

In addition to the in vitro and in silico characterisation of AtMYB61 and its target

sequences demonstrated in this thesis, it is critical that an in vivo characterisation be

conducted as well to further determine how AtMYB61 influence phenotype-affecting

mechanisms. Recently, chromatin immunoprecipitation (ChIP) followed by high-

138

throughput signature sequencing (ChIP-seq) has proven to be an incredibly powerful

means by which to identify in vivo DNA-binding sites of sequence-specific transcription

factors (Massie and Mills, 2008). ChIP-seq could be used to identify AtMYB61 in vivo

DNA targets in the Arabidopsis thaliana genome. Towards this note, a viable antibody

has been generated against the variable region of AtMYB61 (refer to Chapter 3 of this

thesis). DNA sequences can be pulled down, sequenced, and analysed to determine

their location in the Arabidopsis thaliana genome. Targets can be validated by

analysing their transcript abundance in atmyb61 loss-of-function and AtMYB61

overexpressor mutants. The in vivo direct downstream targets of AtMYB61 can be

compared to the in vitro and in silico targets determined in this thesis to confirm

accuracy of methods.

Characterisations of Putative AtMYB61 Repressors

The identification of putative repressors that bound the second intron of AtMYB61

determined in this thesis demonstrated the molecular components that function

upstream to modulate AtMYB61 expression; however, the biochemical characterisations

of these repressor proteins still remain. To determine if the putative AtMYB61 proteins

can repress gene activity, these proteins should be expressed in Arabidopsis thaliana

protoplasts to observe if they can repress a synthetic reporter gene comprising tandem

AtMYB61 second intron repeats fused to a Cauliflower Mosaic Virus 35S promoter,

upstream of the GUS reporter gene uidA. In addition to the biochemical

characterisations of these putative repressor proteins, the in vivo direct downstream

targets of these proteins should be identified. ChIP-seq should be conducted on these

repressor proteins to identify in vivo targets. To determine the expression of AtMYB61

repressors in tissues throughout development, promoter-reporter fusion constructs

should be transformed into plants and analysed.

The validation of binding of putative AtMYB61 repressors to AtMYB61 second intron

repeat is also to be determined. The cDNA of putative AtMYB61 repressors should be

cloned into the pET-15b protein expression vectors and expressed. To determine if the

proteins bind to AtMYB61 second intron repeat in vitro, an electrophoretic mobility shift

assay (EMSA) is to be conducted with recombinant putative AtMYB61 repressor

139

proteins and labelled AtMYB61 second intron repeat. In addition to this, to determine if

the putative AtMYB61 repressor proteins bind to AtMYB61 second intron repeat in vivo,

an EMSA is to be conducted with labelled AtMYB61 second intron repeat and nuclear

proteins purified from putative AtMYB61 repressor loss-of-function (rmx) mutants. It is

hypothesised that in the rmx loss-of-function background, the binding would be reduced

in the EMSA compared to the same assay conducted with wild-type nuclear proteins.

Despite the size and importance of the plant R2R3-MYB family of transcription factors,

little is known about the molecular functioning of individual family members. AtMYB61,

a member of the R2R3-MYB family in Arabidopsis thaliana, regulates pleiotropic

modifications of carbon acquisition and allocation throughout the plant body. As is the

case for most R2R3-MYB transcription factors, the precise mechanisms that enable

AtMYB61 to bring about important changes in plant function were unknown before the

onset of this thesis. The work described in this thesis casts light on the downstream

and upstream mechanisms of AtMYB61. The findings presented in this thesis point to

additional complexities in the regulation of plant gene expression, and argue for the

need for greater exploration of the molecular intricacies involved in how a given plant

transcription factor elicits a phenotype.

140

Appendices

The wound-, pathogen-, and ultraviolet B-responsive MYB134 gene encodes an R2R3 MYB transcription factor that

regulates a suite of genes involved in proanthocyanidin synthesis in Poplar

This chapter is an extract of material originally contained in the following publication:

Mellway, R.D., Tran L.T., Prouse, M.B., Campbell, M.M., and Constabel, C.P. (2009)

The Wound-, Pathogen-, and Ultraviolet B-Responsive MYB134 Gene Encodes an

R2R3 MYB Transcription Factor That Regulates Proanthocyanidin Synthesis in Poplar.

Plant Physiology. 150: 924-941.

Contributions: MBP, RDM, MMC, CPC designed research; MBP, RDM, LTT

performed research; RDM, LTT, MBP, MMC, CPC analysed data; MBP, RDM, MMC,

CPC wrote manuscript with editorial assistance from MBP, RDM, LTT, MMC, CPC

MBP contributed specifically to each figure and table in this chapter.

Copyright: The material in this chapter is copyrighted by The American Society of

Plant Biologists and is cited as:

141

A The Wound-, Pathogen-, and Ultraviolet B-Responsive MYB134 Gene Encodes an R2R3 MYB Transcription Factor that Regulates a Suite of Genes Involved in Proanthocyanidin Synthesis in Poplar

A.1 Abstract

In poplar (Populus spp.), the major defense phenolics produced in leaves are flavonoid-

derived proanthocyanidins (PAs). Transcriptional activation of PA biosynthetic genes

leading to PA accumulation in leaves occurs following herbivore damage and

mechanical wounding. A poplar R2R3-MYB transcription factor gene, MYB134, exhibits

close sequence similarity to the Arabidopsis thaliana PA regulator TRANSPARENT

TESTA2 and is coinduced with PA biosynthetic genes following mechanical wounding

and exposure to elevated ultraviolet B light. Overexpression of MYB134 in poplar

results in transcriptional activation of the full PA biosynthetic pathway and a significant

plant-wide increase in PA levels. Here, we demonstrate through electrophoretic mobility

shift assays (EMSA) that recombinant MYB134 protein is able to bind to promoter

regions of early and late PA pathway genes: PHENYLALANINE AMMONIA-LYASE1

(PAL1), DIHYDROFLAVONOL REDUCTASE1 (DFR1) and ANTHOCYANIDIN

REDUCTASE2 (ANR2). Sequences enriched with adenosine and cytosine nucleotides,

termed AC elements, were over-represented in the 5‘ non-coding regions of putative

target genes. The consensus motif functions as a bona fide target for MYB134 as

determined by EMSA. Our data provide insight into the regulatory mechanisms

controlling PA metabolism in poplar, and the identification of a regulator of stress-

responsive PA biosynthesis constitutes a valuable tool for manipulating PA metabolism

in poplar and investigating the biological functions of PAs in resistance to biotic and

abiotic stresses.

A.2 Introduction

Plant secondary metabolites play important ecological roles and in many plants

constitute a critical component of defenses against biotic and abiotic stress. Many

142

secondary metabolic pathways are responsive to environmental conditions and can be

rapidly activated by stresses such as pathogen infection, elevated light, and herbivory.

The phenylpropanoid pathway in particular leads to the synthesis of a large and diverse

class of plant secondary metabolites, many of which are stress induced (Dixon and

Paiva, 1995). Synthesis of phenylpropanoids and other secondary metabolites

following stress is typically mediated by the transcriptional activation of suites of

biosynthetic genes coordinately regulated by transcription factor proteins (Weisshaar

and Jenkins, 1998; Davies and Schwinn, 2003). The possibility of identifying

transcription factors that control entire pathways is motivating many studies in plant

stress biology, since such regulators would be valuable for the metabolic engineering of

plants for both plant and human health (Dixon, 2005; Sharma and Dixon, 2005; Yu and

McGonigle, 2005).

Populus species (cottonwoods, poplars, and aspens, hereafter referred to collectively as

poplar) are often ecological foundation species and include the most widely distributed

trees in the Northern Hemisphere. The phenolic metabolites produced by poplar are

thought to be important determinants of community structure and ecosystem dynamics

(Lindroth and Hwang, 1996; Schweitzer et al., 2004; Bailey et al., 2005; LeRoy et al.,

2006; Whitham et al., 2006). Poplar leaves typically accumulate several classes of

phenolic metabolites, including the salicylate-derived phenolic glycosides (PGs),

flavonoids such as flavonol glycosides, anthocyanins, and proanthocyanidins (PAs; or

condensed tannins), and numerous small phenolic acids and their esters (Pearl and

Darling, 1971; Klimczak et al., 1972; Palo, 1984; Lindroth and Hwang, 1996). PGs and

PAs are generally the most abundant foliar phenolic metabolites in poplar and together

can constitute more than 30% of leaf dry weight (Pearl and Darling, 1971; Klimczak et

al., 1972; Palo, 1984; Lindroth and Hwang, 1996). PAs are also constitutively produced

in poplar leaves, but their biosynthesis is often up-regulated by stresses such as insect

herbivory, mechanical wounding, and pathogen infection (Peters and Constabel, 2002;

Stevens and Lindroth, 2005; Miranda et al., 2007). PA accumulation following

wounding and herbivory occurs both locally at the site of damage and systemically in

distal leaves (Peters and Constabel, 2002). The strong systemic activation of the PA

biosynthetic pathway in poplar following insect herbivory suggests that these

143

compounds function in herbivore defense. However, experimental evidence indicates

that poplar leaf PAs may not be strong, broad-spectrum antiherbivore compounds

(Hemming and Lindroth, 1995; Ayres et al., 1997). In addition to biotic stresses, nutrient

limitation and high light levels have also been found to result in greater PA

concentrations in poplar (Hemming and Lindroth, 1999; Osier and Lindroth, 2001),

hinting at broader biological roles.

Transcriptional regulation of flavonoid and PA biosynthetic genes involves combinatorial

interactions between several classes of transcription factor proteins (Mol et al., 1998;

Nesi et al., 2001; Winkel-Shirley, 2001). These include members of the R2R3-MYB

domain, basic helix-loop-helix (bHLH) domain, and WD-repeat (WDR) families (Lepiniec

et al., 2006). In Arabidopsis thaliana seed testa, PA biosynthesis is regulated by a

MYB-bHLH-WDR ternary complex composed of the TT2, TT8, and TTG1 proteins (Nesi

et al., 2000; Nesi et al., 2001. The MYB factor (TT2) confers target gene specificity to

the complex, activating the late PA biosynthetic genes, including DFR, BAN, TT12, and

AHA10 {Baudry, 2004 #208; Debeaujon et al., 2003; Baudry et al., 2004; Sharma and

Dixon, 2005). The DNA sequences bound by TT2 have not been elucidated, although

the closely related maize (Zea mays) COLORLESS1 (C1) protein, a regulator of

anthocyanin metabolism, has been shown to bind to both AC-rich motifs known as AC

elements and the animal c-MYB consensus sequence (CNGTTR) present in the

regulatory regions of numerous phenylpropanoid genes (Howe and Watson, 1991;

Weston, 1992; Sainz et al., 1997; Hernandez et al., 2004).

The R2R3-MYBs constitute large gene families in plants, with 126 members in

Arabidopsis thaliana (Stracke et al., 2001) and 192 in poplar (Wilkins et al., 2009).

Although many remain functionally uncharacterised, numerous R2R3-MYB proteins are

implicated in the regulation of plant-specific developmental and physiological processes,

including the regulation of phenylpropanoid metabolism (Stracke et al., 2001). R2R3-

MYB proteins are characterised by two imperfectly repeated N-terminal MYB domains

each forming DNA-binding helix-helix-turn-helix structures. Outside of the R2R3 MYB

domain, the proteins are highly divergent except for short conserved amino acid

sequence motifs. These motifs, together with sequence homology within the MYB

144

domains, form the basis for their classification into different subgroups (Stracke et al.,

2001; Jiang et al., 2004b).

We previously showed that the stress induction of PAs in poplar leaves follows the

transcriptional activation of PA biosynthetic genes (Peters and Constabel, 2002;

Miranda et al., 2007) and therefore hypothesised that a TT2-like R2R3 MYB protein

regulates this process. MYB134 was also previously identified as a candidate PA

regulator that is consistently coregulated with PA biosynthetic genes (Mellway et al.,

2009). Constitutive expression of MYB134 in transgenic poplar resulted in a specific

activation of PA pathway genes, leading to a dramatic increase in PA concentrations,

suggesting that this gene is indeed a poplar PA regulator. Here, we show that

recombinant MYB134 protein binds to promoter regions of both early and late PA

pathway genes containing predicted MYB binding sites. These findings provide insight

into the regulatory mechanisms mediating stress-induced PA biosynthesis, and the PA-

modified poplar trees produced here represent a valuable tool for investigating the

functions of carbon-based allelochemicals in poplar.

A.3 Materials and Methods

A.3.1 EMSA

Recombinant MYB134 protein was produced in Escherichia coli using the coding

sequence cloned in-frame into the NdeI and BamHI sites of the pET15b vector

(Novagen). Recombinant MYB134 protein was produced, extracted, and affinity purified

as described previously for pine (Pinus spp.) MYB proteins (Patzlaff et al., 2003b).

EMSA conditions were exactly as described previously (Patzlaff et al., 2003b; Gomez-

Maldonado et al., 2004) except that recombinant MYB134 protein was used in place of

pine MYB protein.

145

A.4 Results and Discussion

A.4.1 MYB134 Binds to Promoter Regions of PA Biosynthetic Genes

PHENYLALANINE AMMONIA-LYASE1 (PAL1), DIHYDROFLAVONOL REDUCTASE1

(DFR1) and ANTHOCYANIDIN REDUCTASE2 (ANR2) were all upregulated by

constitutive expression of MYB134, suggesting that they were all direct targets (Mellway

et al., 2009). These proteins act in the PA biosynthetic pathway, as PAL1 catalyses the

conversion of phenylalanine to cinnamic acid; DFR1 catalyses dihydroflavonols to

leucoanthocyanidins; and ANR2 catalyses anthocyanidins into epicatechins (Xie and

Dixon, 2005). These target genes represent general phenylpropanoid/early PA

metabolism (PAL1), late flavonoid metabolism (DFR1), and the PA-specific branch of

flavonoid metabolism (ANR2). Candidate MYB134-binding sites in the regulatory

regions of these genes were identified by visual examination of the upstream genomic

sequence and comparison with characterised phenylpropanoid promoters as well as

with a search of the PLACE (plant cis-element database)

(http://www.dna.affrc.go.jp/PLACE/signalscan.html) using SIGNAL SCAN (Prestridge,

1991; Higo et al., 1998). The promoter regions of the target genes were found to

contain motifs similar to the adenosine- and cytosine-rich AC elements found in the

regulatory regions of biosynthetic genes of different branches of phenylpropanoid

metabolism, including both flavonoid and lignin biosynthesis (Fig. A.1a)(Hatton et al.,

1995; Rogers and Campbell, 2004; Hartmann et al., 2005). AC elements are bound by

the maize C1 protein, the most closely related MYB protein to MYB134 for which DNA-

binding sites have been defined, as well as several MYB proteins involved in the

regulation of lignin metabolism (Hatton et al., 1995; Patzlaff et al., 2003b; Rogers and

Campbell, 2004). The 180-bp ANR2 promoter region analysed also contains a motif

matching the CNGTTR consensus sequence bound by the vertebrate c-MYB (Fig.

A.1a)(Howe and Watson, 1991; Weston, 1992). Inspection of these representative

promoter sequences also revealed the presence of bHLH protein consensus-binding

sites (CANNTG) in close proximity to the putative MYB-binding sites (Fig. A.1a). The

upstream region of poplar PAL1 contains two overlapping AC element sequences

identical to the high-affinity P-binding site (ACCTACCAACC) identified in the maize A1

146

Figure A.1. MYB134 binds to the promoters of putative downstream target genes. (a) Schematic representation of 1,000 bp of 5′ noncoding sequences for three putative MYB134 downstream target genes. + and − indicate the orientations of AC element-like motifs relative to the sense coding strand; numbers indicate the positions of these motifs relative to the putative transcriptional start. Arrows above each line indicate bHLH consensus sites (CANNTG), while arrows below each line indicate c-MYB consensus sites (CNGTTR). Light gray horizontal lines under the sequences correspond to the location of the DNA sequence used as the binding target in the EMSA conducted in B. (b) MYB134 binding to 5′ noncoding sequences of the three putative target genes as determined by EMSA. Recombinant MYB134 bound to all three 5′ noncoding sequences, as determined by a gel shift of the probe (arrows), which could be outcompeted with increasing quantities of unlabeled DNA corresponding to a canonical R2R3 MYB-binding site, known as an AC element motif (AC; 5′-ATTGTTCTTCCTGGGGTGACCGTCCACCTACGCTAAAAGCCGTCGCGGGATAAGCCTGTCTG-3′). C, MYB134 binding to the AC-rich canonical R2R3 MYB-binding site motif as determined by EMSA. Binding of recombinant MYB134 to radiolabeled AC can be outcompeted by cold competitor AC (left) but not by the nonspecific competitor poly(dIdC).

147

(encoding dihydroflavonol reductase) promoter sequence that is bound by maize C1

and the maize P protein, an R2R3-MYB protein that regulates the biosynthesis 3-deoxy

flavonoids and phlobaphenes (Fig. A.1a)(Sainz et al., 1997). Within the 180-bp regions

analyzed, poplar DFR1 and ANR2 both contain motifs that are quite similar to the AC

elements defined by (Hatton et al., 1995) in the tobacco PAL2 promoter (GCCTACC

and ACCTACA, respectively)(Fig. A.1a). EMSA experiments showed that the

recombinant MYB134 protein specifically bound the 180-bp upstream regulatory

sequences (Fig. A.1b). Two shifted bands were observed for the PAL1 and ANR2 180-

bp probes, while only one was seen with the DFR1 probe (Fig. A.1b). It is possible that

the MYB134 protein binds both of the overlapping AC elements in the PAL1 promoter

and both the AC element-like sequence and the c-MYB-binding site in the ANR2

promoter. A sequence containing a canonical AC element was an effective competitor

and eliminated MYB134 binding (Fig. A.1b), and recombinant MYB134 also bound to

this element in a specific manner (Fig. A.1c). Thus, MYB134 appears to bind to the

gene regulatory regions of putative target genes in an AC motif-dependent fashion. Our

work indicates that high sequence similarity to TT2 can be used to link MYB gene

function to PA pathway regulation.

In silico analysis has shown that the promoter regions of the poplar flavonoid and PA

biosynthetic genes contain cis elements matching the consensus sequences recognised

by phenylpropanoid regulatory R2R3-MYB proteins (Tsai et al., 2006). MYB134 was

shown to bind to promoter fragments containing motifs similar to the AC elements found

in a wide variety of phenylpropanoid biosynthetic gene promoters (Fig. A.1b). MYB134

was also shown to bind to a DNA sequence containing a canonical AC element

(ACCTAC; Fig. A.1c). These results suggest that such motifs are bound by MYB134 in

vivo, although these results do not rule out the involvement of other putative MYB

binding sites, such as the animal c-MYB recognition site found in the ANR2 promoter.

AC element-like motifs are present within the 2-kb 5′ noncoding sequence of most

poplar flavonoid genes (Tsai et al., 2006). Given that AC elements are widely

distributed in the regulatory regions not just of PA biosynthetic genes but of genes

involved in other branches of flavonoid and phenylpropanoid metabolism, interactions

with cofactors such as bHLH domain proteins that require the presence of additional

148

binding sites likely contribute to the specific activation of different branch pathways

(Hartmann et al., 2005). Consistent with specific bHLH cofactor binding sites

contributing to MYB134 target gene specificity, putative bHLH-binding sites are present

in all poplar PA pathway genes (R.D. Mellway and C.P. Constabel, unpublished data).

In activating the full suite of early and late flavonoid as well as PA biosynthetic genes,

MYB134 differs from Arabidopsis thaliana TT2, which regulates a more limited set of

late PA structural genes (Nesi et al., 2001; Sharma and Dixon, 2005). A wider target

gene set for MYB134, in conjunction with the natural constitutive PA production in a

wider range of poplar tissues, may account for the different effects of TT2

overexpression in Arabidopsis thaliana compared with MYB134 overexpression in

poplar. Unlike poplar, Arabidopsis thaliana produces PAs only in the seed testa, and

ectopic expression of TT2 does not result in plant-wide PA accumulation (Nesi et al.,

2001). A more detailed elucidation of how the pathway is regulated will require

functional characterization of the members of both MYB gene families as well as

identification and analysis of the additional interacting proteins such as the bHLH and

WDR proteins.

A.5 Conclusion

The extensive genomics resources combined with the complexity and biological

importance of phenylpropanoid metabolism in poplar make it a useful system for

investigating this pathway. In this report, we describe work identifying a gene encoding

an R2R3-MYB transcription factor, PtMYB134, which appears to play an important role

in controlling PA biosynthesis. PtMYB134 was shown to bind to the 5‘ non-coding

regulatory regions of both early and late PA biosynthetic genes: PAL1, DFR1 and

ANR2. AC elements were identified within the targets promoter regions and this

consensus motif functions as a bona fide target for MYB134 binding as determined by

an electrophoretic mobility shift assay. Identifying transcriptional regulators of

biosynthetic pathway genes is an important goal for metabolic engineering of secondary

metabolism in plants, and the identification of a putative regulator of PA metabolism in

poplar may permit new experimental approaches for evaluating the biological functions

of PAs.

149

A.6 Acknowledgements

This work was generously supported by a Natural Science and Engineering Research

Council of Canada (NSERC) Canadian Graduate Scholarship (CGSD) awarded to MP,

and by funding from the University of Toronto and NSERC to MMC.

150

B Study Labels

Study Label SALK Line Mutant Label

A SALK_112391C

C SALK_134409C

E SALK_143514C

G SALK_050658 rmx1

I SALK_082219C rmx2

J SALK_046986 rmx3

K SALK_129409C

L SALK_047892C

M SALK_150941C

N SALK_147133

O SALK_109533C rmx4

P SALK_047550C rmx5

Q SALK_125978

R SALK_009225C

S SALK_032344C rmx6

T SALK_031449C

U SALK_036546C

V SALK_100844C

151

References

Ades, S.E., and Sauer, R.T. (1994). Differential DNA-binding specificity of the engrailed homeodomain: the role of residue 50. Biochemistry 33, 9187-9194.

Affolter, M., Percivalsmith, A., Muller, M., Leupin, W., and Gehring, W.J. (1990). DNA binding properties of the purified Antennapedia homeodomain. Proc. Natl. Acad. Sci. U. S. A. 87, 4093-4097.

Alabadi, D., Oyama, T., Yanovsky, M.J., Harmon, F.G., Mas, P., and Kay, S.A. (2001). Reciprocal regulation between TOC1 and LHY/CCA1 within the Arabidopsis circadian clock. Science 293, 880-883.

Alonso, J.M., Stepanova, A.N., Leisse, T.J., Kim, C.J., Chen, H.M., Shinn, P., Stevenson, D.K., Zimmerman, J., Barajas, P., Cheuk, R., Gadrinab, C., Heller, C., Jeske, A., Koesema, E., Meyers, C.C., Parker, H., Prednis, L., Ansari, Y., Choy, N., Deen, H., Geralt, M., Hazari, N., Hom, E., Karnes, M., Mulholland, C., Ndubaku, R., Schmidt, I., Guzman, P., Aguilar-Henonin, L., Schmid, M., Weigel, D., Carter, D.E., Marchand, T., Risseeuw, E., Brogden, D., Zeko, A., Crosby, W.L., Berry, C.C., and Ecker, J.R. (2003). Genome-wide Insertional mutagenesis of Arabidopsis thaliana. Science 301, 653-657.

Anton, I.A., and Frampton, J. (1988). Tryptophans in myb proteins. Nature 336, 719-719.

Arabidopsis Genome, I. (2000). Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408, 796-815.

Arenas-Huertero, F., Arroyo, A., Zhou, L., Sheen, J., and Leon, P. (2000). Analysis of Arabidopsis glucose insensitive mutants, gin5 and gin6, reveals a central role of the plant hormone ABA in the regulation of plant vegetative development by sugar. Genes Dev. 14, 2085-2096.

Avila, J., Nieto, C., Canas, L., Benito, M.J., and Pazares, J. (1993). Petunia hybrida genes related to the maize regulatory C1 gene and to animal myb proto-oncogenes. Plant J. 3, 553-562.

Aya, K., Ueguchi-Tanaka, M., Kondo, M., Hamada, K., Yano, K., Nishimura, M., and Matsuoka, M. (2009). Gibberellin Modulates Anther Development in Rice via the Transcriptional Regulation of GAMYB. Plant Cell 21, 1453-1472.

Ayres, M.P., Clausen, T.P., MacLean, S.F., Redman, A.M., and Reichardt, P.B. (1997). Diversity of structure and antiherbivore activity in condensed tannins. Ecology 78, 1696-1712.

Badis, G., Berger, M.F., Philippakis, A.A., Talukder, S., Gehrke, A.R., Jaeger, S.A., Chan, E.T., Metzler, G., Vedenko, A., Chen, X.Y., Kuznetsov, H., Wang, C.F., Coburn, D., Newburger, D.E., Morris, Q., Hughes, T.R., and Bulyk, M.L.

152

(2009). Diversity and Complexity in DNA Recognition by Transcription Factors. Science 324, 1720-1723.

Bailey, J.K., Deckert, R., Schweitzer, J.A., Rehill, B.J., Lindroth, R.L., Gehring, C., and Whitham, T.G. (2005). Host plant genetics affect hidden ecological players: links among Populus, condensed tannins, and fungal endophyte infection. Canadian Journal of Botany-Revue Canadienne De Botanique 83, 356-361.

Bailey, T.L., Williams, N., Misleh, C., and Li, W.W. (2006). MEME: discovering and analyzing DNA and protein sequence motifs. Nucleic Acids Res. 34, W369-W373.

Baranowskij, N., Frohberg, C., Prat, S., and Willmitzer, L. (1994). A novel DNA binding protein with homology to Myb oncoproteins containing only one repeat can function as a transcriptional activator. Embo J. 13, 5383-5392.

Barbulescu, K., Geserick, C., Schuttke, I., Schleuning, W.D., and Haendler, B. (2001). New androgen response elements in the murine Pem promoter mediate selective transactivation. Mol. Endocrinol. 15, 1803-1816.

Baudry, A., Heim, M.A., Dubreucq, B., Caboche, M., Weisshaar, B., and Lepiniec, L. (2004). TT2, TT8, and TTG1 synergistically specify the expression of BANYULS and proanthocyanidin biosynthesis in Arabidopsis thaliana. Plant J. 39, 366-380.

Beall, E.L., Manak, J.R., Zhou, S., Bell, M., Lipsick, J.S., and Botchan, M.R. (2002). Role for a Drosophila Myb-containing protein complex in site-specific DNA replication. Nature 420, 833-837.

BellLelong, D.A., Cusumano, J.C., Meyer, K., and Chapple, C. (1997). Cinnamate-4-hydroxylase expression in Arabidopsis - Regulation in response to development and the environment. Plant Physiol. 113, 729-738.

Berge, T., Matre, V., Brendeford, E.M., Saether, T., Luscher, B., and Gabrielsen, O.S. (2007). Revisiting a selection of target genes for the hematopoietic transcription factor c-Myb using chromatin immunoprecipitation and c-Myb knockdown. Blood Cells Mol. Dis. 39, 278-286.

Berger, M.F., Philippakis, A.A., Qureshi, A.M., He, F.X.S., Estep, P.W., and Bulyk, M.L. (2006). Compact, universal DNA microarrays to comprehensively determine transcription-factor binding site specificities. Nat. Biotechnol. 24, 1429-1435.

Bergholtz, S., Andersen, T.O., Andersson, K.B., Borrebaek, J., Luscher, B., and Gabrielsen, O.S. (2001). The highly conserved DNA-binding domains of A-, B- and c-Myb differ with respect to DNA-binding, phosphorylation and redox properties. Nucleic Acids Res. 29, 3546-3556.

Bianchi, A., Smith, S., Chong, L., Elias, P., and deLange, T. (1997). TRF1 is a dimer and bends telomeric DNA. Embo J. 16, 1785-1794.

153

Biedenkapp, H., Borgmeyer, U., Sippel, A.E., and Klempnauer, K.H. (1988). Viral myb oncogene encodes a sequence-specific DNA-binding activity. Nature 335, 835-837.

Bilaud, T., Koering, C.E., BinetBrasselet, E., Ancelin, K., Pollice, A., Gasser, S.M., and Gilson, E. (1996). The telobox, a Myb-related telomeric DNA binding motif found in proteins from yeast, plants and human. Nucleic Acids Res. 24, 1294-1303.

Borg, M., Brownfield, L., Khatab, H., Sidorova, A., Lingaya, M., and Twell, D. (2011). The R2R3 MYB Transcription Factor DUO1 Activates a Male Germline-Specific Regulon Essential for Sperm Cell Differentiation in Arabidopsis. Plant Cell 23, 534-549.

Boyes, D.C., Zayed, A.M., Ascenzi, R., McCaskill, A.J., Hoffman, N.E., Davis, K.R., and Gorlach, J. (2001). Growth stage-based phenotypic analysis of arabidopsis: A model for high throughput functional genomics in plants. Plant Cell 13, 1499-1510.

Braun, E.L., and Grotewold, E. (1999). Newly discovered plant c-myb-like genes rewrite the evolution of the plant myb gene family. Plant Physiol. 121, 21-24.

Brown, D.M., Zeef, L.A.H., Ellis, J., Goodacre, R., and Turner, S.R. (2005). Identification of novel genes in Arabidopsis involved in secondary cell wall formation using expression profiling and reverse genetics. Plant Cell 17, 2281-2295.

Bruhat, A., Tourmente, S., Chapel, S., Sobrier, M.L., Couderc, J.L., and Dastugue, B. (1990). Regulatory elements in the 1st-intron contribute to transcriptional regulation of the beta-3 tubulin gene by 20-hydroxyecdysone in Drosophila kc-cells. Nucleic Acids Res. 18, 2861-2867.

Bulow, L., Brill, Y., and Hehl, R. (2010). AthaMap-assisted transcription factor target gene identification in Arabidopsis thaliana. Database-the Journal of Biological Databases and Curation.

Bulow, L., Engelmann, S., Schindler, M., and Hehl, R. (2009). AthaMap, integrating transcriptional and post-transcriptional data. Nucleic Acids Res. 37, D983-D986.

Busch, M.A., Bomblies, K., and Weigel, D. (1999). Activation of a floral homeotic gene in Arabidopsis. Science 285, 585-587.

Carmel, L., Wolf, Y.I., Rogozin, I.B., and Koonin, E.V. (2007). Three distinct modes of intron dynamics in the evolution of eukaryotes. Genome Res. 17, 1034-1044.

Carra, J.H., and Privalov, P.L. (1997). Energetics of folding and DNA binding of the MAT alpha 2 homeodomain. Biochemistry 36, 526-535.

154

Carre, I.A., and Kay, S.A. (1995). Multiple DNA-protein complexes at a circadian-regulated promoter element. Plant Cell 7, 2039-2051.

Chaffey, N., Cholewa, E., Regan, S., and Sundberg, B. (2002). Secondary xylem development in Arabidopsis: a model for wood formation. Physiologia Plantarum 114, 594-600.

Chen, C.M., Wang, C.T., and Ho, C.H. (2001). A plant gene encoding a Myb-like protein that binds telomeric GGTTTAG repeats in vitro. J. Biol. Chem. 276, 16511-16519.

Chen, P.W., Chiang, C.M., Tseng, T.H., and Yu, S.M. (2006). Interaction between rice MYBGA and the gibberellin response element controls tissue-specific sugar sensitivity of alpha-amylase genes. Plant Cell 18, 2326-2340.

Chiou, T.J., and Bush, D.R. (1998). Sucrose is a signal molecule in assimilate partitioning. Proc. Natl. Acad. Sci. U. S. A. 95, 4784-4788.

Colladovides, J., Magasanik, B., and Gralla, J.D. (1991). Control site location and transcriptional regulation in Escherichia coli. Microbiol. Rev. 55, 371-394.

Collins, T.J. (2007). ImageJ for microscopy. Biotechniques 43, 25-+.

Cosma, M.P., Tanaka, T.U., and Nasmyth, K. (1999). Ordered recruitment of transcription and chromatin remodeling factors to a cell cycle- and developmentally regulated promoter. Cell 97, 299-311.

Coupe, S.A., Palmer, B.G., Lake, J.A., Overy, S.A., Oxborough, K., Woodward, F.I., Gray, J.E., and Quick, W.P. (2006). Systemic signalling of environmental cues in Arabidopsis leaves. J. Exp. Bot. 57, 329-341.

Court, R., Chapman, L., Fairall, L., and Rhodes, D. (2005). How the human telomeric proteins TRF1 and TRF2 recognize telomeric DNA: a view from high-resolution crystal structures. EMBO Rep. 6, 39-45.

Davies, K.M., and Schwinn, K.E. (2003). Transcriptional regulation of secondary metabolism. Functional Plant Biology 30, 913-925.

Debeaujon, I., Nesi, N., Perez, P., Devic, M., Grandjean, O., Caboche, M., and Lepiniec, L. (2003). Proanthocyanidin-accumulating cells in Arabidopsis testa: Regulation of differentiation and role in seed development. Plant Cell 15, 2514-2531.

Dekkers, B.J.W., Schuurmans, J., and Smeekens, S.C.M. (2004). Glucose delays seed germination in Arabidopsis thaliana. Planta 218, 579-588.

DeLano, W.L. (2002). The PyMOL Molecular Graphics System DeLano Scientific. http://www.pymol.org.

155

Deyholos, M.K., and Sieburth, L.E. (2000). Separable whorl-specific expression and negative regulation by enhancer elements within the AGAMOUS second intron. Plant Cell 12, 1799-1810.

Dias, A.P., Braun, E.L., McMullen, M.D., and Grotewold, E. (2003). Recently duplicated maize R2R3 Myb genes provide evidence for distinct mechanisms of evolutionary divergence after duplication. Plant Physiol. 131, 610-620.

Dill, A., and Sun, T.P. (2001). Synergistic derepression of gibberellin signaling by removing RGA and GAI function in Arabidopsis thaliana. Genetics 159, 777-785.

Dixon, R.A. (2005). Engineering of plant natural product pathways. Curr. Opin. Plant Biol. 8, 329-336.

Dixon, R.A., and Paiva, N.L. (1995). Stress-induced phenylpropanoid metabolism. Plant Cell 7, 1085-1097.

Do, C.-T., Pollet, B., Thevenin, J., Sibout, R., Denoue, D., Barriere, Y., Lapierre, C., and Jouanin, L. (2007). Both caffeoyl Coenzyme A 3-O-methyltransferase 1 and caffeic acid O-methyltransferase 1 are involved in redundant functions for lignin, flavonoids and sinapoyl malate biosynthesis in Arabidopsis. Planta 226, 1117-1129.

Dooley, S., Seib, T., Welter, C., and Blin, N. (1996). c-myb Intron I protein binding and association with transcriptional activity in leukemic cells. Leuk. Res. 20, 429-439.

Dubos, C., Willment, J., Huggins, D., Grant, G.H., and Campbell, M.M. (2005). Kanamycin reveals the role played by glutamate receptors in shaping plant resource allocation. Plant J. 43, 348-355.

Dubos, C., Stracke, R., Grotewold, E., Weisshaar, B., Martin, C., and Lepiniec, L. (2010). MYB transcription factors in Arabidopsis. Trends Plant Sci. 15, 573-581.

Ebneth, A., Schweers, O., Thole, H., Fagin, U., Urbanke, C., Maass, G., and Wolfes, H. (1994). Biophysical characterization of the c-Myb DNA-binding domain. Biochemistry 33, 14586-14593.

Ehrenkaufer, G.M., Hackney, J.A., and Singh, U. (2009). A developmentally regulated Myb domain protein regulates expression of a subset of stage-specific genes in Entamoeba histolytica. Cell Microbiol. 11, 898-910.

Fairall, L., Schwabe, J.W.R., Chapman, L., Finch, J.T., and Rhodes, D. (1993). The crystal structure of a two zinc-finger peptide reveals an extension to the rules for zinc-finger/DNA recognition. Nature 366, 483-487.

Feldbrugge, M., Sprenger, M., Hahlbrock, K., and Weisshaar, B. (1997). PcMYB1, a novel plant protein containing a DNA-binding domain with one MYB repeat, interacts in vivo with a light-regulatory promoter unit. Plant J. 11, 1079-1093.

156

Fiume, E., Christou, P., Giani, S., and Breviario, D. (2004). Introns are key regulatory elements of rice tubulin expression. Planta 218, 693-703.

Florence, B., Handrow, R., and Laughon, A. (1991). DNA-binding specificity of the fushi tarazu homeodomain. Mol. Cell. Biol. 11, 3613-3623.

Fornale, S., Shi, X.H., Chai, C.L., Encina, A., Irar, S., Capellades, M., Fuguet, E., Torres, J.L., Rovira, P., Puigdomenech, P., Rigau, J., Grotewold, E., Gray, J., and Caparros-Ruiz, D. (2010). ZmMYB31 directly represses maize lignin genes and redirects the phenylpropanoid metabolic flux. Plant J. 64, 633-644.

Frampton, J., Gibson, T.J., Ness, S.A., Doderlein, G., and Graf, T. (1991). Proposed structure for the DNA-binding domain of the Myb oncoprotein based on model building and mutational analysis. Protein Eng. 4, 891-901.

Fu, D.L., Szucs, P., Yan, L.L., Helguera, M., Skinner, J.S., von Zitzewitz, J., Hayes, P.M., and Dubcovsky, J. (2005). Large deletions within the first intron in VRN-1 are associated with spring growth habit in barley and wheat. Mol. Genet. Genomics 273, 54-65.

Fukuzawa, M., Zhukovskaya, N.V., Yamada, Y., Araki, T., and Williams, J.G. (2006). Regulation of Dictyostelium prestalk-specific gene expression by a SHAQKY family MYB transcription factor. Development 133, 1715-1724.

Galis, I., Simek, P., Narisawa, T., Sasaki, M., Horiguchi, T., Fukuda, H., and Matsuoka, K. (2006). A novel R2R3 MYB transcription factor NtMYBJS1 is a methyl jasmonate-dependent regulator of phenylpropanoid-conjugate biosynthesis in tobacco. Plant J. 46, 573-592.

Gallagher, S.R. (1992). GUS protocols : using the GUS gene as a reporter of gene expression. (San Diego ; London: Academic Press).

Galuschka, C., Schindler, M., Bulow, L., and Hehl, R. (2007). AthaMap web tools for the analysis and identification of co-regulated genes. Nucleic Acids Res. 35, D857-D862.

Gaudet, J., and Mango, S.E. (2002). Regulation of organogenesis by the Caenorhabditis elegans, FoxA protein PHA-41. Science 295, 821-825.

Gaudet, J., Muttumu, S., Horner, M., and Mango, S.E. (2004). Whole-genome analysis of temporal gene expression during foregut development. PLoS. Biol. 2, 1828-1842.

Georlette, D., Ahn, S., MacAlpine, D.M., Cheung, E., Lewis, P.W., Beall, E.L., Bell, S.P., Speed, T., Manak, J.R., and Botchan, M.R. (2007). Genomic profiling and expression studies reveal both positive and negative activities for the Drosophila Myb-MuvB/dREAM complex in proliferating cells. Genes Dev. 21, 2880-2896.

157

Gertz, J., Riles, L., Turnbaugh, P., Ho, S.W., and Cohen, B.A. (2005). Discovery, validation, and genetic dissection of transcription factor binding sites by comparative and functional genomics. Genome Res. 15, 1145-1152.

Gewirtz, A.M., and Calabretta, B. (1988). A c-myb antisense oligodeoxynucleotide inhibits normal human hematopoiesis in vitro. Science 242, 1303-1306.

Gibon, Y., Pyl, E.-T., Sulpice, R., Lunn, J.E., Hoehne, M., Guenther, M., and Stitt, M. (2009). Adjustment of growth, starch turnover, protein content and central metabolism to a decrease of the carbon supply when Arabidopsis is grown in very short photoperiods. Plant Cell and Environment 32, 859-874.

Gibson, S.I. (2000). Plant sugar-response pathways. Part of a complex regulatory web. Plant Physiol. 124, 1532-1539.

Gibson, S.I. (2005). Control of plant development and gene expression by sugar signaling. Curr. Opin. Plant Biol. 8, 93-102.

Glover, B.J., Perez-Rodriguez, M., and Martin, C. (1998). Development of several epidermal cell types can be specified by the same MYB-related plant transcription factor. Development 125, 3497-3508.

Godoy, M., Franco-Zorrilla, J.M., Pérez-Pérez, J., Oliveros, J.C., Lorenzo, Ó., and Solano, R. (2011). Improved protein-binding microarrays for the identification of DNA-binding specificities of transcription factors. The Plant Journal 66, 700-711.

Goicoechea, M., Lacombe, E., Legay, S., Mihaljevic, S., Rech, P., Jauneau, A., Lapierre, C., Pollet, B., Verhaegen, D., Chaubet-Gigot, N., and Grima-Pettenati, J. (2005). EgMYB2, a new transcriptional activator from Eucalyptus xylem, regulates secondary cell wall formation and lignin biosynthesis. Plant J. 43, 553-567.

Golay, J., Capucci, A., Arsura, M., Castellano, M., Rizzo, V., and Introna, M. (1991). Expression of c-myb and B-myb, but not A-myb, correlates with proliferation in human hematopoietic cells. Blood 77, 149-158.

Gomez-Maldonado, J., Avila, C., de la Torre, F., Canas, R., Canovas, F.M., and Campbell, M.M. (2004). Functional interactions between a glutamine synthetase promoter and MYB proteins. Plant J. 39, 513-526.

Gong, W., He, K., Covington, M., Dinesh-Kumar, S.P., Snyder, M., Harmer, S.L., Zhu, Y.X., and Deng, X.W. (2008). The development of protein microarrays and their applications in DNA-protein and protein-protein interaction analyses of Arabidopsis transcription factors. Mol. Plant. 1, 27-41.

Graesser, F.A., Lamontagne, K., Whittaker, L., Stohr, S., and Lipsick, J.S. (1992). A highly conserved cysteine in the v-Myb DNA-binding domain is essential for transformation and transcriptional trans-activation. Oncogene 7, 1005-1009.

158

Graf, A., Schlereth, A., Stitt, M., and Smith, A.M. (2010). Circadian control of carbohydrate availability for growth in Arabidopsis plants at night. Proc. Natl. Acad. Sci. U. S. A. 107, 9458-9463.

Graham, I.A., Denby, K.J., and Leaver, C.J. (1994). CARBON CATABOLITE REPRESSION REGULATES GLYOXYLATE CYCLE GENE-EXPRESSION IN CUCUMBER. Plant Cell 6, 761-772.

Grotewold, E., Drummond, B.J., Bowen, B., and Peterson, T. (1994). The myb-homologous P gene controls phlobaphene pigmentation in maize floral organs by directly activating a flavonoid biosynthetic gene subset. Cell 76, 543-553.

Grove, C.A., De Masi, F., Barrasa, M.I., Newburger, D.E., Alkema, M.J., Bulyk, M.L., and Walhout, A.J.M. (2009). A Multiparameter Network Reveals Extensive Divergence between C. elegans bHLH Transcription Factors. Cell 138, 314-327.

Gubler, F., Kalla, R., Roberts, J.K., and Jacobsen, J.V. (1995). Gibberellin-regulated expression of a myb gene in barley aleurone cells: evidence for Myb transactivation of a high-pI alpha-amylase gene promoter. Plant Cell 7, 1879-1891.

Guehmann, S., Vorbrueggen, G., Kalkbrenner, F., and Moelling, K. (1992). Reduction of a conserved Cys is essential for Myb DNA-binding. Nucleic Acids Res. 20, 2279-2286.

Halford, N.G., and Paul, M.J. (2003). Carbon metabolite sensing and signalling. Plant Biotechnology Journal 1, 381-398.

Hall, K.B., and Kranz, J.K. . (2008). Nitrocellulose Filter Binding for Determination of Dissociation Constants. In RNA Protein Interaction Protocols Humana Press, 105-114.

HannaRose, W., and Hansen, U. (1996). Active repression mechanisms of eukaryotic transcription repressors. Trends Genet. 12, 229-234.

Hanson, J., and Smeekens, S. (2009). Sugar perception and signaling - an update. Curr. Opin. Plant Biol. 12, 562-567.

Hara, Y., Onishi, Y., Oishi, K., Miyazaki, K., Fukamizu, A., and Ishida, N. (2009). Molecular characterization of Mybbp1a as a co-repressor on the Period2 promoter. Nucleic Acids Res. 37, 1115-1126.

Haritatos, E., Medville, R., and Turgeon, R. (2000). Minor vein structure and sugar transport in Arabidopsis thaliana. Planta 211, 105-111.

Harlow, E., and Lane, D. . (1988). Antibodies: A Laboratory Manual. Cold Spring Harbor NY. Cold Spring Harbor Laboratory Press.

159

Hartmann, U., Sagasser, M., Mehrtens, F., Stracke, R., and Weisshaar, B. (2005). Differential combinatorial interactions of cis-acting elements recognized by R2R3-MYB, BZIP, and BHLH factors control light-responsive and tissue-specific activation of phenylpropanoid biosynthesis genes. Plant Mol.Biol. 57, 155-171.

Hatton, D., Sablowski, R., Yung, M.H., Smith, C., Schuch, W., and Bevan, M. (1995). 2 Classes of cis sequences contribute to tissue-specific expression of a PAL2 promtoer in transgenic tobacco. Plant J. 7, 859-876.

Hauffe, K.D., Lee, S.P., Subramaniam, R., and Douglas, C.J. (1993). Combinatorial interactions between positive and negative cis-acting elements control spatial patterns of 4CL-1 expression in transgenic tobacco Plant J. 4, 235-253.

Hebsgaard, S.M., Korning, P.G., Tolstrup, N., Engelbrecht, J., Rouze, P., and Brunak, S. (1996). Splice site prediction in Arabidopsis thaliana pre-mRNA by combining local and global sequence information. Nucleic Acids Res. 24, 3439-3452.

Heine, G.F., Hernandez, J.M., and Grotewold, E. (2004). Two cysteines in plant R2R3 MYB domains participate in REDOX-dependent DNA binding. J. Biol. Chem. 279, 37878-37885.

Heine, G.F., Malik, V., Dias, A.P., and Grotewold, E. (2007). Expression and molecular characterization of ZmMYB-IF35 and related R2R3-MYB transcription factors. Mol. Biotechnol. 37, 155-164.

Hemming, J.D.C., and Lindroth, R.L. (1995). Intraspecific variation in aspen phytochemistry - effects on performance of gypsy moths and forest tent caterpillars. Oecologia 103, 79-88.

Hemming, J.D.C., and Lindroth, R.L. (1999). Effects of light and nutrient availability on aspen: Growth, phytochemistry, and insect performance. Journal of Chemical Ecology 25, 1687-1714.

Hernandez, J.M., Heine, G.F., Irani, N.G., Feller, A., Kim, M.G., Matulnik, T., Chandler, V.L., and Grotewold, E. (2004). Different mechanisms participate in the R-dependent activity of the R2R3 MYB transcription factor C1. J. Biol. Chem. 279, 48205-48213.

Hetherington, A.M., and Woodward, F.I. (2003). The role of stomata in sensing and driving environmental change. Nature 424, 901-908.

Hewel, J.A., Liu, J.A., Onishi, K., Fong, V., Chandran, S., Olsen, J.B., Pogoutse, O., Schutkowski, M., Wenschuh, H., Winkler, D.F.H., Eckler, L., Zandstra, P.W., and Emili, A. (2010). Synthetic Peptide Arrays for Pathway-Level Protein Monitoring by Liquid ChromatographyTandem Mass Spectrometry. Mol. Cell. Proteomics 9, 2460-2473.

160

Higo, K., Ugawa, Y., Iwamoto, M., and Higo, H. (1998). PLACE: a database of plant cis-acting regulatory DNA elements. Nucleic Acids Res. 26, 358-359.

Hirayama, T., and Shinozaki, K. (1996). A cdc5(+) homolog of a higher plant, Arabidopsis thaliana. Proc. Natl. Acad. Sci. U. S. A. 93, 13371-13376.

Hoeren, F.U., Dolferus, R., Wu, Y.R., Peacock, W.J., and Dennis, E.S. (1998). Evidence for a role for AtMYB2 in the induction of the Arabidopsis alcohol dehydrogenase gene (ADH1) by low oxygen. Genetics 149, 479-490.

Holm, L., and Park, J. (2000). DaliLite workbench for protein structure comparison. Bioinformatics 16, 566-567.

Howe, K.M., and Watson, R.J. (1991). Nucleotide preferences in sequence-specific recognition of DNA by c-myb protein. Nucleic Acids Res. 19, 3913-3919.

Howe, K.M., Reakes, C.F.L., and Watson, R.J. (1990). Characterization of the sequence-specific interaction of mouse c-myb protein with DNA. Embo J. 9, 161-169.

Huang, R.P. (2003). Protein arrays, an excellent tool in biomedical research. Front. Biosci. 8, D559-D576.

Huang, Y.C., Su, L.H., Lee, G.A., Chiu, P.W., Cho, C.C., Wu, J.Y., and Sun, C.H. (2008). Regulation of Cyst Wall Protein Promoters by Myb2 in Giardia lamblia. J. Biol. Chem. 283, 31021-31029.

Hwang, M.G., Chung, I.K., Kang, B.G., and Cho, M.H. (2001). Sequence-specific binding property of Arabidopsis thaliana telomeric DNA binding protein 1 (AtTBP1). FEBS Lett. 503, 35-40.

Ito, M. (2005). Conservation and diversification of three-repeat Myb transcription factors in plants. J. Plant Res. 118, 61-69.

Ito, M., Iwase, M., Kodama, H., Lavisse, P., Komamine, A., Nishihama, R., Machida, Y., and Watanabe, A. (1998). A novel cis-acting element in promoters of plant B-type cyclin genes activates M phase-specific transcription. Plant Cell 10, 331-341.

Ito, M., Araki, S., Matsunaga, S., Itoh, T., Nishihama, R., Machida, Y., Doonan, J.H., and Watanabe, A. (2001). G2/M-phase-specific transcription during the plant cell cycle is mediated by c-Myb-like transcription factors. Plant Cell 13, 1891-1905.

Jackson, J., Ramsay, G., Sharkov, N.V., Lium, E., and Katzen, A.L. (2001). The role of transcriptional activation in the function of the Drosophila myb gene. Blood Cells Mol. Dis. 27, 446-455.

Jang, J.C., Leon, P., Zhou, L., and Sheen, J. (1997). Hexokinase as a sugar sensor in higher plants. Plant Cell 9, 5-19.

161

Jia, L., Clegg, M.T., and Jiang, T. (2004). Evolutionary dynamics of the DNA-binding domains in putative R2R3-MYB genes identified from rice subspecies indica and japonica genomes. Plant Physiol. 134, 575-585.

Jiang, C.H., Gu, J.Y., Chopra, S., Gu, X., and Peterson, T. (2004a). Ordered origin of the typical two- and three-repeat Myb genes. Gene 326, 13-22.

Jiang, C.Z., Gu, X., and Peterson, T. (2004b). Identification of conserved gene structures and carboxy-terminal motifs in the Myb gene family of Arabidopsis and Oryza sativa L. ssp indica. Genome Biol. 5, 11.

Jin, H.L., and Martin, C. (1999). Multifunctionality and diversity within the plant MYB-gene family. Plant Mol.Biol. 41, 577-585.

Jin, H.L., Cominelli, E., Bailey, P., Parr, A., Mehrtens, F., Jones, J., Tonelli, C., Weisshaar, B., and Martin, C. (2000). Transcriptional repression by AtMYB4 controls production of UV-protecting sunscreens in Arabidopsis. Embo J. 19, 6150-6161.

Joos, H.J., and Hahlbrock, K. (1992). Phenylalanine ammonia-lyase in potato (Solanum-tuberosum L) - genomic complexity, structural comparison of 2 selected genes and modes of expression Eur. J. Biochem. 204, 621-629.

Kapranov, P., Routt, S.M., Bankaitis, V.A., de Bruijn, F.J., and Szczyglowski, K. (2001). Nodule-specific regulation of phosphatidylinositol transfer protein expression in Lotus japonicus. Plant Cell 13, 1369-1382.

Katoh, K., Kuma, K., Toh, H., and Miyata, T. (2005). MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res. 33, 511-518.

Kaul, S., Koo, H.L., Jenkins, J., Rizzo, M., Rooney, T., Tallon, L.J., Feldblyum, T., Nierman, W., Benito, M.I., Lin, X.Y., Town, C.D., Venter, J.C., Fraser, C.M., Tabata, S., Nakamura, Y., Kaneko, T., Sato, S., Asamizu, E., Kato, T., Kotani, H., Sasamoto, S., Ecker, J.R., Theologis, A., Federspiel, N.A., Palm, C.J., Osborne, B.I., Shinn, P., Conway, A.B., Vysotskaia, V.S., Dewar, K., Conn, L., Lenz, C.A., Kim, C.J., Hansen, N.F., Liu, S.X., Buehler, E., Altafi, H., Sakano, H., Dunn, P., Lam, B., Pham, P.K., Chao, Q., Nguyen, M., Yu, G.X., Chen, H.M., Southwick, A., Lee, J.M., Miranda, M., Toriumi, M.J., Davis, R.W., Wambutt, R., Murphy, G., Dusterhoft, A., Stiekema, W., Pohl, T., Entian, K.D., Terryn, N., Volckaert, G., Salanoubat, M., Choisne, N., Rieger, M., Ansorge, W., Unseld, M., Fartmann, B., Valle, G., Artiguenave, F., Weissenbach, J., Quetier, F., Wilson, R.K., de la Bastide, M., Sekhon, M., Huang, E., Spiegel, L., Gnoj, L., Pepin, K., Murray, J., Johnson, D., Habermann, K., Dedhia, N., Parnell, L., Preston, R., Hillier, L., Chen, E., Marra, M., Martienssen, R., McCombie, W.R., Mayer, K., White, O., Bevan, M., Lemcke, K., Creasy, T.H., Bielke, C., Haas, B., Haase, D., Maiti, R., Rudd, S., Peterson, J., Schoof, H., Frishman, D., Morgenstern, B., Zaccaria, P., Ermolaeva, M., Pertea, M., Quackenbush, J., Volfovsky, N., Wu, D.Y., Lowe, T.M., Salzberg, S.L., Mewes, H.W., Rounsley, S., Bush, D., Subramaniam, S.,

162

Levin, I., Norris, S., Schmidt, R., Acarkan, A., Bancroft, I., Brennicke, A., Eisen, J.A., Bureau, T., Legault, B.A., Le, Q.H., Agrawal, N., Yu, Z., Copenhaver, G.P., Luo, S., Pikaard, C.S., Preuss, D., Paulsen, I.T., Sussman, M., Britt, A.B., Selinger, D.A., Pandey, R., Mount, D.W., Chandler, V.L., Jorgensen, R.A., Pikaard, C., Juergens, G., Meyerowitz, E.M., Dangl, J., Jones, J.D.G., Chen, M., Chory, J., and Somerville, M.C. (2000). Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408, 796-815.

Kim, K.N., and Guiltinan, M.J. (1999). Identification of cis-acting elements important for expression of the starch-branching enzyme I gene in maize endosperm. Plant Physiol. 121, 225-236.

Kim, M.J., Lee, T.H., Pahk, Y.M., Kim, Y.H., Park, H.M., Choi, Y.D., Nahm, B.H., and Kim, Y.K. (2009). Quadruple 9-mer-based protein binding microarray with DsRed fusion protein. BMC Mol. Biol. 10, 11.

Kirik, V., Simon, M., Huelskamp, M., and Schiefelbein, J. (2004). The ENHANCER OF TRY AND CPCl gene acts redundantly with TRIPTYCHON and CAPRICE in trichome and root hair cell patterning in Arabidopsis. Dev. Biol. 268, 506-513.

Kislinger, T., Rahman, K., Radulovic, D., Cox, B., Rossant, J., and Emili, A. (2003). PRISM, a generic large scale proteomic investigation strategy for mammals. Mol. Cell. Proteomics 2, 96-106.

Klimczak, M., Kahl, W., and Grodzins.Z. (1972). Studies on phenolic acids, derivatives of cinnamic acid, in plants .1. phenolic acids in poplar (populus). Dissertationes Pharmaceuticae Et Pharmacologicae 24, 181-&.

Klug, A., and Schwabe, J.W.R. (1995). Protein motifs 5. Zinc fingers. Faseb J. 9, 597-604.

Ko, S., Yu, E.Y., Shin, J., Yoo, H.H., Tanaka, T., Kim, W.T., Cho, H.S., Lee, W., and Chung, I.K. (2009). Solution Structure of the DNA Binding Domain of Rice Telomere Binding Protein RTBP1. Biochemistry 48, 827-838.

Ko, S., Jun, S.H., Bae, H., Byun, J.S., Han, W., Park, H., Yang, S.W., Park, S.Y., Jeon, Y.H., Cheong, C., Kim, W.T., Lee, W., and Cho, H.S. (2008). Structure of the DNA-binding domain of NgTRF1 reveals unique features of plant telomere-binding proteins. Nucleic Acids Res. 36, 2739-2755.

Koch, K. (2004). Sucrose metabolism: regulatory mechanisms and pivotal roles in sugar sensing and plant development. Curr. Opin. Plant Biol. 7, 235-246.

Koch, K.E. (1996). Carbohydrate-modulated gene expression in plants. Annual Review of Plant Physiology and Plant Molecular Biology 47, 509-540.

163

Koering, C.E., Fourel, G., Binet-Brasselet, E., Laroche, T., Klein, F., and Gilson, E. (2000). Identification of high affinity Tbf1p-binding sites within the budding yeast genome. Nucleic Acids Res. 28, 2519-2526.

Konig, P., and Rhodes, D. (1997). Recognition of telomeric DNA. Trends Biochem.Sci. 22, 43-47.

Konig, P., Fairall, L., and Rhodes, D. (1998). Sequence-specific DNA recognition by the Myb-like domain of the human telomere binding protein TRF1: a model for the protein-DNA complex. Nucleic Acids Res. 26, 1731-1740.

Koshino-Kimura, Y., Wada, T., Tachibana, T., Tsugeki, R., Ishiguro, S., and Okada, K. (2005). Regulation of CAPRICE transcription by MYB proteins for root epidermis differentiation in Arabidopsis. Plant Cell Physiol. 46, 817-826.

Kranz, H., Scholz, K., and Weisshaar, B. (2000). c-MYB oncogene-like genes encoding three MYB repeats occur in all major plant lineages. Plant J. 21, 231-235.

Lacombe, E., Van Doorsselaere, J., Boerjan, W., Boudet, A.M., and Grima-Pettenati, J. (2000). Characterization of cis-elements required for vascular expression of the Cinnamoyl CoA Reductase gene and for protein-DNA complex formation. Plant J. 23, 663-676.

Lang, M., and Juan, E. (2010). Binding site number variation and high-affinity binding consensus of Myb-SANT-like transcription factor Adf-1 in Drosophilidae. Nucleic Acids Res. 38, 6404-6417.

Lascaris, R.F., Mager, W.H., and Planta, R.J. (1999). DNA-binding requirements of the yeast protein Rap1p as selected in silico from ribosomal protein gene promoter sequences. Bioinformatics 15, 267-277.

Lauvergeat, V., Rech, P., Jauneau, A., Guez, C., Coutos-Thevenot, P., and Grima-Pettenati, J. (2002). The vascular expression pattern directed by the Eucalyptus gunnii cinnamyl alcohol dehydrogenase EgCAD2 promoter is conserved among woody and herbaceous plant species. Plant Mol.Biol. 50, 497-509.

Le Hir, H., Nott, A., and Moore, M.J. (2003). How introns influence and enhance eukaryotic gene expression. Trends Biochem.Sci. 28, 215-220.

Lee, T.I., and Young, R.A. (2000). Transcription of eukaryotic protein-coding genes. Annual Review of Genetics 34, 77-137.

Legay, S., Lacombe, E., Goicoechea, M., Briere, C., Seguin, A., Mackay, J., and Grima-Pettenati, J. (2007). Molecular characterization of EgMYB1, a putative transcriptional repressor of the lignin biosynthetic pathway. Plant Sci. 173, 542-549.

164

Lepiniec, L., Debeaujon, I., Routaboul, J.-M., Baudry, A., Pourcel, L., Nesi, N., and Caboche, M. (2006). Genetics and biochemistry of seed flavonoids. In Annual Review of Plant Biology, pp. 405-430.

LeRoy, C.J., Whitham, T.G., Keim, P., and Marks, J.C. (2006). Plant genes link forests and streams. Ecology 87, 255-261.

Leyva, A., Liang, X.W., Pintortoro, J.A., Dixon, R.A., and Lamb, C.J. (1992). Cis-element combinations determine phenylalanine ammonia-lyase gene tissue-specific expression patterns. Plant Cell 4, 263-271.

Li, B.B., and de Lange, T. (2003). Rap1 affects the length and heterogeneity of human telomeres. Mol. Biol. Cell 14, 5060-5068.

Li, S.F., and Parish, R.W. (1995). Isolation of two novel myb-like genes from Arabidopsis and studies on the DNA-binding properties of their products. Plant J. 8, 963-972.

Liang, Y.-K., Xie, X., Lindsay, S.E., Wang, Y.B., Masle, J., Williamson, L., Leyser, O., and Hetherington, A.M. (2010). Cell wall composition contributes to the control of transpiration efficiency in Arabidopsis thaliana. Plant J. 64, 679-686.

Liang, Y.K., Dubos, C., Dodd, I.C., Holroyd, G.H., Hetherington, A.M., and Campbell, M.M. (2005). AtMYB61, an R2R3-MYB transcription factor controlling stomatal aperture in Arabidopsis thaliana. Curr. Biol. 15, 1201-1206.

Liao, Y., Zou, H.F., Wang, H.W., Zhang, W.K., Ma, B., Zhang, J.S., and Chen, S.Y. (2008). Soybean GmMYB76, GmMYB92, and GmMYB177 genes confer stress tolerance in transgenic Arabidopsis plants. Cell Res. 18, 1047-1060.

Lieb, J.D., Liu, X.L., Botstein, D., and Brown, P.O. (2001). Promoter-specific binding of Rap1 revealed by genome-wide maps of protein-DNA association. Nature Genet. 28, 327-334.

Lindroth, R.L., and Hwang, S.Y. (1996). Diversity, redundancy, and multiplicity in chemical defense systems of aspen. In Phytochemical Diversity and Redundancy in Ecological Interactions, J.T. Romeo, J.A. Saunders, and P. Barbosa, eds, pp. 25-56.

Linger, B.R., and Price, C.M. (2009). Conservation of telomere protein complexes: shuffling through evolution. Crit. Rev. Biochem. Mol. Biol. 44, 434-446.

Linnell, J., Mott, R., Field, S., Kwiatkowski, D.P., Ragoussis, J., and Udalova, I.A. (2004). Quantitative high-throughput analysis of transcription factor binding specificities. Nucleic Acids Res. 32, 7.

Lipsick, J.S. (1996). One billion years of Myb. Oncogene 13, 223-235.

165

Lira, C.B.B., Neto, J.L.D., Khater, L., Cagliari, T.C., Peroni, L.A., dos Reis, J.R.R., Ramos, C.H.I., and Cano, M.I.N. (2007). LaTBP1: A Leishmania amazonensis DNA-binding protein that associates in vivo with telomeres and GT-rich DNA using a Myb-like domain. Arch. Biochem. Biophys. 465, 399-409.

Liu, G.Y., Ren, G., Guirgis, A., and Thornburg, R.W. (2009). The MYB305 Transcription Factor Regulates Expression of Nectarin Genes in the Ornamental Tobacco Floral Nectary. Plant Cell 21, 2672-2687.

Logemann, E., Parniske, M., and Hahlbrock, K. (1995). Modes of expression and common structural features of the complete phenylalanine ammonia-lyase gene family in parsley. Proc. Natl. Acad. Sci. U. S. A. 92, 5905-5909.

Lois, R., Dietrich, A., Hahlbrock, K., and Schulz, W. (1989). A phenylalanine ammonia-lyase gene from parsley - structure, regulation and identification of elicitor and light responsive cis-acting elements. Embo J. 8, 1641-1648.

Loreti, E., Alpi, A., and Perata, P. (2000). Glucose and disaccharide-sensing mechanisms modulate the expression of alpha-amylase in barley embryos. Plant Physiol. 123, 939-948.

Lu, C.A., Ho, T.H.D., Ho, S.L., and Yu, S.M. (2002). Three novel MYB proteins with one DNA binding repeat mediate sugar and hormone regulation of alpha-amylase gene expression. Plant Cell 14, 1963-1980.

Luscher, B., and Eisenman, R.N. (1990). New light on Myc and Myb. Part I. Myc. Genes Dev. 4, 2025-2035.

Ma, X.P., and Calabretta, B. (1994). DNA binding and transactivation activity of A-myb, a c-myb-related gene. Cancer Res. 54, 6512-6516.

Maeda, K., Kimura, S., Demura, T., Takeda, J., and Ozeki, Y. (2005). DcMYB1 acts as a transcriptional activator of the carrot phenylalanine ammonia-lyase gene (DcPAL1) in response to elicitor treatment, UV-B irradiation and the dilution effect. Plant Mol.Biol. 59, 739-752.

Maniatis, T., Goodbourn, S., and Fischer, J.A. (1987). Regulation of inducible and tissue-specific gene expression. Science 236, 1237-1245.

Mardis, E.R. (2007). ChIP-seq: welcome to the new frontier. Nat. Methods 4, 613-614.

Marian, C.O., Bordoli, S.J., Goltz, M., Santarella, R.A., Jackson, L.P., Danilevskaya, O., Beckstette, M., Meeley, R., and Bass, H.W. (2003). The maize Single myb histone 1 gene, Smh1, belongs to a novel gene family and encodes a protein that binds telomere DNA repeats in vitro. Plant Physiol. 133, 1336-1350.

Martin, C., and PazAres, J. (1997). MYB transcription factors in plants. Trends Genet. 13, 67-73.

166

Martin, C., Bhatt, K., Baumann, K., Jin, H., Zachgo, S., Roberts, K., Schwarz-Sommer, Z., Glover, B., and Perez-Rodrigues, M. (2002). The mechanics of cell fate determination in petals. Philos. Trans. R. Soc. Lond. Ser. B-Biol. Sci. 357, 809-813.

Massie, C.E., and Mills, I.G. (2008). ChIPping away at gene regulation. EMBO Rep. 9, 337-343.

Matsumoto, B. (2002). Cell biological applications of confocal microscopy. (San Diego ; London: Academic Press).

Maxwell, B.B., Andersson, C.R., Poole, D.S., Kay, S.A., and Chory, J. (2003). HY5, Circadian Clock-Associated 1, and a cis-element, DET1 dark response element, mediate DET1 regulation of chlorophyll a/b-binding protein 2 expression. Plant Physiol. 133, 1565-1577.

McDonnell, A.V., Jiang, T., Keating, A.E., and Berger, B. (2006). Paircoil2: improved prediction of coiled coils from sequence. Bioinformatics 22, 356-358.

Mehrtens, F., Kranz, H., Bednarek, P., and Weisshaar, B. (2005). The Arabidopsis transcription factor MYB12 is a flavonol-specific regulator of phenylpropanoid biosynthesis. Plant Physiol. 138, 1083-1096.

Meijsing, S.H., Pufall, M.A., So, A.Y., Bates, D.L., Chen, L., and Yamamoto, K.R. (2009). DNA Binding Site Sequence Directs Glucocorticoid Receptor Structure and Activity. Science 324, 407-410.

Melcher, K. (2000). A modular set of prokaryotic and eukaryotic expression vectors. Anal. Biochem. 277, 109-120.

Mellway, R.D., Tran, L.T., Prouse, M.B., Campbell, M.M., and Constabel, C.P. (2009). The Wound-, Pathogen-, and Ultraviolet B-Responsive MYB134 Gene Encodes an R2R3 MYB Transcription Factor That Regulates Proanthocyanidin Synthesis in Poplar. Plant Physiol. 150, 924-941.

Mena, M., Cejudo, F.J., Isabel-Lamoneda, I., and Carbonero, P. (2002). A role for the DOF transcription factor BPBF in the regulation of gibberellin-responsive genes in barley aleurone. Plant Physiol. 130, 111-119.

Meneses, E., Cardenas, H., Zarate, S., Brieba, L.G., Orozco, E., Lopez-Camarillo, C., and Azuara-Liceaga, E. (2010). The R2R3 Myb protein family in Entamoeba histolytica. Gene 455, 32-42.

Miranda, M., Ralph, S.G., Mellway, R., White, R., Heath, M.C., Bohlmann, J., and Constabel, C.P. (2007). The transcriptional response of hybrid poplar (Populus trichocarpa x P-deltoides) to infection by Melampsora medusae leaf rust involves induction of flavonoid pathway genes leading to the accumulation of proanthocyanidins. Molecular Plant-Microbe Interactions 20, 816-831.

167

Mizuguchi, G., Nakagoshi, H., Nagase, T., Nomura, N., Date, T., Ueno, Y., and Ishii, S. (1990). DNA binding activity and transcriptional activator function of the human B-myb protein compared with c-MYB. J. Biol. Chem. 265, 9280-9284.

Mohrmann, L., Kal, A.J., and Verrijzer, C.P. (2002). Characterization of the extended Myb-like DNA-binding domain of trithorax group protein zeste. J. Biol. Chem. 277, 47385-47392.

Mol, J., Grotewold, E., and Koes, R. (1998). How genes paint flowers and seeds. Trends Plant Sci. 3, 212-217.

Moore, B., Zhou, L., Rolland, F., Hall, Q., Cheng, W.H., Liu, Y.X., Hwang, I., Jones, T., and Sheen, J. (2003). Role of the Arabidopsis glucose sensor HXK1 in nutrient, light, and hormonal signaling. Science 300, 332-336.

Morita, A., Umemura, T., Kuroyanagi, M., Futsuhara, Y., Perata, P., and Yamaguchi, J. (1998). Functional dissection of a sugar-repressed alpha-amylase gene (RAmylA) promoter in rice embryos. FEBS Lett. 423, 81-85.

Morohashi, K., and Grotewold, E. (2009). A Systems Approach Reveals Regulatory Circuitry for Arabidopsis Trichome Initiation by the GL3 and GL1 Selectors. PLoS Genet. 5, 17.

Morohashi, K., Casas, M.I., Falcone Ferreyra, L., Mejia-Guerra, M.K., Pourcel, L., Yilmaz, A., Feller, A., Carvalho, B., Emiliani, J., Rodriguez, E., Pellegrinet, S., McMullen, M., Casati, P., and Grotewold, E. (2012). A genome-wide regulatory framework identifies maize pericarp color1 controlled genes. Plant Cell 24, 2745-2764.

Moyano, E., MartinezGarcia, J.F., and Martin, C. (1996). Apparent redundancy in Myb gene function provides gearing for the control of flavonoid biosynthesis in Antirrhinum flowers. Plant Cell 8, 1519-1532.

Mucenski, M.L., McLain, K., Kier, A.B., Swerdlow, S.H., Schreiner, C.M., Miller, T.A., Pietryga, D.W., Scott, W.J., and Potter, S.S. (1991). A functional c-myb gene is required for normal murine fetal hepatic hematopoiesis. Cell 65, 677-689.

Mukherjee, S., Berger, M.F., Jona, G., Wang, X.S., Muzzey, D., Snyder, M., Young, R.A., and Bulyk, M.L. (2004). Rapid analysis of the DNA-binding specificities of transcription factors with DNA microarrays. Nature Genet. 36, 1331-1339.

Muller, K.J., Romano, N., Gerstner, O., Garciamaroto, F., Pozzi, C., Salamini, F., and Rohde, W. (1995). The barley Hooded mutation caused by a duplication in a homeobox gene intron. Nature 374, 727-730.

Nakagoshi, H., Nagase, T., Kaneiishii, C., Ueno, Y., and Ishii, S. (1990). Binding of the c-myb proto-oncogene product to the simian virus 40 enhancer stimulates transcription. J. Biol. Chem. 265, 3479-3483.

168

Nesi, N., Jond, C., Debeaujon, I., Caboche, M., and Lepiniec, L. (2001). The Arabidopsis TT2 gene encodes an R2R3 MYB domain protein that acts as a key determinant for proanthocyanidin accumulation in developing seed. Plant Cell 13, 2099-2114.

Nesi, N., Debeaujon, I., Jond, C., Pelletier, G., Caboche, M., and Lepiniec, L. (2000). The TT8 gene encodes a basic helix-loop-helix domain protein required for expression of DFR and BAN genes in Arabidopsis siliques. Plant Cell 12, 1863-1878.

Newman, L.J., Perazza, D.E., Juda, L., and Campbell, M.M. (2004). Involvement of the R2R3-MYB, AtMYB61, in the ectopic lignification and dark-photomorphogenic components of the det3 mutant phenotype. Plant J. 37, 239-250.

Nishikawa, T., Okamura, H., Nagadoi, A., Konig, P., Rhodes, D., and Nishimura, Y. (2001). Solution structure of a telomeric DNA complex of human TRF1. Structure 9, 1237-1251.

Nomura, N., Takahashi, M., Matsui, M., Ishii, S., Date, T., Sasamoto, S., and Ishizaki, R. (1988). Isolation of human cDNA clones of myb-related genes, A-myb and B-myb. Nucleic Acids Res. 16, 11075-11089.

Oda, M., Furukawa, K., Sarai, A., and Nakamura, H. (1999). Kinetic analysis of DNA binding by the c-Myb DNA-binding domain using surface plasmon resonance. FEBS Lett. 454, 288-292.

Ogata, K., Kanai, H., Inoue, T., Sekikawa, A., Sasaki, M., Nagadoi, A., Sarai, A., Ishii, S., and Nishimura, Y. (1993). Solution structures of Myb DNA-binding domain and its complex with DNA. Nucleic acids symposium series, 201-202.

Ogata, K., Morikawa, S., Nakamura, H., Sekikawa, A., Inoue, T., Kanai, H., Sarai, A., Ishii, S., and Nishimura, Y. (1994). Solution structure of a specific DNA complex of the Myb DNA-binding domain with cooperative recognition helices. Cell 79, 639-648.

Ogata, K., Morikawa, S., Nakamura, H., Hojo, H., Yoshimura, S., Zhang, R.H., Aimoto, S., Ametani, Y., Hirata, Z., Sarai, A., Ishii, S., and Nishimura, Y. (1995). Comparison of the free and DNA-complexed forms of the DNA-binding domain from c-Myb. Nat. Struct. Biol. 2, 309-320.

Ong, S.J., Hsu, H.M., Liu, H.W., Chu, C.H., and Tai, J.H. (2006). Multifarious transcriptional regulation of adhesion protein gene ap65-1 by a novel Myb1 protein in the protozoan parasite Trichomonas vaginalis. Eukaryot. Cell 5, 391-399.

Ong, S.J., Hsu, H.M., Liu, H.W., Chu, C.H., and Tai, J.H. (2007). Activation of multifarious transcription of an adhesion protein ap65-1 gene by a novel Myb2 protein in the protozoan parasite Trichomonas vaginalis. J. Biol. Chem. 282, 6716-6725.

169

Oppenheimer, D.G., Herman, P.L., Sivakumaran, S., Esch, J., and Marks, M.D. (1991). A myb gene required for leaf trichome differentiation in Arabidopsis is expressed in stipules. Cell 67, 483-493.

Ording, E., Kvavik, W., Bostad, A., and Gabrielsen, O.S. (1994). Two functionally distinct half sites in the DNA-recognition sequence of the Myb oncoprotein. Eur. J. Biochem. 222, 113-120.

Osier, T.L., and Lindroth, R.L. (2001). Effects of genotype, nutrient availability, and defoliation on aspen phytochemistry and insect performance. Journal of Chemical Ecology 27, 1289-1313.

Osnato, M., Stile, M.R., Wang, Y.M., Meynard, D., Curiale, S., Guiderdoni, E., Liu, Y.X., Horner, D.S., Ouwerkerk, P.B.F., Pozzi, C., Muller, K.J., Salamini, F., and Rossini, L. (2010). Cross Talk between the KNOX and Ethylene Pathways Is Mediated by Intron-Binding Transcription Factors in Barley. Plant Physiol. 154, 1616-1632.

Osuna, D., Usadel, B., Morcuende, R., Gibon, Y., Blaesing, O.E., Hoehne, M., Guenter, M., Kamlage, B., Trethewey, R., Scheible, W.-R., and Stitt, M. (2007). Temporal responses of transcripts, enzyme activities and metabolites after adding sucrose to carbon-deprived Arabidopsis seedlings. Plant J. 49, 463-491.

Pabo, C.O., and Sauer, R.T. (1992). Transcription factors: structural families and principles of DNA recognition. Annu. Rev. Biochem. 61, 1053-1095.

Palo, R.T. (1984). Distribution of birch (Betula spp), willow (Salix spp), and poplar (Populus spp) secondary metabolites and their potential role as chemical defense against herbivores. Journal of Chemical Ecology 10, 499-520.

Patzlaff, A., Newman, L.J., Dubos, C., Whetten, R., Smith, C., McInnis, S., Bevan, M.W., Sederoff, R.R., and Campbell, M.M. (2003a). Characterisation of PtMYB1, an R2R3-MYB from pine xylem. Plant Mol.Biol. 53, 597-608.

Patzlaff, A., McInnis, S., Courtenay, A., Surman, C., Newman, L.J., Smith, C., Bevan, M.W., Mansfield, S., Whetten, R.W., Sederoff, R.R., and Campbell, M.M. (2003b). Characterisation of a pine MYB that regulates lignification. Plant J. 36, 743-754.

Pavletich, N.P., and Pabo, C.O. (1991). Zinc finger-DNA recognition: crystal structure of a Zif268-DNA complex at 2.1 A. Science 252, 809-817.

Pavletich, N.P., and Pabo, C.O. (1993). Crystal structure of a five-finger GLI-DNA complex: new perspectives on zinc fingers. Science 261, 1701-1707.

Pazares, J., Ghosal, D., Wienand, U., Peterson, P.A., and Saedler, H. (1987). The regulatory c1 locus of Zea mays encodes a protein with homology to myb proto-

170

oncogene products and with structural similarities to transcriptional activators. Embo J. 6, 3553-3558.

Pearl, I.A., and Darling, S.F. (1971). Studies on leaves of family Salicacear .16. Phenolic extractives of leaves of Populus balsamifera and of P. trichocarpa. Phytochemistry 10, 2844-&.

Pego, J.V., Kortstee, A.J., Huijser, G., and Smeekens, S.G.M. (2000). Photosynthesis, sugars and the regulation of gene expression. J. Exp. Bot. 51, 407-416.

Pelloux, J., Rusterucci, C., and Mellerowicz, E.J. (2007). New insights into pectin methylesterase structure and function. Trends Plant Sci. 12, 267-277.

Penfield, S., Meissner, R.C., Shoue, D.A., Carpita, N.C., and Bevan, M.W. (2001). MYB61 is required for mucilage deposition and extrusion in the Arabidopsis seed coat. Plant Cell 13, 2777-2791.

Peters, C.W.B., Sippel, A.E., Vingron, M., and Klempnauer, K.H. (1987). Drosophila and vertebrate myb proteins share two conserved regions, one of which functions as a DNA-binding domain. Embo J. 6, 3085-3090.

Peters, D.J., and Constabel, C.P. (2002). Molecular analysis of herbivore-induced condensed tannin synthesis: cloning and expression of dihydroflavonol reductase from trembling aspen (Populus tremuloides). Plant J. 32, 701-712.

Phan, H.A., Iacuone, S., Li, S.F., and Parish, R.W. (2011). The MYB80 Transcription Factor Is Required for Pollen Development and the Regulation of Tapetal Programmed Cell Death in Arabidopsis thaliana. Plant Cell 23, 2209-2224.

Pitt, C.W., Valente, L.P., Rhodes, D., and Simonsson, T. (2008). Identification and characterization of an essential telomeric repeat binding factor in fission yeast. J. Biol. Chem. 283, 2693-2701.

Prestridge, D.S. (1991). Signal scan - a computer-program that scans DNA-sequences for eukaryotic transcriptional elements. Computer Applications in the Biosciences 7, 203-206.

Prouse, M.B., and Campbell, M.M. (2012). The interaction between MYB proteins and their target DNA binding sites. Biochimica Et Biophysica Acta-Gene Regulatory Mechanisms 1819, 67-77.

Ptashne, M., and Gann, A. (1997). Transcriptional activation by recruitment. Nature 386, 569-577.

Punwani, J.A., Rabiger, D.S., and Drews, G.N. (2007). MYB98 positively regulates a battery of synergid-expressed genes encoding filiform apparatus-localized proteins. Plant Cell 19, 2557-2568.

171

Punwani, J.A., Rabiger, D.S., Lloyd, A., and Drews, G.N. (2008). The MYB98 subcircuit of the synergid gene regulatory network includes genes directly and indirectly regulated by MYB98. Plant J. 55, 406-414.

Ramirez, V., Agorio, A., Coego, A., Garcia-Andrade, J., Hernandez, M.J., Balaguer, B., Ouwerkerk, P.B.F., Zarra, I., and Vera, P. (2011). MYB46 Modulates Disease Susceptibility to Botrytis cinerea in Arabidopsis. Plant Physiol. 155, 1920-1935.

Ramsay, R.G., Ishii, S., and Gonda, T.J. (1991). Increase in specific DNA binding by carboxyl truncation suggests a mechanism for activation of Myb. Oncogene 6, 1875-1879.

Rawat, R., Schwartz, J., Jones, M.A., Sairanen, I., Cheng, Y.F., Andersson, C.R., Zhao, Y.D., Ljung, K., and Harmer, S.L. (2009). REVEILLE1, a Myb-like transcription factor, integrates the circadian clock and auxin pathways. Proc. Natl. Acad. Sci. U. S. A. 106, 16883-16888.

Riechmann, J.L., Heard, J., Martin, G., Reuber, L., Jiang, C.Z., Keddie, J., Adam, L., Pineda, O., Ratcliffe, O.J., Samaha, R.R., Creelman, R., Pilgrim, M., Broun, P., Zhang, J.Z., Ghandehari, D., Sherman, B.K., and Yu, C.L. (2000). Arabidopsis transcription factors: Genome-wide comparative analysis among eukaryotes. Science 290, 2105-2110.

Rippe, R.A., Lorenzen, S.I., Brenner, D.A., and Breindl, M. (1989). Regulatory elements in the 5'-flanking region and the 1st intron contribute to transcriptional control of the mouse alpha-1 type-i collagen gene. Mol. Cell. Biol. 9, 2224-2227.

Robertson, G., Hirst, M., Bainbridge, M., Bilenky, M., Zhao, Y.J., Zeng, T., Euskirchen, G., Bernier, B., Varhol, R., Delaney, A., Thiessen, N., Griffith, O.L., He, A., Marra, M., Snyder, M., and Jones, S. (2007). Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nat. Methods 4, 651-657.

Roche, P.J., Hoare, S.A., and Parker, M.G. (1992). A consensus DNA-binding site for the androgen receptor. Mol. Endocrinol. 6, 2229-2235.

Rogers, L.A., and Campbell, M.M. (2004). The genetic control of lignin deposition during plant growth and development. New Phytol. 164, 17-30.

Rogers, L.A., Dubos, C., Cullis, I.F., Surman, C., Poole, M., Willment, J., Mansfield, S.D., and Campbell, M.M. (2005). Light, the circadian clock, and sugar perception in the control of lignin biosynthesis. J. Exp. Bot. 56, 1651-1663.

Rogg, L.E., and Bartel, B. (2001). Auxin signaling: Derepression through regulated proteolysis. Developmental Cell 1, 595-604.

172

Roldan, M., Gomez-Mena, C., Ruiz-Garcia, L., Salinas, J., and Martinez-Zapater, J.M. (1999). Sucrose availability on the aerial part of the plant promotes morphogenesis and flowering of Arabidopsis in the dark. Plant J. 20, 581-590.

Rolland, F., Moore, B., and Sheen, J. (2002). Sugar sensing and signaling in plants. Plant Cell 14, S185-S205.

Rolland, F., Baena-Gonzalez, E., and Sheen, J. (2006). Sugar sensing and signaling in plants: Conserved and novel mechanisms. In Annual Review of Plant Biology, pp. 675-709.

Romano, J.M., Dubos, C., Prouse, M.B., Wilkins, O., Hong, H., Poole, M., Kang, K.Y., Li, E.Y., Douglas, C.J., Western, T.L., Mansfield, S.D., and Campbell, M.M. (2012). AtMYB61, an R2R3-MYB transcription factor, functions as a pleiotropic regulator via a small gene network. New Phytol. 195, 774-786.

Romero, I., Fuertes, A., Benito, M.J., Malpica, J.M., Leyva, A., and Paz-Ares, J. (1998). More than 80R2R3-MYB regulatory genes in the genome of Arabidopsis thaliana. Plant J. 14, 273-284.

Rose, A., Meier, I., and Wienand, U. (1999). The tomato I-box binding factor LeMYBI is a member of a novel class of Myb-like proteins. Plant J. 20, 641-652.

Rose, A.B. (2002). Requirements for intron-mediated enhancement of gene expression in Arabidopsis. RNA-Publ. RNA Soc. 8, 1444-1453.

Rose, A.B. (2008). Intron-Mediated Regulation of Gene Expression. Curr.Top.Microbiol.Immunol. 326, 277-290.

Rosinski, J.A., and Atchley, W.R. (1998). Molecular evolution of the Myb family of transcription factors: Evidence for polyphyletic origin. J. Mol. Evol. 46, 74-83.

Ruan, M.B., Liao, W.B., Zhang, X.C., Yu, X.L., and Peng, M. (2009). Analysis of the cotton sucrose synthase 3 (Sus3) promoter and first intron in transgenic Arabidopsis. Plant Sci. 176, 342-351.

Rushton, D.L., Tripathi, P., Rabara, R.C., Lin, J., Ringler, P., Boken, A.K., Langum, T.J., Smidt, L., Boomsma, D.D., Emme, N.J., Chen, X., Finer, J.J., Shen, Q.J., and Rushton, P.J. (2012). WRKY transcription factors: key components in abscisic acid signalling. Plant Biotechnology Journal 10, 2-11.

Ryu, K.H., Kang, Y.H., Park, Y.H., Hwang, D., Schiefelbein, J., and Lee, M.M. (2005). The WEREWOLF MYB protein directly regulates CAPRICE transcription during cell fate specification in the Arabidopsis root epidermis. Development 132, 4765-4775.

Sablowski, R.W.M., and Meyerowitz, E.M. (1998). A homolog of NO APICAL MERISTEM is an immediate target of the floral homeotic genes APETALA3/PISTILLATA. Cell 92, 93-103.

173

Sablowski, R.W.M., Baulcombe, D.C., and Bevan, M. (1995). Expression of a flower-specific Myb protein in leaf cells using a viral vector causes ectopic activation of a target promoter. Proc. Natl. Acad. Sci. U. S. A. 92, 6901-6905.

Sablowski, R.W.M., Moyano, E., Culianezmacia, F.A., Schuch, W., Martin, C., and Bevan, M. (1994). A flower-specific Myb protein activates transcription of phenylpropanoid biosynthetic genes. Embo J. 13, 128-137.

Saikumar, P., Murali, R., and Reddy, E.P. (1990). Role of tryptophan repeats and flanking amino acids in Myb-DNA interactions. Proc. Natl. Acad. Sci. U. S. A. 87, 8452-8456.

Sainz, M.B., Grotewold, E., and Chandler, V.L. (1997). Evidence for direct activation of an anthocyanin promoter by the maize C1 protein and comparison of DNA binding by related Myb domain proteins. Plant Cell 9, 611-625.

Sakura, H., Chie, K.I., Nagase, T., Nakagoshi, H., Gonda, T.J., and Ishii, S. (1989). Delineation of three functional domains of the transcriptional activator encoded by the c-myb protooncogene. Proc. Natl. Acad. Sci. U. S. A. 86, 5758-5762.

Sala, A., and Watson, R. (1999). B-Myb protein in cellular proliferation, transcription control, and cancer: Latest developments. J. Cell. Physiol. 179, 245-250.

Saleh, A., Alvarez-Venegas, R., and Avramova, Z. (2008). An efficient chromatin immunoprecipitation (ChIP) protocol for studying histone modifications in Arabidopsis plants. Nat. Protoc. 3, 1018-1025.

Samach, A., Onouchi, H., Gold, S.E., Ditta, G.S., Schwarz-Sommer, Z., Yanofsky, M.F., and Coupland, G. (2000). Distinct roles of CONSTANS target genes in reproductive development of Arabidopsis. Science 288, 1613-1616.

Santi, L., Wang, Y.M., Stile, M.R., Berendzen, K., Wanke, D., Roig, C., Pozzi, C., Muller, K., Muller, J., Rohde, W., and Salamini, F. (2003). The GA octodinucleotide repeat binding factor BBR participates in the transcriptional regulation of the homeobox gene Bkn3. Plant J. 34, 813-826.

Schaffer, R., Ramsay, N., Samach, A., Corden, S., Putterill, J., Carre, I.A., and Coupland, G. (1998). The late elongated hypocotyl mutation of Arabidopsis disrupts circadian rhythms and the photoperiodic control of flowering. Cell 93, 1219-1229.

Schellmann, S., Schnittger, A., Kirik, V., Wada, T., Okada, K., Beermann, A., Thumfahrt, J., Jurgens, G., and Hulskamp, M. (2002). TRIPTYCHON and CAPRICE mediate lateral inhibition during trichome and root hair patterning in Arabidopsis. Embo J. 21, 5036-5046.

Schmid, M., Davison, T.S., Henz, S.R., Pape, U.J., Demar, M., Vingron, M., Scholkopf, B., Weigel, D., and Lohmann, J.U. (2005). A gene expression map of Arabidopsis thaliana development. Nature Genet. 37, 501-506.

174

Schwartz, T., Rould, M.A., Lowenhaupt, K., Herbert, A., and Rich, A. (1999). Crystal structure of the Z alpha domain of the human editing enzyme ADAR1 bound to left-handed Z-DNA. Science 284, 1841-1845.

Schwechheimer, C., and Bevan, M. (1998). The regulation of transcription factor activity in plants. Trends Plant Sci. 3, 378-383.

Schweitzer, J.A., Bailey, J.K., Rehill, B.J., Martinsen, G.D., Hart, S.C., Lindroth, R.L., Keim, P., and Whitham, T.G. (2004). Genetically based trait in a dominant tree affects ecosystem processes. Ecology Letters 7, 127-134.

Seeliger, D., and de Groot, B.L. (2010). Ligand docking and binding site analysis with PyMOL and Autodock/Vina. J. Comput.-Aided Mol. Des. 24, 417-422.

Seguin, A., Laible, G., Leyva, A., Dixon, R.A., and Lamb, C.J. (1997). Characterization of a gene encoding a DNA-binding protein that interacts in vitro with vascular specific cis elements of the phenylalanine ammonia-lyase promoter. Plant Mol.Biol. 35, 281-291.

Seong, S.Y., and Choi, C.Y. (2003). Current status of protein chip development in terms of fabrication and application. Proteomics 3, 2176-2189.

Serpa, V., Vernal, J., Lamattina, L., Grotewold, E., Cassia, R., and Terenzi, H. (2007). Inhibition of AtMYB2 DNA-binding by nitric oxide involves cysteine S-nitrosylation. Biochem. Biophys. Res. Commun. 361, 1048-1053.

Sharma, S.B., and Dixon, R.A. (2005). Metabolic engineering of proanthocyanidins by ectopic expression of transcription factors in Arabidopsis thaliana. Plant J. 44, 62-75.

Shimazaki, K.-i., Doi, M., Assmann, S.M., and Kinoshita, T. (2007). Light regulation of stomatal movement. In Annual Review of Plant Biology, pp. 219-247.

Sinha, A.K., Hofmann, M.G., Romer, U., Kockenberger, W., Elling, L., and Roitsch, T. (2002). Metabolizable and non-metabolizable sugars activate different signal transduction pathways in tomato. Plant Physiol. 128, 1480-1489.

Smeekens, S. (2000). Sugar-induced signal transduction in plants. Annual Review of Plant Physiology and Plant Molecular Biology 51, 49-81.

Smith, A.M., and Stitt, M. (2007). Coordination of carbon supply and plant growth. Plant Cell and Environment 30, 1126-1149.

Solano, R., Nieto, C., and Pazares, J. (1995). MYB.Ph3 transcription factor from Petunia hybrida induces similar DNA-bending/distortions on its two types of binding site. Plant J. 8, 673-682.

Solano, R., Fuertes, A., SanchezPulido, L., Valencia, A., and PazAres, J. (1997). A single residue substitution causes a switch from the dual DNA binding specificity

175

of plant transcription factor MYB.Ph3 to the animal c-MYB specificity. J. Biol. Chem. 272, 2889-2895.

Solfanelli, C., Poggi, A., Loreti, E., Alpi, A., and Perata, P. (2006). Sucrose-specific induction of the anthocyanin biosynthetic pathway in Arabidopsis. Plant Physiol. 140, 637-646.

Solomon, M.J., Larsen, P.L., and Varshavsky, A. (1988). Mapping protein-DNA interactions in vivo with formaldehyde: evidence that histone H4 is retained on a highly transcribed gene. Cell 53, 937-947.

Steffens, N.O., Galuschka, C., Schindler, M., Bulow, L., and Hehl, R. (2004). AthaMap: an online resource for in silico transcription factor binding sites in the Arabidopsis thaliana genome. Nucleic Acids Res. 32, D368-D372.

Steffens, N.O., Galuschka, C., Schindler, M., Bulow, L., and Hehl, R. (2005). AthaMap web tools for database-assisted identification of combinatorial cis-regulatory elements and the display of highly conserved transcription factor binding sites in Arabidopsis thaliana. Nucleic Acids Res. 33, W397-W402.

Stenman, G., Andersson, M.K., and Andren, Y. (2010). New tricks from an old oncogene Gene fusion and copy number alterations of MYB in human cancer. Cell Cycle 9, 2986-2995.

Stevens, M.T., and Lindroth, R.L. (2005). Induced resistance in the indeterminate growth of aspen (Populus tremuloides). Oecologia 145, 298-306.

Stitt, M., Gibon, Y., Lunn, J.E., and Piques, M. (2007). Multilevel genomics analysis of carbon signalling during low carbon availability: coordinating the supply and utilisation of carbon in a fluctuating environment. Functional Plant Biology 34, 526-549.

Stobergrasser, U., Brydolf, B., Bin, X., Grasser, F., Firtel, R.A., and Lipsick, J.S. (1992). The Myb DNA-binding domain is highly conserved in Dictyostelium discoideum. Oncogene 7, 589-596.

Stracke, R., Werber, M., and Weisshaar, B. (2001). The R2R3-MYB gene family in Arabidopsis thaliana. Curr. Opin. Plant Biol. 4, 447-456.

Stracke, R., Ishihara, H., Barsch, G.H.A., Mehrtens, F., Niehaus, K., and Weisshaar, B. (2007). Differential regulation of closely related R2R3-MYB transcription factors controls flavonol accumulation in different parts of the Arabidopsis thaliana seedling. Plant J. 50, 660-677.

Sugimoto, K., Takeda, S., and Hirochika, H. (2000). MYB-related transcription factor NtMYB2 induced by wounding and elicitors is a regulator of the tobacco retrotransposon Tto1 and defense-related genes. Plant Cell 12, 2511-2527.

176

Sulpice, R., Pyl, E.-T., Ishihara, H., Trenkamp, S., Steinfath, M., Witucka-Wall, H., Gibon, Y., Usadel, B., Poree, F., Piques, M.C., Von Korff, M., Steinhauser, M.C., Keurentjes, J.J.B., Guenther, M., Hoehne, M., Selbig, J., Fernie, A.R., Altmann, T., and Stitt, M. (2009). Starch as a major integrator in the regulation of plant growth. Proc. Natl. Acad. Sci. U. S. A. 106, 10348-10353.

Sun, C.H., Palm, D., McArthur, A.G., Svard, S.G., and Gillin, F.D. (2002). A novel Myb-related protein involved in transcriptional activation of encystation genes in Giardia lamblia. Mol. Microbiol. 46, 971-984.

Suzuki, A., Wu, C.Y., Washida, H., and Takaiwa, F. (1998). Rice MYB protein OSMYB5 specifically binds to the AACA motif conserved among promoters of genes for storage protein glutelin. Plant Cell Physiol. 39, 555-559.

Tahirov, T.H., Sasaki, M., Inoue-Bungo, T., Fujikawa, A., Sato, K., Kumasaka, T., Yamamoto, M., and Ogata, K. (2001). Crystals of ternary protein-DNA complexes composed of DNA-binding domains of c-Myb or v-Myb, C/EBP alpha or C/EBP beta and tom-1A promoter fragment. Acta Crystallogr. Sect. D-Biol. Crystallogr. 57, 1655-1658.

Tahirov, T.H., Sato, K., Ichikawa-Iwata, E., Sasaki, M., Inoue-Bungo, T., Shiina, M., Kimura, K., Takata, S., Fujikawa, A., Morii, H., Kumasaka, T., Yamamoto, M., Ishii, S., and Ogata, K. (2002). Mechanism of c-Myb-C/EBP beta cooperation from separated sites on a promoter. Cell 108, 57-70.

Tallman, G. (2004). Are diurnal patterns of stomatal movement the result of alternating metabolism of endogenous guard cell ABA and accumulation of ABA delivered to the apoplast around guard cells by transpiration? J. Exp. Bot. 55, 1963-1976.

Tamagnone, L., Merida, A., Parr, A., Mackay, S., Culianez-Macia, F.A., Roberts, K., and Martin, C. (1998). The AmMYB308 and AmMYB330 transcription factors from antirrhinum regulate phenylpropanoid and lignin biosynthesis in transgenic tobacco. Plant Cell 10, 135-154.

Tamura, K., Dudley, J., Nei, M., and Kumar, S. (2007). MEGA4: Molecular evolutionary genetics analysis (MEGA) software version 4.0. Mol. Biol. Evol. 24, 1596-1599.

Tanikawa, J., Yasukawa, T., Enari, M., Ogata, K., Nishimura, Y., Ishii, S., and Sarai, A. (1993). Recognition of specific DNA sequences by the c-myb protooncogene product: role of three repeat units in the DNA-binding domain. Proc. Natl. Acad. Sci. U. S. A. 90, 9320-9324.

Telfer, A., Bollman, K.M., and Poethig, R.S. (1997). Phase change and the regulation of trichome distribution in Arabidopsis thaliana. Development 124, 645-654.

Tiessen, A., Prescha, K., Branscheid, A., Palacios, N., McKibbin, R., Halford, N.G., and Geigenberger, P. (2003). Evidence that SNF1-related kinase and hexokinase are involved in separate sugar-signalling pathways modulating post-

177

translational redox activation of ADP-glucose pyrophosphorylase in potato tubers. Plant J. 35, 490-500.

Toufighi, K., Brady, S.M., Austin, R., Ly, E., and Provart, N.J. (2005). The Botany Array Resource: e-Northerns, Expression Angling, and Promoter analyses. Plant J. 43, 153-163.

Treisman, R., Marais, R., and Wynne, J. (1992). Spatial flexibility in ternary complexes between SRF and its accessory proteins. Embo J. 11, 4631-4640.

Tsai, C.-J., Harding, S.A., Tschaplinski, T.J., Lindroth, R.L., and Yuan, Y. (2006). Genome-wide analysis of the structural genes regulating defense phenylpropanoid metabolism in Populus. New Phytol. 172, 47-62.

Uimari, A., and Strommer, J. (1997). Myb26: a MYB-like protein of pea flowers with affinity for promoters of phenylpropanoid genes. Plant J. 12, 1273-1284.

Urao, T., Yamaguchishinozaki, K., Urao, S., and Shinozaki, K. (1993). An Arabidopsis myb homolog is induced by dehydration stress and its gene product binds to the conserved MYB recognition sequence. Plant Cell 5, 1529-1539.

Usadel, B., Blaesing, O.E., Gibon, Y., Retzlaff, K., Hoehne, M., Guenther, M., and Stitt, M. (2008). Global transcript levels respond to small changes of the carbon status during progressive exhaustion of carbohydrates in Arabidopsis rosettes. Plant Physiol. 146, 1834-1861.

Vakoc, C.R., Letting, D.L., Gheldof, N., Sawado, T., Bender, M.A., Groudine, M., Weiss, M.J., Dekker, J., and Blobel, G.A. (2005). Proximity amona distant reaulatory elements at the beta-globin locus requires GATA-1 and FOG-1. Mol. Cell 17, 453-462.

Vavouri, T., and Elgar, G. (2005). Prediction of cis-regulatory elements using binding site matrices - the successes, the failures and the reasons for both. Curr. Opin. Genet. Dev. 15, 395-402.

Verrijdt, G., Haelens, A., and Claessens, F. (2003). Selective DNA recognition by the androgen receptor as a mechanism for hormone-specific regulation of gene expression. Mol. Genet. Metab. 78, 175-185.

Vicente, C., Conchillo, A., Pauwels, D., Vazquez, I., Garcia-Orti, L., Calasanz, M.J., Lahortiga, I., Cools, J., and Odero, M.D. (2009). MYB Overexpression Is Directly Involved in Acute Myeloid Leukemia Pathogenesis and Could Constitute a New Therapeutic Target for Patients with Aberrant Expression of This Gene. Blood 114, 948-948.

Wagner, D., Sablowski, R.W.M., and Meyerowitz, E.M. (1999). Transcriptional activation of APETALA1 by LEAFY. Science 285, 582-584.

178

Wang, Q.F., Lauring, J., and Schlissel, M.S. (2000). c-Myb binds to a sequence in the proximal region of the RAG-2 promoter and is essential for promoter activity in T-lineage cells. Mol. Cell. Biol. 20, 9203-9211.

Wang, S., Wang, J.W., Yu, N., Li, C.H., Luo, B., Gou, J.Y., Wang, L.J., and Chen, X.Y. (2004). Control of plant trichome development by a cotton fiber MYB gene. Plant Cell 16, 2323-2334.

Wang, Z.Y., and Tobin, E.M. (1998). Constitutive expression of the CIRCADIAN CLOCK ASSOCIATED 1 (CCA1) gene disrupts circadian rhythms and suppresses its own expression. Cell 93, 1207-1217.

Watson, R.J., Robinson, C., and Lam, E.W.F. (1993). Transcription regulation by murine B-myb is distinct from that by c-myb. Nucleic Acids Res. 21, 267-272.

Weisshaar, B., and Jenkins, G.I. (1998). Phenylpropanoid biosynthesis and its regulation. Curr. Opin. Plant Biol. 1, 251-257.

Weston, K. (1992). Extension of the DNA binding consensus of the chicken c-Myb and v-Myb proteins. Nucleic Acids Res. 20, 3042-3049.

Whitham, T.G., Bailey, J.K., Schweitzer, J.A., Shuster, S.M., Bangert, R.K., LeRoy, C.J., Lonsdorf, E.V., Allan, G.J., DiFazio, S.P., Potts, B.M., Fischer, D.G., Gehring, C.A., Lindroth, R.L., Marks, J.C., Hart, S.C., Wimp, G.M., and Wooley, S.C. (2006). A framework for community and ecosystem genetics: from genes to ecosystems. Nature Reviews Genetics 7, 510-523.

Wilkins, O., Nahal, H., Foong, J., Provart, N.J., and Campbell, M.M. (2009). Expansion and Diversification of the Populus R2R3-MYB Family of Transcription Factors. Plant Physiol. 149, 981-993.

Williams, C.E., and Grotewold, E. (1997). Differences between plant and animal myb domains are fundamental for DNA binding activity, and chimeric Myb domains have novel DNA binding specificities. J. Biol. Chem. 272, 563-571.

Winkel-Shirley, B. (2001). Flavonoid biosynthesis. A colorful model for genetics, biochemistry, cell biology, and biotechnology. Plant Physiol. 126, 485-493.

Wong, M.W., Henry, R.W., Ma, B.C., Kobayashi, R., Klages, N., Matthias, P., Strubin, M., and Hernandez, N. (1998). The large subunit of basal transcription factor SNAP(C) is a Myb domain protein that interacts with Oct-1. Mol. Cell. Biol. 18, 368-377.

Wright, W.E., Binder, M., and Funk, W. (1991). Cyclic amplification and selection of targets (CASTing) for the myogenin consensus binding site. Mol. Cell. Biol. 11, 4104-4110.

179

Xiang, Q.J., and Judelson, H.S. (2010). Myb transcription factors in the oomycete Phytophthora with novel diversified DNA-binding domains and developmental stage-specific expression. Gene 453, 1-8.

Xiao, W.Y., Sheen, J., and Jang, J.C. (2000). The role of hexokinase in plant sugar signal transduction and growth and development. Plant Mol.Biol. 44, 451-461.

Xie, D.Y., and Dixon, R.A. (2005). Proanthocyanidin biosynthesis - still more questions than answers? Phytochemistry 66, 2127-2144.

Xie, Z.D., Lee, E., Lucas, J.R., Morohashi, K., Li, D.M., Murray, J.A.H., Sack, F.D., and Grotewold, E. (2010). Regulation of Cell Proliferation in the Stomatal Lineage by the Arabidopsis MYB FOUR LIPS via Direct Targeting of Core Cell Cycle Genes. Plant Cell 22, 2306-2321.

Xue, G.P. (2005). A CELD-fusion method for rapid determination of the DNA-binding sequence specificity of novel plant DNA-binding proteins. Plant J. 41, 638-649.

Yang, H., Chung, H.J., Yong, T., Lee, B.H., and Park, S. (2003a). Identification of an encystation-specific transcription factor, Myb protein in Giardia lamblia. Mol. Biochem. Parasitol. 128, 167-174.

Yang, S.C., Sweetman, J.P., Amirsadeghi, S., Barghchi, M., Huttly, A.K., Chung, W.I., and Twell, D. (2001). Novel anther-specific myb genes from tobacco as putative regulators of phenylalanine ammonia-lyase expression. Plant Physiol. 126, 1738-1753.

Yang, T., Perasso, R., and Baroin-Tourancheau, A. (2003b). Myb genes in ciliates: A common origin with the myb protooncogene? Protist 154, 229-238.

Yang, Y.O., and Klessig, D.F. (1996). Isolation and characterization of a tobacco mosaic virus-inducible myb oncogene homolog from tobacco. Proc. Natl. Acad. Sci. U. S. A. 93, 14972-14977.

Yanhui, C., Xiaoyuan, Y., Kun, H., Meihua, L., Jigang, L., Zhaofeng, G., Zhiqiang, L., Yunfei, Z., Xiaoxiao, W., Xiaoming, Q., Yunping, S., Li, Z., Xiaohui, D., Jingchu, L., Xing-Wang, D., Zhangliang, C., Hongya, G., and Li-Jia, Q. (2006). The MYB transcription factor superfamily of Arabidopsis: expression analysis and phylogenetic comparison with the rice MYB family. Plant Mol.Biol. 60, 107-124.

Yi, J.X., Derynck, M.R., Li, X.Y., Telmer, P., Marsolais, F., and Dhaubhadel, S. (2010). A single-repeat MYB transcription factor, GmMYB176, regulates CHS8 gene expression and affects isoflavonoid biosynthesis in soybean. Plant J. 62, 1019-1034.

Yu, E.Y., Yen, W.F., Steinberg-Neifach, O., and Lue, N.F. (2010). Rap1 in Candida albicans: an Unusual Structural Organization and a Critical Function in Suppressing Telomere Recombination. Mol. Cell. Biol. 30, 1254-1268.

180

Yu, O., and McGonigle, B. (2005). Metabolic engineering of isoflavone biosynthesis. In Advances in Agronomy, Volume 86, D.L. Sparks, ed, pp. 147-190.

Zheng, Y.M., Ren, N., Wang, H., Stromberg, A.J., and Perry, S.E. (2009). Global Identification of Targets of the Arabidopsis MADS Domain Protein AGAMOUS-Like15. Plant Cell 21, 2563-2577.

Zhong, M., Niu, W., Lu, Z.J., Sarov, M., Murray, J.I., Janette, J., Raha, D., Sheaffer, K.L., Lam, H.Y.K., Preston, E., Slightham, C., Hillier, L.W., Brock, T., Agarwal, A., Auerbach, R., Hyman, A.A., Gerstein, M., Mango, S.E., Kim, S.K., Waterston, R.H., Reinke, V., and Snyder, M. (2010). Genome-Wide Identification of Binding Sites Defines Distinct Functions for Caenorhabditis elegans PHA-4/FOXA in Development and Environmental Response. PLoS Genet. 6, 13.

Zhong, R., Richardson, E.A., and Ye, Z.-H. (2007). The MYB46 transcription factor is a direct target of SND1 and regulates secondary wall biosynthesis in Arabidopsis. Plant Cell 19, 2776-2792.

Zhong, R., Lee, C., Zhou, J., McCarthy, R.L., and Ye, Z.-H. (2008). A Battery of Transcription Factors Involved in the Regulation of Secondary Cell Wall Biosynthesis in Arabidopsis. Plant Cell 20, 2763-2782.

Zhou, J.L., Lee, C.H., Zhong, R.Q., and Ye, Z.H. (2009). MYB58 and MYB63 Are Transcriptional Activators of the Lignin Biosynthetic Pathway during Secondary Cell Wall Formation in Arabidopsis. Plant Cell 21, 248-266.

Zhou, L., Jang, J.C., Jones, T.L., and Sheen, J. (1998). Glucose and ethylene signal transduction crosstalk revealed by an Arabidopsis glucose-insensitive mutant. Proc. Natl. Acad. Sci. U. S. A. 95, 10294-10299.

Zhu, Z., An, F., Feng, Y., Li, P., Xue, L., Mu, A., Jiang, Z., Kim, J.-M., To, T.K., Li, W., Zhang, X., Yu, Q., Dong, Z., Chen, W.-Q., Seki, M., Zhou, J.-M., and Guo, H. (2011). Derepression of ethylene-stabilized transcription factors (EIN3/EIL1) mediates jasmonate and ethylene signaling synergy in Arabidopsis. Proc. Natl. Acad. Sci. U. S. A. 108, 12539-12544.

Zimmermann, I.M., Heim, M.A., Weisshaar, B., and Uhrig, J.F. (2004). Comprehensive identification of Arabidopsis thaliana MYB transcription factors interacting with R/B-like BHLH proteins. Plant J. 40, 22-34.

181

Copyright Acknowledgements

Statement of Publications

The research presented in this thesis has appeared or has been submitted as a series

of original publications in refereed journals.

Chapter 1

Prouse M.B., and Campbell M.M. (2012) The interaction between MYB proteins and

their target DNA binding sites. Biochimica Et Biophysica Acta-Gene Regulatory

Mechanisms. 1819: 67-77.

Chapter 2

Romano J, Dubos, C., Prouse, M.B., Wilkins, O., Hong, H., Poole, M., Kang, K., Li, E., ,

Douglas, C.J., Western, T.L., Mansfield, S.D., and Campbell, M.M. (2012) AtMYB61, an

R2R3-MYB transcription factor, is a pleiotropic regulator of plant carbon acquisition and

resource allocation. New Phytologist. 195: 774-786.

Chapter 3

Prouse M.B., and Campbell M.M. (2013) Interactions between the R2R3-MYB

transcription factor, AtMYB61, and target DNA binding sites. PLOS ONE. 8(5): e65132.

Appendix

Mellway, R.D., Tran L.T., Prouse, M.B., Campbell, M.M., and Constabel, C.P. (2009)

The Wound-, Pathogen-, and Ultraviolet B-Responsive MYB134 Gene Encodes an

R2R3 MYB Transcription Factor That Regulates Proanthocyanidin Synthesis in Poplar.

Plant Physiology. 150: 924-941.