49
UNIVERSIDADE DE LISBOA FACULDADE DE CIÊNCIAS DEPARTMENT OF VEGETAL BIOLOGY DEVELOPMENT OF TAL EFFECTOR RECOMBINASE FUSION PROTEINS FOR GENOME ENGINEERING Lígia Marisa Sampaio Pina Supervisors: Prof. Dr. Rui Gomes (DBV/FCUL) Prof. Dr. Frank Buchholz (TUD, MPI-CBG) DISSERTAÇÃO MASTER IN MOLECULAR BIOLOGY AND GENETICS 2012/2013

DEVELOPMENT OF TAL EFFECTOR RECOMBINASE FUSION PROTEINS ...repositorio.ul.pt/bitstream/10451/9647/1/ulfc103133_tm_ligia_pina.pdf · a capacidade do direcionamento das recombinases

  • Upload
    hadang

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

UNIVERSIDADE DE LISBOA

FACULDADE DE CIÊNCIAS

DEPARTMENT OF VEGETAL BIOLOGY

DEVELOPMENT OF TAL EFFECTOR –

RECOMBINASE FUSION PROTEINS FOR GENOME

ENGINEERING

Lígia Marisa Sampaio Pina

Supervisors:

Prof. Dr. Rui Gomes (DBV/FCUL)

Prof. Dr. Frank Buchholz (TUD, MPI-CBG)

DISSERTAÇÃO

MASTER IN MOLECULAR BIOLOGY AND GENETICS

2012/2013

Classification: _________________________________________

1º ____________________________________________________

2º ____________________________________________________

3º ____________________________________________________

4º ____________________________________________________

5º ____________________________________________________

_______/_______/________________

Internal supervisor

Prof. Dr. Rui Artur Paiva Loureiro Gomes

Department of Vegetal Biology, University of Lisbon, Faculty of Science

External supervisor

Prof. Dr. Frank Buchholz

Head of UCC Section Systems Biology, Medical Faculty and University Hospital Carl

Custav Carus

“Where the world ceases to be the scene of our

personal hopes and wishes, where we face it

as free beings admiring, asking and observing,

there we enter the realm of Art and Science”

Albert Einstein

Acknowledgements

I would like to express my gratefulness to the following people for their contribution

and help during my thesis work.

In first place to Prof. Dr. FRANK BUCHHOLZ for accepting me in his lab giving me

the opportunity to work and develop this project, for the continuous support, for the

discussions and ideas on our weekly meetings, for critically reading this thesis and above all

my sincere gratitude for the confidence that he placed in me before he even know me

personally.

Prof. Dr. RUI GOMES for accepting to be my internal supervisor, for all the help,

support and motivation during the development of this project, for critically reading this

thesis, for the availability and sympathy he always gave me.

Dr. MADINA KARIMOVA for supervising my thesis work, for the discussions and

ideas on our daily dialogues, for all the support and motivation, for making me believe in this

project and in my qualities, for meticulous reading this report. I would like to thank her

specially for being such a good friend and wonderful human being.

I am very grateful to Dr. DENNIS KAPPEI, JOVAN MIRCETIC, and Dr. INA

WEIßWANGE for the support in scientific and daily challenges.

I would like to thank VICTORIA SPLITH (Vicky), ARNE JAHN (Arni) and

STANISLAVA POPOVA (Stani) for being more than colleagues, for fun moments, late nights,

special meals and candies but above all for the sincere friendship.

I am also very thankful to all the Buchholz Lab members for the sympathy and

companionship. I am so grateful to you all for the cups of tea shared in the kitchen after

lunch.

My roommates LINDA SCHUBERT and PATRICK JAHN that have become my family

during this period in Dresden, a very special THANK YOU. I don’t have works to explain how

much you two were important for my happiness.

My best friends and boyfriend, CARLOS, for visiting me in Dresden even when

outside was approximately -20 ºC, for all the Skype conversations and text messages we

share, that helped me feel closer to home.

At last but not least, I would like to thank my family for all their love. To my mom, dad

and sister for their perpetual support, motivation, trust, funny moments, preoccupation and

love.

Abstract

Genome engineering is currently entering the phase of genetic manipulation in vivo.

Several important discoveries have made this progress possible, such as the development of

site-specific recombinases (SSRs) and the discovery of TAL effector (TALE) proteins a novel

type of DNA-binding proteins.

SSRs have proved to be reliable and widely used engineering tools to manipulate

DNA in vitro and in vivo. They posses unique ability to mediate efficient and precise

integration, deletion or inversion of defined DNA segments. Hyperactivated variants of the

resolvase/invertase family of serine recombinases do not required accessory factors for

recombination. Thereby they can be re-targeted to sequences of interest by replacing native

DNA-binding domains (DBDs) with engineered TALE proteins, generating a chimeric TALE

recombinase (TALER) with programmable sequence specificity.

Here we described a chimeric TALE recombinase assembled in order to mediate site-

specific recombination on novel sequences. We engineered a fusion between a

hyperactivated catalytic domain from the Tn3 resolvase and a functional DBD assembled

utilizing the customized TALE design. We use three different truncated TALE variants to

generate diverse TALER constructs and six different target sites.

TALE domain was assembled and tested beforehand through its fusion with a

nuclease protein. We have shown that TALE DNA-binding domain works efficiently and when

fused to a nonspecific cleavage domain is able to introduce DNA strand breaks at specific

genomic sequence in human cells. We also show that current designs of TALERs did not

mediate site-specific recombination on the predicted target sites when tested in bacterial

cells. We discuss some reasons that might be relevant for the obtain results.

Although TALER fusions described herein did not produce a functional variant. This

work demonstrates that further optimization of several technical details can be made. The

creation of novel recombinases domains promises significantly expanding the targeting

capacity of engineered recombinases in genome engineering and even gene therapy

treatments.

Key words: Genome engineering, chimeric nuclease, DNA double-strand break

(DSB), site-specific recombination, transcription activator-like effector (TALE)

Resumo

A engenharia genómica encontra-se a entrar numa fase de manipulação genética in

vivo. Diversas descobertas foram feitas de forma a tornar possível este progresso, tais

como, o desenvolvimento das recombinases sítio-específicas (SSRs) e a descoberta das

proteínas TAL-effector (TALE), um novo tipo de proteínas de ligação ao DNA.

As recombinases sítio-específicas demonstraram ser viáveis quando usadas como

ferramentas para a manipulação do DNA in vitro e in vivo. Estas recombinases possuem

qualidades únicas para mediar de forma eficiente e precisa a integração, deleção ou

inversão de segmentos de DNA. Variáveis hiperativadas de resolvases/invertases da família

das serina recombinases não requerem factores acessórios para a recombinação. Deste

modo, podem ser redirecionados para sequências de interesse através da troca do domínio

de ligação ao DNA (DBD), por uma proteína TALE, originando, uma recombinase quimérica

(TALER) com especificidade programada para uma sequencia alvo.

Neste trabalho é descrita uma recombinase quimérica TALE construída de forma a

mediar a recombinação sítio-específica em novas sequencias. Nós promovemos a fusão

entre um domínio catalítico hiperativo da resolvase Tn3 com um domínio funcional de

ligação ao DNA, utilizando um design TALE customizado. Foram desenhadas três variantes

truncadas da proteína TALE, para gerar diversos TALERs, e seis locais alvo diferentes.

O domínio TALE foi construído e testado em antemão através da sua fusão com uma

proteína nuclease. Foi demonstrado que o domínio TALE de ligação ao DNA quando

acoplado a um domínio de clivagem não específico trabalha, eficientemente, de forma a

introduzir quebras de cadeia dupla em células humanas. Também demonstramos que os

TALERs correntemente utilizados, não foram capazes de mediar recombinação sítio-

específica nos locais alvos preditos quando testados em células bacterianas. Nós discutimos

algumas das razões que podem ser relevantes para os resultados obtido.

Embora, as fusões TALER descritas neste trabalho não resultaram numa variante

funcional. Este trabalho demonstra que uma optimização de diversos detalhes técnicas pode

ser feita. A criação de novas recombinases quiméricas promete expandir significativamente

a capacidade do direcionamento das recombinases em áreas como a engenharia genómica

e tratamentos de terapia génica.

Palavras chave: Engenharia genómica, recombinação sítio-específica, ativador de

transcrição do tipo efector (TALE), nuclease quimérica, quebra de cadeia dupla de DNA.

I

Table of Contents

Chapter I - Introduction 1

1 – Targeted genome engineering 1

1.1 – Turning genetic manipulations into genome engineering 1

1.2 – Site-specific recombination as a powerful tool for genome engineering 1

1.2.1 – Two families of recombinases 2

1.2.2 – Application of SSR 4

2 – Targeted DNA-binding proteins - zinc finger proteins 4

3 – Transcription activator-like effectors (TALEs) - a novel DNA binding domain 7

3.1 – TALE-nucleases for targeted DNA 8

3.2 – Chimeric TALE recombinases 9

Chapter II – Objectives 11

Chapter III – Materials and Methods 12

1 – Materials 12

1.1 – Enzymes, reagents and kits 12

1.2 – Bacterial reagents 12

1.3 – Plasmids 12

1.4 – Syntethic oligonucleotides 12

1.5 – Sequencing 13

2 – Methods 13

2.1 – TALEN construction 13

2.1.1 – Golden Gate Assembly Protocol 13

2.1.2 – Electrocompetent cells 14

2.1.2.1 – Transformation 14

2.1.3 – Optimized transfection protocol for U2OS cells 15

2.1.3.1 – PCR amplification and sequence verification 15

2.1.4 – Optimized conditions for T7 Endonuclease 1 assay 15

2.1.5 – Agarose Gel electrophoresis 15

2.2 – Chimeric Recombinase 16

2.2.1 – Cloning protocol of the Tn3 catalytic domain into pEVO vector 16

2.2.2 – Cloning protocol of the TALE truncations into pEVO-TN3 plasmid 16

Chapter IV – Results and Discussion 17

II

Part I – Design and assemble of custom-made TALEN to target HOT1 gene in the

human genome 17

1 – TALEN design 17

1.2 – HOT1-TALEN binding sites and spacer regions 17

2 – In vivo assay of TALEN activity 18

2.1 – Expression of custom HOT1-TALEN in human cells 18

2.2 – T7 endonuclease I (T7E1) assay for detection of mutagenesis rate 19

2.3- Mutation rate detection in human cells tretaed with HOT1-TALEN 21

Part II – TAL-effector Recombinase design and assembly 23

3 – TALER architecture 23

3.1 – Activated resolvase mutant 24

3.2 – Designed truncations 25

4 – Design of the recombination target sites 26

5 – TALER activity in bacterial cells 27

Chapter V – Conclusions 29

Chapter VI – Bibliography 30

Supplements 33

III

Table of Figures

Figure 1. Scheme representation of site-specific recombination. .......................................... 2

Figure 2. Tn3 resolvase-mediated site-specific recombination .............................................. 4

Figure 3. Scheme for the case of ZFN cleavage ................................................................... 6

Figure 4. TALE-protein structure ........................................................................................... 7

Figure 5. TALEN structure and mediated genome editing in human cells ............................. 9

Figure 6. Fusion orientation of a TALER ............................................................................. 10

Figure 7. Schemes of the principal and target sequence of HOT1-TALEN.......................... 18

Figure 8. Transfection efficiency of U2OS cells .................................................................. 19

Figure 9. In vivo TALEN-induced genome editing ............................................................... 20

Figure 10. T7E1 plasmid assay .......................................................................................... 21

Figure 11. Targeted genome editing in human cells at specific target site .......................... 22

Figure 12. TALEN efficiency at the HOT1 genomic locus ................................................... 22

Figure 13. TALER fusion orientation ................................................................................... 24

Figure 15. TALE truncations variant designs ...................................................................... 25

Figure 16. Scheme of the chimeric recombination target sites ............................................ 26

Figure 17. Scheme of the recombination assay .................................................................. 27

Figure 18. Recombination assay results ............................................................................. 28

Supplements Table 1. Primers used for vector’s construction and recombination assay

……...…………………………………………………………………………………………………33

Supplements Table 2. TALEN construct assembly timeline…………………………………..34

Supplements Figure 1. Mammalian expression vector………………………………………..35

Supplements Figure 2. APC-TALEN target sequence amplified with PCR from human

cells…………………………………………………………………………………………………...35

Supplements Figure 3. HOT1-TALEN target sequence amplified with PCR from human

cells……………………………………………………………………………………………...……36

Supplements Figure 4. Recombination sites………………………………………………...…36

Supplements Figure 5. Primers INT1, INT2 and INT4 orientation…………………………...37

Supplements Figure 6. Gel analysis from PCR result using INT2 and INT4 primer…........38

IV

Abbreviations

ºC – Degrees

μl – Microliters

A

ATP – Adenosine triphosphate

B

bp – Base pairs

D

DBD – DNA-binding domain

DSB – Double strand break

G

g – Gravitational acceleration

H

HR – Homologous recombination

h – Hours

K

kb – Kilobase

M

min – Minutes

ml – Millilitres

mm – Millimeters

N

NEB – New England Biolabs

ng – Nanograms

NHEJ – Non-homologous end-joining

O

OD – Odds ratio

ON – Overnight

R

Rpm – Rotations per minute

S

sec – Seconds

SSRs – Site-specific recombinases

T

T7E1 – T7 endonuclease 1

TALE – Transcription activator-like

effector

TALEN – Transcription activator-like

effector nuclease

TALER – Transcription activator-like

effector recombinase

U

U – Units

V

V – Volts

Z

ZF – Zinc finger

ZFN – Zinc finger nuclease

ZFR – Zinc finger recombinase

Chapter I – Introduction

1

Chapter I - Introduction

1 – Targeted genome engineering

1.1 – Turning genetic manipulations into genome engineering

The discovery of restriction enzymes (Roberts, 1976) and DNA ligases (Lehnman,

1974) were important achievements to the manipulation of the DNA in vitro engaged to

obtain a better molecular understanding of biology and medicine. These enzymes are able to

recognize and cut small target sequences (4-8 bp in length). Although restriction enzymes

and ligases are widely used to for modifying DNA sequences in vitro, they are not applicable

to alter the genomic sequences of the living organisms (in vivo). The recognition sites for

these enzymes are likely widespread in eukaryotic genomes, compromising the ability to

precisely rearrange DNA segments in vivo.

With the advancing of new techniques and sophisticated bioinformatics design and

modelling, the last decade have witnessed the progress of molecular biology leading into a

new genomic engineering era (Fehér, Burland, & Pósfai, 2012). Genome engineering is

currently entering the phase of genetic manipulation in vivo representing an important

research area for diverse disciplines, particularly in the areas of human health. Several

important discoveries have made this progress possible, such as the development of site-

specific recombinases (SSRs) used extensively in mammals and other eukaryotes for

experimental research or projected gene therapy (Gorman & Bullock, 2000) and the

generation of zinc-finger (ZF-) and TAL effector (TALE) proteins.

1.2 – Site-specific recombination as a powerful tool for genome engineering

Site-specific recombinases (SSRs) are an important class of DNA-binding proteins

essential for a variety of diverse biological processes, including regulation of gene

expression, bacterial genome replication, differentiation, integration and excision of viral DNA

into a host genome (Gaj, et al., 2011) (Van de Putte & Goosen, 1992). A recombination

event result from a physical exchange of chromosomal material and a specialized

recombination occurs only between specific sites.

Recently SSRs have emerged as powerful tools for genome engineering due their

ability to perform rearrangements of DNA segments in complex eukaryotic genomes (Smith

& Thorpe, 2002). Recombinases bind to a pair of specific short DNA sequences, termed

target sites, and catalyse strand exchange between them, via consecutive cleavage and

ligation of the target sites with conservation of the phosphodiester bond energy (Kilby,

Snaith, & Murray, 1993). When the two target sites are present on two different DNA

Chapter I – Introduction

2

molecules they will be merged by site-specific recombination representing an integration

event (Fig. 1a). For sites located on the same DNA molecule, two different results can occur

depending of the relative orientation of the target sites. Recombination between sites in a

head-to-tail orientation results from excision (if the substrate is circular, this reaction is

sometimes called resolution) of the intervening DNA (Fig. 1a), whereas inversion results from

exchange between sites in a head-to-head orientation (Fig. 1b).

a)

b)

Figure 1. Scheme representation of site-specific recombination.

The arrows represent recombination sites and pointed ends indicate sites orientation. (a) Integration of a circular DNA molecule into a specific site on a second molecule. The reverse reaction excises the integrated DNA as a circle. (b) Inversion results from recombination between inverted sites, oriented in a head-to-head orientation.

Naturally SSRs are highly specific, fast and efficient. All known SSRs fall into one

of two distinct families, the tyrosine recombinases and the serine recombinases, named after

the protein nucleophile responsible for formation of covalent recombinase-DNA intermediates

during the exchange of DNA strands (Grindley, Whiteson, & Rice, 2006). The two families

are unrelated in structure and protein sequence and their recombination mechanisms are

distinctly different. Each family appears to have arisen and, probably, evolved separately.

Given the lack of similarity, one might expect that the two-recombinase families would have

evolved to become specialized for distinct types of DNA rearrangements. However, that is

not the case. Although the serine and tyrosine recombinases are unrelated in sequence,

structure and mechanism, there is no obvious distinction between their biological functions

(Rice & Correll, 2008) (Rowland & Stark, 2005).

1.2.1 – Two families of recombinases

Recombinases derived from bacteria and fungi. The tyrosine recombinases are most

widespread among prokaryotes and are structurally diverse and extremely versatile

(Grindley, Whiteson, & Rice, 2006). Whereas serine recombinases are mainly widespread in

Eubacteria and Archea, but not in Eukarya, where the few examples found so far may be of

recent bacterial origin (Rowland & Stark, 2005). There are interesting similarities and

differences in the catalytic mechanism used by these recombinases. Most SSRs from serine

family require additional host factors for efficient catalysis, limiting their use for in vivo

Inversion

Integration

Excision (resolution)

Chapter I – Introduction

3

applications in heterologous hosts. In the tyrosine family, recombinases such Cre and Flp are

able to recombine their targets efficiently without the aid of any accessory proteins (Laprise,

Yoneji, & Gardner, 2010). Hence, they can be used to rearrange DNA in living systems.

Cre protein encoded by the coliphage P1 (Sauer & Henderson, 1988) and the Flp

protein from the 2 μm plasmid of Saccharomyces cerevisiae (Voziyanov, et al., 2003), are

among the most popular and widely used site-specific tyrosine recombinases for genome

engineering (Branda & Dymecki, 2004). Both SSRs are able to catalyse reversible site-

specific recombination between two identical sequences of DNA with high fidelity, without the

need for cofactors in Escherichia coli and in vitro. Hence, they can be used to rearrange DNA

in a living system (Kilby, Snaith, & Murray, 1993).

On the other hand, information regarding domain characteristics, structure and

function of the serine recombinases comes from studies on four prototypical systems, Tn3/γδ

resolvases and Hint/Gin invertases. In all four systems the combined properties of the

recombinases and their cognate DNA sites result in strict control over the outcome of the

recombination (Grindley, Whiteson, & Rice, 2006). Serine recombinases such Tn3 resolvase,

have two spatially and functionally distinct domains. The N-terminal catalytic domain (140

amino acids) involved in catalysis of DNA cleavage and rejoining, followed by a small helix-

turn-helix DNA binding domain (DBD) (45 amino acids) where is located the DNA binding

activity (Grindley N. , 2002).

Tn3 resolvase catalyses DNA site-specific recombination between two identical

144 bp res sites, presents in each copy of the transposon, in a head-to-tail orientation (Fig.

2a). Each res site consists of three subsites, namely subsites I, II and III (28 bp, 34 bp and

25 bp, respectively) unequally spaced with 22 bp separating sites I and II and only 5 bp

between sites II and III (Fig. 2b). The subsites consist of 6 bp inverted repeat motifs flanking

a central sequence of variable length acting as binding sites for resolvase. Two subunits of

resolvase at the binding site I of each res are directly involved in catalysis, and another four

units are bound at the ‘accessory’ binding sites II and III, forming a synaptic complex (Fig.

2c) necessary for activation of the catalytic function, allowing recombination (Grindley, et al.,

1982) (Nöllmann, Byron, & Stark, 2005).

Recently, Akopian and colleagues discovered multiple mutants of Tn3 resolvase that

catalyze rapid recombination at a truncated version of res comprising just the 28-bp binding

site (Akopian, et al. 2003). It was also shown that certain combinations of activating

mutations are more efficient that many single mutations and are able to promote

recombination as fast as any wild-type resolvase (Olorunniji & Stark, 2009). These findings

make it possible to use recombinases with shorter recognition sequence.

Chapter I – Introduction

4

a)

b) c)

Figure 2. Tn3 resolvase-mediated site-specific recombination

(a) Site-specific recombination reaction mediated by Tn3 resolvase. Synapsis of two res sites in a head-to-tail orientation and strand exchange results in a 2-noded catenane resolution product. Arrowheads indicate the two res sites. (b) The Tn3 wild-type recombination site res. The boxes represent the three resolvase-binding sites with the respective length (bp). The point within site I at which resolvase breaks and rejoins the DNA is discernible by a staggered line. (c) A simplified model for res synapsis of Arnold et al. (1999). Binding sites II and III of the two res sites allow juxtaposition and subsequent catalysis of strand exchange at the two site Is. Double-stranded DNA is represented by thick black lines. Dark grey boxes and light grey boxes represent res accessory sites and binding site Is, respectively. Recombination subunits are shown as shaded ovals and dashed arrows indicate possible contacts between resolvase/DNA.

1.2.2 – Application of SSR

Sit-specific recombinases have been developed as novel tools for manipulating DNA

in living organism. SSRs are in themselves sufficient to catalyse recombination between

specific target DNA of around 35 bp. Target sites of this size, unlike restriction enzymes

target sites, are unlikely to occur at random in higher eukaryotes genomes. Therefore these

enzymes can be used both in vitro and in vivo (Kilby, Snaith, & Murray, 1993). Their ability to

work like scissors allows to excise the chromosome, that made a revolution in deleting exons

or whole genomes. Also, the ability to control the recombinase expression (e. g. via promoter

specific expression and most importantly via inducible systems like estrogen-binding domain)

made it possible to obtain the conditional knockout of the genes in living mice deletions of

which before were letal (Branda & Dymecki, 2004) and to manipulate mouse genome (Marie-

Christine, Françoise, & Xavier, 2009). As a minor application the excision of a selection

marker from the genome, for conditional knockout, is also achieve through SSRs (Faix, et al.,

2013). Adaptation of recombinases to a new recognition target site was also achieved due to

strategies such as directed evolution methods (Buchholz & Stewart, 2001) and by the

excision of integrated HIV proviral DNA from the genome of infected cells (Sarkar, et al.,

2007).

2 – Targeted DNA-binding proteins - zinc finger proteins

In order to promote the development of new methods for gene modification to

res siteI II III

28 bp34 bp 25 bp

22 5

2-node catenane resolution product

Resolution

Chapter I – Introduction

5

mammalian genomes, researchers have pursued to alteration of genetic information in cells

via engineered enzymes. The latest approaches aim to exchange the specificity of the

enzymes that act on DNA by fusing novel DNA-binding domain (DBD) with altered sequence

recognition to catalytic proteins or domains (Bogdanove & Voytas, 2011). Thus theoretically

newly assembled enzymes possess both the required catalytic activity and the specificity

region recognizing the sequence of interest. However, in practice the engineering of

enzymes that target specific sequences within complex genomes is a formidable challenge.

One of the first programmable DBD has been based on the zinc-finger (ZF) technology

(Perez-Pinera, Ousterout, & Gersbach, 2012).

A ZF contains ~30 amino acids, including two invariables pairs of cysteine and

histidine residues. Each finger typically recognizes three consecutive base pairs of DNA via

interactions of a single amino acid side chain per base pair. Their binding properties depend

on the amino acid sequence of the finger domains and the linker between the fingers.

Several fingers can be linked in tandem allowing the engineering of ZF proteins to target

various effector domains in vitro (Pabo, Peisach, & Grant, 2001).

Nowadays, it is possible to engineer synthetic ZF proteins capable to bind almost any

target site in the human genome (Mandell & Barbas, 2008). A synthetic ZF can be fused to

the nonspecific DNA cleavage domain of the type IIS restriction enzyme FokI, generating a

ZF nuclease (ZFN) (Fig. 3). When expressed in mammalian cells, to ZFN monomers bind, in

inverse orientation, with an optimal spacing of 5-7 nucleotides, and assembled ZFN cleaves

DNA creating a double-strand break (DSB) in the genome at or near the desired target site

(Maeder, Thibodeau-Beganny, & Joung, 2008). DSB repair in eukaryotes is largely

accomplished by non-homologous end joining (NHEJ) pathway, which the two DNA ends are

ligated, frequently with loss or gain of small amounts of DNA sequence, thus, leading to the

short insertions, deletions or mutations (Burma, P.C, & Chen, 2006) (Weterings & Chen,

2008). This way ZFNs can be applied for the targeted mutagenesis in order to obtain gene

disruption.

DSB can also undergo homologous recombination (HR) repair, which transfers

information missing at the break from a homologous DNA molecule (Huang, et al., 2009).

When an external template with flanking homology is provided, it can serve as donor for the

new genetic information to be inserted (Fig. 3). Therefore, ZFN technology inducing precise

DSBs has been used successfully to incorporate gene sequences at specific location in the

genomes of cells from a variety of species, including human cell (Urnov, et al. 2010) to

generate knock-out and knock-in alleles in mouse zygotes (Cui, et al., 2011), to increase the

potencial for gene correction therapy of human inherited desease (Urnov, et al., 2005) and

targeted mutagenesis in plants (Townsend, et al., 2009).

Chapter I – Introduction

6

Figure 3. Scheme for the case of ZFN cleavage

Illustration of a pair of ZFNs bound to DNA is shown at the top. ZF domains are depicted in collared boxes with short vertical lines indicating the main contact with the DNA base pairs. FokI cleavage domains are shown as red ovals, with common cleavage sites, and the linker peptides between the two domains of each subunit are labelled. ZFN efficiently create DSB in chromosomal DNA that result from the cut by FokI. These DSB can be repaired either by NHEJ producing an indel (represented by a star) near the spacer region of the ZFN binding site; or by HR to restore the wild type when a homologous template is available. DSB, double strand break. NHEJ, non-homologous end joining. HR, homologous recombination.

Inspired by the success of the ZFN technology authors have developed, recently, the

ZF recombinase (ZFR) technology driven by the fact that SSRs mediates efficient precise

integration, deletion or inversion of defined DNA segments. As the name suggests, a ZFR is

an engineered fusion via a short flexible linker peptide between the DNA-binding domain of a

ZF protein and a catalytic domain of a serine recombinase (Gaj, et al., 2011). Recombination

catalysed by ZFR occurs specially at ‘Z-sites’ that consist of two pairs of sequence motifs,

recognized by the DBD of the ZF part, flanking a central sequence, named spacer, bound by

the recombinase catalytic domains (Akopian, et al., 2003). In their study, Prorocic, and

colleagues (2011), demonstrate the importance of the linker connecting the two domains and

the size of the recombinase necessary for an efficient recombinase activity. They proposed

that when the linker is not long enough the interaction between the two domains and the Z-

site does not occur (Prorocic, et al., 2011).

ZFR promises valuable applications in biotechnology. Some authors suggest that

integrated proviral DNA from retrovirus (e.g. HIV) can be deleted by recombination between

identical sequences in the Long Terminal Repeats (LTRs) that flank the provirus (Proudfoot,

et al., 2011). Others, sugests that can be applied for integration of a transgene at a specififc

site in the genome of an organism (Prorocic, et al., 2011).

ZF-based technologies (ZFN, ZFR) althought very elegant and promicing face certain

technical and biological obstacles. Among them are the levels of off-target activit, the

imperfect modularity of particular domains, with certain combinations of ZF domains being

5’#

5’#3’#

3’#

Linker#

Linker#

DSB

5’#5’#3’#

3’#

NHEJ HR

5’#5’#3’#

3’#

Homologous template

5’#5’#3’#

3’#

5’#3’#5’#

3’#

Indel

Targeted gene replacement

Targeted mutagenesis (e.g. gene Knock-out)

Chapter I – Introduction

7

N C

DNABindingDomain

LTPDQVVAIASHDGGKALETVQRLLPVLCQDHG

NLS AD

HDNINGHDNNHDHDNIHDNGNIHDHDNNNG|||||||||||||||CATCGCCACTACCGT

less specific or less effecient in DNA binding. Difficulty in the construction of the ZF proteins.

The assembling of sequence specififc ZF array is labor intensive, time consuming (Kim, et

al., 2009) (Gupta, et al., 2012), and also associated with hight rates of failure (Ramirez, et al.,

2008) This has led scientists to search for a new tecnhology that will enable design,

assembly, and utilization in most laboratories of molecular biology.

3 – Transcription activator-like effectors (TALEs) - a novel

DNA binding domain

Another recently developed and promising genetic engineering tool is based on

TALE-technology. Transcription activator-like effectors, also known as TALEs, constitute a

class of naturally occurring DNA-binding proteins found in the plant pathogen Xanthomonas

sp. TALEs are delivered to the nucleus of the host cell during infection, bind to effector

specific sequences in host gene promoters and transcriptionally activate gene expression

(Kay, et al., 2007) (Römer, et al., 2007).

TALE is composed of multiple domains assembled in an array manner. Each DNA-

binding domain consists of a series of 33 to 35 amino acids repeats that function to

selectively bind to a DNA target sequence (Fig. 4). Members from two independent labs

found that repeating domains are identical with the exception of two hypervariable residues

at position 12 and 13 called ‘repeat-variable di-residues’ (RVD’s) that dictates specificity of

the corresponding repeat to a single nucleotide (Boch, et al., 2009) (Moscou & Bogdanove,

2009). The four most common RVD’s (NI, HD, NG and NN) are preferentially associated with

the four DNA bases (A, C, T and G, respectively). When the TALE binding code was

deciphered the modular organization of the TALE binding protein was applied to assemble

“designer” DNA-bonding domains, similar to ZF technology. Individual TALE repeats can be

joined to produce DBDs capable of recognizing heterologous DNA sequences, including

endogenous sequences in mammalian cells. As a result, recently, TALE proteins were

employed as DNA targeting tools. In particularly, TALEN technology has successfully proved

the potential to rival the existing ZF technology (Scholze & Boch, 2011).

Figure 4. TALE-protein structure

Structure of a TALE. Repeats are shown as blue boxes. A consensos repeat is shown with RVD underlined. [NLS, nuclear localization signal(s); AD, transcriptional activation domain; N-, N-terminal domain; C-, C-terminal domain [Adapted from Cermak, et al., Nucleic Acids Research, 2011]

Chapter I – Introduction

8

3.1 – TALE-nucleases for targeted DNA

Recently, TALENs have emerged as an alternative to ZFNs for genome editing for

introducing targeted DSBs and have become an important new tool for genome engineering.

Furthermore, recent large-scale tests demonstrated the ability of TALENs to introduce

different genome alterations in a wide range of organisms and cell types (Joung & Sander,

2012).

TALENs consist of a fusion between engineered TALE arrays with the catalytic

domain of the FokI endonuclease similar in ZFN design (Fig. 5). Given the modular nature of

the DBD of the TALE, repeats can be assembled in order to target a desired DNA sequence

into arrays using methods such as the ‘Golden Gate’ cloning strategy (Cermak, et al., 2011).

Golden gate assemble utilized the one-step ligation of several modules at once and therefore

is a robust and rapid method, and using a two-step protocol a TALEN can be assembled in 5

days. Recently a ‘FLASH’ assembly system (Reyon, et al., 2012) was developed as method

for large-scale assembly of TALENs.

Since FokI is only active as a dimer, these artificial nucleases are composed of active

pairs in which two monomers bind to the opposing targets across a spacer, with

approximately 17 bp, over which the FokI endonuclease dimerize and cleaves the DNA

introducing DSBs. Similar to ZFNs the chromosome breaks triggers HR or gene targeting

(Krogh & Symington, 2004) when a repair template is available to repair the breaks in the

DNA , or NHEJ (Lieber, 2008) which often instroduces short DNA insertions or deletions that

create targeted gene knockout (Fig. 3).

A recent study published in the beggining of this year, 2013, performed by Kim and

his co-workers, presented the results from a pilote-test of a genome-scale collection of

TALENs for gene targeting in human cells involving 124 genes (Kim, et al., 2013). The

results showed that all tested TALENs were active and disrupted their target genes at high

frequencies, although two of these TALENs became active only after their target sites were

partially demethylated using an inhibitor of DNA methyltransferase. Furthermore, the authors

developed a scalable Golden-Gate assembly system that consist of a total of 432 plasmids

and used this system to construct a TALEN lybrary, available to all researchers in the link

http://www.talenlibrary.net/, which contains a collection of 18,740 TALEN pairs, to disrupt or

modify every protein-coding gene in the human genome. In comparizon with ZF technology,

TALEN technology show a higher success rate arround 64–88%.

TALE promises to facilitate and enhance genetic manipulations in different cell types

and organisms. Moreover, TALENs simplicity and robust success rates (Reyon, et al., 2012)

has already spurred much broader adoption of genome-editing technology.

Chapter I – Introduction

9

Figure 5. TALEN structure and mediated genome editing in human cells

Structure of a TALEN. TALEN binds and cleaves as a dimer on a target DNA site. Cleavage by the FokI nuclease domains occurs in the ‘spacer’ sequence that lies between the two TALEN monomers bound to the DNA. RVD’s, Repeat variable di-residues. [NLS, nuclear localization signal(s); AD, transcriptional activation domain; N-, N-terminal domain; C-, C-terminal domain [Adapted from Cermak, et al., Nucleic Acids Research, 2011]

Both types of designer nucleases: ZFN and TALENs have invaluable feature of

customizable DNA binding. For in vivo application such technologies must demonstrate strict

specificity toward their intended DNA targets and both technologies should be applied

cautiously in organisms. The foreseeable source of the risk is that complex genomes contain

multiple copies of sequences that are identical or highly homologous to the desired DNA

target, leading to off-target activity and cellular toxicity. Additionally, repair of DSBs relies on

cell machinery that varies with cell type. This issue can be approached through optimization

of cleavage specificity for example by generation TALEN heterodimers with reduced toxicity

(Gaj, Gersbach, & Barbas, 2013). Alternatively, safer targeted genomic modifications can be

approached by merging the success of the modular assembly of TALE technology with the

highly specific enzymes such as the SSRs.

3.2 – Chimeric TALE recombinases

At current state both ZF- and TALE-nuclease technologies are predominantly used

for the genome mutagenesis, which inactivates the gene in the cellular context. Designer

nucleases also enable introduction of novel sequences via HR repair pathway but the

efficiency is significantly lower that for the indels. However the nuclease technology does not

give a possibility of the conditional knockout of the embryonic lethal genes in vivo as

classical SSR do. Therefore, developing a designer-recombinase similar to the designer

nucleases would be next step in advancing genome technologies.

Site-specific recombinases have proved to be reliable and widely used engineering

tools to manipulate DNA in vitro and in vivo. Due to the fact that SSR posses unique ability to

fulfil three types of the reactions (excision, integration and inversion) and because of their

high specificity, expanding their ability to recognize new targets in one of the intriguing and

challenging tasks. There have been several attempts to alter the specificity of the Cre-like

recombinase (Buchholz & Stewart, 2001) (Santoro & Schultz, 2002). To date, however,

altering the specificity of many SSRs has proven difficult (Grindley, Whiteson, & Rice, 2006)

(Sarkar, et al., 2007).

5’-ACGTCATCGCCACTACCGTATGGTGTGCTCATAGTTGTGGTTTGTCTAGTACC-3’3’-TGCAGTAGCGGTGATGGCATACCACACGAGTATCAACACCAAACAGATCATGG-5’

N C

NLSADDNABindingDomain

Chapter I – Introduction

10

In the light of modular design approaches relying on fusion of DBD and catalytic

domains, serine recombinases have more convenient protein structure. Serine recombinases

function as two-domain enzymes with distinctly separated binding and cleavage functions.

Moreover, mutants of several serine recombinases has been identified that do not required

accessory factors for recombination. Altogether, serine class can be proposed as preferred

recombinases for modular engineering.

The replacement of the DBD of a serine recombinase for custom-designed DBD

protein allows the recombination at specific locus within the genome in bacteria or

mammalian cells. Using ZFR technology, Prorocic and his co-workers, have shown the

necessity for further in vitro optimization due drawbacks of the ZF assembly (Prorocic, et al.,

2011). One of the main issue associated with ZFRs are the lack of ZF domains capable of

recognizing any DNA triplets and the level of off-target activity. A potential solution to this

problem of the ZFR technology might be the use of TALEs instead of ZF proteins, because of

simpler assembly and higher binding efficiencies.

Mercer, Gaj, Fuller, & Barbas, have carried out the first attempt to create a chimeric

TALE recombinase with programmable DNA sequence specificity (Mercer, et al., 2012).

They engineered a fusion between a hyperactive catalytic domain from the DNA invertase

Gin and a TALE using a library of TALE variants to identity TALER fusions that modify DNA.

This study is an ideal starting point for future works. Developing an active TALER, however,

require further optimization to increase activity in bacteria and mammalian cells. An

illustrative example of a TALER is represented in Figure 6.

Figure 6. Fusion orientation of a TALER

TALER structure. The recombinase, pink balloon, is linked to the TALE DNA-binding domain (DBD), forming the TALER structure. TALE binds to a specific sequence in the DNA chain allowing to the recombinase to dimerize and recombine over a ‘spacer’ sequence that lies between the two TALER monomers. Amino acids linkers between the two domains of each subunit are labelled. Two arrows indicates the spacer region. N-, N-terminal domain; C-, C-terminal domain [Adapted from Mercer et al., Nucleic Acids Research, 2012]

In conclusion, novel TALE technology has an immense potential in the existing and

emerging techniques for genome engineering. It allows modifying specificity of the DNA

acting enzymes. On the other hand, creating customized recombinases would make an

immense impact on the genome engineering filed. And among all known recombinases the

structural properties of the serine recombinases is a promising starting point for success of

fusion recombinases. More detailed knowledge about TALE, as well as TALENs, can

enlighten the ways to achieve functional and reliable designer recombinases.

5’

5’

3’

3’

TALE DBD

TALE DBD Linker

N-

-N

-C

C-

Spacer

Chapter II – Objectives

11

Chapter II – Objectives

Targeted genome engineering has become an important research area for diverse

disciplines, with site-specific recombinases (SSRs) being amongst the most reliable and

efficient tools for genome engineering. With the discovery of the newly described class of

specific DNA binding proteins, transcription activator-like effector (TALE) from plant pathogen

Xanthomonas spp., it has become possible to retarget DNA-binding proteins in a relatively

simple manner. A combination of the two methods would create a unique genetic tool that

enables the sophisticated DNA modifications based on excision, inversion and insertion of

DNA at any desired position in the genome.

Any customized TALE repeats array can be inserted into the TALER architecture thus

dramatically expanding the targeting capacity of engineered recombinases for applications in

biotechnology and medicine.

The aim of this master thesis has been the design and experimentally tests

architecture of a TALE recombinase (TALER). In order to accomplish this aim the project

was divided in two goals:

Design and assemble of a custom-made TALE nuclease (TALEN) to target HOT1

gene in the human genome. A functional TALE domain with a known specificity is

necessary for further fusion with the recombinase and therefore it needs to be

tested beforehand. It can be done in a fusion with a nuclease.

Design and assembly of a TALE recombinase (TALER) that can be used to

recombine in bacteria and mammalian cells. Compare to other existing results,

different designs were tested with respect to their efficiency, recombination

capacity and safety of the TALER technique.

Chapter IV – Results and Discussion

12

Chapter III – Materials and Methods

1 – Materials

The Buchholz group, from the Biotechnology Center of the TU Dresden (BIOTEC

Institute), provided the entire material necessary for the development of this master thesis.

1.1 – Enzymes, reagents and kits

All restriction enzymes were purchased from New England BioLabs (NEB) unless

otherwise indicated. Taq DNA polymerase, Phusion DNA polymerase, Deoxyribonucleotides

(dNTPs), 10X PCR buffer, MgCL2 and DMSO was purchased from Bioline.

Agarose was purchased from Invitrogen Life Technologies and all cell culture

reagents (DMEM, FCS, Pen-Strep and Trypsin) were bought from Gibco. Antibiotics were

acquired from Sigma and arabinose was bought from Sigma-Aldrich. Gel extraction was

performed using QIAquick. Gel Extraction Kit (cat#28706); PCR purification was performed

with QIAquick PCR Purification Kit (cat#28106); Genomic DNA from Hela cells and U2OS

cells was isolated with QIAamp DNA blood minikit (cat#51106); Plasmid DNA extraction was

performed using QIAGEN Plasmid Mini Kit (cat#12125); High-purity plasmid DNA isolation

was performed with QIAGEN Plasmid maxi Kit (cat#12165), or PureYieldTM Plasmid

Midiprep System, Promega according to manufacturer’s protocols.

1.2 – Bacterial reagents

All media for bacteria maintenance, LB liquid medium, LB-agar plates and SOC

medium were provided by the institute kitchen.

1.3 – Plasmids

The Golden Gate TALEN and TAL Effector kit 2.0 (cat#1000000024), published in the

original form in Cermak, et al. (2011), and a commercialized APC-TALEN_left (ID36727) and

APC-TALEN right (ID36728) [Reyon, et al. (2012)] were purchased from addgene plasmid

repository (www.addgene.org).

A HOT1-TALEN cassete was commercially synthetized (Invitrogen) and cloned into

the mammalian expression vector pcDNA3.1(-) (Invitrogen) using two resctriction AflII sites.

1.4 – Syntethic oligonucleotides

The company Biomers, Germany, synthesized all the oligonucleotides used in this

work. Primer sequences are provided in Supplements Table 1. The Golden Gate TALEN

Chapter IV – Results and Discussion

13

assembly protocol kit supplied all primers used during custom TALEN assembly (Sequences

in Supplements Table 1).

1.5 – Sequencing

All the sequences were provided by the sequencing facility of the Max Planck Institute

of Molecular Cell Biology and Genetics (MPI-CBG) by request.

2 – Methods

2.1 – TALEN construction

TALENs assembly of the RVD-containing repeats was conducted using the Golden

Gate TALEN assembly protocol from Cermak, et al. (2011). Assembly of a custom TALEN is

accomplish in a 5 days protocol (Supplements, Table 2) and is based on the use of Type IIS

restriction endonucleases, such BsaI and Esp3I, that can introduce unique cohesive ends on

DNA fragments so that they can be assembled in a precise, sequential order.

2.1.1 – Golden Gate Assembly Protocol

Day 1: The first 10 RVD modules are selected and cut and ligated together into a

single 20 μl reaction containing 1 μl BsaI (10 U) and 1 μl T4 DNA ligase (2000 U) in T4 DNA

ligase buffer. The reaction is incubated in a thermocycler for 10 cycles of 5 min 37 ºC and 10

min at 16 ºC, then heated to 50 ºC for 5 min and then 80 ºC for 5 min. Following by a

treatment with 1 μl 25 mM ATP together with 1 μl Plasmid Safe DNase (10 U, Epicentre).

The mixture is incubated at 37ºC for 1 h, and then used to transform E. coli. Cells are plated

on LB agar containing 50 μg/ml of spectinomycin, with X-gal and IPTG for blue and white

screening of recombinants. Colonies are blue in the case of a vector contains no insert DNA

and white in the case of a vector containing a fragment of cloned DNA.

Day 2: After overnight (ON) growth pick three white colonies from each

transformation and check by colony PCR each clone using primers pCR8_F1 and pCR8_R1,

using these conditions: Anneal at 55 ºC, extend 1.75 min, cycle 35X. Check by agarose gel

electrophoresis expected size of 1.2 kb. Start a liquid ON culture at 37 ºC with the correct

clones.

Day 3: Isolate plasmid DNA using QIAGEN Plasmid Mini Kit and identify clones with

correct arrays by restriction enzyme digestion and agarose gel electrophoresis. The reaction

was performed in a 20 μl volume using for restriction screening AflII and XbaI restriction

enzymes to cut out the array of fused repeats. Digestion was done during 2 h at 37 ºC. After

heat inactivation 65 ºC, 20 min, digested products were run in a 0.8 % agarose gel, during 1

h at 90 V, desired band (around 1048 bp) was excised from the gel and DNA was extracted

using QIAquick Gel Extraction Kit (QIAGEN) following manufactures instruction. DNA

Chapter IV – Results and Discussion

14

quantification was performed using NanoDrop 8000 (Thermo Scientific).

The next step is to join the intermediary arrays, along with the last repeat. RVD’s 11-

14 are selected, together 15th last RVD plasmid. Arrays are ligated into a correct order into a

20 μl digestion and ligation reaction mixture prepared as in the first day. Once assembled,

the RVDs were cloned into a pTAL3 destination vector with the appropriate TALEN backbone

and used to transform E. coli cells. Cells are plated on LB agar containing 100 μg/ml of

ampicillin for selection of transformants, with X-gal and IPTG for blue and white screening of

recombinants.

Day 4: After ON growth pick three white colonies from each transformation and check

by colony PCR each clone using primers TAL_F1 and TAL_R2, using these conditions:

95ºC, 10 min, start 35 cycles (95 ºC, 30 sec denaturation, 55 ºC, 30 sec annealing, 72 ºC, 3

min extension). Check by agarose gel electrophoresis expected size of 1.2 kb. Start a liquid

ON culture at 37 ºC with the correct clones.

Day 5: Isolate plasmid DNA vectors containing the final full-length TALEN monomers.

Array length was verified by sequencing by the sequencing facility of the Max Planck Institute

of Molecular Cell Biology and Genetics (MPI-CBG) with primers SeqTALEN_5-1 and

TAL_R2.

2.1.2 – Electrocompetent cells

All modifications of the strains were performed by electroporation of the plasmid DNA

into the electrocompetent cells. E. coli DH5a strain was performed as followed: 1 ml of an

ON culture was diluted in 100 ml L-broth with the appropriate antibiotics and grown at 37 °C

until an OD600 of 0.6 was reached. All the subsequent steps were done at 4 °C in the cold

room. After chilling in on ice water bath for 20 min, the bacterial culture was transferred to

pre-chilled 50 ml falcon tubes and centrifuge (at 4000 g in Sorvall centrifuge for 15 min).

Supernatant is discarded and pelleted cells resuspended in 25 ml of ice-cold water by gently

pipetting up and down and centrifuged again as before. The cells were then resuspended in

10 ml 10 % glycerol and centrifuged again. The final pellet was resuspended in 2 ml glycerol

and 125 μl aliquots were frozen in liquid nitrogen and stored at -80 °C.

2.1.2.1 – Transformation

For transformation of competent cells, aliquots were thawed on ice and mixed with

ligation mix or plasmid, not exceeding 1/10 of the volume of competent cells. The mix was

pipet into an electroporation cuvette of 1 mm gap (Gene Pulser and MicroPulser Cuvettes,

BIORAD), avoiding bubbles and empty spaces between the metal sides, and electroporated

at 1350 V (Electroporator 2510, Eppendorf). Electroporation constant should be around 4-6.

Cells were then diluted in 1 ml of LB medium and grown 1 h at 37 °C shaking at 950 rpm

Chapter IV – Results and Discussion

15

using a thermoshaker. After 1 h of recovery growth, 50 μl were plated on L-agar plates pre-

warmed at 37°C and containing appropriate antibiotics. Plates were incubated at 37 °C, ON.

2.1.3 – Optimized transfection protocol for U2OS cells

U2OS cells were cultured and transfected in duplicate with a total of 2 μg of HOT1-

TALEN plasmids in 6-well dishes using Amaxa® Cell Line Nucleofector® Kit V, according to

manufactures protocol. 2 μg of pmaxGFP® Vector, supplied by the kit, and 2 μg of Venus-

H2B plasmid were transfected as a positive control for transfection efficiency. The cells were

maintained in 4,5 g/l D-Glucose DMEM supplemented with 10 % Fetal Bovine Serum, 100

U/ml Penicillin and 100 μg/ml Streptomycin at 37 °C, 5 % CO2 HERAcell Incubator (Thermo

Electron Corporation). The cells were grown to ~ 70-80 % confluence. Cells were collected

72 h after transfection and genomic DNA was isolated using QIAamp DNA blood minikit

(QIAGEN).

2.1.3.1 – PCR amplification and sequence verification

Targeted locus was amplified from isolated genomic DNA with primers DK628 and

DK629 employing primers listed in Supplements Table 1. PCR conditions were optimized

using Phusion high-fidelity DNA polymerase according to manufacture’s instructions for 35

cycles (98 ºC, 30 sec denaturation, 60 ºC, 30 sec annealing, 72 ºC, 20 sec extension). 1.5

mM MgCl2 was added to all reactions. PCR products were analysed for correct size in a 2 %

agarose gel. Fragments were purified with QIAquick PCR Purification Kit (QIAGEN)

according to manufactures instructions.

2.1.4 – Optimized conditions for T7 Endonuclease 1 assay

For treatment with T7E1, 200 ng of purified PCR product in a final volume of 10 μl

was denaturated and reannealed in NEBuffer2 (NEB) using a thermocycler with the following

protocol: 95 ºC, 5 min; 95-85 ºC at -2 ºC/s; 85-25 ºC at -1 ºC/s; hold 4ºC (Reyon, et al.,

2012). Hybridized PCR products were treated with 10 U of T7E1 at 37 ºC for 20 min in a

reaction volume of 20 μl. 10 μl of the total reaction was loaded together with 6x orange

loading dye in a 2 % agarose gel and run during 40 min at 90 V.

2.1.5 – Agarose Gel electrophoresis

PCR products and plasmids were resolved on an agarose gel of appropriate

concentration (Invitrogen GmbH) supplemented with 0.5 μg/ml Ethidium Bromide (Sigma-

Aldrich Chemie GmbH). Images of the gels were acquired with Gel Doc System (Bio-Vision)

or Gel Doc 2000 (Bio-Rad). If necessary the desired bands were excised from the agarose

gel and DNA was extracted using QIAquick Gel Extraction Kit (QIAGEN) following

manufactures protocol.

Chapter IV – Results and Discussion

16

2.2 – Chimeric Recombinase

2.2.1 – Cloning protocol of the Tn3 catalytic domain into pEVO vector

pEVO vector and pMA-T vector (containing Tn3 resolvase) were digested in a 80 μl

reaction using BsrGI and XbaI enzymes and respective buffer (NEB) during 3 h at 37ºC.

Digested product is run on the 0.8 % agarose gel, 90 V, 1 h,. The desired band was cut from

the gel using a UV light of the longer wave and gel purified. Ligation was carried out by

mixing vector and insert (1:3 proportion) in a 20 μl ligation buffer containing 1 U of T4 DNA

ligase (NEB) according to manufacturers protocol. Ligation was incubated for 1 h at 16 ºC.

Ligation mix was heat inactivated at 65ºC, electroporated into electro-competent E.coli DH5A

cells and plated into 25 mg/ml Cm plates. Resulting pEVO-TN3 plasmid DNA was extracted

from liquid culture after ON incubation at 37 ºC of selected clones. Correct sequence was

confirmed by sequencing.

2.2.2 – Cloning protocol of the TALE truncations into pEVO-TN3 plasmid

TALE truncations were constructed via PCR amplification utilizing primers Δ120-

TALE_up, Δ152-TALE_up, Δ180-TALE_up and C63-TALE-low. PCR fragments were cloned

into pEVO-TN3 vector using SpeI and PacI cloning sites. PCR fragments were purified and

digested using SpeI and PacI enzymes and respective buffer (NEB). pEVO-TN3 vector is

also digested with the same enzymes. After 3 h, at 37 ºC, vector is run on the 0.8 % agarose

gel, 90 V, 1 h, the desired vector backbone with 4.7 kb was purified and the three truncated

inserts were column purified. Ligation and transformation was carried out according to

cloning protocol described previously.

Colony PCR was used to screen a large number of transformants for the presence of

the insert using primers that recognize the backbone plasmid. Primer P1_up and P2_low

were used and Colony PCR was realized using TaqRed DNA polymerase according to

manufactures instruction for 95 ºC for 10 min followed by 20 cycles (95 ºC, 30 sec

denaturation, 60 ºC, 30 sec annealing, 72 ºC, 2.5 min extension). A correct clone containing,

approximately, a 7 kb band for each design was selected by sequencing.

Chapter IV – Results and Discussion

17

Chapter IV – Results and Discussion

Part I – Design and assemble of custom-made TALEN to target

HOT1 gene in the human genome

1 – TALEN design

According to literature (Mussolino & Cathomen, 2012) (Li, et al., 2011), TALENs work

in pairs so it is necessary a left TALEN and a right TALEN separated by a spacer, over which

the FokI dimerize and cleaves the target locus.

Individual TALE repeat can be joined to produce DNA binding domain capable of

recognizing endogenous sequences in mammalian cells. In the present work, I have built a

TALEN architecture that targets the homeobox telomere-binding protein 1, also called HOT1

gene (NCBI Gene ID: 79618) in the human genome. Comparing to other commercially

assembled and efficient TALEN I have evaluated the utility of the HOT1-TALEN to drive

targeted gene modifications in mammalian cells.

In a recent study, HOT1 was associated with the active telomerase complex and

promotion of chromatin association of telomerase. This discovery suggests that HOT1

physically interacts with both entities and suggests that HOT1 may contribute to the

association of telomerase with telomeres and telomere length maintenance. Collectively,

HOT1 supports telomerase-dependent telomere elongation. Therefore, interference with the

interaction of telomerase complex/telomeric repeats via HOT1 might provide an interesting

target for the development of novel therapeutics (Kappei, et al., 2013).

1.2 – HOT1-TALEN binding sites and spacer regions

Because TALENs function as dimer, a pair of TALENs targeting the HOT1 protein-

coding gene in the human genome was assembled using the Golden Gate assembly method

for TALEN engineering from Cermak et al. (2011). TALEN binding orientation is predicted at

Figure 7a.

The left and the right TALENs target sequences are located on opposite strands of

DNA. The target sequences for the designed TALENs are shown in Figure 7b. A 15-

nucleotide sequence immediately after a T was chosen for binding the left TALEN (5’-

TGATCCGCCTCATGT-3’). Another 15-nucleotide sequence that also follows a T in 5’3’

direction on the complementary strand was selected for binding by the right TALEN (5’-

ACTTACACCACTGGA-3’). A 19 bp DNA spacer separates both left and right TALEN-binding

sequences (5’-AAAGTATGCTTAGTTCCTT3’), over which FokI dimerize and cleaves the

Chapter IV – Results and Discussion

18

DNA creating DSB. The spacer region resides in the start codon (ATG) exon of the HOT1

gene (Fig. 7b).

The DNA-cleavage domain of the FokI endonuclease was chosen due its well-

documented non-specific catalytic activity when linked with other DNA-binding domains, such

as ZF proteins. Also, the nucleotide ‘T’ that precedes the TALE target sites is essential for

the target gene activation (Li, et al., 2011).

a)

b)

Figure 7. Schemes of the principal and target sequence of HOT1-TALEN

(a) Structure of the assembled TALE-nuclease. Two monomeric TALENs are required to bind the target site to allow efficient dimerization of the FokI nuclease domain, represented by red ovals, and successful DNA cleavage in the spacer between the TALEs (TALE left and TALE right) binding sites. (b) HOT-1-TALEN target site sequence. Black bold letters represent the right TALEN binding site and left TALEN binding site. The spacer, sequence between the two binding sites, contains the ATG start codon of the HOT1 gene indicated in red.

2 – In vivo assay of TALEN activity

2.1 – Expression of custom HOT1-TALEN in human cells

To validate the activity of the custom TALEN, HOT1-TALENs were subcloned into a

mammalian expression vector pCDN3.1 (-), from Invitrogen, (Supplements Figure 1), using

AflII and AflII to carry out targeted mutagenesis in human osteosarcoma cells (U2OS).

Plasmids that encoded HOT1-TALENs and a commercially assembled APC-TALEN

pair (Reyon, et al., 2012), chosen as a positive control for TALEN activity, were transfected

into U2OS cells. I have optimized the transfection efficiency conditions using the Cell Line

Nucleofector® Kit V in order to obtain the highest tranfection efficiency (optimized contions

are described in Materials and Methods, section 2.1.3).

Images from the optimized tranfection are shown in Figure 8, 72 h after tranfection.

Cells were tranfected with pmaxGFP® Vector, a green cytoplasmic fluorescent protein

supplied by the kit and used as for transfection optimization. This vector shown after 72 h

100 % transfection efficiency.

5’5’

3’

3’

N C

NC

abcd ef

a’b’c’d‘ e’f ’

FokI

FokI

TALE%right%%

TALE%le( %

Chapter IV – Results and Discussion

19

Both self-assemblend and commercial TALENs have nuclear localization signals,

therefore, Venus-H2B vector, carrying the nucleor localization of the Venus fluorescence

protein, was tested and further used for transfection as control. After 72 h, cells transfected

with Venus-H2B shown approximately 95 % transfection efficiency.

Figure 8. Transfection efficiency of U2OS cells

U2OS cells were transfected with the Cell Line Nucleofector® Kit V, Program X-001 with (a) 2 μg of pmaxGFP® Vector as kit control and (b) (c) 2 μg of Venus-H2B as transfection efficiency control. Cells were analyzed 72 hours post Nucleofection® using the bright field and fluorescence microscopy using 10x and 20x resolution. A merge image converging the two fields was taken for Venus-H2B transfected cells using 20x resolution where it is possible to see the nucleus of the cells marked with green fluorescence.

2.2 – T7 endonuclease I (T7E1) assay for detection of mutagenesis rate

The activity of the HOT1-TALEN was tested at the intended endogenous gene target

by measuring whether they create mutations by NHEJ pathway. A DSB induced by TALENs

and repaired by NHEJ often results in small insertions or deletions (“indel” mutations) near

the DSB site. We can detect these indel mutations by treating amplified DNA fragments with

mismatch-sensitive T7 endonuclease I (T7E1).

The DNA segments encompassing the sites of TALEN recognition were amplified by

PCR. PCR conditions were optimized using Phusion high-fidelity DNA polymerase (contions

(a)

pm

ax

GF

Ve

cto

r

(b)

Ve

nu

s-H

2B

Ve

cto

r

(c)

Ve

nu

s-H

2B

Ve

cto

r

10x

10x

20x

Bright field Green field

Merge

Chapter IV – Results and Discussion

20

are described in Materials and Methods, section 2.1.3.1) in order to obtain one specific band

of 300 bp.

DNA amplicons were melted and annealed. Because a pull of modified and non-

modified cells after treatment with TALENs was used for the PCR amplification of the target

region, DNA amplicons contain both WT and mutated DNA sequences. Therefore,

hybridization of the PCR products results in formation of heteroduplexes. When

heteroduplexes DNA is treated with T7 endonuclease, mismatch positions are recognized

and cleaved (Figure 9).

Figure 9. In vivo TALEN-induced genome editing

Schematic of the T7E1 assay used to determine TALEN activity. gDNA is isolated and used to amplify the TALEN target region from a heterogeneous population, gPCR products are rehybridized slowly to generate Ht. T7 only cuts Ht, whereas Hm are left intact. A schematic of an idealized gel result after treatment with T7E1 is shown. The size of the bands results from where is positioned the indel in the amplicon and TALEN activity is calculated based on the fraction of cleaved DNA. Ht- heteroduplexes; Hm- homoduplexes; g- genomic

Other functional assays, such as plasmid-based reporter constructs, restriction sites

destroyed by NHEJ or other enzymes that detect DNA mismatch, may also be used to

validate TALE activity (Sanjana, et al., 2012) (Carlson, et al., 2012).

Although T7E1 assay is standard and have already been described elsewhere

(Reyon, et al., 2012) (Kim, et al., 2009), the functional caracterization is integral to TALE

production and therefore I have optimized the assay conditions (optimized contions are

described in Materials and Methods, section 2.1.4).

For that reason the efficacy of the assay was tested first in plasmid DNA. I have

tested 3 different plasmids with the same sequence, except for two of them that differ from

one point mutation into a different position, named Point mutant 1 (PM1) and Point Murant 2

(PM2), respectivelly (Figure 10a).

4

Uncut

Cut

50%

T7E1

Melting and anneling

gDNA preparation and PCR amplification of Target DNA

Heteroduplexes digestion with T7E1

Agarose gel analysis

Ht

Ht

Hm

Chapter IV – Results and Discussion

21

PM1 and PM2 were hybridized, independently, with the WT DNA and then treated, or

not, with the T7E1. WT DNA was hybridized alone and treated with T7E1 as a negative

control. As shown in Figure 10b, plasmid-mediated mismatched was revealed by T7E1

assay. Each point mutant when hybridized with the WT DNA gave rise to distinctive cleavage

patters, reflecting the fact that the point mutation is localized at different positions. As

expected WT DNA treated and hybridized DNA non-treated revealed an uncut band pattern.

a) b)

Figure 10. T7E1 plasmid assay

(a) Scheme of the plasmid DNA amplified from 3 different plasmids with the exact sequence except for PM1 and PM2 that differ from the WT DNA at one mutation. Circle lines represent plasmid DNA and arrows the position of primers forward and reverse. The violet and orange ovals represent a point mutation. (b) Gel showing the T7E1 result from T7E1 plasmid assay. The resulting DNA bands are indicated with an arrow (uncut) and a bracket (cut) at the left and the bands size at the right of the gel image. WT – wild type; PM1- point mutant 1; PM2- point mutant 2.

2.3- Mutation rate detection in human cells tretaed with HOT1-TALEN

A commercial TALEN targeting for ACP gene in the human genome was shown to be

functional with a estimated percentage of NHEJ rate of 58.5 % (Reyon, et al., 2012). In my

work, I have tested APC gene specific TALEN in T7E1 assay (Supplements Fig. 2; APC

primers available at Supplements Table 1). The U2OS cells were treated for 72 h and the

T7E1 assay was performed on the isolated genomic DNA.

The APC-TALEN has shown efficient NHEJ-mediated mutagenesis at the intended

target site after treatment with T7E1. As it can be observed from figure 11, the non modified

500 bp band was detected without T7E1 endonuclease in the TALEN-treated DNA. However,

a shorter 300 bp band was detected upon T7E1 treatment, indicating the indel mutagenesis

in the amplified region. We estimated a sussess rate of ~50%, shown in gel figure 11. As a

negative control I have amplified the same region in cells treated with a different TALEN that

targets a different locus in the genome. As expected no mutagenesis activity was observed

after hybridization and T7 treatment (Fig. 11).

Uncut

Cut

T7E1+-+-+

WTWT-PM1WT-PM2

1kb700bp500bp300bp

Chapter IV – Results and Discussion

22

HOT1

uncut

cut

HOT1 Control

T7E1-++

Result+

300bp200bp100bp

Figure 11. Targeted genome editing in human cells at specific target site

Mediated genomic modifications revealed by the T7E1 in a commercially assembled ACP-TALEN. At the left of the gel panels the expecting positions of the resulting DNA bands are indicated: the arrow indicates uncut DNA and the bracket indicates cut DNA bands. The absence of T7 treatment is indicated by a (-) and the presence by a (+).

Establishing the success of the T7E1 assay was a priorety. Once we were able to

reproduce published results, the activity of the HOT1-TALEN was tested. Custom assembled

HOT1-TALEN constructs encoding left and right TALENs were transfected into U2OS, and

genomic DNA was isolated. The relevant target site was amplified by PCR (Supplements Fig.

3) and subjected to treatment with T7EI. As a negative control I have amplified the same

region in cells treated with the heterologous APC-TALEN that targets a different locus in the

genome.

As shown at Figure 12, the resulting cleavage pattern corresponds to activity of HOT1-

TALEN at the cut site. The uncut band of 300 bp was observed in the T7E1 non-treated

sample and as expected in the negative control. When the PCR amplicons were treated with

T7E1 two shorter bands were observed (100 bp and 200 bp) as expected for the TALEN

cleavage position within the amplicon (Figure 12). Therefore, this experiment revealed

mutagenesis activity at the intended target site characteristic of insertions or deletions

mutations (indels) produced at the target gene site by the FokI domain of the TALENs.

Ideally the intensity of the two resulting bands (100 bp and 200 bp) should be equivalent to

the intensity of the uncut band, in case of 50% activity rate. In this case the intensity of the

two shorter bands was weaker and therefore can be translate into HOT1-TALEN activity

being weaker than 50%.

Figure 12. TALEN efficiency at the HOT1 genomic locus

A custom-made HOT1-TALEN generated by the Golden gate assembly protocol for TALE and TALEN. The TALENs are shown at the top of the agarose gels. At the left of the gel panels the expecting positions of the resulting DNA bands are indicated: the arrow indicates uncut DNA and the bracket indicates cut DNA bands. The absence of T7 treatment is indicated by a (-) and the presence by a (+).

T7E1-++

APCControl

500bp300bp200bp

uncut

cut

Chapter IV – Results and Discussion

23

We observed activity discrepancy among the HOT1-TALEN with the APC-TALEN in

the T7E1 assay. This could be reasonable by several factors. The differences could relate to

expression level. But more likely, the data reflect intrinsic differences in the DNA binding

affinity of the arrays. It was previously shown that the TALEN efficiencies can vary within the

same cell type although the similar assemble approach is applied to built the DNA-binding

domain (Reyon, et al., 2012). One of the proposed factors for different DNA-binding strength

is the methylation of the DNA.

Nevertheless, the self-assembled TALEN targeting HOT1 gene is active enough to

detect significant mutagenesis rate in the T7E1 assay. Therefore, it was possible to proceed

to the second part of this project and utilize the functional TALE DNA-binding domain for the

protein fusion.

In summary, we have shown that the HOT1-TALEN can be used to efficiently

introduce targeted indel mutations in endogenous genes of human cells. In principle, any

researcher can rapidly and easily create targeted mutations in U2OS cells. Further

experiments are needed to establish the clonal cell lines carrying the HOT1 gene mutation.

Part II – TAL-effector Recombinase design and assembly

3 – TALER architecture

The aim of this work was to generate the chimeric recombinase able to recombine

novel sequences. For that reason fusion recombinase, namely, TALER, was designed.

TALER is a chimeric recombinase consisting of a TAL-effector and a recombinase catalytic

domain. The specificity of such chimeric recombinase is predominantly dependent upon DNA

binding domain, while the activity on the catalytic domain. It is important to notice that the

catalytic domain of the serine recombinases have pre-defined catalytic specificities,

therefore, a residual DNA specificity is obtained from catalytic domain.

The aim of my thesis was to identify a TALER architecture able to recombine DNA

in bacterial cells. It was decided to generate TALERs using the TALE part of the HOT1-

TALEN_right monomer that has been previously used to generate an active TALEN able to

cleave target DNA in human genome. The catalytic domain was chosen to be of a serine

TN3 recombinase. Schematic illustration is shown at Figure 13.

5’

3’ 5’

3’

N C

f ’e‘ d’c’b’a’

Tn3

NC Tn3

abcd ef

a’b’c’d‘ e’f ’

TALE%

TALE%

Chapter IV – Results and Discussion

24

MRLFGYARVSTSQQSLDLQVRALKDAGVKANRIFTDKASGSSTDREGLDLLRMKVEEGDVILVKKLDRLGRD

MALFGYARVSTSQQSLDLQVRALKDAGVKANRIFTDKASGSSTDREGLDLLRMKVKEGDVILVKKLDRLGRD

* *

TADMIQLIKEFDAQGVAVRFIDDGISTDGDMGQMVVTILSAVAQAERRRILERTNEGRQEAKLKGIKFGRRR

TADMIQLIKEFDAQGVAVRFIDDGISTDSYIGLMVVTILSAVAQAERRRILERTNEGRQEAKLKGIKFGRRR

*** *

TVDRNVVLTLHQKGTGATEIAHQLSIARSTVYKILEDERAS

TVDRNVVLTLHQKGTGATEIAHQLSIARSTVYKILEDERAS

WT– YP003829171.1

NM-resolvase

148

1

WT– YP003829171.1

NM-resolvase

WT– YP003829171.1

NM-resolvase

Figure 13. TALER fusion orientation

Schematic showing the fusion orientation of TALER. Gradient collored box represents the TALE DNA-binding domain that is expected to bind the DNA chain in 5’3’ orientation. Tn3 catalytic domain is represented by a pink balloon and linked to the TALE DBD by a spacer, represented by a thin line. a,b,c,d,e and f letters indicates the orientation of the TALE DBD.

3.1 – Activated resolvase mutant

The TALER engineering described here begins with the hyperactive Tn3 resolvase

(NM-resolvase). We chose the TN3 hyperactive resolvase because it is known that some

mutants are able to catalyse efficient recombination at synthetic target sites (Akopian, et al.

2003).

A synthetic gene TN3 was commercially synthesized by GeneArt, Life Technologies

Corporation, and cloned into a pMA-T using Sfil and Sfil cloning sites. The NM-resolvase

sequence (residues 1-148) contains the following six substitutions differing from the wild-type

(WT) sequence (YP_003829171.1): R2A E56K G101S D102Y M103I Q105L (Fig. 14) that

have been described previously and used to generate ZFR (Akopian, et al. 2003). The C-

terminal domain is required for specific binding and catalytic activity at site I, no binding or

recombination is observed if all residues after arginine 148 are deleted (Olorunniji & Stark,

2009).

Figure 14. Sequence alignment of wt Tn3-resolvase and NM-resolvase Alignment of the NM resolvase sequence compared to the wild-type Tn3 resolvase. Mismatches in the sequence compared to the wt are highlighted in yellow.

Downstream of the NM-resolvase sequence, a 10 amino acids linker (5’ –

GSGGSGGSTS – 3’) was added to allow some flexibility between the DBD and the

recombinase. The codons for TS introduce a unique SpeI restriction site (Proudfoot, et al.,

2011). Another restriction site for PacI restriction endonuclease was introduced right next the

SpeI site that would be used for fusion of the TALE part to the recombinase.

The plasmid pMA-T offers convenient cloning of the recombinase using BsrGI and

XbaI sites. First, TN3 recombinase domain-coding sequence was subcloned into pEVO

vector using BsrGI and XbaI restriction sites.

Chapter IV – Results and Discussion

25

3.2 – Designed truncations

Recent reports have shown that TALEN activity can be enhanced if the C-terminal

(Zhang, et al., 2012) and N-terminal portion of the TALE protein is truncated (Miller, et al.,

2011) (Mussolino, et al., 2011). Furthermore, Barbas and his co-workers used a library of

incrementally truncated TALE variants to generate the first chimeric TALE recombinase

constructs published (Mercer, et al., 2012). Thus, in order to improve TALER activity, we

generate C-terminal and N-terminal TALE truncations.

Truncations were generated by PCR amplification under optimized PCR conditions to

amplify the TALE DBD part from the pTAL3_HOT1-TALEN previously assembled (Fig.15a).

Reaction was optimized using Pfu Turbo Hotstart DNA polymerase. Three different

truncations were generated at the N-terminal domain in different intervals beginning at Δ120,

Δ152, Δ180, respectively. We designed one C-terminal truncation at position 63 (C63). N-

terminal truncations Δ120, Δ152 and Δ180 were carried out with primers Δ120-TALE_up,

Δ152-TALE_up and Δ180-TALE_up and C-terminal truncation C63 using primer C63-TALE-

low employing primers listed in Supplements Table 1. Primers forward introduce a SpeI

cloning site and primer reverse introduce a PacI cloning site for cloning inserts into pEVO-

TN3 vector (Fig. 15a). TALE truncations are represented at figure 15b. TALENs with both

Δ152 and C63 truncations have been reported as efficient TALEN-mediated modification of

plant cells (Zhang, et al., 2012).

a)

b)

Figure 15. TALE truncations variant designs

(a) pTAL3_HOT1-TALEN structure. A TALE DBD and a FokI nuclease domain constitute the HOT1-TALEN. N-terminal and C-terminal truncations are indicated by an arrow at the N-terminal and C-terminal part of the TALE DBD. (b) Schematic illustrating the design of the three TALE truncations. TAL-N’, TALE N-terminal domain- TALE-C’, TALE C-terminal domain. RVD’s, Repeat variable di-residues. Amp, ampicillin resistant. HIS3, HIST3 gene.

C60

Δ152

C60

Δ180

C60

Δ120

PacISpeI

5´-SpeI_Δ120Δ152Δ180

3’-PacI_C60

pTAL3_HOT1-TALEN

Chapter IV – Results and Discussion

26

PCR fragments were cloned downstream of the recombinase-coding sequence using

SpeI and PacI cloning sites. Individual clones were sequenced and correct clones were

selected to proceed. At this point we have three different TALER constructs, respectively,

pEVO-TALER Δ120, C63; Δ152, C63 and Δ180, C63.

4 – Design of the recombination target sites

We wanted to investigate if TALERs can be tailored to promote site-specific

recombination at novel target sites. For that chimeric recombination target sites were

designed. Each recombination target site comprised a central core sequence flanked by two

TALER domain-binding motifs. The central 16 bp sequence represented the native core

target sequence of the TN3 recombinase, whereas the flanking sequences are identical to

HOT1-TALEN_right targets (Figure 16). The TALE binding sequences were positioned on

the opposite strands.

Different target sites were designed, containing three length variants of the core

sequence (16 bp plus flanking 6, 8 or 10 bp) resulting in target sites named 20, 22 or 24. Two

versions of the target sites were also design, either the core sequence is located in the sense

strain of the target sequence and in the second version the core sequense in located in the

antisense strain. Sequences of the target sites are represented in Supplements Figure 4. An

ilustration of the target sites is represented in Figure 16.

The recombination sites were introduced by PCR amplification (primers available at

Supplements Table 1). Each oligo contained one of the recombination site, therefore upon

PCR, the amplified KmR cassete was flanked from both sided with the recombination sites

and cloned into pEVO plasmid. The final assembly of the pEVO construct contained the TN3-

TALE chimeric recombinase and a pair of recombination sites.

Figure 16. Scheme of the chimeric recombination target sites

Representative illustration of TALERs target sites. Gradient collared box represents the TALE recognition site were the TALE DNA-binding domain is expected to bind. Separating the two TALE recognition sites is a spacer with an invariable 16 bp sequence, represented by a pink box and a total length of 20, 22 or 24 bp located in the 5’ to 3’ strand or 3’ to 5’, named respectively Sense target site (SS) or antisense target site (AS).

5’

3’ 5’

3’core 16

20/22/24

Sense strand

5’

3’ 5’

3’

core 16

20/22/24

Antisense strand

Chapter IV – Results and Discussion

27

5 – TALER activity in bacterial cells

Next, to investigate recombinase mediated by TALER I have cloned two

recombination sites into pEVO-TALER vectors for application of an excision assay in E. coli.

This recombination assay is based on the pEVO plasmid. The plasmid carries a gene coding

for recombinase under the control of arabinose promoter (AraC), and two recombination sites

(Fig. 17). The recombination activity in all constructs is expressed from the L-arabinose

promoter which allows inducible expression of the recombinase protein by addition of L(+)-

arabinose to the growth medium.

Figure 17. Scheme of the recombination assay

pEVO vector containing the TALER, CmR, chloramphenicol resistance gene and two recombination sites (triangles) flanking a KmR, Kanamycin resistance gene. TALER orientation is shown as an arrow. Origin of replication (ori) is indicated as a box. AraC, gene coding arabinose promoter. regulator protein INT1 and INT2 primers are illustrated before and after arabinose induction. Non-recombined and recombined plasmids are denoted as two triangles and one triangle, respectively. [Adapted from Karimova et al., Nucleic Acids Research, 2013]

For comparison of the recombination activity, two random clones from each bacterial

cells (E. coli) expressing the vectors containing the recombinase encoding DNA (TALER)

and the respective recombination sites (SS20, 22, 24 and AS20, 22, 24) were selected and

grown in arabinose-induced conditions (3 ml ON culture with LB + 25 µg/mL Cm + 100

µg/mL L-arabinose at 37 ºC for 12-16 h, shaking at 180 rpm) for recombinase expression

induction and non-induced conditions. Importantly, sequencing of the clones confirmed them

to contain the correct TALER in combination with their corresponding recombination sites.

Upon site-specific recombination, it is expected that a 1.2 kb fragment be excised

from the plasmid, signifying recombination activity (Fig. 17). Excision event can be detected

by PCR, resulting in two different PCR fragments size for recombined and non-recombined

outcome. PCR assay can detect low quantities of the recombine plasmid and therefore is a

sensitive recombination detection approach. For that, INT1 and INT2 primers were used that

amplify the region contacting the target site -KmR- cassette of the pEVO plasmid (primers

are listed in Supplements Table 1). PCR fragments size was observed on a 1% agarose gel.

Site-specific recombination of Cre on the human pseudo site loxXp22 served as

positive control for arabinose induction, as is shown that Cre mediates recombination on

Chapter IV – Results and Discussion

28

loxXp22 sites. The gel analysis (Fig.18) revealed that even in the non-induced state there

was already recombined plasmid due to the high activity of the Cre recombinase.

Surprisingly, activity analysis in E. coli revealed that TALER constructs were not able

to recombine tested TALER recombination sites. All tested clones, that contains the

construct expressing the TALE recombinase with the respective recombination sites, show a

1.7 kb band in both, the induced and the non-induced state, indicating that no excision event

took place (Fig. 18).

Figure 18. Recombination assay results

PCR analysis of TALENs recombination on the intended recombination sites using 30 ng of plasmid DNA isolated after ON culture with or without L(+)-arabinose. Water Control (WC) served as a negative PCR control. Pre-recombined pEVOVikaΔvox plasmid DNA served as a positive control for PCR. pEVOCre-loXp22 cloned was cultured in parallel as a control for efficient arabinose induction. The + and – indicate presence or absence of L(+)-arabinose (100μg/ml), respectively, in the growth medium. Non-recombined and recombine plasmids are denoted as two triangles and one triangle, respectively. M, marker: 2-log DNA ladder (NEB). SS, sense strand. AS, antisense strand.

Two different PCR assays using INT1 and INT4 primers and INT2 and INT4 primers

were performed to further analyse a putative change in vector size after recombination

expression or the possibility of an inversion even of the recombination site. The results

expected are: in case of no-recombination PCR reaction implying primers INT1 and INT4

shows no PCR product. In the other hand, PCR reaction implying primers INT2 and INT4

shows a band around 0.7 kb when PCR product is analysed by agarose gel electrophoresis

(Supplementary Results section (Suppl. Fig. 5 and Suppl. Fig 6). The results of these assays

are consistent, and clearly indicate that the TALER constructs show no recombination on the

designed recombination sites in Escherichia coli.

>

>>

pEVO-TN3-TALEN_Δ152C63

L(+)arabinose M M

0.6kb

1.5kb2kb

+-+-+-+-+-+-

SS20 SS22 SS24 AS20 AS22 AS24

pEVO-TN3-TALEN_Δ180C63

+-+-+-+-+-+-

SS20 SS22 SS24 AS20 AS22 AS24

L(+)arabinose

Controls

H2O +-M pEVO

VikaΔvox

pEVO

Cre-loXp22

M M +-+-+-+-+-+-

SS20 SS22 SS24 AS20 AS22 AS24

pEVO-TN3-TALEN_Δ120C63

>

>

>

0.6kb

1.5kb2kb

Chapter V – Conclusions

29

Chapter V – Conclusions

The aim of my thesis work was to assemble a chimeric TALE recombinase in order to

mediate site-specific recombination on the novel sequences. To accomplish the aim of this

thesis work, my project has been divided in two goals. First, a functional DNA-binding

domain needed to be assembled utilizing the customized TALE design. For that the TALE-

nuclease specific for targeting of the HOT1 gene was generated and successfully tested in

human U2OS cells. Secondly, fusion TALE-recombinase (TALER) and a panel of target sites

were designed, assembled and tested in E. coli.

I have assembled a HOT1-TALEN that has been confirmed by sequencing.

Posteriorly I have optimized an assay (T7E1) that works efficiently in plasmid DNA and

mammalian cells to validate TALEN activity. HOT1-TALEN has proved to be active and can

be used to target and cleave a specific DNA locus in human genome. For further

experiments we plan to use this TALEN to generate gene knockout in human cells.

To accomplish the second goal of my project diverse TALER variants and six target

sites have been designed and cloned into pEVO vector. Recombination tests were

performed on all TALER plasmids. Current designs of TALERs did not mediate site-specific

recombination on the predicted restriction sites. One reason for the observed lack of

recombination could be the target sites, more precisely the core sequence length. Ideally,

TALERs recombination activity should be insensitive to the central sequences of the target

sites, which interacts with the recombinase catalytic domains. Nevertheless, it is known that

natural serine recombinase catalytic domains contribute significantly to DNA sequence

specificity (Gordley, Gersbach, & Barbas, 2009). We chose the Tn3 hyperactive resolvase

because some mutants were shown to promote recombination at some specific non-

canonical target sites (Akopian, et al., 2003) and the core sequence is exactly as in Tn3 res

site I. Although due to the sterical assembled of the TALER maybe the core sequence legth

is not sufficient for appropriate dimerization of the recombinase. Though for future

experiments, we plan to validate that the activated Tn3-resolvase is recombining efficiently in

E. coli. An interesting way to verify it could be using the ZF recombinase technology.

Another important reason may be the length and sequence of the linker between the

TALE DBD and the recombinase. The linker length has not been investigated in any detail

and it may still be far from optimal. Except for a work published last year, 2012, about

TALERs (Mercer, et al., 2012), where TALERs are able to recombine in bacteria but not in

mammalian cells very efficiently, no more TALER experiments to date have been described.

Therefore a focused investigation of the optimal protein design is needed and several

technical details can be approached for the further optimisation.

Chapter VI – Bibliography

30

Chapter VI – Bibliography

Akopian, A., He, J., Boocock, M. R., & Stark, W. M. (2003). Chimeric recombinases with designed DNA sequence recognition. PNAS , 100 (15), 8688-8691.

Arnold, P. H., Blake, D. G., Grindley, N. D., Boocock, M. R., & Stark, W. M. (1999). Mutants of Tn3 resolvase which do not require accessory binding sites for recombination activity. The EMBO Journal , 18 (5), 1407-1414.

Boch, J., Scholze, H., Schornack, S., Landgraf, A., Hahn, S., Kay, S., et al. (2009). Breaking the Code of DNA Binding Specificity of TAL-Type III Effectors. Science , 326 (5959), 1509-1512 .

Bogdanove, A. J., & Voytas, D. F. (2011). TAL effectors: Customizable Proteins for DNA targeting . Science , 333.

Branda, C. S., & Dymecki, S. M. (2004). Talking about a Revolution: The Impact of Site-Specific Recombinases on Genetic Analyses in Mice. Developmental cell , 6, 7-28.

Buchholz, F., & Stewart, A. F. (2001). Alteration of Cre recombinase site specificity by substrate-linked protein evolution . Nature Biotechnology , 19, 1047-1052.

Burma, S., P.C, B., & Chen, D. J. (2006). Role of non-homologous end joining (NHEJ) in maintaining genomic integrity. DNA Repair , 5, 1042–1048 .

Carlson, D. F., Tan, W., Lillico, S. G., Stverakova, D., Proudfoot, C., Christian, M., et al. (2012). Efficient TALEN-mediated gene knockout in livestock. (R. M. Roberts, Ed.) PNAS , 109 (43), 17382-17387.

Cermak, T., Doyle, E. L., Christian, M., L Wang, Y. Z., Schmidt, C., Baller, J. A., et al. (2011). Efficient design and assembly of custom TALEN and other TAL effector-based constructs for DNA targeting . Nucleic Acids Research , 39 (12), e82-e82.

Cui, X., Ji, D., Fisher, D. A., Wu, Y., Briner, D. M., & Weinstein, E. J. (2011). Targeted integration in rat and mouse embryos with zinc-finger nucleases. Nature Biotechnology , 29, 64-67.

Faix, J., Linkner, J., Nordholz, B., Platt, J., Liao, X., & Kimmel, A. (2013). The application of the Cre-loxP system for generating multiple knock-out and knock-in targeted loci. Methods in Molecular Biology , 983, 249-67.

Fehér, T., Burland, V., & Pósfai, G. (2012). In the fast lane: Large-scale bacterial genome engineering. Journal of Biotechnology , 160 (1-2), 72-79.

Gaj, T., Gersbach, C. A., & Barbas, C. F. (2013). ZFN, TALEN, and CRISPR/Cas-based methods for genome engineering . Trends in Biotechnology , 31 (7), 397-405.

Gaj, T., Mercer, A. C., Gersbach, C. A., Gordley, R. M., & Barbas, C. F. (2011). Structure-guided reprogramming of serine recombinase DNA sequence specificity. (J. A. wells, Ed.) PNAS , 108 (2), 498-503.

Gordley, R. M., Gersbach, C. A., & Barbas, C. F. (2009). Synthesis of programmable integrases. PNAS , 106 (13), 5054-5058.

Gorman, C., & Bullock, C. (2000). Site-specific gene targeting for gene expression in eukaryotes . Current Opinion in Biotechnology , 11 (5), 455-460.

Grindley, N. D., Whiteson, K. L., & Rice, P. A. (2006). Mechanisms of Site-Specific Recombination . Annual Review of Biochemistry , 75, 567-605.

Grindley, N. (2002). The movement of Tn3 like elementstransposition and cointegrate resolution. In R. C. N. Craig, Mobile DNA II (pp. 272-302). Washington, DC: American Society of Microbiology Press.

Grindley, N., Lauth, M., Wells, R., Wityk, R., Salvo, J., & Reed, R. (1982). Transposon-mediated site-specific recombinationidentification of three binding sites for resolvase at the res sites of γδ and Tn3. Cell , 30, 19-27.

Gupta, A., Christensen, R. G., Rayla, A. L., Lakshmanan, A., Stormo, G. D., & Wolfe, S. A. (2012). An optimized two-finger archive for ZFN-mediated gene targeting. NATURE METHODS , 9, 588-590.

Chapter VI – Bibliography

31

Huang, J., Huen, M. S., Kim, H., Leung, C. C., Glover, J. N., Yu, X., et al. (2009). RAD18 transmits DNA damage signalling to elicit homologous recombination repair. Nature Cell Biology , 11, 592-603.

Joung, J. K., & Sander, J. D. (2012). TALENs: a widely applicable technology for targeted genome editing . Nature Reviews Molecular Cell Biology , 14 (1), 49-55.

Kappei, D., Butter, F., Benda, C., Scheibe, M., Draškovič, I., Stevense, M., et al. (2013). HOT1 is a mammalian direct telomere repeat-binding protein contributing to telomerase recruitment. EMBO , 32, 1681 - 1701.

Kay, S., Hahn, S., Marois, E., Hause, G., & Bonas, U. (2007). A bacterial effector acts as a plant transcription factor and induces a cell size regulator. Science , 318 (5850), 648-651.

Kilby, N. J., Snaith, M. R., & Murray, J. A. (1993). Site-specific recombinases: tools for genome engineering . Science , 9 (12), 413-421.

Kim, H. J., Lee, H. J., Kim, H., Cho, S. W., & Kim, J.-S. (2009). Targeted genome editing in human cells with zinc finger nucleases constructed via modular assembly. Genome Research , 19 (7), 1279-1288.

Kim, Y., Kweon, J., Kim, A., Chon, J. K., Yoo, J. Y., Kim, H. J., et al. (2013). A library of TAL effector nucleases spanning the human genome. Nature Biotechnology , 31 (3), 251–258.

Krogh, B. O., & Symington, L. S. (2004). Recombination proteins in yeast. Annual Reviews of Genetics , 38, 233-271

Laprise, J., Yoneji, S., & Gardner, J. F. (2010). Homology-dependent interactions determine the order of strand exchange by IntDOT recombinase. Nucleic Acids Research , 38 (3), 958-969.

Lehnman, I. R. (1974). DNA ligase: Structure, Mechanism, and Function. Science , 186 (4166), 790-797.

Li, T., Huang, S., Jiang, W. Z., Wright, D., Spalding, M. H., Weeks, D. P., et al. (2011). TAL nucleases (TALNs): hybrid proteins composed of TAL effectors and FokI DNA-cleavage domain. Nucleic Acids Research , 39 (1), 359-372.

Lieber, M. (2008). The mechanism of human nonhomologous DNA end joining. Journal of Biological Chemistry , 283 (1), 1-5.

Maeder, M. L., Thibodeau-Beganny, S., & Joung, J. K. (2008). Rapid “open-source” engineering of customized zinc-finger nucleases for highly efficient gene modification. Molecular cell , 31 (2), 294-301.

Mandell, J. G., & Barbas, C. F. (2008). Zinc Finger Tools: custom DNA-binding domains for transcription factors and nucleases. Nucleic Acids Research , 34, W516-W523.

Marie-Christine Birling, F. G. (2009). Site-specific recombinases for Manipulating of the Mouse Genome . Methods in Molecular Biology , 561 , 245-263

Maresca, M., Lin, V. G., Guo, N., & Yang, Y. (2012). Obligate Ligation-Gated Recombination (ObLiGaRe): Custom designed nucleases mediated targeted integration through non-homologous end joining . Genome Research , 23 (3), 539–546 .

Mercer, A., Gaj, T., Fuller, R. P., & Barbas, C. F. (2012). Chimeric TALE recombinases with programmable DNA sequence specificity. Nucleic Acids Research , 40 (21), 11163–11172.

Moscou, M., & Bogdanove, A. (2009). A simple cipher governs DNA recognition by TAL effectors. Science , 326 (5959), 1501.

Mussolino, C., Morbitzer, R., Lutge, F., Dannemann, N., Lahaye, T., & Cathomen, T. (2011). A novel TALE nuclease scaffold enables high genome editing activity in combination with low toxicity . Nucleic Acids Research , 39 (21), 9283-9293.

Nöllmann, M., Byron, O., & Stark, W. M. (2005). Behavior of Tn3 Resolvase in Solution and Its Interaction with res. Biophysical Journal , 89 (3), 1920-1931.

Olorunniji, F. J., & Stark, W. M. (2009). The catalytic residues of Tn3 resolvase . Nucleic Acids Research , 37 (22), 7590-7602.

Olorunniji, F. J., He, J., Wenwieser, S. V., Boocock, M. R., & Stark, W. M. (2008). Synapsis and catalysis by activated Tn3 resolvase mutants . Nucleic Acids Research , 36 (22), 7181-7191.

Pabo, C. O., Peisach, E., & Grant, R. A. (2001). DESIGN AND SELECTION OF NOVEL CYS2HIS2 ZINC FINGER PROTEINS. Annual Review of Biochemistry , 70, 313-340.

Perez-Pinera, P., Ousterout, D. G., & Gersbach, C. A. (2012). Advances in targeted genome editing. Current Opinion in Chemical Biology , 16 (3-4), 268–277.

Chapter VI – Bibliography

32

Prorocic, M. M., Wenlong, D., Olorunniji, F. J., Akopian, A., Schloetel, J.-G., Hannigan, A., et al. (2011). Zinc-finger recombinase activities in vitro. Nucleic Acids Research , 39 (21), 9316-9328.

Proudfoot, C., McPherson, A. L., Kolb, A. F., & Stark, W. M. (2011). Zinc Finger Recombinases with Adaptable DNA Sequence Specificity . (D. T. Kirkpatrick, Ed.) PLoS ONE , 6 (4), e19537.

Römer, P., Hahn, S., Jordan, T., Strauss, T., Bonas, U., & Lahaye, T. (2007). Plant pathogen recognition mediated by promoter activation of the pepper Bs3 resistance gene. Science , 318 (5850), 645-648.

Ramirez, C., Feley, J., Wright, D., Muller-Lerch, F., Rahman, S., Cornu, T., et al. (2008). Unexpected failure rates for modular assembly of engineered zinc fingers. Nat Methods , 5 (5), 374-375.

Reyon, D., Tsai, S. Q., Khayter, C., Foden, J. A., Sander, J. D., & Joung, J. K. (2012). FLASH assembly of TALENs for high-throughput genome editing. Nature Biotechnology , 30 (5460-465).

Rice, P. A., & Correll, C. C. (2008). Protein-Nucleic Acid Interactions: Structural Biology. (P. A. Rice, & C. C. Correll, Eds.) Cambridge, UK: The Royal Society of Chemistry .

Roberts, R. J. (1976, November). Restriction Endonucleases. Critical Reviews in Biochemistry and Molecular Biology , 123-164.

Rowland, S. J., & Stark, W. M. (2005). Site-specific recombination by the serine recombinases. In S. J. Rowland, & W. M. Stark, The Dynamic Bacterial Genome (pp. 83-120). Cambridge University Press.

Sanjana, N. E., Cong, L., Zhou, Y., Cunniff, M. M., Feng, G., & Zhang, F. (2012). A transcription activator-like effector toolbox for genome engineering . Nature Protocols , 7 (1), 171-192.

Santoro, S., & Schultz, P. (2002). Directed evolution of the site specificity of Cre recombinase. PNAS , 99 (7), 4185-4190.

Sarkar, I., Hauber, I., Hauber, J., & Buchholz, F. (2007). HIV-1 Proviral DNA Excision Using an Envolved Recombinase. Science , 316, 1912-19.

Sauer, B., & Henderson, N. (1988). Site-specific DNA recombination in mammalian cells by the Cre recombinase of bacteriophage P1 . Proc. Nail. Acad. Sci. , 85, 5166-5170.

Scholze, H., & Boch, J. (2011). TAL effectors are remote controls for gene activation. Current Opinion in Microbiology , 14, 47-53.

Seligman, L. M., Chisholm, K. M., Chevalier, B. S., Chadsey, M. S., Edwards, S. T., Savage, J. H., et al. (2002). Mutations altering the cleavage specificity of a homing endonuclease. Nucleic Acids Research , 30 (17), 3870-3879.

Smith, M. C., & Thorpe, H. M. (2002). Diversity in the serine recombinases. Molecular Microbiology , 442 (2), 299-307.

Townsend, J. A., Wright, D. A., Winfrey, R. J., Fu, F., Maeder, M. L., Joung, J. K., et al. (2009). High-frequency modification of plant genes using engineered zinc-finger nucleases . Nature , 459, 442-446.

Urnov, F. D., Miller, J. C., Lee, Y.-L., Beausejour, C. M., Rock, J. M., Augustus, S., et al. (2005). Highly efficient endogenous human gene correction using designed zinc-finger nucleases . Nature , 435 (2), 646-651.

Urnov, F. D., Rebar, E. J., Holmes, M. C., Zhang, H. S., & Gregory, P. D. (2010). Genome editing with engineered zinc finger nucleases. Nature Reviews Genetics , 11, 636–646.

Van de Putte, P., & Goosen, N. (1992). DNA inversions in phages and bacteria. Trends in Genetics , 8 (12), 457-562.

Voziyanov, Y., Konieczka, J. H., Stewart, A. F., & Jayaram, M. (2003). Stepwise Manipulation of DNA Specificity in Flp Recombinase: Progressively Adapting Flp to Individual and Combinatorial Mutations in its Target Site. Journal of Molecular Biology , 326 (1), 65-76.

Weterings, E., & Chen, D. (2008). The endless tale of non-homologous end-joining. Cell Research , 28, 114-124.

Zhang, Y., Zhang, F., Li, X., Baller, J. A., Qi, Y., Starker, C. G., et al. (2012). TALENs enable efficient plant genome engineering. Plant Physiology Preview , 161 (1), 20-27.

Supplements

33

Supplements

I – Supplementary Tables

Table 1. Primers used for vector’s construction and recombination assay The first column shows the primer names used in PCR reactions. The second column lists the sequences of the oligos used in the PCR reaction.

Primer

name

Primer sequence 5’3’

Golden Gate TALEN assembly Kit

TAL_F1 ttggcgtcggcaaacagtgg

TAL_F2 ggcgacgaggtggtcgttgg

SeqTALEN_5-

1

catcgcgcaatgcactgac

pCR8_F1 ttgatgcctggcagttccct

pCR8_R1 cgaaccgaacaggcttatgt

HOT1-TALEN validation

DK628 GTGTCTGAATTGAAAGGGATCA

DK629 AACGTCTGTTGTCAAATCTGGA

APC-TALEN validation

DK689 CTTCCCACCTCCCACAAGAT

DK690 GAGAATGGAGGACCTGCAAA

pEVO-TN3-TALEN truncations

Δ120-

TALE_up

actACTAGTacagcggctgccccagcag

Δ152-TALE-up

actACTAGTcggccgccgcgcgccaag

Δ180-TALE_up

actACTAGTacgctcggctacagtcag

C63-

TALE_low

agtTTAATTAAggcaacgcgatgggacgtg

TALER Recombination Sites

SS20-km-up gtAGATCTTCCGTGGTGTAAGTAcgaaatattataaattatcaTACTTACACCACGGAGGTCTGACGCTCAGTGGAAC

SS20-km-low gtCTCGAGTCCGTGGTGTAAGTAtgataatttataatatttcgTACTTACACCACGGACCGATTTCGGCCTATTGG

Supplements

34

SS22-km-up gtAGATCTTCCGTGGTGTAAGTAtcgaaatattataaattatcagTACTTACACCACGGAGGTCTGACGCTCAGTGGAAC

SS22-Km-low gtCTCGAGTCCGTGGTGTAAGTActgataatttataatatttcgaTACTTACACCACGGACCGATTTCGGCCTATTGG

SS24-km-up gtAGATCTTCCGTGGTGTAAGTAttcgaaatattataaattatcagaTACTTACACCACGGAGGTCTGACGCTCAGTGGAAC

SS24-Km-low gtCTCGAGTCCGTGGTGTAAGTAtctgataatttataatatttcgaaTACTTACACCACGGACCGATTTCGGCCTATTGG

AS20-Km-up gtAGATCTTCCGTGGTGTAAGTAactattaaatattataaagcTACTTACACCACGGAGGTCTGACGCTCAGTGGAAC

AS20-Km-low gtCTCGAGTCCGTGGTGTAAGTAgctttataatatttaatagtTACTTACACCACGGACCGATTTCGGCCTATTGG

AS22-Km-up gtAGATCTTCCGTGGTGTAAGTAagctttataatatttaatagtcTACTTACACCACGGAGGTCTGACGCTCAGTGGAAC

AS22-Km-low gtCTCGAGTCCGTGGTGTAAGTAgactattaaatattataaagctTACTTACACCACGGACCGATTTCGGCCTATTGG

AS24-Km-up gtAGATCTTCCGTGGTGTAAGTAaagctttataatatttaatagtctTACTTACACCACGGAGGTCTGACGCTCAGTGGAAC

AS24-Km-low gtCTCGAGTCCGTGGTGTAAGTAagactattaaatattataaagcttTACTTACACCACGGACCGATTTCGGCCTATTGG

pEVO vector

P1_up TCTACTGTTTCTCCATA

P2_low GCGGATGAGAGAAGATT

INT1-up TCTGTTGTTTGTCGGTGAACG

INT2-low CTTAAACGCCTGGTGCTACG

INT4-low AAGAGGCATAAATTCCGTCAGC

Table 2. TALEN construct assembly timeline

RVD’s modules selection

Golden Gate Reaction 1

Pick and culture 3

white colonies

Pick and culture 3

white colonies

DNA extraction and

digestion to confirm

clones with the correct

size

Golden Gate Reaction 2

DNA extraction and digestion (or sequencing) to confirm

clones with the correct

size

TALENs are ready to test in yeast or to subclone into

vector of choice

Day 1 Day 2 Day 3 Day 4 Day 5

Supplements

35

II – Supplementary Figures

Figure 1. Mammalian expression vector

HOT1-TALEN_left and HOT1-TALEN_right were both cloned into a mammalian expression vector pcDNA3.1 using AflII restriction endonucleases under control of the cytomegalovirus (CMV) promoter with ampicillin and hygromycin resistance.

Figure 2. APC-TALEN target sequence amplified with PCR from human cells

APC-TALEN target sequence is highlighted in red and spacer region is shown in bold. Arrows indicates primers orientation. Primers sequences are underlined. F, forward, R, reverse.

APC-TALEN [Reyon D., et. al (2012)]

5’CTTCCCACCTCCCACAAGATGGCGGAGGGCAAGTAGCAAGGGGGCGGGGTGTGGCCGCCGGAAGCCTAGCCGCTGCTCGGGGGGGAC

CTGCGGGCTCAGGCCCGGGAGCTGCGGACCGAGGTTGGCTCGATGCTGTTCCCAGGTACTGTTGTTGGCTGTTGGTGAGGAAGGTGAAG

CACTCAGTTGCCTTCTCGGGCCTCGGCGCCCCCTATGTACGCCTCCCTGGGCTCGGGTCCGGTCGCCCCTTTGCCCGCTTCTGTACCAC

CCTCAGTTCTCGGGTCCTGGAGCACCGGCGGCAGCAGGAGCTGCGTCCGGCAGGAGACGAAGAGCCCGGGCGGCGCTCGTACTTCTGGC

CACTGGGCGAGCGTCTGGCAGGTGAGTGAGGCTGCAGGCATTGACGTCTCCTCCCGGCAAAGCTTCCTCGGCTTTGCCCCGCCGCTGCT

CGGGACCCTACGGTGCTCGGCCCGACTCTGTGGCTCTCTTCTCTCCATGTCTCACCCTCTCCCCTCCCCGCACTCCCCATTCAGGCCTC

CAGTTGGCCCCTGGCTTTGCAGGTCCTCCATTCTC-3’

R primer

F primer

Supplements

36

Figure 3. HOT1-TALEN target sequence amplified with PCR from human cells

HOT1-TALEN target sequence is highlighted in red and spacer region is shown in bold. Arrows indicates primers orientation. Primers sequences are underlined. F, forward, R, reverse.

Figure 4. Recombination sites

Alignment with site I of Tn3 res site. The motifs bound by the TALE DBD domains are in blue. The motifs in site I bound by the resolvase C-terminal domains are in green. The central core sequence of each recombination site is in bold. The nucleotides added to each site are underlined.

R primer

HOT1-TALEN

5´GTGTCTGAATTGAAAGGGATCAAAGTGTATGGTGAATAAAGCAGTAGAATGCAAATCAGAAGATATAATATTTAACATGATTAGC

TAGTGTTAACTCTTGCTGTCCATCAGTTACATAATTAAAATTTTTATTCAGCACTTGATTAACATAAATGTTTCTCAATTTTCTATC

TTTGTTCTACAGAATGGTAGATAACGCAGATCATCTCTGGAAAGGATATTGATCCGCCTCATGTAAAGTATGCTTAGTTCCTTTCCA

GTGGTGTAAGATCAAGTCCTTTTTGATTTTTATCTTCACAATCATTTTAGTAAAGTTAAGATGTCACTATGATTATGATCCAGATTT

GACAACAGACGTT-3’

F primer

Supplements

37

III – Supplementary Results

Recombination assay

PCR amplification using INT1 and INT4 primers was performed. INT4 orientation is

illustrated in Figure 5. For all TALER construct and respective target sites no PCR product

was observed in the gel image when analysed by agarose gel electrophoresis.

Figure 5. Primers INT1, INT2 and INT4 orientation pEVO vector containing the TALER, CmR, chloramphenicol resistance gene and two recombination sites (triangles) flanking a KmR, Kanamycin resistance gene. Origin of replication (ori) is indicated as a box. AraC, gene coding arabinose promoter. regulator protein. INT1, INT2 and INT3 primers orientation is illustrated by an arrow.

For PCR amplification using INT2 and INT4 primers a 0.7 kb band was observed for

all TALER constructs and respective target sites, including controls (Fig. 6). The controls

used for these assays were the same used previously for recombination assay.

recombination site

recombination site

L(+)arabinose

Controls

H2O +-M pEVO

VikaΔvox

pEVO

Cre-loXp22

M +-+-+-+-+-+-

SS20 SS22 SS24 AS20 AS22 AS24

pEVO-TN3-TALEN_Δ120C63

1kb

0.5kb

Supplements

38

Figure 6. Gel analysis from PCR result using INT2 and INT4 primers

PCR analysis of TALENs recombination on the intended recombination sites using 30 ng of plasmid DNA isolated after ON culture with or without L(+)-arabinose. Water Control (WC) served as a negative PCR control. Pre-recombined pEVOVikaΔvox plasmid DNA served as a positive control for PCR. pEVOCre-loXp22 cloned was cultured in parallel as a control for efficient arabinose induction. The + and – indicate presence or absence of L(+)-arabinose (100μg/ml), respectively, in the growth medium. Non-recombined and recombine plasmids are denoted as two triangles and one triangle, respectively. M, marker: 2-log DNA ladder (NEB). SS, sense strand. AS, antisense strand.

pEVO-TN3-TALEN_Δ152C63

L(+)arabinose M M

0.5kb1kb

+-+-+-+-+-+-

SS20 SS22 SS24 AS20 AS22 AS24

pEVO-TN3-TALEN_Δ180C63

+-+-+-+-+-+-

SS20 SS22 SS24 AS20 AS22 AS24