Upload
larissa
View
214
Download
0
Embed Size (px)
Citation preview
ORIGINAL INVESTIGATION
Genomic rearrangements at the FRA2H common fragile sitefrequently involve non-homologous recombination eventsacross LTR and L1(LINE) repeats
Lena M. Brueckner • Evgeny Sagulenko •
Elisa M. Hess • Diana Zheglo • Anne Blumrich •
Manfred Schwab • Larissa Savelyeva
Received: 26 January 2012 / Accepted: 24 March 2012 / Published online: 5 April 2012
� Springer-Verlag 2012
Abstract Common fragile sites (cFSs) are non-random
chromosomal regions that are prone to breakage under
conditions of replication stress. DNA damage and chro-
mosomal alterations at cFSs appear to be critical events in
the development of various human diseases, especially
carcinogenesis. Despite the growing interest in under-
standing the nature of cFS instability, only a few cFSs have
been molecularly characterised. In this study, we fine-
mapped the location of FRA2H using six-colour fluores-
cence in situ hybridisation and showed that it is one of the
most active cFSs in the human genome. FRA2H encom-
passes approximately 530 kb of a gene-poor region con-
taining a novel large intergenic non-coding RNA gene
(AC097500.2). Using custom-designed array comparative
genomic hybridisation, we detected gross and submicro-
scopic chromosomal rearrangements involving FRA2H in a
panel of 54 neuroblastoma, colon and breast cancer cell
lines. The genomic alterations frequently involved different
classes of long terminal repeats and long interspersed
nuclear elements. An analysis of breakpoint junction
sequence motifs predominantly revealed signatures of
microhomology-mediated non-homologous recombination
events. Our data provide insight into the molecular struc-
ture of cFSs and sequence motifs affected by their activa-
tion in cancer. Identifying cFS sequences will accelerate
the search for DNA biomarkers and targets for individua-
lised therapies.
Introduction
In recent years, common fragile sites (cFSs) have become
of increasing interest, as their tendency to breakage has
been associated with genomic instability in different types
of disease, especially in cancer (Glover 2006; Dillon et al.
2010). CFSs are non-random chromosomal regions that
tend to undergo double-strand breakage in response to
DNA replication stress. They are present in all individuals
as a part of the normal chromosome architecture. In vivo,
cFS expression may be triggered by various endogenous
and exogenous factors, including hypoxia, chemothera-
peutics and other drugs, exposure to UV or ionising radi-
ation, pesticides, cigarette smoke, and chronic caffeine or
alcohol abuse (Dillon et al. 2010). In vitro, cFSs may be
induced using agents impairing DNA synthesis, such as
aphidicolin, an inhibitor of DNA polymerases a, d and e,which has been shown to activate most fragile sites
(Mrasek et al. 2010; Glover et al. 1984). Active cFSs are
visible microscopically as gaps and breaks in metaphase
chromosomes.
The molecular events underlying the observed fragility
at cFSs are not well understood. A deficiency in proteins
associated with the ATR DNA damage checkpoint pathway
(e.g. ATR, BRCA1, CHK1) appears to result in elevated
cFS breakage, suggesting an essential role of this pathway
in maintaining cFS stability (Casper et al. 2002; Durkin
and Glover 2007). It has been hypothesised that fragility at
cFSs, induced by DNA replication stress, results from
extended single-stranded regions of unreplicated DNA
accumulating at stalled replication forks having escaped
Electronic supplementary material The online version of thisarticle (doi:10.1007/s00439-012-1165-3) contains supplementarymaterial, which is available to authorized users.
L. M. Brueckner � E. Sagulenko � E. M. Hess � D. Zheglo �A. Blumrich � M. Schwab � L. Savelyeva (&)
Division of Tumor Genetics, German Cancer Research Center
(DKFZ), Im Neuenheimer Feld 280, 69120 Heidelberg,
Germany
e-mail: [email protected]
123
Hum Genet (2012) 131:1345–1359
DOI 10.1007/s00439-012-1165-3
the ATR replication checkpoint. Recently it has been
demonstrated that topoisomerase I activity is required for
cFS breakage and that polymerase–helicase uncoupling is
an initial key event in cFS instability (Arlt and Glover
2010). Although it is unclear why cFSs are particularly
sensitive to perturbations in DNA replication, there is
evidence suggesting that DNA sequences capable of
forming stable secondary structures impair replication of
these genomic regions (Zlotorynski et al. 2003; Burrow
et al. 2010). While currently the cytogenetic locations of 89
cFSs are listed in the Entrez NCBI human genome data-
base, several additional cFSs have been reported (Mrasek
et al. 2010). Despite growing evidence of their importance
in disease development, most cFSs have not yet been
investigated at the molecular level. Up to now, only nine of
them have been characterised at kilobase resolution:
FRA1E (Hormozian et al. 2007), FRA2C (Blumrich et al.
2011), FRA2G (Limongi et al. 2003), FRA3B (Wilke et al.
1994; Zimonjic et al. 1997), FRA7K (Helmrich et al. 2007),
FRA9G (Sawinska et al. 2007), FRA13A (Savelyeva et al.
2006), FRA16D (Mangelsdorf et al. 2000) and FRAXB
(Arlt et al. 2002). All of these cFSs span AT-rich genomic
regions, ranging between 300 kb and 1 Mb, that appear to
be enriched in peaks of enhanced DNA flexibility (Sch-
wartz et al. 2006). Fragility at cFSs has also been linked to
some epigenetic features, including their association with
regions of late replication (Debatisse et al. 2006) and his-
tone hypoacetylation (Jiang et al. 2009). Ectopic expres-
sion of cFS sequences has been shown to increase breakage
at the integration site, supporting the hypothesis that the
DNA sequence itself is a critical factor underlying cFS
instability (Ragland et al. 2008). In contrast to rare fragile
sites, where fragility can be attributed to either AT-rich
minisatellites or CGG repeat expansions (Sutherland
2003), no such extended repeat motifs have been identified
within cFSs. Nevertheless, all molecularly characterised
cFSs appear to be enriched in stretches of interrupted AT-
dinucleotide-rich sequences with the potential to form
secondary structures, which may thus impair replication
fork progression, in turn resulting in elevated chromosomal
breakage (Zlotorynski et al. 2003; Dillon et al. 2010).
Generally, the level of susceptibility to breakage varies
among cFSs, FRA3B at chromosome band 3p14.2 being the
most frequently expressed and best-studied cFS in the
human genome (Denison et al. 2003; Mrasek et al. 2010).
The second and third most fragile cFSs are FRA16D at
16q23.2 and FRAXB at Xp22.3, respectively, followed by a
group of other cFSs frequently observed in the human gen-
ome, including FRA2H (2q32.1), FRA1E (1p21.3), FRA6E
(6q26), FRA7K (7q22-31) and FRA7H (7q32). Normally,
cFSs are stable in somatic cells, but they are frequently
involved in chromosomal rearrangements in many different
cancers. Heterozygous and homozygous deletions appear to
be the most prevalent genomic alterations of cFS regions in
malignancies such as lung, kidney, breast, and digestive tract
cancers (Arlt et al. 2006). To date, all molecularly charac-
terised cFSs span genomic regions containing protein-cod-
ing gene sequences, and most of them encompass large
genes extending over hundreds of kilobases of genomic
DNA (Smith et al. 2007). The two most active and best-
characterised cFS genes, FHIT at FRA3B and WWOX at
FRA16D, have been demonstrated to function as tumour
suppressor genes, whose inactivation provides a selective
growth advantage to cancer cells (Lewandowska et al. 2009;
Saldivar et al. 2010). Recently, direct evidence suggesting
involvement of cFSs in generating cancer-specific rear-
rangements in human cells has been provided by Gandhi
et al. (2010). After exposure to fragile site-inducing chem-
icals, the RET, CCDC6, and NCOA4 genes, located in
FRA10C and FRA10G, respectively, undergo DNA break-
age and form rearrangements known to contribute to papil-
lary thyroid carcinoma development. DNA damage at cFSs
appears to be among the earliest events during tumouri-
genesis, resulting from oncogene-induced replication stress
in many kinds of human tumours (Halazonetis et al. 2008;
Tsantoulis et al. 2008). This assumption stems from the
observation that, in precancerous lesions, genomic altera-
tions preferentially target cFSs as those loci most sensitive to
replication stress and has led to suggesting DNA damage at
cFSs as potential biomarkers for clinical patient manage-
ment and cFS genes as targets for therapies (Lai et al. 2010).
CFS activation may also trigger gene amplification (e.g.
MET, MYC, MYCN) (Blumrich et al. 2011; Hellman et al.
2002; Cicek et al. 2009) and serve as preferred integration
sites for several oncogenic viruses, such as hepatitis B and
human papilloma viruses (Ferber et al. 2003).
Besides their involvement in somatic rearrangements
associated with tumourigenesis, cFSs also appear to con-
tribute to germline rearrangements leading to non-malig-
nant human diseases. Recently, evidence supporting this
hypothesis has emerged from sequence analysis of break-
point junctions within the PARK2 (FRA6E) and DMD
(FRAXC) cFS genes in a large number of patients with
autosomal-recessive juvenile Parkinsonism and Duchenne
and Becker muscular dystrophy, respectively (Mitsui et al.
2010). The investigated germline breakpoint sequences
shared some features with somatic breakpoints within
PARK2 and DMD in cancer cell lines, suggesting that
common mechanisms may be involved in generating both
germline and somatic rearrangements over cFS regions. As
a growing body of evidence points to cFS instability as an
important factor contributing to human disease develop-
ment, it is crucial to identify and molecularly characterise
the full repertoire of these regions in the human genome.
In this study, using the advantage of six-colour fluo-
rescence in situ hybridisation (FISH), we were able to
1346 Hum Genet (2012) 131:1345–1359
123
delineate the boundaries of the entire FRA2H region at
kilobase resolution level and to characterise the genetic
complexity of the fragile DNA sequence. We show that
FRA2H is one of the most active cFSs in the human gen-
ome. Using custom-designed 60 bp-resolution comparative
genomic hybridisation arrays (array CGH), FISH with
specific DNA probes and multicolour FISH (mFISH), we
detected gross chromosomal rearrangements in colon car-
cinoma and neuroblastoma cell lines, as well as two novel
copy number variants (CNVs) at FRA2H. Sequence anal-
ysis of breakpoint junctions revealed that DNA damage
repair at FRA2H predominantly appears to occur via non-
homologous recombination events mediated by short
microhomologies.
Materials and methods
Cell culture and fragile site induction
Colorectal carcinoma cell lines were cultured in DMEM/
Ham’s F12 medium (50:50; Biochrom, Berlin, Germany)
supplemented with 20 % foetal calf serum (FCS; PAA,
Pasching, Austria). HDC cell lines (Bruderlein et al. 1990)
were of low passage number. Epstein–Barr virus (EBV)-
transformed lymphocytes from eight healthy individuals,
neuroblastoma and breast cancer cell lines were cultivated
in RPMI 1640 medium (Lonza, Cologne, Germany) sup-
plemented with 10 % FCS. Cell line authenticity was
verified cytogenetically and/or via DNA typing (DSMZ,
Braunschweig, Germany and DKFZ Genomics and Pro-
teomics Core Facility, Heidelberg, Germany). To induce
fragile site breakage, EBV-transformed lymphocytes were
treated with 0.4 lM aphidicolin (Sigma-Aldrich, Deis-
enhofen, Germany) in 0.5 % ethanol 24 h prior to cell
harvest. Metaphase preparations were made following
standard procedures.
Bacterial artificial chromosome (BAC) and fosmid
clones
BAC and fosmid clones were obtained from the Children
Hospital Oakland Research Institute (CHORI, Oakland,
California, USA). DNA isolation was carried out following
standard phenol–chloroform extraction techniques. DNA
extracts served as probes for FISH-based fragile site
mapping and validation of array CGH data. End sequences
of BAC clones at FRA2H (RP11-334K17, RP11-400O18,
RP11-561J1, RP11-335G13, RP13-513D10, RP13-541C6,
RP11-639N24 and RP11-625P14) were determined using
T7 and SP6 standard primers (Seqlab sequencing services,
Gottingen, Germany).
Fluorescence in situ hybridisation
Six-colour FISH was performed to map the location of
FRA2H and to validate array CGH data. BAC and fosmid
DNA probes were labelled with DEAC, FITC, Cy3, Cy3.5,
Cy5 and Cy5.5-coupled dUTPs by nick-translation. The
succinimidyl-ester derivatives of the fluorescent dyes
DEAC, FITC (Molecular Probes, Eugene, Oregon, USA),
Cy3, Cy3.5, Cy5 and Cy5.5 (GE Healthcare, Freiburg,
Germany) were used for the synthesis of modified nucle-
otides. Fluorescent dye coupling to allylamine-dUTPs
(Sigma-Aldrich, Deisenhofen, Germany) was carried out as
described previously (Henegariu et al. 2000). Hybridisation
to metaphase spreads occurred overnight and followed
standard cytogenetic procedures. Hybridised slides were
counterstained with DAPI (4, 6-diamidino-2-phenylindole;
Sigma-Aldrich, Munich, Germany). Fluorescent signals
were visualised using a Leica DMRA 2 microscope and
analysed with the corresponding Leica CW 4000 FISH
software. Array CGH validation assays included a probe
control experiment on normal lymphocytes to ensure the
correct signal number and position of BAC and fosmid
clones.
Multicolour FISH
Karyotypes of each tumour cell line were obtained using
the commercially available 24XCyte multicolour FISH
probe mix (Metasystems, Altlussheim, Germany). DNA
denaturation and hybridisation to metaphase spreads fol-
lowed the manufacturer’s recommendations. Slides were
viewed with a Zeiss Axio Imager.Z1 microscope and kar-
yotypes constructed with ISIS FISH imaging software
(Metasystems, Altlussheim, Germany).
Array comparative genomic hybridisation
The custom array designs of selected fragile site regions
(Roche NimbleGen, Madison, Wisconsin, USA) used in this
study contained 730,000 60-mer oligonucleotide probes per
array yielding a resolution of approximately 60 bp. Geno-
mic DNA from lymphocytes and tumour cell lines was
isolated by phenol–chloroform extraction following stan-
dard procedures. A lymphocyte genomic DNA pool from
five healthy individuals served as reference DNA. Sample
labelling and hybridisation were carried out following the
manufacturer’s instructions (Roche NimbleGen). DNA
quality and labelling efficiency were determined using a
Nanodrop-1000 spectrophotometer (NanoDrop Technolo-
gies, Wilmington, Delaware, USA). Array slides were
scanned at a 2 lm resolution using a Roche ms-200 scan-
ner. Array CGH images were processed with NimbleGen
Hum Genet (2012) 131:1345–1359 1347
123
NimbleScan (v. 2.4) and viewed with NimbleGen Signal-
Map (v. 1.9.0.03).
PCR amplification
Specific primers on either side of each break were designed
using the Primer3 (v. 0.4.0) web interface or the Invitrogen
OligoPerfectTM
Designer software. All primers were ana-
lysed using NCBI BLAST and IDT SciTools OligoAnalyzer
3.0 software. Two negative controls were included in each
PCR run, substituting tumour DNA with normal lymphocyte
genomic DNA and water. In case of multiple PCR products,
nested PCR was carried out using a second pair of primers.
The following primers were used: HDC-133 nm: forward
50-AAGCCAGAATGCCAGCTTAT-30, reverse 50-AGGA
AAGCAAATGGGGTTTA-30; HDC-54: forward 1 50-GGC
AAATGGAATCAGTGGAT-30, forward 2 50-GAATCAA
TTAAACGAGGCTTGG-30, reverse 1 50-TTCTCCCCAT
CACTTTCAGG-30, reverse 2 50-TCCATTGCTGATACC
CTTTCTT-30; NB-69: forward 50-TGCTTGACATCCTTA
GTCATGG-30, reverse 50-CTCTTTGTGGGGATCTCTC
ATC-30; HDC-114: forward 50-CTGACTGGAACAAGCT
AGTGGGT-30, reverse 50-GGTTTGAAGAGCTGAAAT
AGCAA-30. Purified PCR products were sequenced using
GATC (Konstanz, Germany) or Seqlab (Gottingen, Ger-
many) sequencing services.
Sequence analysis
DNA sequences of FRA2H and 500 kb of the adjacent non-
fragile regions were obtained from the UCSC Genome
Browser (NCBI build 37, hg19). The positions of break-
point junction sequences were identified using the BLAT
tool on the UCSC website. The interspersed repeat com-
position was identified using RepeatMasker (v.3.3.0). To
evaluate DNA flexibility, TwistFlex, a programme mea-
suring the potential variation in the twist angle between
consecutive base pairs, was used. TwistFlex analysis was
performed with default settings; windows with values
[13.7� were considered as flexibility peaks. The EMBOSS
Needle alignment was used to calculate the extent of
sequence homology between the sequences (1,000 bp)
flanking each breakpoint. Palindromic sequences larger
than 20 nt were identified using EMBOSS palindrome and
microsatellites with more than 20 repeat units using Tan-
dem Repeats Finder.
Web resources
Database of Genomic Variants (DGV; last update: 02 Nov
2010), http://projects.tcag.ca/variation/; 1000 Genomes
Project (release 9 September 2011), http://browser.1000
genomes.org/; Ensembl human genome database (release
63, June 2011), http://www.ensembl.org/; Entrez NCBI
human genome database, http://www.ncbi.nlm.nih.gov/
gene/; Primer3, http://frodo.wi.mit.edu/primer3/input.htm;
Invitrogen OligoPerfectTM
Designer, http://tools.invitrogen.
com/content.cfm?pageid=9716; NCBI BLAST, http://www.
ncbi.nlm.nih.gov/tools/primer-blast/; IDT SciTools Oli-
goAnalyzer, http://eu.idtdna.com/analyzer/applications/
oligoanalyzer/; UCSC Genome Browser, http://genome.
ucsc.edu/; RepeatMasker, http://www.repeatmasker.org/;
TwistFlex, http://margalit.huji.ac.il/TwistFlex/; EMBOSS
Needle, http://www.ebi.ac.uk/Tools/psa/emboss_needle/
nucleotide.html; EMBOSS palindrome, http://emboss.
bioinformatics.nl/cgi-bin/emboss/palindrome; Tandem
Repeats Finder, http://tandem.bu.edu/trf/trf.html.
Results
FRA2H spans 533 kb of a gene-poor region at 2q32.1
According to the Entrez NCBI human genome database,
FRA2H has been cytogenetically assigned to chromosome
band 2q32.1, encompassing approximately 6.4 Mb of the
human genome. To determine the molecular location of
FRA2H at 100–150 kb resolution level, six-colour FISH
was carried out using 16 BAC probes. Mapping data are
summarised in Table 1 and illustrated in Fig. 1. Initially,
six differently labelled BAC probes (RP11-357A22, RP11-
262E6, RP11-1112I8, RP11-114I18 RP11-124A13 and
RP11-761I2) spaced at approximately 5 Mb, covering
about 25 Mb of 2q31.1–2q32.3, were hybridised to meta-
phase spreads of aphidicolin-treated lymphocytes from
eight healthy individuals (Fig. 1a). Of 821 investigated
metaphases, FRA2H breaks occurred in 78 cases. All
FRA2H breaks were observed between the signals of RP11-
114I18 and RP11-124A13, within a 4.4 Mb genomic region
extending from 185.5 to 189.9 Mb. To further approximate
the position of FRA2H, RP11-114I18, RP11-124A13 and
four additional BAC clones (RP11-334K17, RP11-16M14,
RP11-60J7, RP11-843D8), evenly spaced within the
defined region, were hybridised to metaphase spreads from
the same set of individuals (Fig. 1b). All FRA2H breaks (i.e.
42 breaks in 460 metaphases) were found to be located
between RP11-334K17 and RP11-16M14, defining FRA2H
to a 900 kb region (186.6–187.5 Mb). Finally, the molec-
ular location of FRA2H was fine-mapped using eight
contiguous BAC probes (RP11-334K17, RP11-400O18,
RP11-561J1, RP11-335G13, RP13-513D10, RP11-639N24,
RP11-625P14, RP11-16M14) spanning this region (Fig. 1c;
Table 1). In all investigated metaphases with FRA2H
breakage (i.e. 71 breaks in 732 metaphases), RP11-334K17
and RP11-400O18 only produced signals centromeric to the
break, while RP11-625P14 and RP11-16M14 signals were
1348 Hum Genet (2012) 131:1345–1359
123
only observed telomeric to the break. BAC clones were
considered to span the fragile region when their fluorescent
signal was detected on either side of a FRA2H break. Sig-
nals spanning FRA2H breaks were produced by RP11-
561J1 (11.3 %), RP11-335G13 (31.0 %), RP13-513D10
(53.5 %) and RP11-639N24 (4.2 %), defining the entire
FRA2H region to a size of 533 kb, encompassing the
genomic region from 186.72 to 187.25 Mb. To construct a
FRA2H physical and genetic map, the positions of the eight
BAC clones covering or bordering FRA2H were verified by
end sequencing. Alignment of the 533 kb sequence with the
UCSC human genome browser confirmed that FRA2H
maps to the 2q32.1 G-band and is located within a gene-
poor region. 84.5 % of all FRA2H breaks occurred within
RP11-335G13 and RP13-513D10, suggesting a core fragile
region of about 344 kb. Overall, our data indicate a high
sensitivity of this genomic region to replication stress, as
FRA2H breakage was detected in 9.5 % (i.e. 191 out of
2,013) of aphidicolin-treated lymphocyte metaphases.
FRA2H is the most active cFS on the long arm
of chromosome 2
The relative frequency of expression differs among cFSs in
the human genome, breakage at 3p14.2 (FRA3B) being
most commonly observed, followed by breakage at 16q23
(FRA16D), Xp22.3 (FRAXB) and 2q31–32 (FRA2H) (Glo-
ver et al. 1984; Mrasek et al. 2010). We aimed to determine
whether FRA2H, spanning the genomic region from 186.72
to 187.25 Mb, corresponds to the fourth most active cFS in
the human genome, which has previously been assigned to
2q32.1 by G-banding (Glover et al. 1984). Therefore, we
compared FRA2H fragility to five other known cFSs on the
long arm of chromosome 2 using FISH on aphidicolin-
treated lymphocytes from eight healthy individuals
(Table 2). BAC probes located near FRA2F (RP11-
1193F23), FRA2G (RP11-357A22), FRA2H (RP11-121I13),
FRA2I (RP11-388A17) and FRA2J (RP11-225M4) were
used for this purpose. Although expression patterns differed
slightly among individuals, FRA2H expression was
observed most frequently with breakage in 181 out of 2,595
metaphases overall. FRA2F with breaks in 40 metaphases
and a novel cFS at 2q12–14 with breaks in 33 metaphases
appear to be the second and third most fragile site on
chromosome arm 2q, respectively. In comparison to
FRA2H, FRA2I (23 breaks), FRA2G (17 breaks) and FRA2J
(16 breaks) seem only moderately susceptible to aphidico-
lin-induced breakage. Overall, these data indicate that
FRA2H is the most active cFS on the long arm of chro-
mosome 2 and together with previous estimates (Mrasek
et al. 2010) may thus be considered to be the fourth most
active aphidicolin-induced cFS in the human genome.
FRA2H is enriched in LTR and L1 long interspersed
nuclear elements (LINE) sequences
To determine whether FRA2H harbours any particular
sequence repeat motifs that may account for its fragility,
Table 1 FISH mapping of FRA2H
BAC clones Chromosome 2 bp
(GRCh37/hg19)
Known genes Number of fluorescent signals
C S T %C %S %T
RP11-357A22 170,227,020–170,299,104 191 0 0 100 0 0
RP11-262E6 174,894,479–175,057,417 191 0 0 100 0 0
RP11-1112I8 179,875,332–180,030,900 191 0 0 100 0 0
RP11-114I18 185,375,595–185,431,815 191 0 0 100 0 0
RP11-334K17 186,448,899–186,614,447 FSIP2, AC007966.1 120 0 0 100 0 0
RP11-400O18 186,572,178–186,717,985 FSIP2, AC007966.1 71 0 0 100 0 0
RP11-561J1 186,652,393–186,839,345 FSIP2 63 8 0 88.7 11.3 0
RP11-335G13 186,781,768–186,961,978 AC097500.2 41 22 8 57.7 31.0 11.3
RP13-513D10 186,946,516–187,125,789 3 38 30 4.2 53.5 42.3
RP11-639N24 187,108,314–187,264,292 0 3 68 0 4.2 95.8
RP11-625P14 187,251,137–187,446,479 ZC3H15 0 0 71 0 0 100
RP11-16M14 187,447,002–187,639,629 FAM171B, ITGAV 0 0 120 0 0 100
RP11-60J7 188,007,222–188,177,239 0 0 120 0 0 100
RP11-843D8 189,015,216–189,220,306 0 0 120 0 0 100
RP11-124A13 189,942,867–190,096,644 0 0 191 0 0 100
RP11-761I2 195,011,019–195,179,513 0 0 191 0 0 100
C centromeric, S spanning the break, T telomeric
Hum Genet (2012) 131:1345–1359 1349
123
we analysed the FRA2H sequence as well as 500 kb of the
non-fragile adjacent sequences with the RepeatMasker web
interface (Table 3; Supplementary Figure 1b, c). Sequen-
ces of FRA2H and its flanking regions were obtained from
the UCSC genome browser. With an AT content of 64.6 %,
the FRA2H sequence is considered to be enriched in AT
nucleotides, which are uniformly distributed along the
analysed sequence. The adjacent non-fragile sequences,
also located within the 2q32.1 G-band, have a similar AT
content. FRA2H is composed of 57.6 % interspersed
RP11-400O18
RP11-561J1 RP11-639N24
RP11-625P14
q32.1
RP11-335G13
RP13-513D10
533 kb
c
b
a
170 Mb 175 Mb 180 Mb 185 Mb 190 Mb 195 Mb2q31.1 2q31.2 2q31.3 2q32.1 2q32.2 2q32.3
RP11-357A22 RP11-262E6 RP11-1112I8 RP11-114I18 RP11-124A13 RP11-761I2
185 Mb 186 Mb 187 Mb 188 Mb 189 Mb 190 Mb
RP11-114I18 RP11-334K17 RP11-16M14 RP11-60J7
2p25.2 2p24.1 2p22.3 2p16.1 2p11.22p13.1 2q12.1 2q14.1 2q21.2 2q23.3 2q24.3 2q31.2 2q32.3 2q34 2q36.12q37.2
RP11-334K17
11.3% 31.0% 53.5% 4.2%
ZC3H15 AC018867.1 AC007966.1
FSIP2
186.6 Mb186.5 Mb 186.7 Mb 187.2 Mb186.8 Mb 186.9 Mb 187.0 Mb 187.1 Mb 187.3 Mb 187.4 Mb
RH48029 RH48586 SHGC-110012 WI-22618 SHGC-106163RH92115 D2S1875 RH119604
centromeric telomericspanning
RP11-16M14
187.5 Mb 187.6 Mb
FRA2H
ITGAV
FAM171B
D2S2967 D2S2500
AC097500.2
RP11-843D8 RP11-124A13
RP11-334K17 RP11-400O18 RP11-561J1 RP11-335G13 RP13-513D10 RP11-639N24 RP11-625P14 RP11-16M14
1350 Hum Genet (2012) 131:1345–1359
123
repeats, including 4.5 % ALU, 1.0 % MIR, 30.0 %
L1(LINE), 3.6 % L2(LINE), 14.5 % long terminal repeat
(LTR) and 3.3 % DNA elements. Percentages of ALU,
MIR, L2(LINE) and DNA elements are similar to those
estimated to represent the genomic fraction of autosomal
DNA with an AT value over 64 % (Smit 1999). In com-
parison to the genomic mean (20 %), L1(LINE) appear to
be overrepresented in FRA2H and the 500 kb region
proximal to FRA2H (30.0 and 28.1 %, respectively).
Moreover, the frequency of LTR elements appears to be
elevated with 14.5 % in FRA2H relative to an estimate of
6.8 % in the mean AT-rich fraction of the genome.
To examine whether FRA2H contains a higher number of
potentially highly flexible sequences relative to the non-
fragile adjacent regions, the sequences were analysed using
TwistFlex, a tool designed to predict the inherent flexibility
of a given sequence in relation to the DNA helix twist angle.
At a helix twist angle of 13.7�, with a value of 12.9, FRA2H
appears to have more flexibility peaks per 100 kb in com-
parison to the centromeric and telomeric flanking sequences
with values of 8.8 and 6.6, respectively (Table 3). FRA2H is
about 2.5-fold enriched in clusters of flexibility peaks,
defined as at least three flexibility peaks between which the
distance of any two adjacent peaks is B5 kb (Supplementary
Figure 1a). As the mean flexibility value for non-fragile
G-band sequences has been defined as 3.3 flexibility islands/
100 kb (Debacker et al. 2007), the number of flexibility
peaks is increased 3.9-fold in FRA2H and 2.7-fold in its
centromeric flanking sequence. At a twist angle of 16�,
however, the values do not appear to differ substantially. To
estimate the content of sequence motifs with the potential to
form unusual secondary structures, we identified the location
of palindromes ([20 nt), microsatellites ([20 repeat units)
and segmental duplications using EMBOSS palindrome and
Tandem Repeats Finder (Supplementary Figure 1a). The
FRA2H sequence contains about 2.8-fold more palindromes
than the adjacent genomic regions, but exhibits a similar
amount of microsatellite repeats. The analysis revealed
neither stretches of perfect AT repeats nor segmental
duplications within FRA2H.
Overall, the FRA2H region is AT-rich and appears to be
enriched in flexible sequences, like the majority of
molecularly characterised cFSs. Although computational
analysis showed that the entire 1.53 Mb region is AT-rich
and abundant in peaks of enhanced DNA flexibility,
FRA2H exhibits a much higher density of flexibility peaks
in comparison to the upstream and downstream sequences.
A notable feature of FRA2H is the elevated retrotransposon
LTR content, which is considerably higher than in the AT-
rich genomic mean and in other cFS sequences studied to
date (Savelyeva et al. 2006).
Table 2 Frequency of breakage at cFSs on the long arm of chromosome 2
Individual 2q12-14 FRA2F FRA2G FRA2H FRA2I FRA2 J Total 2q No. of metaphases
I 20 19 7 61 12 5 124 888
II 5 6 3 46 4 4 68 577
III 3 5 0 28 1 3 40 319
IV 1 6 2 15 2 2 28 158
V 1 1 0 1 0 1 4 113
VI 2 2 3 18 3 1 29 304
VII 1 1 0 10 1 0 13 204
VIII 0 0 2 2 0 0 4 32
Total 33 40 17 181 23 16 310 2,595
Fig. 1 FISH mapping and genomic localisation of FRA2H. a An
ideogram of chromosome 2 and a detailed presentation of the
2q31–32 region including the genomic location of six differently
labelled BAC clones are shown at the top. DAPI (left) and six-colour
FISH (right) images represent chromosomes 2 with aphidicolin-
induced FRA2H breaks from metaphase spreads of four individuals.
All breaks occurred between BAC clones RP11-114I18 and RP11-
124A13 (dotted red frame). b The genomic positions of six BAC
probes within the 185–190 Mb region selected to further approach
FRA2H are depicted at the top. A panel of chromosome 2 images with
FRA2H breaks is shown. Each chromosome 2 is presented three
times—left: DAPI; middle: FISH with RP11-114I18 and PR11-60J7;
right: FISH with RP11-334K17 and RP11-16M14. All breaks
occurred between RP11-334K17 and RP11-16M14 signals (dottedred frame). c Fine-mapping of FRA2H using contiguous BAC probes
spanning the region between RP11-334K17 and RP11-16M14. In all
investigated metaphases with FRA2H breakage RP11-334K17 and
RP11-400O18 produced signals centromeric to the break, while
RP11-625P14 and RP11-16M14 only produced signals telomeric to
the break. Breakpoint-spanning signals included those from RP11-
561J1, RP11-335G13, RP13-513D10 and RP11-639N24. The fre-
quency (%) of observed breakpoint-spanning signals is indicated for
each BAC below the FISH images. The genomic positions of the
contiguous BAC probes verified by end sequencing are displayed in a
physical map of the FRA2H region (borders indicated by red frame).
The genomic map shows that the AC097500.2 lincRNA gene is the
only known gene located within FRA2H. Known coding genes are
presented in black. FISH signals and bar colours on the physical and
genomic map correspond to fluorescent labels: DEAC dUTP (purple),
FITC dUTP (green), Cy3 dUTP (red), Cy3.5 dUTP (pink), Cy5 dUTP
(yellow), Cy5.5 dUTP (blue)
b
Hum Genet (2012) 131:1345–1359 1351
123
FRA2H is affected by intrachromosomal
rearrangements in human cancer cell lines
To determine whether FRA2H is unstable in cancer cells, 26
colorectal carcinoma, 17 neuroblastoma and 11 breast can-
cer cell lines (Supplementary Table 1) were subjected to
array CGH analysis (Fig. 2a). Custom-designed oligonu-
cleotide CGH arrays included a 1.1 Mb genomic region
(chr2: 186,290,000–187,400,000) encompassing the entire
FRA2H region and approximately 400 kb of its centromeric
and 150 kb of its telomeric flanking sequence. Array CGH
detected a number of apparently tumour-related rearrange-
ments at FRA2H in 7.4 % of tumour cell lines (i.e. in 4 out of
54). Rearrangements were most frequently observed in
colorectal cancer cell lines (i.e. in 11.5 %; 3 out of 26
samples), comprising a 300.5 kb loss in HDC-133 nm, a
133.5 kb loss in HDC-54 and a 82.5 kb loss in SW-620 sub1
cells. A gain of genomic material, affecting the entire
FRA2H sequence, was found in the NB-69 neuroblastoma
cell line. No rearrangements were detected in any of the
investigated breast cancer cell lines. To validate copy
number changes and to assess the chromosomal architecture
of the observed FRA2H rearrangements, we performed
FISH (Fig. 2b) using BAC and fosmid DNA probes (Sup-
plementary Table 2) located within or in close proximity to
each copy number alteration detected by array CGH and
mFISH (Fig. 2c). In HDC-133 nm cells, mFISH revealed
three apparently normal copies of chromosome 2, on all of
which RP11-561J1 and RP11-639N24 produced hybridisa-
tion signals, while RP11-335G13 and RP13-513D10 signals
were only visible on two out of three chromosomes, sug-
gesting an interstitial deletion. HDC-54 cells contain a
structurally normal chromosome 2 displaying three specific
BAC signals (RP11-334K17, RP11-400O18 and RP11-
335G13) and a derivative chromosome 2 homologue
showing RP11-334K17 and RP11-335G13 signals only. The
missing RP11-400O18 signal on the derivative chromosome
2 implies an interstitial deletion. SW-620 sub1 cells have a
set of four chromosomes 2, which appear to be structurally
normal on mFISH resolution level. However, FISH using
RP11-334K17, G248P81916F8 and RP11-291P02 probes
revealed that two of the chromosome 2 homologues harbour
an interstitial deletion at FRA2H as G248P81916F8 signals
were not visible on either of them. Thus, heterozygous
interstitial deletions at FRA2H were determined in all three
colorectal carcinoma cell lines with genomic losses. To
Table 3 DNA sequence
analysis of FRA2H and flanking
regions
FRA2H Non-fragile flanking regions (500 kb) Genome
Centromeric Telomeric
Flexibility peaks/100 kb
13.7� 12.9 8.8 6.6
16� 1.3 3.0 0.6
DNA repeat composition (%)
GC content 35.4 34.6 37.0 \36.0
ALUs 4.5 4.4 7.4 5.0
MIRs 1.0 0.7 2.1 1.5
L1(LINE) 30.0 28.1 21.2 20.0
L2(LINE) 3.6 2.1 4.2 3.1
LTR elements 14.5 12.3 9.7 6.8
DNA elements 3.3 5.8 4.2 2.9
Total interspersed repeats 57.6 53.4 49.5 39.4
Fig. 2 Copy number alterations at FRA2H in human cancer cell
lines. a Array CGH revealed losses of genetic material in HDC-
133 nm, HDC-54 and SW-620 sub1 cell lines and a gain in the NB-69
cell line. Dotted lines indicate the borders of FRA2H. The genomic
coordinates of oligomers are shown on the x axis, the log2 (ratio of
cancer cell line to reference DNA) on the y axis. Arrowheads on array
plots indicate the genomic position of BAC clones used to validate
array data. b FISH validation of DNA copy number alterations. In
HDC-133 nm cells, there are two normal chromosomes 2 with four
hybridisation signals and a rearranged chromosome with signals of
RP11-561J1 (purple) and RP11-639N24 (green). HDC-54 cells have
a normal chromosome 2 with three hybridisation signals and an
aberrant chromosome 2 lacking the RP11-400O18 signal (red). SW-
620 sub1 cells have four chromosomes 2, two of which do not display
G248P81916F8 signals (red). In NB-69 cells, the abnormal chromo-
some 2 displays single copy RP11-334K17 and RP11-475D24 signals
flanking the duplication. RP11-625P14 and RP11-707M15 signals are
duplicated and their orientation suggests a direct tandem duplication.
Normal chromosomes 2 are marked by white arrows; aberrant
chromosomes 2 by red arrows. c The intrachromosomal nature of
FRA2H rearrangements is confirmed by mFISH. Chromosome 2
material (purple) is marked by white arrows. Chromosome 2
translocations in HDC-54 and NB-69 cells do not involve the FRA2Hlocus. d Alignment of rearrangements to the transcriptional map of
FRA2H. FSIP2 and AC008174.3 at the centromeric border of FRA2Hare deleted in HDC-54 cells. AC097500.2 is heterozygously lost in
HDC-133 nm cells. The deletion in SW-620 sub1 cells does not affect
any known genes. In NB-69 cells, the large duplication covers the
entire FRA2H region. Known coding genes are presented in black,
rearranged regions in either green (gain) or red (loss)
c
1352 Hum Genet (2012) 131:1345–1359
123
estimate the size of the gain in the NB-69 neuroblastoma cell
line, we monitored its length across the more telomeric
portion of 2q, extending from 186.65 to 206.24 Mb. To
validate NB-69 array CGH data, we performed mFISH and
FISH using four BAC probes (RP11-334K17, RP11-
625P14, RP11-707M15 and RP11-475D24) covering the
entire length of the 19.6 Mb gain. FISH revealed a direct
tandem duplication within a large derivative chromosome
composed only of chromosome 2 material. An additional
portion of chromosome 2 was revealed by mFISH, located
HD
C-1
33nm
NB
-69
0.0
0.8
-0.8
0.0
0.8
-0.8
FRA2H
0.0
0.8
-0.8
0.0
0.8
-0.8
HD
C-5
4S
W-6
20 s
ub1
a b c
d
FRA2H
Chromosome 2 (Mb)
RP11-561J1 RP11-335G13 RP13-513D10 RP11-639N24
SW-620 sub1
187.4
186.4 186.6 186.8 187.0 187.2
HDC-54
NB-69
186.4 186.5 186.6 186.7 186.8 186.9 187.0 187.1 187.2 187.3
ZC3H15 AC018867.1
AC080125.1 AC097500.1 AC104058.1 AC093038.1 AC017071.1
AC018867.1U8 AC007966.1
FSIP2
HDC-133nm
RP11-334K17 RP11-625P14 RP11-707M15 RP11-475D24
RP11-334K17 G248P81916F8 RP11-291P02
RP11-334K17 RP11-400O18 RP11-335G13
AC008174.3 AC097500.2
Hum Genet (2012) 131:1345–1359 1353
123
on a derivative chromosome 11 with several translocations
including part of the short arm of chromosome 2 (validated
using fosmid clone G248P89404A5; data not shown). To
test whether FRA2H is involved in balanced rearrangements
undetectable by array-CGH, FISH using BAC clones RP11-
334K17 and RP11-16M14 was performed on all 54 cell
lines. Neither translocations nor inversions appeared to
affect the FRA2H region in any of the tested cancer cell
lines. Hence, all detected alterations represented simple in-
trachromosomal deletions or, in one case duplication, all
with one or two breaks within or in close proximity to
FRA2H.
To determine whether genomic rearrangements at
FRA2H are associated with damage of known genetic
elements, we aligned the detected chromosomal rear-
rangements with a transcriptional map of the 1.1 Mb
genomic region analysed by array CGH (Fig. 2d). The
transcriptional map shows that AC097500.2, a novel large
intergenic non-coding RNA (lincRNA) gene, is the only
known gene located within FRA2H. Fibrous sheath inter-
acting protein 2 (FSIP2), situated at the centromeric border
of FRA2H, is the only known protein-coding gene located
in close proximity to FRA2H. Another protein-coding gene
in the analysed region, zinc finger CCCH-type containing
15 (ZC3H15), lies approximately 650 kb telomeric to
FSIP2, in 100 kb distance to the telomeric border of
FRA2H. Apart from AC097500.2, FSIP2 and ZC3H15,
another lincRNA gene (AC007966.1) and a novel pro-
cessed transcript (AC008174.3) map within the analysed
region. The 300.5 kb deletion found in HDC-133 nm cells
appears to affect AC097500.2, while the 133.5 kb deletion
in HDC-54 cells leads to heterozygous loss of both FSIP2
and AC008174.3, as well as to partial loss of lincRNA
AC007966.1 sequences. The 82.5 kb deletion within the
centromeric portion of FRA2H in SW-620 sub1 cells does
not involve any known genes, in contrast to the large
19.6 Mb duplication in NB-69 cells, which may affect
expression of more than 80 protein-coding genes. In con-
clusion, of the investigated tumour types, colorectal cancer
appears to be most prone to FRA2H DNA copy number
alterations due to intrachromosomal recombination events.
Genomic rearrangements at FRA2H occur mainly
within L1(LINE) and LTR elements
Despite the involvement of cFSs in cancer-related chro-
mosomal recombination being well documented, there are
limited data regarding the impact of particular DNA
sequences on break formation at these regions. Taking
advantage of the 60 bp probe density on our CGH arrays,
we attempted to resolve the rearrangements found in
tumour cell lines at sequence level. To obtain the break-
point junctions, PCR was carried out on genomic DNA
from four tumour cell lines harbouring FRA2H rearrange-
ments using specific primers located on either side of each
breakpoint. PCR products and sequence data are shown in
Fig. 3. In HDC-133 nm cells, the 300,526 bp deleted
region lies within FRA2H, extending from 186,849,907 to
187,150,432 bp. Alignment with the genomic sequence
revealed a cytosine nucleotide insertion and no microho-
mology at the deletion junction. HDC-54 cells have a
133,465 bp deletion from 186,596,799 to 186,730,263 bp
of chromosome 2. The comparison of the joined DNA
fragment with the reference genomic sequence demon-
strated a 7 bp microhomology at the junction. We failed to
amplify the breakpoint junction in SW-620 sub1 cells,
owing to long stretches of repetitive elements on either side
of the deletion. In NB-69 cells, using outward facing
primers, we successfully sequenced the 19,579,975 bp
duplication extending from 186,656,022 to 206,235,996 bp
of chromosome 2, confirming a direct tandem orientation.
The duplication junction results from simple end-to-end
joining of DNA segments and is flanked by 6 bp of
microhomology. To find common DNA sequence motifs at
sites of chromosomal recombination at FRA2H, we ana-
lysed 500 bp of the upstream and downstream sequences
flanking each breakpoint. Overall, with a mean of 62.7 %,
the analysed breakpoint sequences have an AT-content
similar to the entire FRA2H sequence ranging from 51.9 %
in the sequence centromeric to the breakpoint in HDC-54
to 71 % in the sequence telomeric to the breakpoint in
HDC-133 nm cells. An examination of sequence homology
between centromeric and telomeric breakpoints using
EMBOSS Needle did not reveal a significant sequence
identity (i.e. mean value of 39.6 %) in any of the analysed
sequence pairs. A DNA repeat composition analysis using
RepeatMasker showed that both centromeric breakpoints of
the deletions in HDC-133 nm and HDC-54 cells are loca-
ted within LTRs, belonging to the ERVL–MaLR family,
and that both telomeric breakpoints lie within the L1
family of LINE repeats. Our 60 bp-resolution array CGH
data show that both the telomeric and centromeric break-
points of the FRA2H deletion in SW-620 sub1 cells lie
within ERV1 (chr2: 187,194,563–187,195,110) and ERVL
(chr2: 187,274,735–187,277,066) LTRs, respectively. In
contrast to the sequences at deletion breakpoints, both
duplication ends in NB-69 cells align to unique genomic
regions. To investigate whether the regions of high DNA
flexibility identified within FRA2H coincide with the
sequences encompassing breakpoints, we analysed them
using TwistFlex. The only flexibility peak was found
within the telomeric breakpoint of HDC-133 nm, partially
overlapping with a (TCTA)19 simple repeat spanning from
187,150,581 to 187,150,653 bp.
According to the DGV (Supplementary Table 3) and the
1000 Genomes Project, several germline CNVs have been
1354 Hum Genet (2012) 131:1345–1359
123
identified within FRA2H in the healthy population. In this
study, we have detected two novel germline CNVs in close
proximity to FRA2H (Supplementary Figure 2). Since
deletions identical at array-resolution level were observed
twice at 186,337,009–186,339,129 bp in CHLA-90 and
SK-N-FI, as well as at 187,296,433–187,302,233 bp in
HDC-90 and GI-ME-N cells among a total of 52 geneti-
cally distinct samples, we assume these variations to be of
germline origin, uncovering novel CNVs. Moreover, we
detected and sequenced a breakpoint junction of a
submicroscopic deletion of 8,067 bp (chr2: 186,740,882–
186,748,948) located within FRA2H in the HDC-114
colorectal carcinoma cell line (Fig. 3). Whether this dele-
tion represents a de novo tumour-associated chromosomal
alteration or is a relatively rare benign CNV listed in the
DGV (chr2: 186,740,927–186,747,368 l; Variation_63258
and Variation_90177), remains uncertain as further
sequence information is not provided in the database.
Sequence analysis did not show any insertions of novel
nucleotides at the junction but revealed a 4 bp
TGC AGA TG CT CAG CA T CT G CT CAGC T
T G TCTA GCAT G AGT CAT CG A CGA
T C ATG TC ACT C A C CTC TA CGCAGAGCGC
tcaacgtcc...ccacctccadeletion
deletion
deletion
AGT C A AAA C
C AA G TG AA T TTT T
T G A T T T T G A AAAA
133,465 bp
300,526 bp
8,067 bp
atggcacag...ctactttga
ctcaaaatg...aacgggaaa
CTCAGC T CAGC TGACA GTGT GATGTACA GCAT
duplication duplication
C T C T AA T
206.2 Mb
2q32.1 FRA2H 2q32.2 2q32.3 2q33.1 2q33.2 2q33.3
186.7 Mb
186.5-186.8 Mb
186.8-187.2 Mb
CTG
NB-69
500 bp
1 kb
Ly H2O
100 bp
HDC-114
500 bp
1 kb
Ly H2O
100 bp
HDC-54
500 bp
1 kb
Ly H2O
100 bp
HDC-133nm
500 bp
1 kb
Ly H2O
100 bp
LTR L1(LINE)
)ENIL(1L RTL
CTCTGA AAC G AG TAA AG
i
A
AAAATACTCA CTC C TC CTT
TTAGA TTGCCG
)ENIL(1L )ENIL(1L
CCGTCTCACC A GGA CCTGCATA
HD
C-1
33nm
HD
C-5
4N
B-6
9H
DC
-114
Fig. 3 Sequence analysis of the FRA2H breakpoint junctions. Gel
electrophoresis of breakpoint junction PCR products is shown on the
left as follows: 100 bp DNA ladder, specific tumour DNA product,
lymphocyte DNA from a healthy male (Ly) and water (H2O). PCR
product sizes range from 500 bp in HDC-114 to[1 kb in HDC-54. A
schematic representation of the rearranged regions is shown on the
right. DNA sequencing confirmed a 300.5 kb deletion in HDC-
133 nm and revealed a cytosine nucleotide insertion at the junction
(i). In HDC-54 cells, a 7 bp microhomology was detected at the
breakpoint junction of a 133.5 kb deletion. In the NB-69 cells, the
duplication junction is flanked by 6 bp of microhomology and results
from simple end-to-end joining of DNA segments. A reference
ideogram shows the original position of the duplicated regions
highlighted in pink (FRA2H sequence) and green. The 8.1 kb deletion
in HDC-114 cells located in a region listed as a CNV locus displays a
4 bp microhomology in immediate vicinity to the centromeric
breakpoint. Capital letters indicate a present sequence; lower caseletters a deleted sequence. Purple boxes highlight microhomologies.
Grey bars indicate the presence of LTR or L1(LINE) at breakpoints
Hum Genet (2012) 131:1345–1359 1355
123
microhomology in immediate vicinity to the centromeric
breakpoint. Both deletion boundaries map to regions of
L1(LINE) repeats. Thus, all eight endpoints of the
observed FRA2H deletions lie within L1(LINE) or LTR
elements, corresponding to the high abundance of these
repeats in the entire FRA2H sequence. The lack of exten-
ded homology and the presence of 4–7 bp microhomolo-
gies at the majority of junctions suggest the involvement of
non-homologous repair pathways mediated by short
microhomologies.
Discussion
In this study, we fine-mapped FRA2H delimiting the
boundaries between fragile and non-fragile genomic
regions at a BAC clone resolution of *100 kb using six-
colour FISH. We show that FRA2H maps to a gene-poor
chromosomal area and is one of the most active cFSs in the
human genome. High-density array CGH designed for
monitoring copy number changes at cFSs enabled us to
define large and submicroscopic DNA alterations at
FRA2H in cancer cells. Sequence analysis of the observed
breakpoint junctions revealed that genomic rearrangements
at FRA2H appear to result predominantly from non-
homologous recombination involving LTR and L1(LINE)
repeats.
Until now, the approximate genomic location of about
30 cFSs has been identified at megabase resolution level
(Durkin and Glover 2007). Recently, FRA2H has been
approximated to a region at chromosome bands 2q32.1 and
2q32.2 spanning about 8 Mb (Pelliccia et al. 2010). How-
ever, based on the analysis of more than 190 breaks, we
show that FRA2H is actually limited to a genomic region of
about 530 kb at 2q32.1.
All molecularly characterised cFSs span genomic
regions containing protein-coding genes, and the majority
of them are associated with extremely large genes extending
over hundreds of kilobases of genomic DNA (Smith et al.
2007). Unlike other cFSs, FRA2H encompasses a gene-poor
region, which according to the latest version of the Ensembl
genome database does not contain any protein-coding
genes. The newly annotated AC097500.2 lincRNA gene is
the only known gene residing within FRA2H. Growing
evidence suggests an important role of lincRNAs in diverse
cellular processes including regulation of gene expression
and epigenetic marks in disease pathways (Gibb et al. 2011;
Guttman et al. 2011).
The relative frequency of breakage differs among indi-
vidual cFSs, as breaks at the 20 most fragile cFSs account
for more than 80 % of all lesions observed in lymphocytes
after aphidicolin treatment (Glover et al. 1984). A recent
genome-wide screen quantifying cFS expression
frequencies revealed a broad spectrum of sensibility to
induced replication stress. FRA2H at 2q32.1 exhibited
3.9 % of breaks, thus belonging to the four most commonly
expressed cFSs in lymphocytes, preceded by FRA3B at
3p14.2 (14.2 %), FRA16D at 16q23 (7.6 %), and FRAXB at
Xp22.3 (5.5 %) (Mrasek et al. 2010). Comparing expres-
sion patterns of cFSs on the long arm of chromosome 2
using six-colour FISH, we were able to confirm that, fol-
lowing aphidicolin induction, the FRA2H genomic region
determined in this study is by far the most commonly
observed cFS on this chromosome, accounting for about
60 % of 2q breaks. Combined with previous estimates at
G-banding level, FRA2H may be considered to be the
fourth most frequently expressed cFS in human lympho-
cytes overall.
Instability at cFSs is thought to contribute to cancer
development or progression, as chromosomal breakpoint
loci found in different human cancers appear to coincide
with known cFS loci. Recently, a large-scale screen of 746
cancer cell lines demonstrated a significant amount of
homozygous deletion clusters occurring within cFS regions
(Bignell et al. 2010). As it has been demonstrated at
FRA3B, replication stress can lead to chromosomal rear-
rangements resembling somatic copy number alterations in
cancer cells (Durkin et al. 2008).
Despite the high frequency of FRA2H breakage in
aphidicolin-treated lymphocytes, we observed copy num-
ber alterations in only 4 out of 54 cancer cell lines (7.4 %).
This is a rather moderate level of recombination compared
to other active cFSs, such as FRA3B, FRA16D and FRA6E,
displaying breakage at much higher frequencies (Mitsui
et al. 2010; Lewandowska et al. 2009; Saldivar et al. 2010).
An explanation may be that FRA2H expression is high in
lymphocytes, but low in the cell types from which the
investigated tumours originated. Tissue-specific cFS
expression patterns have not yet been explored, however,
as differences in breakage susceptibility of some cFSs have
been observed between lymphocytes and fibroblasts, tis-
sue-dependent variation is to be expected (Letessier et al.
2011; Debatisse et al. 2012).
Although few genomic alterations were detected at
FRA2H, our data indicate that the frequency of FRA2H
recombination differs among tumour types. Rearrange-
ments were most frequently detected in colorectal carci-
noma cell lines. All copy number alterations found in colon
carcinoma cells were large intrachromosomal deletions
ranging from 82.5 to 300.5 kb, thus representing a major
type of genomic recombination that has also been observed
at other cFS loci (Arlt et al. 2006). FSIP2, encoding a
protein involved in fibrous sheath formation (Brown et al.
2003), is the only protein-coding gene located within the
deleted sequences. Two other genes affected by rear-
rangements, AC007966.1 and AC097500.2, are lincRNA
1356 Hum Genet (2012) 131:1345–1359
123
genes, whose function is unknown. The heterozygous sta-
tus of the deletions indicates that the affected genes retain
their function; however, the effect of haploinsufficiency
cannot be excluded.
The molecular nature of fragility at cFSs is largely
unknown, but the hypothesis that the DNA sequence itself
is a primary cause of instability at these genomic loci has
been supported by experimental data from Ragland et al.
(2008). Alternatively, epigenetic factors have been pro-
posed to play a prevalent role in FRA3B fragility (Letessier
et al. 2011), suggesting that the instability at cFSs results
from a combination of late replication and paucity of ini-
tiation events. Recently, an analysis of the replication
dynamics along FRA16C suggested that both DNA
sequences perturbing fork progression and inherent origin
paucity underlie an increased susceptibility to breakage at
cFS loci (Ozeri-Galai et al. 2011). In contrast to rare fragile
sites, cFSs are not associated with any expanded repeats
that would account for their fragility. The only DNA fea-
tures they share are a relatively high content of AT-rich
sequences and enrichment in flexibility peaks. These traits
appear to be characteristic of not only FRA2H but also the
entire 1.53 Mb region analysed in this study. However,
there appear to be local variation biases towards a poten-
tially higher flexibility at FRA2H relative to the non-fragile
adjacent sequences. The FRA2H sequence is primarily
abundant in L1(LINE) and LTR elements. A relative
L1(LINE) repeat enrichment of over 30 % has also been
observed in two other cFSs, FRA13A and FRA9G. The
most notable difference to the corresponding AT-rich
fraction of the genome is the high number of different LTR
retrotransposon sequences within FRA2H. Such elevated
LTR content reminds of another cFS, FRAXB, supporting
several reports that cFSs are preferred sites for viral inte-
gration (Ferber et al. 2003). A higher density of L1(LINE)
and LTR elements is also an intrinsic feature of the adja-
cent non-fragile sequences, in particular the centromeric
region, indicating that deviations from the mean are a
feature of a large genomic region including FRA2H rather
than the FRA2H fragile sequence alone. Several sequence
motifs potentially capable of impairing replication fork
progression and increasing chromosome fragility have been
considered to contribute to fragility in cFSs. For instance,
long stretches of AT perfect repeats, having been suggested
to cause FRA16D fragility (Zhang and Freudenreich 2007),
were not found within FRA2H.
One way to determine the genetic components within
cFSs responsible for their instability is to identify the
breakpoint junctions of chromosomal rearrangements at the
single nucleotide level. Sequences of cFS breakpoint
junctions were first obtained from aphidicolin-induced
FRA3B deletions (Durkin et al. 2008). Recently, a com-
prehensive sequence analysis of approximately 500 breaks
within the PARK2 (FRA6E) and DMD (FRAXC) cFS genes
in cancer cells and germlines showed breakpoint clustering
in specific genomic regions predominantly involving
microhomologies (Mitsui et al. 2010). In line with these
observations, our sequence data did not reveal extended
homology but 4–7 bp microhomologies in three out of four
breakpoint junctions at FRA2H.
Two main mechanisms, homologous and non-homolo-
gous recombination, have been proposed to be involved in
generating structural chromosomal rearrangements (Has-
tings et al. 2009). Involvement of non-allelic homologous
recombination (NAHR) in the formation of the observed
rearrangements is unlikely, as FRA2H breakpoint junctions
lack extended sequence homology. Non-homologous
recombination mechanisms include non-homologous end
joining (NHEJ), microhomology-mediated end joining
(MMEJ), as well as replicative pathways as described by the
fork stalling and template switching (FoSTeS) (Lee et al.
2007; McVey and Lee 2008) or the microhomology-medi-
ated break-induced replication (MMBIR) model (Hastings
et al. 2009). The deletion found in HDC-133 nm cells may
be a consequence of classical NHEJ, as this mechanism does
not require homologies to join DNA double-strand breaks
and often results in addition of single nucleotides. Micro-
homologies at the breakpoint junctions of the FRA2H
deletions in HDC-54 and SW-620 sub1 cells and the dupli-
cation in NB-69 cells indicate involvement of MMEJ or
MMBIR mechanisms, which appear to require 5–25 nucle-
otide microhomologies for effective repair (Gu et al. 2008;
Hastings et al. 2009). Combined with other studies discussed
above, our data suggest that non-homologous recombination
pathways mediated by short microhomologies play a major
role in generating and repairing cFS lesions.
A FRA2H breakpoint sequence analysis showed that all
detected deletions involve recombination between different
LTR families and L1(LINE). Several repetitive elements,
including L1(LINE) and LTR elements, are thought to
promote secondary structure formation, potentially leading
to replication fork stalling or collapse (Labib and Hodgson
2007). Thus, L1(LINE) repeats have been proposed to
constitute the molecular basis of FRA3B instability (Mimori
et al. 1999; Inoue et al. 1997). Moreover, involvement of
LTR retrotransposons in generating chromosomal rear-
rangements has been frequently observed in yeast genetic
systems (Szilard et al. 2010; Admire et al. 2006; Lemoine
et al. 2005). As all the detected FRA2H deletion endpoints
map within L1(LINE) or LTR elements, it is possible that
the high density of L1(LINE) and LTR at FRA2H favours
such recombination events.
Somatic cancer-related alterations and germline CNVs
are thought to have a common molecular basis, as similar
sequence features have been observed and replication
stress plays a role in the formation of both (Arlt et al.
Hum Genet (2012) 131:1345–1359 1357
123
2011). Whether non-homologous recombination events
across LTR and L1 elements are also involved in the for-
mation of germline CNVs at FRA2H remains uncertain as,
to our knowledge, none of the CNVs annotated in this
region have been assessed at the single nucleotide level.
Identifying DNA sequences directly involved in somatic
chromosomal rearrangements and CNVs in cFSs may
provide important clues for understanding the molecular
basis of cFS fragility. Such DNA alterations may serve as
potential biomarkers for early cancer diagnostics and as
targets for individualised therapies.
Acknowledgments The authors would like to thank Dr. Johannes
Gebert and Dr. Diego Arango del Corro for providing the SW-948
and SW-620 colorectal cancer cell lines and Dr. Wilhelm Dirks for
verifying cell line identity by DNA typing. This work was supported
by the Helmholtz-Russia Joint Research Groups (HRJRG-006) and
the Bundesministerium fur Bildung und Forschung NeuroblastomGenom Forschungs Netzwerk (BMBF NGFNPlus #01GS0896).
Conflict of interest The authors declare that they have no conflict
of interest.
References
Admire A, Shanks L, Danzl N, Wang M, Weier U, Stevens W, Hunt
E, Weinert T (2006) Cycles of chromosome instability are
associated with a fragile site and are increased by defects in
DNA replication and checkpoint controls in yeast. Genes Dev
20(2):159–173
Arlt MF, Glover TW (2010) Inhibition of topoisomerase I prevents
chromosome breakage at common fragile sites. DNA Repair
(Amst) 9(6):678–689
Arlt MF, Miller DE, Beer DG, Glover TW (2002) Molecular
characterization of FRAXB and comparative common fragile
site instability in cancer cells. Genes Chromosom Cancer
33(1):82–92
Arlt MF, Durkin SG, Ragland RL, Glover TW (2006) Common
fragile sites as targets for chromosome rearrangements. DNA
Repair (Amst) 5(9–10):1126–1135
Arlt MF, Ozdemir AC, Birkeland SR, Lyons RH Jr, Glover TW,
Wilson TE (2011) Comparison of constitutional and replication
stress-induced genome structural variation by SNP array and
mate-pair sequencing. Genetics 187(3):675–683
Bignell GR, Greenman CD, Davies H, Butler AP, Edkins S, Andrews
JM, Buck G, Chen L, Beare D, Latimer C, Widaa S, Hinton J,
Fahey C, Fu B, Swamy S, Dalgliesh GL, Teh BT, Deloukas P,
Yang F, Campbell PJ, Futreal PA, Stratton MR (2010) Signa-
tures of mutation and selection in the cancer genome. Nature
463(7283):893–898
Blumrich A, Zapatka M, Brueckner LM, Zheglo D, Schwab M,
Savelyeva L (2011) The FRA2C common fragile site maps to the
borders of MYCN amplicons in neuroblastoma and is associated
with gross chromosomal rearrangements in different cancers.
Hum Mol Genet 20(8):1488–1501
Brown PR, Miki K, Harper DB, Eddy EM (2003) A-kinase anchoring
protein 4 binding proteins in the fibrous sheath of the sperm
flagellum. Biol Reprod 68(6):2241–2248
Bruderlein S, van der Bosch K, Schlag P, Schwab M (1990)
Cytogenetics and DNA amplification in colorectal cancers.
Genes Chromosom Cancer 2(1):63–70
Burrow AA, Marullo A, Holder LR, Wang YH (2010) Secondary
structure formation and DNA instability at fragile site FRA16B.
Nucleic Acids Res 38(9):2865–2877
Casper AM, Nghiem P, Arlt MF, Glover TW (2002) ATR regulates
fragile site stability. Cell 111(6):779–789
Cicek MS, Slager SL, Achenbach SJ, French AJ, Blair HE, Fink SR,
Foster NR, Kabat BF, Halling KC, Cunningham JM, Cerhan JR,
Jenkins RB, Boardman LA, Petersen GM, Sargent DJ, Alberts
SR, Limburg PJ, Thibodeau SN (2009) Functional and clinical
significance of variants localized to 8q24 in colon cancer. Cancer
Epidemiol Biomarkers Prev 18(9):2492–2500
Debacker K, Winnepenninckx B, Ben-Porat N, FitzPatrick D, Van
Luijk R, Scheers S, Kerem B, Frank Kooy R (2007) FRA18C: a
new aphidicolin-inducible fragile site on chromosome 18q22,
possibly associated with in vivo chromosome breakage. J Med
Genet 44(5):347–352
Debatisse M, El Achkar E, Dutrillaux B (2006) Common fragile sites
nested at the interfaces of early and late-replicating chromosome
bands: cis acting components of the G2/M checkpoint? Cell
Cycle 5(6):578–581
Debatisse M, Le Tallec B, Letessier A, Dutrillaux B, Brison O (2012)
Common fragile sites: mechanisms of instability revisited.
Trends Genet 28(1):22–32
Denison SR, Simper RK, Greenbaum IF (2003) How common are
common fragile sites in humans: interindividual variation in the
distribution of aphidicolin-induced fragile sites. Cytogenet
Genome Res 101(1):8–16
Dillon LW, Burrow AA, Wang YH (2010) DNA instability at
chromosomal fragile sites in cancer. Curr Genomics 11(5):
326–337
Durkin SG, Glover TW (2007) Chromosome fragile sites. Annu Rev
Genet 41:169–192
Durkin SG, Ragland RL, Arlt MF, Mulle JG, Warren ST, Glover TW
(2008) Replication stress induces tumor-like microdeletions in
FHIT/FRA3B. Proc Natl Acad Sci USA 105(1):246–251
Ferber MJ, Thorland EC, Brink AA, Rapp AK, Phillips LA,
McGovern R, Gostout BS, Cheung TH, Chung TK, Fu WY,
Smith DI (2003) Preferential integration of human papillomavi-
rus type 18 near the c-myc locus in cervical carcinoma.
Oncogene 22(46):7233–7242
Gandhi M, Dillon LW, Pramanik S, Nikiforov YE, Wang YH (2010)
DNA breaks at fragile sites generate oncogenic RET/PTC
rearrangements in human thyroid cells. Oncogene 29(15):2272–
2280
Gibb EA, Brown CJ, Lam WL (2011) The functional role of long
non-coding RNA in human carcinomas. Mol Cancer 10:38
Glover TW (2006) Common fragile sites. Cancer Lett 232(1):4–12
Glover TW, Berger C, Coyle J, Echo B (1984) DNA polymerase
alpha inhibition by aphidicolin induces gaps and breaks at
common fragile sites in human chromosomes. Hum Genet
67(2):136–142
Gu W, Zhang F, Lupski JR (2008) Mechanisms for human genomic
rearrangements. Pathogenetics 1(1):4
Guttman M, Donaghey J, Carey BW, Garber M, Grenier JK, Munson
G, Young G, Lucas AB, Ach R, Bruhn L, Yang X, Amit I,
Meissner A, Regev A, Rinn JL, Root DE, Lander ES (2011)
lincRNAs act in the circuitry controlling pluripotency and
differentiation. Nature 477(7364):295–300
Halazonetis TD, Gorgoulis VG, Bartek J (2008) An oncogene-
induced DNA damage model for cancer development. Science
319(5868):1352–1355
Hastings PJ, Ira G, Lupski JR (2009) A microhomology-mediated
break-induced replication model for the origin of human copy
number variation. PLoS Genet 5(1):e1000327
Hellman A, Zlotorynski E, Scherer SW, Cheung J, Vincent JB, Smith
DI, Trakhtenbrot L, Kerem B (2002) A role for common fragile
1358 Hum Genet (2012) 131:1345–1359
123
site induction in amplification of human oncogenes. Cancer Cell
1(1):89–97
Helmrich A, Stout-Weider K, Matthaei A, Hermann K, Heiden T,
Schrock E (2007) Identification of the human/mouse syntenic
common fragile site FRA7K/Fra12C1—relation of FRA7 K and
other human common fragile sites on chromosome 7 to
evolutionary breakpoints. Int J Cancer 120(1):48–54
Henegariu O, Bray-Ward P, Ward DC (2000) Custom fluorescent-
nucleotide synthesis as an alternative method for nucleic acid
labeling. Nat Biotechnol 18(3):345–348
Hormozian F, Schmitt JG, Sagulenko E, Schwab M, Savelyeva L
(2007) FRA1E common fragile site breaks map within a
370 kilobase pair region and disrupt the dihydropyrimidine
dehydrogenase gene (DPYD). Cancer Lett 246(1–2):82–91
Inoue H, Ishii H, Alder H, Snyder E, Druck T, Huebner K, Croce CM
(1997) Sequence of the FRA3B common fragile region:
implications for the mechanism of FHIT deletion. Proc Natl
Acad Sci USA 94(26):14584–14589
Jiang Y, Lucas I, Young DJ, Davis EM, Karrison T, Rest JS, Le Beau
MM (2009) Common fragile sites are characterized by histone
hypoacetylation. Hum Mol Genet 18(23):4501–4512
Labib K, Hodgson B (2007) Replication fork barriers: pausing for a
break or stalling for time? EMBO Rep 8(4):346–353
Lai LA, Kostadinov R, Barrett MT, Peiffer DA, Pokholok D, Odze R,
Sanchez CA, Maley CC, Reid BJ, Gunderson KL, Rabinovitch
PS (2010) Deletion at fragile sites is a common and early event
in Barrett’s esophagus. Mol Cancer Res 8(8):1084–1094
Lee JA, Carvalho CM, Lupski JR (2007) A DNA replication
mechanism for generating nonrecurrent rearrangements associ-
ated with genomic disorders. Cell 131(7):1235–1247
Lemoine FJ, Degtyareva NP, Lobachev K, Petes TD (2005)
Chromosomal translocations in yeast induced by low levels of
DNA polymerase a model for chromosome fragile sites. Cell
120(5):587–598
Letessier A, Millot GA, Koundrioukoff S, Lachages AM, Vogt N,
Hansen RS, Malfoy B, Brison O, Debatisse M (2011) Cell-type-
specific replication initiation programs set fragility of the
FRA3B fragile site. Nature 470(7332):120–123
Lewandowska U, Zelazowski M, Seta K, Byczewska M, Pluciennik
E, Bednarek AK (2009) WWOX, the tumour suppressor gene
affected in multiple cancers. J Physiol Pharmacol 60(Suppl
1):47–56
Limongi MZ, Pelliccia F, Rocchi A (2003) Characterization of the
human common fragile site FRA2G. Genomics 81(2):93–97
Mangelsdorf M, Ried K, Woollatt E, Dayan S, Eyre H, Finnis M,
Hobson L, Nancarrow J, Venter D, Baker E, Richards RI (2000)
Chromosomal fragile site FRA16D and DNA instability in
cancer. Cancer Res 60(6):1683–1689
McVey M, Lee SE (2008) MMEJ repair of double-strand breaks
(director’s cut): deleted sequences and alternative endings.
Trends Genet 24(11):529–538
Mimori K, Druck T, Inoue H, Alder H, Berk L, Mori M, Huebner K,
Croce CM (1999) Cancer-specific chromosome alterations in the
constitutive fragile region FRA3B. Proc Natl Acad Sci USA
96(13):7456–7461
Mitsui J, Takahashi Y, Goto J, Tomiyama H, Ishikawa S, Yoshino H,
Minami N, Smith DI, Lesage S, Aburatani H, Nishino I, Brice A,
Hattori N, Tsuji S (2010) Mechanisms of genomic instabilities
underlying two common fragile-site-associated loci, PARK2 and
DMD, in germ cell and cancer cell lines. Am J Hum Genet
87(1):75–89
Mrasek K, Schoder C, Teichmann AC, Behr K, Franze B, Wilhelm K,
Blaurock N, Claussen U, Liehr T, Weise A (2010) Global
screening and extended nomenclature for 230 aphidicolin-
inducible fragile sites, including 61 yet unreported ones. Int J
Oncol 36(4):929–940
Ozeri-Galai E, Lebofsky R, Rahat A, Bester AC, Bensimon A, Kerem
B (2011) Failure of origin activation in response to fork stalling
leads to chromosomal instability at fragile sites. Mol Cell
43(1):122–131
Pelliccia F, Bosco N, Rocchi A (2010) Breakages at common fragile
sites set boundaries of amplified regions in two leukemia cell
lines K562—molecular characterization of FRA2H and locali-
zation of a new CFS FRA2S. Cancer Lett 299(1):37–44
Ragland RL, Glynn MW, Arlt MF, Glover TW (2008) Stably
transfected common fragile site sequences exhibit instability at
ectopic sites. Genes Chromosom Cancer 47(10):860–872
Saldivar JC, Shibata H, Huebner K (2010) Pathology and biology
associated with the fragile FHIT gene and gene product. J Cell
Biochem 109(5):858–865
Savelyeva L, Sagulenko E, Schmitt JG, Schwab M (2006) The
neurobeachin gene spans the common fragile site FRA13A. Hum
Genet 118(5):551–558
Sawinska M, Schmitt JG, Sagulenko E, Westermann F, Schwab M,
Savelyeva L (2007) Novel aphidicolin-inducible common fragile
site FRA9G maps to 9p22.2, within the C9orf39 gene. Genes
Chromosom Cancer 46(11):991–999
Schwartz M, Zlotorynski E, Kerem B (2006) The molecular basis of
common and rare fragile sites. Cancer Lett 232(1):13–26
Smit AF (1999) Interspersed repeats and other mementos of
transposable elements in mammalian genomes. Curr Opin Genet
Dev 9(6):657–663
Smith DI, McAvoy S, Zhu Y, Perez DS (2007) Large common fragile
site genes and cancer. Semin Cancer Biol 17(1):31–41
Sutherland GR (2003) Rare fragile sites. Cytogenet Genome Res
100(1–4):77–84
Szilard RK, Jacques PE, Laramee L, Cheng B, Galicia S, Bataille AR,
Yeung M, Mendez M, Bergeron M, Robert F, Durocher D (2010)
Systematic identification of fragile sites via genome-wide
location analysis of gamma-H2AX. Nat Struct Mol Biol
17(3):299–305
Tsantoulis PK, Kotsinas A, Sfikakis PP, Evangelou K, Sideridou M,
Levy B, Mo L, Kittas C, Wu XR, Papavassiliou AG, Gorgoulis
VG (2008) Oncogene-induced replication stress preferentially
targets common fragile sites in preneoplastic lesions. A genome-
wide study. Oncogene 27(23):3256–3264
Wilke CM, Guo SW, Hall BK, Boldog F, Gemmill RM, Chandrase-
kharappa SC, Barcroft CL, Drabkin HA, Glover TW (1994)
Multicolor FISH mapping of YAC clones in 3p14 and identi-
fication of a YAC spanning both FRA3B and the t(3;8)
associated with hereditary renal cell carcinoma. Genomics
22(2):319–326
Zhang H, Freudenreich CH (2007) An AT-rich sequence in human
common fragile site FRA16D causes fork stalling and chromo-
some breakage in S. cerevisiae. Mol Cell 27(3):367–379
Zimonjic DB, Druck T, Ohta M, Kastury K, Croce CM, Popescu NC,
Huebner K (1997) Positions of chromosome 3p14.2 fragile sites
(FRA3B) within the FHIT gene. Cancer Res 57(6):1166–1170
Zlotorynski E, Rahat A, Skaug J, Ben-Porat N, Ozeri E, Hershberg R,
Levi A, Scherer SW, Margalit H, Kerem B (2003) Molecular
basis for expression of common and rare fragile sites. Mol Cell
Biol 23(20):7143–7151
Hum Genet (2012) 131:1345–1359 1359
123