11
Piotr Kozlowski Wlodzimierz J. Krzyzosiak Laboratory of Cancer Genetics, Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland Structural factors determining DNA length limitations in conformation-sensitive mutation detection methods Numerous mutations and polymorphisms in human genes remain to be identified using reliable methods. Of the available mutation scanning methods those dependent on structural change-induced mobility shifts are highly effective. Their efficiency is, how- ever, DNA length-sensitive and the reasons for that are poorly understood. In this study, we explain why scanning genes for mutations is less effective in longer DNA fragments, and reveal the factors which are behind this effect. We have performed a systematic analysis of the same sequence variants of exon 11 of the BRCA1 gene in DNA fragments of three different lengths using the combined single-strand conforma- tion polymorphism (SSCP) and heteroduplex analysis (DA) by capillary electrophoresis (CE). There are two major structural factors responsible for the reduced mutation detection rate in long amplicons. The first is increased contribution from other sec- ondary structure modules and domains in longer fragments, which mask the structural change induced by the mutation. The second is higher frequency of single-nucleotide polymorphisms (SNPs) including common polymorphisms in longer fragments. This makes it necessary to distinguish the structural effect of the mutation from that of each polymorphic variant, which is often difficult to achieve. Taking these factors into account, an efficient scanning of genes for sequence variants by conformation-sensi- tive methods may be performed. Keywords: BRCA1 / DNA secondary structure / Heteroduplex analysis / Mutation detection / Single-strand conformation polymorphism DOI 10.1002/elps.200406125 1 Introduction Mutations in human genes may either cause diseases, predispose to diseases or modify their course. Common polymorphisms may be involved in common diseases and in positive or adverse reactions to drugs. By the end of 2003, about 1500 human disease genes harboring nearly 40 000 different mutations have been identified [1] and the number of known single-nucleotide polymor- phisms (SNPs) is at least one order of magnitude higher [2]. Both the list of genes involved in monogenic dis- eases and the list of mutations are growing constantly, and numerous new additions from genes, SNPs and mutations involved in complex polygenic diseases are expected. The necessity of determining more compre- hensive mutation and SNP spectra for many different genes in different populations also appears obvious and is becoming one of the most important challenges for genomic medicine. Scanning and screening genes for SNPs and mutations require reliable methods which are highly sensitive, rapid, have a high throughput and are cost-effective. In spite of the continuing efforts, none of the available methods ful- fills all these criteria. The most popular methods used to detect sequence variations are those which do not require many laborious postamplification steps. Such methods usually take advantage of the differences in physical properties between the wild-type sequences and those harboring SNPs or mutations. They include conformation- sensitive methods based on the principle, that a mutation or any kind of sequence variation induces changes in the structure of a DNA fragment, which affects its mobility during electrophoresis or chromatography. The well-known single-strand conformation polymor- phism (SSCP) [3, 4], heteroduplex analysis (DA) [5, 6], double-strand conformation polymorphism (DSCA) [7, 8] methods belong to this category, which also includes Correspondence: Dr. Wlodzimierz J. Krzyzosiak, Laboratory of Cancer Genetics, Institute of Bioorganic Chemistry, Polish Acad- emy of Sciences, Noskowskiego 12/14, PL-61-704 Poznan, Poland E-mail: [email protected] Fax: 148-61-852-05-032 Abbreviations: DA, heteroduplex analysis; SNP , single-nucleo- tide polymorphism Electrophoresis 2005, 26, 71–81 71 2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim Nucleic acids

Structural factors determining DNA length limitations in conformation-sensitive mutation detection methods

Embed Size (px)

Citation preview

Piotr KozlowskiWlodzimierz J. Krzyzosiak

Laboratory of Cancer Genetics,Institute of Bioorganic Chemistry,Polish Academy of Sciences,Poznan, Poland

Structural factors determining DNA lengthlimitations in conformation-sensitive mutationdetection methods

Numerous mutations and polymorphisms in human genes remain to be identified usingreliable methods. Of the available mutation scanning methods those dependent onstructural change-induced mobility shifts are highly effective. Their efficiency is, how-ever, DNA length-sensitive and the reasons for that are poorly understood. In thisstudy, we explain why scanning genes for mutations is less effective in longer DNAfragments, and reveal the factors which are behind this effect. We have performed asystematic analysis of the same sequence variants of exon 11 of the BRCA1 gene inDNA fragments of three different lengths using the combined single-strand conforma-tion polymorphism (SSCP) and heteroduplex analysis (DA) by capillary electrophoresis(CE). There are two major structural factors responsible for the reduced mutationdetection rate in long amplicons. The first is increased contribution from other sec-ondary structure modules and domains in longer fragments, which mask the structuralchange induced by the mutation. The second is higher frequency of single-nucleotidepolymorphisms (SNPs) including common polymorphisms in longer fragments. Thismakes it necessary to distinguish the structural effect of the mutation from that of eachpolymorphic variant, which is often difficult to achieve. Taking these factors intoaccount, an efficient scanning of genes for sequence variants by conformation-sensi-tive methods may be performed.

Keywords: BRCA1 / DNA secondary structure / Heteroduplex analysis / Mutation detection /Single-strand conformation polymorphism DOI 10.1002/elps.200406125

1 Introduction

Mutations in human genes may either cause diseases,predispose to diseases or modify their course. Commonpolymorphisms may be involved in common diseasesand in positive or adverse reactions to drugs. By the endof 2003, about 1500 human disease genes harboringnearly 40 000 different mutations have been identified [1]and the number of known single-nucleotide polymor-phisms (SNPs) is at least one order of magnitude higher[2]. Both the list of genes involved in monogenic dis-eases and the list of mutations are growing constantly,and numerous new additions from genes, SNPs andmutations involved in complex polygenic diseases areexpected. The necessity of determining more compre-

hensive mutation and SNP spectra for many differentgenes in different populations also appears obvious andis becoming one of the most important challenges forgenomic medicine.

Scanning and screening genes for SNPs and mutationsrequire reliable methods which are highly sensitive, rapid,have a high throughput and are cost-effective. In spite ofthe continuing efforts, none of the available methods ful-fills all these criteria. The most popular methods used todetect sequence variations are those which do not requiremany laborious postamplification steps. Such methodsusually take advantage of the differences in physicalproperties between the wild-type sequences and thoseharboring SNPs or mutations. They include conformation-sensitive methods based on the principle, that a mutationor any kind of sequence variation induces changes in thestructure of a DNA fragment, which affects its mobilityduring electrophoresis or chromatography.

The well-known single-strand conformation polymor-phism (SSCP) [3, 4], heteroduplex analysis (DA) [5, 6],double-strand conformation polymorphism (DSCA) [7, 8]methods belong to this category, which also includes

Correspondence: Dr. Wlodzimierz J. Krzyzosiak, Laboratory ofCancer Genetics, Institute of Bioorganic Chemistry, Polish Acad-emy of Sciences, Noskowskiego 12/14, PL-61-704 Poznan,PolandE-mail: [email protected]: 148-61-852-05-032

Abbreviations: DA, heteroduplex analysis; SNP, single-nucleo-tide polymorphism

Electrophoresis 2005, 26, 71–81 71

2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

Nuc

leic

acid

s

72 P. Kozlowski and W. J. Krzyzosiak Electrophoresis 2005, 26, 71–81

conformation-sensitive gel electrophoresis (CSGE) [9,10], denaturing high-performance liquid chromatography(DHPLC) [11, 12], constant denaturant gel electrophoresis(CDGE) [13, 14], denaturant gradient gel electrophoresis(DGGE) [15], temperature gradient gel electrophoresis(TGGE) [16], temperature gradient capillary electrophore-sis (TGCE) [17, 18], and constant denaturant capillaryelectrophoresis (CDCE) [19] methods. The latter grouptakes advantage of the structural differences existing be-tween normal and mutant sequences in partially meltedduplexes.

In theory, the process of scanning genes for mutations,which are present at unknown localizations, could befaster and more economical by analyzing longer ampli-cons. In practice, there is a prevailing view that a certainoptimum size range exists for DNA fragments to bescanned for mutations efficiently, and going beyond thisrange usually gives suboptimal results [10, 20, 21]. Inspite of that, the possibility of efficient mutation detectionin longer fragments was postulated in several reports. Forexample, there were reports suggesting that the analysisof fragments, 3–5 times longer than the 150–200 bp opti-mal for SSCP method preserves high mutation detectionrates [8, 22, 23]. There were also relevant reports regard-ing other mutation detection methods [24–26].

The aim of this study is to clarify this issue and shed morelight on factors and effects, which determine the limits ofthe efficient scanning genes for mutations by the con-formation-dependent methods. We have approached thisproblem in a systematic way, by analyzing a number ofthe same mutations, polymorphisms and rare sequencevariants in DNA fragments of increasing length. We haveinvestigated these factors by SSCP and DA representingdifferent types of conformation-sensitive methods. Weused modern capillary electrophoresis (CE) versions ofthese methods as earlier successful applications ofSSCP-CE [22, 27, 28], DA-CE [21, 30–32] and combinedSSCP-DA-CE [8, 33] confirmed the advantages of the CEplatform.

2 Materials and methods

2.1 DNA samples

The genomic DNA samples with different mutations, andpolymorphisms variants in the analyzed part of exon 11 ofthe BRCA1 (MIM# 113705) gene were either from ourlaboratory repository or were kindly provided by otherresearchers. All samples were adjusted to the equal con-centration (,20 ng/mL) before amplification.

2.2 PCR amplification

All sequence variants were amplified in three overlappingDNA fragments of different length from 252 to 848 bp(Table 1). For each fragment, a sample containing thewild-type sequence was amplified as a reference and forfragments spanning polymorphic variants additionallysamples representing all polymorphic genotypes wereamplified. To reduce the cost of analysis, universal fluo-rescently labeled primers were used to reamplify ampli-cons containing uniform priming sites. During the firstamplification step, in which genomic DNA served as atemplate, fragments were amplified with specific, unla-beled primers tagged with 20 nucleotides of universalsequence. The specific primers used to obtain all short,medium and long fragments are listed in Table 1. Theproducts generated in the first PCR step were diluted1000 times and served as templates in the next step ofamplification. All fragments were reamplified with a pair offluorescently labeled universal primers (Table 1) under thesame PCR conditions. All unlabeled and labeled primerswere synthesized at MWG Biotech (Ebersberg, Germany).Both steps of amplification were conducted in a totalreaction volume of 5 mL which consisted of 1 mL adequateDNA template, 16 PCR reaction buffer (10 mM Tris-HCl,pH 8.3, 50 mM KCl, 1.5 mM MgCl2), 200 mM of each dNTP,0.12 m Taq polymerase (Promega, Madison, WI, USA),and 1 mM of each primer. Reactions were performed inGeneAmp PCR System 9700 (Applied Biosystems, FosterCity, CA, USA) according to fast and economical PCRprotocol [34]. Briefly, 1 s denaturation at 947C, 1 sannealing and 1 s extension at 727C were repeated 35times. In cases of medium and long fragments, theextension time was increased to 30 and 60 s, respectively.In all reactions with specific primers the annealing tem-perature was increased to 607C after the first five cycles.The annealing temperatures applied are specified inTable 1.

2.3 Sample preparation and condition of CE

Samples for combined SSCP/DA were prepared in awater medium as described earlier [8, 35]. Briefly, theproducts of PCR amplification were diluted 50 times(short fragments) or 20 times (medium and long frag-ments) in deionized water and divided into two portions.One portion used for the generation of single-strand con-formers was denatured at 957C for 30 s. Then, 1 mL duplexportion and 1 mL SSCP portion as well as 1 mL 406 dilutedROX-labeled size standard were added to the 10 mL deion-ized water. According to the analyzed fragment length,one of the commercially available ROX-500, ROX-1000 orROX-2500 DNA size standards (Applied Biosystems) were

2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

Electrophoresis 2005, 26, 71–81 DNA length limitations in mutation detection methods 73

Table 1. Primers used for combined SSCP-DA

Fragmentname

Primernamea)

Primer sequence Fragmentlength (bp)

Annealingtemperature (7C)

UnivF-FAM 5’- GTC CTG CCA ATG AGA AGA AA 55UnivR-JOE 5’- GAG ACC ATT TTC CCA GCA TC

LongA 1061F 5’- UnivF aag gag cca aca taa cag at 778 48/60

1798R 5’- UnivR tca cct ttt gtt tta ttc tcB 1622F 5’- UnivF att aaa gcg taa aag gag ac 848 55/60

2429R 5’- UnivR tga aat act gct act ctc taC 2212F 5’- UnivF cag agc tga agt taa caa at 772 53/60

2943R 5’- UnivR cat tat caa ctg gct tat ct

MediumA 1151F 5’- UnivF tga tcc cct gtg tga ga 600 55/60

1710R 5’- UnivR ttt cag gag tct ttt gaa ctB1 1622F 5’- UnivF att aaa gcg taa aag gag ac 577 55/60

2158R 5’- UnivR ttc ttg gct cca gtt gcaB2 1959F 5’- UnivF aag tct tct acc agg cat a 511 55/60

2429R 5’- UnivR tga aat act gct act ctc taC 2370F 5’- UnivF atg tta agt gga gaa agg gt 553 53/60

2882R 5’- UnivR ctg tac agg ctt gat att ag

shortA1 1151F 5’- UnivF tga tcc cct gtg tga ga 263 55/60

1373R 5’- UnivR ctc att tag aac gtc caa taA2 1339F 5’- UnivF cca aag tag ctg atg tat tg 263 55/60

1561R 5’- UnivR aga ttt tca gtt aca tgg ctA3 1499F 5’- UnivF att tgg gaa aac cta tcg ga 252 55/60

1710R 5’- UnivR ttt cag gag tct ttt gaa ctB1 1762F 5’- UnivF tta cta ata gtg gtc atg ag 273 55/60

1994R 5’- UnivR tag ttc aag cgc atg aat aB2 2147F 5’- UnivF tgg agc caa gaa gag taa ca 264 55/60

2370R 5’- UnivR tga gat ctt tgg ggt ctt caC1 2391F 5’- UnivF ttg caa act gaa aga tct gt 290 55/60

2640R 5’- UnivR gac tgt ggt taa ctt cat gtC2 2671F 5’- UnivF aac ttg atg ctc agt att tg 276 55/60

2906R 5’- UnivR aaa gcc tgc agt gat att aa

a) Numbers present in the specific primer names correspond to positions of primer 5’-ends in theBRCA1 exon 11 sequence (U14680).

used. CE was performed on an ABI 310 genetic analyzer(Applied Biosystems) equipped with an argon laser, whichemits the greatest intensity of light at 488 and 514.5 nm.The instrument has a capacity of 48 samples, which areinjected automatically during the run. The samples wereelectroinjected at 15 kV for 5 s to a 42 cm long (Lt/Ld = 42/31 cm) [ 50 mm capillary (Applied Biosystems) filled with5.5% GeneScan (Applied Biosystems) polymer with 16Tris-borate-EDTA (TBE) buffer and 10% glycerol. Electro-phoresis was at 13 kV at a temperature of 257C or 307C.Virtual filter A, allowing efficient detection and the distin-guishing of signals from applied fluorescent dyes (FAM,JOE, ROX), was used in all separations. The results were

collected using the Data Collection program Version 3.0.0for Windows NT and analyzed using GeneScan Analysissoftware Version 3.7 for Windows NT (both Applied Bio-systems). To avoid the room-temperature variation, thearea in which the CE apparatus was located was equili-brated at 207C.

2.4 DNA secondary structure modeling

The DNA secondary structure prediction was performedusing the DNA-Mfold program Version 3.1. [36], accessedfrom the web site http://www.bioinfo.rpi.edu/applica-

2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

74 P. Kozlowski and W. J. Krzyzosiak Electrophoresis 2005, 26, 71–81

tions/mfold/old/dna/form1.cgi. This program is designedto determine the optimal and suboptimal secondarystructures of DNA, calculated for 1 M NaCl solution, and tocount the free energies of the predicted structures andtheir individual motifs. The structures were predicted for atemperature of 307C. The suboptimality parameter wasfixed at 5% in all structure predictions, and the windowparameter was set up at 5, 10 and 15 for short, mediumand long fragments, respectively. The free-energy valuesused for DNA structure prediction and its stability calcu-lation were those from the laboratory of John SantaLuciaJr. [37, 38].

3 Results and discussion

3.1 Experiment design

Fourteen different sequence variants of the BRCA1gene were analyzed in this study in the DNA fragmentsof three different length ranges: short (252–290 bp),

medium (511–600 bp) and long (772–848 bp). Thesevariants included eight base substitutions, six transi-tions and two transversions (5-C/T, 1-A/G, 1-A/T, and 1-G/T), and six insertions/deletions (insT, insA, delC,delAA, and delAGAAG). Among them were six cancer-predisposing mutations, four rare sequence variantsand four common polymorphisms. Of the latter, threewere in complete linkage disequilibrium [39]. Withregard to their localization within the exon 11, thesevariants were analyzed in three groups A, B and C asshown in Fig. 1.

The amplicons of the specified length were obtainedusing fluorescently labeled universal primers in a two stepamplification procedure (Section 2.2). This proceduremay be considered alternative to another cost-effectivemethod of fluorescent postlabeling of PCR productsdescribed earlier [22, 40]. The amplicons of varying sizewere analyzed by the combined SSCP-DA-CE method [8,33, 41, 42]. The electrophoresis peak patterns recordedfor the sequence variants were each time compared with

Figure 1. Schematic representation of the BRCA1 gene exon 11 fragment with all analyzed sequence variants indicated.The frequent polymorphisms (shown in bold italics) exhibiting total linkage disequilibrium are spanned by the horizontalline. The scale below the bar representing exon 11 indicates the sequence position in mRNA of the BRCA1gene (U14680).The square brackets above (A, B and C, respectively) span the sequence variants analyzed in the specified groups offragments shown below. Above the symbol of each fragment the fragment length (bold) and fragment name (in brackets)are indicated.

2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

Electrophoresis 2005, 26, 71–81 DNA length limitations in mutation detection methods 75

the patterns generated by the corresponding wild-typesequences as shown for three fragments: short A2, me-dium A and long A in Fig. 2. It may be noticed from theiranalysis that the mutations 1411insT and 1499insA resultin changes of the SSCP patterns of both sense and anti-sense strands in the short A2 fragment whereas no suchchanges are observed in fragments medium A and long A.Both mutations, however, generate some mobility shifts induplex portions of all these fragments.

In case of variants occurring in fragments harboringcommon polymorphisms their patterns were comparedwith those characteristic for all polymorphic variants (twodifferent homozygotes and the heterozygote). The patterngenerated by mutation or rare sequence variant, whichcould not be distinguished from that generated by eachvariant of the common polymorphism, scored negative.The mutations and SNPs, which resulted in only minorshifts of their peaks with respect to those of the wild-typefragments were classified as “not easily detected” as theycould be overlooked during a routine analysis. Suchmutations and sequence variants which are shown inFig. 3 are marked with “-1” in Table 2.

3.2 Mutation detection rate decreases in longerDNA fragments

The results of mutation and SNP detection in DNA frag-ments of different length are summarized in Table 2. Itturns out from these data, that the mutation detection rateis highest for the shortest DNA fragments analyzed. Boththe SSCP and DA alone detect 79% variants at 307C and

the sensitivity of SSCP increases to 86% at 257C. Thecombined SSCP-DA method detects all variants at bothtemperatures which parallels our earlier results [8], inwhich the mutation detection rate was shown to be 81, 90and 100% in the duplex, SSCP and both fractions,respectively. The detection rate of the same mutationsanalyzed in longer fragments is lower and for the com-bined method it drops to 76–85% and 62–69% for themedium and long fragments, respectively. For the SSCPand duplex methods considered separately this decline ofmutation detection sensitivity is still higher. Interestingly,in the longest DNA fragments analyzed all detectedmutations in the duplex fraction belong to the insertion/deletion category. On the other hand, all variants detectedin the single-strand conformer portion of the longestfragments are base substitutions. An exception is the2312delAGAAG which is mutation easily detected in allfragments, fractions and conditions. These results sug-gest that the DA could predominantly detect mutationshardly detectable by the SSCP, and vice versa.

The temperature at which the analysis is performed, al-though changed in the narrow range in our experiments,does not seem to have a strong effect on the mutationdetection rate. It appears from our study that mutationsnot detected at the temperature of 257C usually remainundetected also at 307C. A similar correlation betweenmutations undetected at various temperatures wasobserved for SSCP [29] and DA [8] in a much wider tem-perature range. This suggests that the combination ofSSCP and DA may be a much better way to increase themutation detection efficiency than performing each typeof the analyses alone at different temperatures.

Figure 2. Examples of mutations analyzed by the SSCP-DA-CE method at a temperature of 307C in fragments of differentlength (short A2, medium A and long A). Along with the electrophoregrams representing the wild-type sequence, the elec-trophoregrams of the 1411insTand 1499insA mutations are shown. For each of the fragments, the DA (left side) and SSCP(right side) portions of electrophoregrams are presented separately. The time of electrophoresis is expressed both in sec-onds and (scans). Blue, green and red color-labeled peaks represent sense strand (FAM-labeled), antisense strand (JOE-labeled) and DNA size standard (ROX-labeled), respectively.

2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

76 P. Kozlowski and W. J. Krzyzosiak Electrophoresis 2005, 26, 71–81

Table 2. Mutation detection in short, medium and long fragments

Analyzedsequencevariant

Polymorphismbackground

Fragments

Long Medium Short

307C 257C 307C 257C 307C 257C

DA SS S DA SS S DA SS S DA SS S DA SS S DA SS S

1186A/G – 1 1 – – – – -1 – – -1 – 1 1 1 1 1 1

1411insT 1186A/A 1 – 1 1 – 1 1 2 1 1 – 1 1 1 1 1 1 1

1499insA 1186A/G 1 -1 1 1 – 1 1 -1 1 1 -1/- 1 1 1 1 1 1 1

1675delA 1186A/A 1 – 1 1 – 1 1 1 1 1 – 1 1 1 1 1 1 1

1806C/T 2201C/T – 1/-1 – – – – – 1 1 – 1 1 – 1 1 – 1 1

1838A/T 2201C/T – 1/-1 – – 1 1 – 1 1 – 1 1 1 1 1 1 1 1

2201C/T – 1 1 – – – – 1 1 – 1 1 – 1 1 – 1 1

2312del5 NA 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

2457C/T 2430T/C; 2731C/T – 1 1 – 1/- – – – – – 1 1 1 – 1 1 1 1

2594delC 2430T/T; 2731C/C -1 -1 – – 1 1 1 – 1 1 1 1 1 1 1 1 1 1

2430T/C – 1 1 – 1 1 – 1 1 – 1 1 – 1 1 – 1 1

2731C/T – 1 1 – 1 1 – 1 1 – 1 1 1 – 1 1 – 1

2804delAA 2430T/T; 2731C/C 1 1/-1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

2841G/T 2430T/C; 2731C/T – 1/-1 – – 1/- – – 1/- – – 1/- – 1 – 1 1 – 1

Mutation detection (%) 38.5 38.5 69.2 38.5 38.5 61.5 46.2 53.8 76.9 46.2 61.5 84.6 78.6 78.6 100 78.6 85.7 100

Columns marked DA, SS, and S represent the outcome of duplex, SSCP, and comined analysis, respectively. Symbols (1),(2), and (1-) indicate variants that were detected, undetected, or not easily detected, respectively. In case of mutationswhich are distinguished or not from homozygote and heterozygote patterns, the respective results are indicated by (1) or(2) before and after the slash symbol (/).

3.3 Conformational signatures of mutations aresuppressed in longer fragments

It may also be noticed from the data shown in Table 2, thatthere is some correlation between the detection of par-ticular sequence variants in fragments of different lengths.For example, the 1806C/T and 2201C/T mutations in theduplex fraction, which are not detected in short frag-ments, remain undetected in medium and long frag-ments. On the other hand, the mutations 1411insT and1675delA in the duplex fraction and 2201C/T in SSCPfraction were detected in all fragments differing in length.This kind of correlation could not be observed by otherauthors as the mutations they investigated were eachanalyzed in a specific sequence context and fragmentlength [20]. We propose, that the basis for mutationdetection in fragments of different length is usually thestructural change occurring in the same DNA structuremodule. However, the impact of this structural change onthe mobility shift decreases with the increase of the frag-ment length and the number of other secondary structureelements. An example may be again the 1806C/T variant,which when present in the sense strand labeled with theforward primer, shows two peaks (Fig. 4). These peaksare well separated in the shortest fragment and theirseparation decreases with increasing fragment length.

In an attempt to explain these observations we have usedthe Mfold program [36] which was earlier used to corre-late the SSCP results with the predicted secondarystructures [43, 44] and their calculated free energies [45].When the secondary structures of the sense strands werepredicted as many as four, five and four structures weregenerated for the short, medium and long fragments,respectively, for both the wild-type and mutant sequence(suboptimality parameter, 5%). The free energies of thefirst suboptimal structures differed by 0.1–2% from thoseof optimal structures. As proposed in the Fig. 4, where thelowest free-energy structures are shown, contributionsfrom other secondary structure modules present in longerDNA fragments make the same structural change causedby the mutation less influential on the overall structure.

In this example, the number of secondary structure mod-ules with DG bellow 22 kcal/mol are 3, 10 and 12 for theshort, medium and long fragments, respectively. Althoughthe DG threshold 22 kcal/mol was arbitrarily chosen tocount the number of relatively stable DNA structure mod-ules the same trend is observed for the lower 21 kcal/moland higher 25 kcal/mol thresholds. The comparison ofthe total DG values of the wild-type sequences andsequences containing mutations in fragments of differentlength led to a similar conclusion. The same difference in

2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

Electrophoresis 2005, 26, 71–81 DNA length limitations in mutation detection methods 77

Figure 3. Electrophoregrams showing SSCP patterns ofdifferent variants of the “long C” amplicon. From up todown: more frequent and less frequent homozygous, andheterozygous variant of the linked (2430T/C and 2731C/T)polymorphic variant; as well as mutation 2841G/T occur-ring in the heterozygous polymorphic background andmutations 2804delAA and 2594delC occurring in thehomozygous polymorphic background, respectively.Both the 2841G/Tand 2804delAA mutations regardless ofpolymorphic background induced the SSCP patternsimilar to that of the polymorphic heterozygote. The2594delC mutation induced a slight broadening of peakscharacteristic of the homozygote for the wild-typesequence. Other conditions and designations are thesame as described in the legend to Fig. 2.

DG between mutant and wild-type structures makes asmaller contribution to the total free energies of structuresformed by longer fragments. In the presented example, thiscontribution decreases from 0.67% in short fragment to0.21% in long fragment (Table in Fig. 4). However, when wegathered a similar set of data for the antisense strand, inwhich nomobility shift was generated by the mutationeven inthe shortest fragment, the outcome turned out to be similar.

The DNA structure module harboring the mutation waspredicted to undergo a similar structural change, and freeenergy difference between the wild type and mutant low-est energy structures was also 0.2 kcal/mol (Fig. 4). Thus,the secondary structure prediction and its stability calcu-

lation for the sense and antisense strands failed to explaintheir different responses to a mutation. This may be due toa variable contribution of tertiary interactions within dif-ferent DNA sequences, which may either further exposeor mask the effects generated by a mutation in the sec-ondary structure. The Mfold-modeled structures of otherDNA fragments showed, that the modules harboringmutations very often remain unchanged with the increaseof fragment length, no matter whether they are rearrangedby the mutation or not. Thus, typically, the expression of amutation or SNP as a mobility shift during electrophoresisis higher if the lower length of the analyzed fragment andthe number of other structure modules is present. Inagreement with the structural explanation of thedecreased mobility shifts, these differences in migrationrates do not disappear completely in longer fragments butbecome smaller and more difficult to detect.

It is likely that the different classification of such smallmobility shifts contributed to the differences in the muta-tion detection rates reported by different authors. In oneearlier investigation practically all analyzed homozygousmutations were detected in DNA fragments with lengthsof 521–741 bp [22]. That was, however, achieved byhighly rigorous calibration of electrophoresis runs at aprecisely controlled temperature, and by the use ofsequence-specific DNA length standards. Under suchconditions even very small mobility shifts induced by themutations could be taken into consideration and scoredas positives. A similar criteria applied for the analysis ofheterozygous mutations could likely result in the highincidence of false positives during routine analysis. In ourwork, such mutations were classified as “not easilydetected”. If they were classified as “detected”, theoverall mutation detection rate would rise significantly.

3.4 Common polymorphisms may hindermutation detection

Another factor that may seriously complicate mutationdetection in long DNA fragments is the presence of amutation in the background of a common polymorphism.The SNPs were found to occur, on average, every 346 bpin coding sequences of the investigated 106 humangenes [46]. In the 5592 bp long coding sequence of theBRCA1 gene 17 polymorphic sites were identified [47]. Asignificant portion of these polymorphisms are commonvariants that occur at high frequency in a population. Theyare likely to induce specific changes in the SSCP andduplex patterns of gene fragments scanned for muta-tions. The mutations, in order to be detected in any poly-morphic background, have to generate such structuraland electrophoretic mobility shifts that can be dis-

2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

78 P. Kozlowski and W. J. Krzyzosiak Electrophoresis 2005, 26, 71–81

Figure 4. Influence of the 1806C/T mutation on the SSCP pattern in short, medium and long fragments. Respective elec-trophoregrams along with the wild-type sequences reference patterns are shown from up to down. Blue and green peaksrepresent conformers formed by sense (FAM-labeled) and antisense (JOE-labeled) strand, respectively. In the right sidepanel the respective sense strand structures proposed by the DNA-Mfold program are presented. The parts of the struc-tures shown in red indicate changes induced by the 1806C/T substitution. We used Structure Viewer (included in DNA-Mfold) with algorithms: Natural and Flat_Alt as a basis for the graphical presentation of the predicted structures. Thestructures shown and their calculated energies correspond to the lowest-energy structures. Inset shows the magnifiedstructure motif changed by the mutation. Included table shows the DG value of the wild-type (DGWT) and mutant (DGM)structures; the difference between the DGWT and DGM (DDG); contribution of DDG in DGWT (%DDG); the number of motifswith DG value below 22 (M) and resolution between the mutant and wild-type specific peaks (R) is defined as distancebetween the peaks divided by their average half-height width [55]. Other conditions and designations are the same asdescribed in the legend to Fig. 2.

tinguished from those produced by each of the poly-morphic variants. In shorter fragments (below 300 bp) themobility shifts caused by mutations are usually largeenough to detect them. This, however, is not the case forlong fragments. In agreement with the results of ourstudies, the chances of mutation detection decrease withthe increase of fragment length. In longer fragments the

mobility shifts caused by mutations are usually smallerand sometimes difficult to distinguish from those inducedby heterozygous polymorphic variants (Fig. 3).

According to our results and their explanations, the poly-morphisms coexisting with mutations may complicatemutation detection in several different ways which are shown

2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

Electrophoresis 2005, 26, 71–81 DNA length limitations in mutation detection methods 79

Figure 5. Schematic representation of potential combinations of two sequence variants (mutation and frequent poly-morphism) in the same fragment, and their putative influence on mutation detection. (A)-(E) represent schematically theelectrophoretic mobilities of the same polymorphic fragment, amplified from different genomic DNA samples containing: A,wild-type consensus sequence; B, heterozygous SNP; C, heterozygous mutation; D, both mutation and polymorphismpresent on the same chromosome; E, both mutation and polymorphism present on different chromosomes. Putative SSCPpatterns corresponding to the different DNA samples (schematically illustrated above) are presented in squares. The out-comes of different sequence variants in the SSCP patterns are described in the figure. Symbols (1) and (–) indicate,respectively, the situations in which mutation is detected or not detected.

schematically in Fig. 5. For example, the mutation maynot induce a structural change sufficient to cause theelectrophoretic mobility shift and remains undetected(Figs. 5C and D, lanes 1; Fig. 5E, lanes 1 and 2). Also,the mutations which induce more profound structuralchanges and mobility shifts may be missed if they simu-late the effect of polymorphism (Figs. 5C and D, lanes 2;Fig. 5E, lane 4). Their structural effects may simply over-lap with those caused by polymorphic variants makingthem undistinguishable from the homozygote (Fig. 5E,lane 4) or heterozygote (Figs. 5C and D, lane 2) for thecoexisting polymorphism [48].

The situation becomes more complicated when twounlinked common polymorphisms occur in the same frag-ment [49, 50]. In such cases, the polymorphic variability itselfresults in the presence of ten possible combinations of gen-otypes making mutation detection very difficult. In agree-ment with the high density of SNPs in genes, the likelihood ofmutations coexisting with polymorphic sites in the analyzedfragments is high and increases with the fragment length. Inour analysis of the BRCA1 gene exon 11 fragments, all longamplicons spanned either a single or two polymorphic sites.The polymorphic sites were also present in 3/4 and 4/7 me-dium and short amplicons, respectively. The efficient analy-

sis of the fragments “C” both medium and long was possiblebecause the two coexisting polymorphisms were linked andresulted in only three out of ten possible combinations ofgenotypes. Thus, the presence of a polymorphic back-ground is another important factor limiting successful scan-ning of long fragments of genes for mutations using con-formation-sensitive methods.

4 Concluding remarks

Although the average exon length in human genes isabout 140 bp [51] the long and even very long exons suchas exon 11 of the BRCA1 gene are not uncommon. Theyjustify attempts in searching them for mutations in longerfragments. The optional scanning of cDNAs [52–54]instead of genomic sequences provides further rationalefor such investigations. The systematic analysis of differ-ent BRCA1 mutations, polymorphisms and sequencevariants in fragments of different length enabled us todemonstrate clearly the decreasing detection rate of allthese variants in fragments of increasing length. We haverevealed two structural factors limiting mutation detectionin longer DNA fragments. These factors are also impor-

2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

80 P. Kozlowski and W. J. Krzyzosiak Electrophoresis 2005, 26, 71–81

tant for other mutation detection methods, in whichstructural change caused by the mutation forms a basisfor its detection. A practical conclusion from our study isthat long-fragment analysis should be avoided whenscanning or screening for SNPs or mutations by the con-formation-sensitive methods.

We thank T. Wagner (Austria), L. Messiaen (Belgium), H.Nevanlinna, R. Winqvist (Finland), J.E. Eyfjord (Iceland),M. Santarosa, P. Radice (Italy), J.G.M. Klijn, P. Devilee,(the Nederlands), J. Apold (Norway), A. Vega, O. Diez, J.Benitez, T. Caldes (Spain), A. Lindblom (Sweden), P. Bond(UK) for providing us with DNA samples containing muta-tions in the BRCA1 and BRCA2 genes. This work wassupported by the State Committee for Scientific Re-search, grant No. PBZ-KBN-040/P04/13 and Foundationfor Polish Science, grants No. 117/96 and 8/2000. One ofus, PK, is highly indebted to the Tomasz Jakub MichalskiFoundation for their appreciation.

Received February 2, 2004

5 References

[1] Stenson, P. D., Ball, E. V., Mort, M., Phillips, A. D., Shiel, J. A.,Thomas, N. S., Abeysinghe, S., Krawczak, M., Cooper, D.N., Hum. Mutat. 2003, 21, 577–581.

[2] Thorisson, G. A., Stein, L. D., Nucleic Acids Res. 2003, 31,124–127.

[3] Orita, M., Iwahana, H., Kanazawa, H., Hayashi, K., Sekiya,T., Proc. Natl. Acad. Sci. USA 1989, 86, 2766–2770.

[4] Orita, M., Suzuki, Y., Sekiya, T., Hayashi, K., Genomics 1989,5, 874–879.

[5] Keen, J., Lester, D., Inglehearn, C., Curtis, A., Bhattacharya,S., Trends Genet. 1991, 7, 5.

[6] Keen, T. J., Inglehearn, C. F., Lester, D. H., Bashir, R., Jay,M., Bird, A. C., Jay, B., Bhattacharya, S. S., Genomics 1991,11, 199–205.

[7] Kirkpatrick, B. W., Huff, B. M., Casas-Carrillo, E., Anim.Genet. 1993, 24, 155–161.

[8] Kozlowski, P., Krzyzosiak, W. J., Nucleic Acids Res. 2001,15, E71.

[9] Ganguly, A., Rock, M. J., Prockop, D. J., Proc. Natl. Acad.Sci. USA 1993, 90, 10325–10329.

[10] Ganguly, A., Hum. Mutat. 2002, 19, 334–342.

[11] Oefner, P. J., Underhill, P. A., Am. J. Hum. Genet. 1995, 57,A266.

[12] Underhill, P. A., Jin, L., Lin, A. A., Mehdi, S. Q., Jenkins, T.,Vollrath, D., Davis, R. W., Cavalli-Sforza, L. L., Oefner, P. J.,Genome Res. 1997, 7, 996–1005.

[13] Borresen, A. L., Hovig, E., Smith-Sorensen, B., Malkin, D.,Lystad, S., Andersen, T. I., Nesland, J. M., Isselbacher, K. J.,Friend, S. H., Proc. Natl. Acad. Sci. USA 1991, 88, 8405–8409.

[14] Hovig, E., Smith-Sorensen, B., Brogger, A., Borresen, A. L.,Mutat. Res. 1991, 262, 63–71.

[15] Fischer, S. G., Lerman, L. S., Proc. Natl. Acad. Sci. USA1983, 80, 1579–1583.

[16] Wartell, R. M., Hosseini, S. H., Moran Jr., C. P., Nucleic AcidsRes. 1990, 18, 2699–2705.

[17] Gelfi, C., Righetti, P. G., Cremonesi, L., Ferrari, M., Electro-phoresis 1994, 15, 1506–1511.

[18] Kristensen, A. T., Bjorheim, J., Ekstrom, P. O., BioTechni-ques 2002, 33, 650–653.

[19] Khrapko, K., Hanekamp, J. S., Thilly, W. G., Belenkii, A.,Foret, F., Karger, B. L., Nucleic Acids Res. 1994, 22, 364–369.

[20] Sheffield, V. C., Beck, J. S., Kwitek, A. E., Sandstrom, D. W.,Stone, E. M., Genomics 1993, 16, 325–332.

[21] Tian, H., Brody, L. C., Landers, J. P., Genome Res. 2000, 10,1403–1413.

[22] Inazuka, M., Wenz, H. M., Sakabe, M., Tahira, T., Hayashi,K., Genome Res. 1997, 7, 1094–1103.

[23] Kukita, Y., Tahira, T., Sommer, S. S., Hayashi, K., Hum.Mutat. 1997, 10, 400–407.

[24] Perry, D. J., Carrell, R. W., J. Clin. Pathol. 1992, 45, 158–160.[25] Boyd, M., Lanyon, W. G., Connor, J. M., Hum. Mutat. 1993,

2, 127–130.[26] Liu, W., Smith, D. I., Rechtzigel, K. J., Thibodeau, S. N.,

James, C. D., Nucleic Acids Res. 1998, 26, 1396–1400.[27] Hebenbrock, K., Williams, P. M., Karger, B. L., Electropho-

resis 1995, 16, 1429–1436.[28] Atha, D. H., Wenz, H. M., Morehead, H., Tian, J., O’Connell,

C. D., Electrophoresis 1998, 19, 172–179.[29] Larsen, L. A., Christiansen, M., Vuust, J., Andersen, P. S.,

Hum. Mutat. 1999, 13, 318–327.[30] Cheng, J., Kasuga, T., Mitchelson, K. R., Lightly, E. R., Wat-

son, N. D., Martin, W. J., Atkinson, D., J. Chromatogr. A1994, 677, 169–177.

[31] Tian, H., Brody, L. C., Mao, D., Landers, J. P., Anal. Chem.2000, 72, 5483–5492.

[32] Tian, H., Brody, L. C., Fan, S., Huang, Z., Landers, J. P., Clin.Chem. 2001, 47, 173–185.

[33] Kourkine, I. V., Hestekin, C. N., Buchholz, B. A., Barron, A.E., Anal. Chem. 2002, 74, 2565–2572.

[34] Sobczak, K., Kozlowski, P., Krzyzosiak, W. J., Acta Biochim.Pol. 1995, 42, 363–366.

[35] Kozlowski, P., Krzyzosiak, W. J., Electrophoresis 2004, 25,990–998.

[36] Zuker, M., Nucleic Acids Res. 2003, 31, 3406–3415.[37] Allawi, H. T., SantaLucia Jr., J., Biochemistry 1997, 36,

10581–10594.[38] SantaLucia Jr., J., Proc. Natl. Acad. Sci. USA 1998, 95,

1460–1465.[39] Kozlowski, P., Sobczak, K., Jasinska, A., Krzyzosiak, W. J.,

Hum. Mutat. 2000, 16, 371.[40] Inazuka, M., Tahira, T., Hayashi, K., Genome Res. 1996, 6,

551–557.[41] Ravnik-Glavac, M., Glavac, D., Dean, M., Hum. Mol. Genet.

1994, 3, 801–807.[42] Kozlowski, P., Sobczak, K., Napierala, M., Wozniak, M.,

Czarny, J., Krzyzosiak, W. J., Nucleic Acids Res. 1996, 24,1177–1178.

[43] Nielsen, D. A., Novoradovsky, A., Goldman, D., NucleicAcids Res. 1995, 23, 2287–2291.

[44] Atha, D. H, Kasprzak, W., O’Connell, C. D., Shapiro, B. A.,Nucleic Acids Res. 2001, 29, 4643–4653.

[45] Glavac, D., Potocnik, U., Podpecnik, D., Zizek, T., Smerkolj,S., Ravnik-Glavac, M., Hum. Mutat. 2002, 19, 384–394.

[46] Cargill, M., Altshuler, D., Ireland, J., Sklar, P., Ardlie, K., Patil,N., Shaw, N., Lane, C. R., Lim, E. P., Kalyanaraman, N.,

2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

Electrophoresis 2005, 26, 71–81 DNA length limitations in mutation detection methods 81

Nemesh, J., Ziaugra, L., Friedland, L., Rolfe, A., Warrington,J., Lipshutz, R., Daley, G. Q., Lander, E. S., Nat. Genet. 1999,22, 231–238.

[47] Durocher, F., Shattuck-Eidens, D., McClure, M., Labrie, F.,Skolnick, M. H., Goldgar, D. E., Simard, J., Hum. Mol. Genet.1996, 5, 835–842.

[48] Orban, T. I., Csokay, B., Olah, E., BioTechniques 2000, 29,94–98.

[49] Kwiatkowska, J., Slomski, R., Jozwiak, S., Short, M. P.,Kwiatkowski, D. J., Genomics 1997, 44, 350–354.

[50] Veenstra-Van der Weele, J., Kim, S. J., Gonen, D., Hanna, G.L., Leventhal, B. L., Cook Jr., E. H., Mol. Psychiatry 2001, 6,160–167.

[51] International Human Genome Sequencing Consortium, Na-ture 2001, 409, 860–921.

[52] Mackay, K., Byers, P. H., Dalgleish, R., Hum. Mol. Genet.1993, 2, 1155–1160.

[53] Andersen, T. I., Eiken, H. G., Couch, F., Kaada, G., Skrede,M., Johnsen, H., Aloysius, T. A., Tveit, K. M., Tranebjaerg, L.,Dorum, A., Moller, P., Weber, B. L., Borresen-Dale, A. L.,Hum. Mutat. 1998, 11, 166–174.

[54] Jakubowska, A., Gorski, B., Byrski, T., Huzarski, T., Gron-wald, J., Menkiszak, J., Cybulski, C., Debniak, T., Hadaczek,P., Scott, R. J., Lubinski, J., Hum. Mutat. 2001, 18, 149–156.

[55] Quesada, M. A., Menchen, S., Methods Mol. Biol. 2001, 162,139–166.

2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim