5
Tuberculosis (2006) 86, 314318 Tuberculosis Detecting genetic variability among different Mycobacterium tuberculosis strains using DNA microarrays technology Raul Diaz a, , Noman Siddiqi b , Eric J. Rubin b a National Reference Laboratory on Tuberculosis and Mycobacteria, Pedro Kouri Institute of Tropical Medicine (IPK), Autopista Novia del Mediodı ´a Km 6, Lisa, Havana 11300, Cuba b Department of Immunology and Infectious Diseases, Harvard School of Public Health, 665 Huntington Ave, Boston, MA 02115, USA Received 16 November 2005; accepted 20 January 2006 KEYWORDS Mycobacteria; Genetic variability; DNA microarray; Polymorphism Summary Recent advances in functional and comparative genomics have improved our understanding of genetic diversity among the Mycobacterium tuberculosis complex. In this study, we investigated the genetic polymorphism of M. tuberculosis using whole-genome microarray analysis. Amplified fragments of 15 M. tuberculosis strains (from two different geographical origins) and the reference strain H37Rv were produced by random amplification of polymorphic DNA (RAPD) using three different primers. The RAPD products were labeled with fluorescent dyes (Cy3 and Cy5) and hybridized to a TB DNA microarray representing nearly all open reading frames (ORFs) of H37Rv. The final results were analyzed using bioinformatic tools. Some genetic variability was found among the 16 M. tuberculosis strains. The majority of the highly polymorphic DNA sequences were observed in ORFs representing non-essential genes of the bacterium. The future use of comparative genomics based on DNA microarray technology should prove a powerful tool for understanding phenotypic variability among M. tuberculosis isolates of similar genetic composition. It is also a promising approach to provide important insights into evolution, virulence and pathogenesis of M. tuberculosis. & 2006 Elsevier Ltd. All rights reserved. Introduction The availability of the complete genome sequence of Mycobacterium tuberculosis has deeply increased the knowledge about this important pathogen. In ARTICLE IN PRESS http://intl.elsevierhealth.com/journals/tube 1472-9792/$ - see front matter & 2006 Elsevier Ltd. All rights reserved. doi:10.1016/j.tube.2006.01.002 Corresponding author. Tel.: +537202 0448; fax: +537 204 6051. E-mail addresses: [email protected], [email protected] (R. Diaz).

Detecting genetic variability among different Mycobacterium tuberculosis strains using DNA microarrays technology

Embed Size (px)

Citation preview

Page 1: Detecting genetic variability among different Mycobacterium tuberculosis strains using DNA microarrays technology

ARTICLE IN PRESS

Tuberculosis (2006) 86, 314–318

Tuberculosis

1472-9792/$ - sdoi:10.1016/j.t

�Correspondfax: +537 204 6

E-mail addrraul.diaz@infom

http://intl.elsevierhealth.com/journals/tube

Detecting genetic variability among differentMycobacterium tuberculosis strains using DNAmicroarrays technology

Raul Diaza,�, Noman Siddiqib, Eric J. Rubinb

aNational Reference Laboratory on Tuberculosis and Mycobacteria, Pedro Kouri Institute of TropicalMedicine (IPK), Autopista Novia del Mediodı́a Km 6, Lisa, Havana 11300, CubabDepartment of Immunology and Infectious Diseases, Harvard School of Public Health, 665 Huntington Ave,Boston, MA 02115, USA

Received 16 November 2005; accepted 20 January 2006

KEYWORDSMycobacteria;Genetic variability;DNA microarray;Polymorphism

ee front matter & 2006ube.2006.01.002

ing author. Tel.: +537 20051.esses: [email protected] (R. Diaz).

Summary Recent advances in functional and comparative genomics haveimproved our understanding of genetic diversity among the Mycobacteriumtuberculosis complex. In this study, we investigated the genetic polymorphism ofM. tuberculosis using whole-genome microarray analysis. Amplified fragments of 15M. tuberculosis strains (from two different geographical origins) and the referencestrain H37Rv were produced by random amplification of polymorphic DNA (RAPD)using three different primers. The RAPD products were labeled with fluorescent dyes(Cy3 and Cy5) and hybridized to a TB DNA microarray representing nearly all openreading frames (ORFs) of H37Rv. The final results were analyzed using bioinformatictools. Some genetic variability was found among the 16 M. tuberculosis strains. Themajority of the highly polymorphic DNA sequences were observed in ORFsrepresenting non-essential genes of the bacterium. The future use of comparativegenomics based on DNA microarray technology should prove a powerful tool forunderstanding phenotypic variability among M. tuberculosis isolates of similargenetic composition. It is also a promising approach to provide important insightsinto evolution, virulence and pathogenesis of M. tuberculosis.& 2006 Elsevier Ltd. All rights reserved.

Elsevier Ltd. All rights reserve

2 0448;

.cu,

Introduction

The availability of the complete genome sequenceof Mycobacterium tuberculosis has deeply increasedthe knowledge about this important pathogen. In

d.

Page 2: Detecting genetic variability among different Mycobacterium tuberculosis strains using DNA microarrays technology

ARTICLE IN PRESS

Detecting genetic variability among different Mycobacterium tuberculosis strains 315

spite of large-sequence polymorphisms (LSPs) andsingle nucleotide polymorphisms (SNPs) that havebeen recently found in M. tuberculosis isolates, themolecular basis of genotypic variation in virulenceand transmissibility of the bacillus is unclear. Morerecently, new developments in functional andcomparative genomics, using DNA microarrays andbioinformatics, have fostered major advances inour understanding of genetic variability amongM. tuberculosis, providing a whole-genome per-spective on genomic content, gene regulation andmetabolism of M. tuberculosis.1,2 The applicationof DNA microarray technology among naturalpopulations of mycobacteria is a promising ap-proach for understanding their evolution, virulenceand pathogenesis. However, the resolution of suchmethods for detecting chromosomal variation islimited to insertions and deletions of sufficient sizeto be detected by microarrays. These methodsoften fail to detect small insertion/deletion eventsand are unlikely to find single nucleotide poly-morphisms (SNPs).

Here we describe a different method to detectgenetic polymorphisms in M. tuberculosis. Wecombine random amplified polymorphic DNA(RAPD)3 with DNA microarray hybridization.2Thisallows us to detect chromosomal alterations thatcannot be easily seen using either method alone.

Materials and methods

Mycobacterial strains

Sixteen M. tuberculosis strains were used in thisstudy: M. tuberculosis H37Rv, two from Russiaprovided by Megan Murray (Harvard School of PublicHealth, Massachusetts, USA) and 13 from Massa-chusetts, USA, provided by Alexander Sloutsky(Massachusetts State Laboratory Institute, USA).DNA was purified by a standard protocol.4 All DNAsamples had been genotyped previously by restric-tion fragment length polymorphism (RFLP)4 andspoligotyping.5

Polymerase chain reaction (PCR) by randomamplified polymorphic DNA (RAPD) analysis

A RAPD method using 20 different primers, OPA1-20(OPA A Kit, Operon Technologies, California, USA),were performed to obtain a large set of amplifiedfragments. DNA of M. tuberculosis H37Rv wasevaluated to select the best combination ofprimers. The DNA amplification reaction (25 mL)contained 10mM Tris/HCl (pH 8.3), 50mM KCl,

1.5mM MgCl2, 10% (vol/vol) DMSO, 25 pmolesprimer, 0.2mM dNTPs, 50 ng DNA, and 1.25 unitsof Taq DNA polymerase (Takara Bio Inc., Shiga,Japan). After selection of adequate primers thevolume, the primer concentration and thequantity of polymerase were increased to 100 mL,125 pmoles and 5 units, respectively. PCRconditions in a PTC-200 Thermocycler (MJ Re-search, Massachusetts, USA) or a MastercyclerGradient Thermocycler (Eppendorf, California,USA) were 95 1C for 2min, 35 cycles of 95 1Cfor 30 s, 30 1C for 1min and 72 1C for 2min, and afinal extension of 72 1C for 5min. Ten microlitersof each PCR product was electrophoresed in1% agarose with Tris-acetate-EDTA buffer at 70 Vfor 2 h.

DNA microarray

Fluorescent labelingPCR amplicons were purified by QIAquick Nucleo-tide Removal Kit (Qiagen, California, USA). TheKlenow reaction (to incorporate amino allylmodi-fied dNTPs for labeling) was performed in 25 mLtotal volume using 1 mg of template (300 ng eachOPA 2, 4, 20 PCR products), 2 mg of 9-mer randomprimer 0.2mM aadNTPs and 27500 unit of KlenowDNA polymerase (New England Biolabs, Massachu-setts, USA). Around 6 mg of Klenow product werelabeled with fluorescent dyes (Cy3 and Cy5,Amersham, New Jersey, USA) at room temperaturein the dark for 2 h.

Microarray hybridizationA total of 4–5 mg of labeled Klenow productswith Cy3 (and or Cy5) were hybridized against atuberculosis microarray slide containing 3855open reading frames (ORFs) from M. tuberculosisH37Rv,6 (nearly all ORFs of this bacterium) using anautomated hybridizing workstation (TECAN,Maennedorf, Switzerland) and a hybridizationprogram according to Sassettti et al.6,7 Thehybridized slides were scanned using an Axon4000B scanner 1 (Axon Instruments, California,USA) and analyzed using: GenePix software 5.1(Axon) and GeneSpring 7.0 software (Silicon Genet-ics, California, USA).

Results

RAPD analysis

A set of 20 primers was evaluated in RAPDexperiments using M. tuberculosis H37Rv DNA.

Page 3: Detecting genetic variability among different Mycobacterium tuberculosis strains using DNA microarrays technology

ARTICLE IN PRESS

R. Diaz et al.316

DNA amplification using three primers, OPA2, 4 and20, showed the greatest variety of amplifiedfragments. RAPD with a single primer producedmore amplified bands that with a combination oftwo or three per sample. The high diversity ofamplicons derived using primers OPA2, 4 and 20 wasconfirmed with DNA of 15 strains from Russia andMassachusetts (Fig. 1).

Figure 1 RAPD patterns obtained with OPA2 primer fornine Mycobacterium tuberculosis strains from the Mas-sachusetts State Laboratory Institute. Lanes 1 and 12,molecular weight marker 100 bp ladder; lanes 2–10, M.tuberculosis strains; lane 11, molecular marker 1 Kbladder.

Table 1 Examples of some essential and non-essential p

Gene Region Name

Essential genesadi Rv2531c Ornithine/arginine decaalr Rv3423c Alanine racemaseargF Rv1656 Ornithine carbamoyltraargR Rv1657 Arginine repressoratpB Rv1304 ATP synthase a chaingroEL2 Rv0440 60 kD chaperonin 2

Non-essential genesahpC Rv2428 Alkyl hydroperoxide redatsB Rv3299c Probable arylsulfatasectaB Rv1451 Cytochrome c oxidase adinX Rv1537 Probable DNA-damage-iephD Rv2214c Probable epoxide hydrofadA4 Rv1323 Acetyl-CoA C-acetyltranfadD4 Rv0214 Acyl-CoA synthasefadE10 Rv0873 Acyl-CoA dehydrogenasfdxB Rv3554 Ferredoxinfic Rv3641c Possible cell division prmmpL1 Rv0402c Conserved large membrmetZ Rv0391 O-succinylhomoserine snadR Rv0212c Similar to E.coli NadR

Genetic polymorphism in DNA microarrayexperiments

More than 90% out of 3855 ORFs analyzed werehybridized with mixed PCR products derived fromreactions using primers OPA2, 4 and 20 and H37RvDNA as template. On the other hand, the 60% of allORFs in the microarray slide showed a positivereaction with amplicons from different M. tuber-culosis strains. We found that the hybridization of293 array features was polymorphic. The poly-morphic genes were distributed randomly aroundthe genome. The majority (72%) of these poly-morphic genes was predicted to be non-essential.8

Some examples of essential and non-essentialpolymorphic genes appear in Table 1.

One of the major drawbacks of RAPD analysis isthat small changes in experimental conditionsmight yield markedly different results. To measurethe variation in experiments, we repeated ouranalysis in completely independent replicates andused two different thermocyclers for DNA amplifi-cations in successive days. We found excellentreproducibility (data not shown).

Using a clustering algorithm, we could comparethe relatedness of strains. Strains originally iso-lated in Peru (from the Massachusetts collection)were more closely related to each other than thoseisolated in Russia (Fig. 2). The two strains withidentical spoligo and RFLP patterns (M14 and M15)

olymorphic genes found in this study.

Function

rboxylase Amino acids and aminesPyruvate

nsferase Glutamate familyRepressors/activators

Chaperones/Heat shock

uctase DetoxificationSulphur metabolism

ssembly factor Aerobicnducible protein Restriction/modificationlase Detoxificationsferase (aka thiL) Fatty acids

Fatty acidse Fatty acids

Electron transportotein Cell divisionane protein Conserved membrane proteinsulfhydrylase Aspartate

Repressors/activators

Page 4: Detecting genetic variability among different Mycobacterium tuberculosis strains using DNA microarrays technology

ARTICLE IN PRESS

Figure 2 Clustering of strains based on their relatednessas determined by DNA microarray analysis of RAPDproducts. Amplicons derived from the indicated strainsand from H37Rv were differentially labeled, mixed, andhybridized to an M. tuberculosis DNA microarray. Shownare the results of a phylogenetic tree (constructed usingGeneSpring software) derived from the 293 most poly-morphic loci. Strains marked R are Russian strains whilethose marked M are from the Massachusetts StateLaboratory Institute.

Detecting genetic variability among different Mycobacterium tuberculosis strains 317

were the most closely related. However, even thesestrains could be distinguished.

Discussion

The availability of complete genome sequenceinformation of M. tuberculosis and the recentdevelopment of comparative genome tools hasprovided a great opportunity to make DNA se-quence comparisons among M. tuberculosis iso-lates. Array-based comparative genomics is apromising approach to investigate molecular epi-demiology, microbial evolution and pathogenesis ofM. tuberculosis isolates. DNA microarrays takeadvantage of what is known about genome se-quence and allow the evaluation of geneticvariability from a whole-genome perspec-tive.1,2,9,10

Here we used a combination of RAPD methodol-ogy, to produce a large set of amplified fragments,with a DNA microarray to evaluate the extent ofgenetic variability among M. tuberculosis isolatesfrom two different geographic regions. RAPDanalysis utilizes arbitrarily designed short primersto amplify several unknown loci concurrently. It issimple and easy to perform and allows rapid andinexpensive analysis. However, two drawbacks limitthe use of RAPD in M. tuberculosis. First, straindiversity in this species is rather limited. Thus,single sets of primers will only infrequently detectdifferences between strains. Second, because RAPDproducts are analyzed by gel electrophoresis, thenumber of products must be limited to avoid overlycomplex patterns.

To increase the diversity of products that couldbe analyzed, we used a DNA microarray to detectand map RAPD products. We found that the use ofthree primers produced diverse PCR amplicons.Under the conditions used in our experiments wewere able to obtain thousands of amplicons from avariety of strains. Importantly, the results obtainedwere reproducible in replicate samples, suggestingthat this method could be quite robust. In addition,analysis of different strains produced differentresults, suggesting that this method can be usefulfor detecting polymorphisms. Importantly, thelarge number of polymorphic loci, far larger thancan be identified using methods such as RFLPanalysis, suggests that this method might producefine discrimination among strains.

What is the nature of these polymorphisms? It isimpossible to determine from this preliminarywork. Clearly, insertions and deletions would likelyproduce RAPD polymorphisms. However, because ofthe nature of the method, it is possible that SNPsmight also produce different amplification pat-terns, particularly if those alterations were withinsequences to which the RAPD primers hybridized.Further characterization of polymorphic genes,using Southern hybridization and sequencing willbe necessary to identify the cause of polymorphicamplifications. Certainly, though, these differencesmight help define biological differences betweengeographically diverse strains.11

While useful in a research laboratory, the use ofDNA microarrays for typing strains is impractical. Thearrays and the equipment required are expensive andrequire extensive training is required for their use. Itis difficult to imagine the wide application of such amethodology. However, knowing which genes arepolymorphic could allow the use of very accessibletechnology. Because most genes do not produceinformative polymorphisms, they can be omittedfrom any detection strategy. There are several simplemethods to detect hybridization to macroarrays (suchas the methodology used for spoligotyping5) thatcould be combined with RAPD to produce a simpleand informative strain typing method. None of thesetechniques is more difficult than RFLP analysis andmight, therefore, be widely used. We are currentlyworking to develop such methods.

Acknowledgments

We thank Megan Murray and Alexander Sloutsky forsupplying the DNA of M. tuberculosis isolates, andmembers of the Rubin lab for helpful advice.

This work was supported by Grant AI051929 fromthe National Institutes of Health (to E.J.R.).

Page 5: Detecting genetic variability among different Mycobacterium tuberculosis strains using DNA microarrays technology

ARTICLE IN PRESS

R. Diaz et al.318

R. Diaz was the recipient of a fellowship fromDavid Rockefeller Center for Latin American Studies(DRCLAS), Harvard University.

References

1. Kato-Maeda M, Bifani PJ, Kreiswirth BN, Small PM. Thenature and consequence of genetic variability withinMycobacterium tuberculosis. J Clin Invest 2001;107:533–7.

2. Behr MA, Wilson MA, Gill WP, Salamon H, Schoolnik GK, RaneS, et al. Comparative genomics of BCG vaccines by whole-genome DNA microarray. Science 1999;284:1520–3.

3. Abed Y, Davin-Regli A, Bollet C, De Micco P. Efficientdiscrimination of Mycobacterium tuberculosis strains by16S-23S spacer region-based random amplified polymorphicDNA analysis. J Clin Microbiol 1995(5):1418–20.

4. van Embden JD, Cave MD, Crawford JT, Dale JW, EisenachKD, Gicquel B, et al. Strain identification of Mycobacteriumtuberculosis by DNA fingerprinting: recommendations for astandardized methodology. J Clin Microbiol 1993;31:406–9.

5. Kamerbeek J, Schouls L, Kolk A, van Agterveld M, vanSoolingen D, Kuijper S, et al. Simultaneous detection andstrain differentiation of Mycobacterium tuberculosis for

diagnosis and epidemiology. J Clin Microbiol 1997;35:907–14.

6. Sassetti CM, Boyd DH, Rubin EJ. Comprehensive identifica-tion of conditionally essential genes in mycobacteria. ProcNatl Acad Sci USA 2001;98:12712–7.

7. Sassetti CM, Rubin EJ. Genetic requirements for mycobac-terial survival during infection. Proc Natl Acad Sci USA2003;100:12989–94.

8. Sassetti CM, Boyd DH, Rubin EJ. Genes required formycobacterial growth defined by high density mutagenesis.Mol Microbiol 2003;48:77–84.

9. Tsolaki AG, Hirsh AE, DeRiemer K, Enciso JA, Wong MZ,Hannan M, et al. Functional and evolutionary genomics ofMycobacterium tuberculosis: insights from genomic dele-tions in 100 strains. Proc Natl Acad Sci USA 2004;101:4865–70.

10. Fleischmann RD, Alland D, Eisen JA, Carpenter L, White O,Peterson J, et al. Whole-genome comparison of Mycobacter-ium tuberculosis clinical and laboratory strains. J Bacteriol2002;184:5479–90.

11. Kremer K, van Soolingen D, Frothingham R, Haas WH,Hermans PW, Martin C, et al. Comparison of methods basedon different molecular epidemiological markers for typing ofMycobacterium tuberculosis complex strains: interlabora-tory study of discriminatory power and reproducibility.J Clin Microbiol 1999;37:2607–18.