7
Corrections MEDICAL SCIENCES. For the article ‘‘HLA-B*5801 allele as a genetic marker for severe cutaneous adverse reactions caused by allopurinol,’’ by Shuen-Iu Hung, Wen-Hung Chung, Lieh-Bang Liou, Chen-Chung Chu, Marie Lin, Hsien-Ping Huang, Yen- Ling Lin, Joung-Liang Lan, Li-Cheng Yang, Hong-Shang Hong, Ming-Jing Chen, Ping-Chin Lai, Mai-Szu Wu, Chia-Yu Chu, Kuo-Hsien Wang, Chien-Hsiun Chen, Cathy S. J. Fann, Jer- Yuarn Wu, and Yuan-Tsong Chen, which appeared in issue 11, March 15, 2005, of Proc. Natl. Acad. Sci. USA (102, 4134–4139; first published March 2, 2005; 10.1073pnas.0409500102), due to a printer’s error, the symbols in the affiliation line appeared incorrectly. The corrected affiliation line appears below. a Institute of Biomedical Sciences, Academia Sinica, Taipei 11529, Taiwan; Departments of c Dermatology, e Rheumatology, Allergy, and Immunology, and h Nephrology, Chang Gung Memorial Hospital, Taipei 10507, Taiwan; f Department of Medical Research, Mackay Memorial Hospital, Taipei 10449, Taiwan; g Department of Immunology and Rheumatology, Taichung Veterans General Hospital, Taichung 40705, Taiwan; i Department of Dermatology, National Taiwan University Hospital, Taipei 10002, Taiwan; j Department of Dermatology, Taipei Medical University Hospital, Taipei 11031, Taiwan; k Department of Medical Research, China Medical University Hospital, Taichung 40447, Taiwan; l Department of Pediatrics, Duke University Medical Center, Durham, NC 27710; and d Molecular Medicine Program, Taiwan International Graduate Program, Academia Sinica and the School of Life Sciences, National Yang Ming University, Taipei 11529, Taiwan www.pnas.orgcgidoi10.1073pnas.0502360102 BIOGRAPHY, NEUROSCIENCE. For the article ‘‘Biography of Cornelia I. Bargmann,’’ by Melissa Marino, which appeared in issue 9, March 1, 2005, of Proc. Natl. Acad. Sci. USA (102, 3181–3183; first published February 22, 2005; 10.1073pnas.0500025102), due to an editorial office error, the term ‘‘roundworm’’ was incorrectly replaced with ‘‘flatworm’’ in the first sentence. The correct sentence should read as follows: ‘‘In the unhearing, unseeing world of the roundworm Caenorhabditis elegans, its sense of smell is its lifeline.’’ www.pnas.orgcgidoi10.1073pnas.0502717102 www.pnas.org PNAS April 26, 2005 vol. 102 no. 17 6237– 6238 CORRECTIONS

Symmetrical base preferences surrounding HIV-1, avian sarcoma

Embed Size (px)

Citation preview

Corrections

MEDICAL SCIENCES. For the article ‘‘HLA-B*5801 allele as agenetic marker for severe cutaneous adverse reactions caused byallopurinol,’’ by Shuen-Iu Hung, Wen-Hung Chung, Lieh-BangLiou, Chen-Chung Chu, Marie Lin, Hsien-Ping Huang, Yen-Ling Lin, Joung-Liang Lan, Li-Cheng Yang, Hong-Shang Hong,Ming-Jing Chen, Ping-Chin Lai, Mai-Szu Wu, Chia-Yu Chu,Kuo-Hsien Wang, Chien-Hsiun Chen, Cathy S. J. Fann, Jer-Yuarn Wu, and Yuan-Tsong Chen, which appeared in issue 11,March 15, 2005, of Proc. Natl. Acad. Sci. USA (102, 4134–4139;first published March 2, 2005; 10.1073�pnas.0409500102), due toa printer’s error, the symbols in the affiliation line appearedincorrectly. The corrected affiliation line appears below.aInstitute of Biomedical Sciences, Academia Sinica, Taipei 11529, Taiwan;Departments of cDermatology, eRheumatology, Allergy, and Immunology,and hNephrology, Chang Gung Memorial Hospital, Taipei 10507, Taiwan;fDepartment of Medical Research, Mackay Memorial Hospital, Taipei 10449,Taiwan; gDepartment of Immunology and Rheumatology, TaichungVeterans General Hospital, Taichung 40705, Taiwan; iDepartment ofDermatology, National Taiwan University Hospital, Taipei 10002, Taiwan;jDepartment of Dermatology, Taipei Medical University Hospital, Taipei11031, Taiwan; kDepartment of Medical Research, China Medical UniversityHospital, Taichung 40447, Taiwan; lDepartment of Pediatrics, DukeUniversity Medical Center, Durham, NC 27710; and dMolecular MedicineProgram, Taiwan International Graduate Program, Academia Sinica and theSchool of Life Sciences, National Yang Ming University, Taipei 11529,Taiwan

www.pnas.org�cgi�doi�10.1073�pnas.0502360102

BIOGRAPHY, NEUROSCIENCE. For the article ‘‘Biography of CorneliaI. Bargmann,’’ by Melissa Marino, which appeared in issue 9,March 1, 2005, of Proc. Natl. Acad. Sci. USA (102, 3181–3183;first published February 22, 2005; 10.1073�pnas.0500025102),due to an editorial office error, the term ‘‘roundworm’’ wasincorrectly replaced with ‘‘f latworm’’ in the first sentence. Thecorrect sentence should read as follows: ‘‘In the unhearing,unseeing world of the roundworm Caenorhabditis elegans, itssense of smell is its lifeline.’’

www.pnas.org�cgi�doi�10.1073�pnas.0502717102

www.pnas.org PNAS � April 26, 2005 � vol. 102 � no. 17 � 6237–6238

CORR

ECTI

ON

S

MICROBIOLOGY. For the article ‘‘Symmetrical base preferencessurrounding HIV-1 and avian sarcoma�leukosis virus but notmurine leukemia virus integration sites,’’ by Alexander G.Holman and John M. Coffin, which appeared in issue 17, April26, 2005, of Proc. Natl. Acad. Sci. USA (102, 6103–6107; firstpublished March 31, 2005; 10.1073�pnas.0501646102), the au-thors note the following: ‘‘After our report appeared in thePNAS Early Edition, we observed that the simultaneouslypublished paper by Wu et al. (1) reported similar base prefer-ences for all integration sites; however, the placement of theintegration site in the analysis of the murine leukemia virus(MLV) data set differed by one base. Further analysis revealeda small initial error that propagated through our analysis,ultimately leading us to misplace the location of the MLVintegration site by one base. As a consequence, the MLVintegration site preferences incorrectly appeared to be asym-metric. We now conclude that HIV-1, avian sarcoma�leukosisvirus (ASLV), and MLV all show symmetrical base preferencessurrounding their integration sites. Accordingly, the title of thearticle should be corrected to read ‘Symmetrical base prefer-ences surrounding HIV-1, avian sarcoma�leukosis virus, andmurine leukemia virus integration sites,’ and it has been cor-rected in the online version. Fig. 2 and Figs. 11 and 12, which arepublished as supporting information on the PNAS web site, canbe corrected by placing the integration site between offset 0 and1 instead of �1 and 0. A corrected version of Fig. 4 and its legendappear below. The remainder of the analysis was unaffected bythis error, and with the described correction, we remain confi-dent in our overall conclusions.’’

1. Wu, X., Li, Y., Crise, B., Burgess, S. M. & Munroe, D. J. (2005) J. Virol. 79,5211–5214.

www.pnas.org�cgi�doi�10.1073�pnas.0502810102

Fig. 4. Comparison of the observed integration preferences to the inferredpreferences for the opposite LTR. (A) Schematic of the topology of HIV-1integration. HIV-1 integration complexes join the viral LTRs to oppositestrands of the DNA separated by five bases. MLV joins with an offset of fourbases, whereas ASLV uses a six-base offset (not pictured). (B) Symmetryobserved in HIV-1 with five-base offset. Black lettering represents the basepreference seen from the top LTR (Fig. 1).The integration site is indicated bythe black dashed vertical line in the graph and the black arrow in thenumbering schematic. The vertical arrow indicates the expected axis of sym-metry based on the characteristic five-base spacing between the sites of HIV-1DNA integration. The red lettering represents the same base preferences;however, they are reversed and shifted five bases to represent the preferencesas observed from the bottom LTR. The inferred integration site is indicated bythe red vertical line in the graph and the red arrow in the numberingschematic. (C) Symmetry observed in MLV with four-base offset. (D) Symmetryobserved in ALV with six-base offset.

6238 � www.pnas.org

Symmetrical base preferences surrounding HIV-1,avian sarcoma�leukosis virus, and murineleukemia virus integration sitesAlexander G. Holman and John M. Coffin*

Department of Molecular Microbiology, Sackler School of Graduate Biomedical Sciences, Tufts University School of Medicine, Boston, MA 02111

Contributed by John M. Coffin, February 28, 2005

To investigate retroviral integration targeting on a nucleotidescale, we examined the base frequencies directly surroundingcloned in vivo HIV-1, murine leukemia virus, and avian sarcoma�leukosis virus integrations. Base preferences of up to 2-fold theexpected frequencies were found for three viruses, representing Pvalues down to <10�100 and defining what appear to be preferredintegration sequences. Offset symmetry reflecting the topology ofthe integration reaction was found for HIV-1 and avian sarcoma�leukosis virus but not murine leukemia virus, suggesting funda-mental differences in the way different retroviral integrationcomplexes interact with host-cell DNA.

retrovirus � targeting

A lthough recent evidence suggests that integration prefer-ences for some retroviruses are based on the in vivo

transcriptional activity of the target DNA (1–5), it has beengenerally thought that there is little sequence specificity for basessurrounding the integration site. In vitro, studies using nonionicdetergent-lysed HIV-1 virions have found small base preferencessurrounding the integration (6). In vivo, only a weak basepreference has been described within the five-base duplicationproduced during HIV-1 integration (7).

Recently, several cloning projects have produced an unprec-edented number of sequences from HIV-1, murine leukemiavirus (MLV), and avian sarcoma�leukosis virus (ASLV) inte-grations into human cellular DNA (1, 3–5). To better understandhow these viruses select an integration site, we analyzed the basepreferences in the genomic sequence directly surrounding theintegration sites. We report here that, with an analysis possessingsufficient statistical power, strong preferences are shown. Inaddition, this analysis reveals previously undescribed patterns ofsymmetry reflecting the topology of the integration reaction.

Materials and MethodsObtaining Cloned Integration Sites. Sequence sets used were ob-tained by selective amplification and cloning of human cellularDNA flanking integration sites published by Schroder et al. (1)for HIV-1 deposited in GenBank (accession nos. BH609398–BH609878); those reported by Wu et al. (3), for HIV-1 and MLVobtained as a gift from the authors and deposited in GenBank(accession nos. AY515855–AY517469); those reported byMitchell et al. (4), for HIV-1 and ASLV deposited in GenBank(accession nos. CL528318–CL529767); and those reported byNarezkina et al. (5), for ASLV obtained as a gift and depositedin GenBank (accession nos. AY653309–AY653534) (sequencesets are further described in Table 1).

Genomic Localization of Integration Sites. The BLAT program,hosted at the University of California, Santa Cruz (8), was usedto search each integration clone against the July 2003 freezeof the human genome. The BLAT genome search, hosted by theUniversity of California at Santa Cruz Genome BioinformaticsGroup, was used because it excels at quickly finding high-

identity matches searching with 30-mer or longer sequences.Results were curated to remove low-quality hits. Matches weregrouped by sequence name, then sorted by total number ofbases matched. The best match for each sequence was used ifit had �95% identity and if the second best match had �90%identity. Of the clones discarded, nearly all were removedbecause they were too short to uniquely match a singlegenomic site. Additionally, the effects observed were quitesimilar among data sets irrespective of the number of clonesdiscarded. If the beginning of the match did not fall at the firstbase of the clone, the genomic match was adjusted by the samenumber of bases. For plus-strand matches, bases were sub-tracted from the lower of the two genomic site boundaries. Forminus-strand matches, bases were added to the higher of thetwo genomic site boundaries. The genomic sequence was thenretrieved from the July 2003 freeze of the genome databasehosted at the University of California, Santa Cruz (9, 10). Forplus-strand matches, the lower boundary of the genomic matchwas used as the site of viral joining, and the plus-strandsequence was requested. For minus-strand matches, the higherboundary of the genomic match was used as the site of viraljoining, and the minus-strand sequence was requested. Onethousand bases of f lanking sequence both 5� and 3� to the siteof viral joining was also requested, producing a 2,001-basesequence with the joining site for a retroviral integration eventlocated at the center. A set of 881 random locations throughoutthe genome was used as a control.

Analysis of Base Frequency Surrounding Integration Sites. Sequenceswere aligned to their integration site and numbered relative todistance from integration. The center base was the firstgenomic base 3� to the viral integration joint, referred to asoffset 0, f lanked by 500 bases 5�, referred to as offsets �500through �1, and f lanked by 500 bases 3�, referred to as offsets1–500. Thus the sequences analyzed represented the genomicsequence before the viral long terminal repeat (LTR) wasinserted between offsets �1 and 0. A global base frequencywas calculated across all offsets for all sequences present in aset. Base frequencies observed at each offset between �500and 500 within a set were compared to the global basefrequencies and the P value of any differences determinedby using �2 analysis. The overall base composition for each setdid not differ greatly from that of the genome as a whole(Table 1).

ResultsTo determine whether there was a base preference in the vicinityof the integrations, the sequences from each data set were

Abbreviations: MLV, murine leukemia virus; ASLV, avian sarcoma�leukosis virus; LTR, longterminal repeat.

See Commentary on page 5903.

*To whom correspondence should be addressed. E-mail: [email protected].

© 2005 by The National Academy of Sciences of the USA

www.pnas.org�cgi�doi�10.1073�pnas.0501646102 PNAS � April 26, 2005 � vol. 102 � no. 17 � 6103–6107

MIC

ROBI

OLO

GY

SEE

COM

MEN

TARY

aligned to the integration site and the frequency of each base ateach position tabulated. These frequencies were compared withthe overall frequency for that set by using a �2 analysis tocalculate P values. The randomly selected genomic sites showedno significant bias at any site, with only two locations in the 1,001bases analyzed having P values below 0.001 (Fig. 1 D and E andTable 2, which is published as supporting information on thePNAS web site).

In a preliminary analysis, the five HIV-1 sequence setsrevealed significant preferences at offsets as far as 19 bases 5�and 15 bases 3� from the integration site (Figs. 5–9, which arepublished as supporting information on the PNAS web site) thatcorrelated across all five data sets (Fig. 10, which is published assupporting information on the PNAS web site). These five setswere then analyzed together as a combined HIV-1 data set,further clarifying the sequence preferred by the virus andexpanding the region containing strongly preferred sites. Thepattern [�3]TDG(int)GTWACCHA[7] (written by using stan-dard International Union of Biochemistry mixed base codes) waspreferred between offsets �3 and 7 with some bases appearingat frequencies up to 2-fold higher than expected yielding P valuesas low as 10�156 (Fig. 1 A and B and Table 3, which is publishedas supporting information on the PNAS web site). Preferredbases with P values �10�3 were found as far as 20 bases 5� and17 bases 3� of the site of integration with particularly A-T-richregions centered around positions �10 and 14. The bias forparticular bases near the integration site is particularly clearwhen the P values for the base distribution are plotted for theentire 1-kb region flanking the integration site (Fig. 1C).

The MLV and ASLV data sets were analyzed by the samemethods, with minor preprocessing of the Narezkina sequenc-es† (Figs. 11–16, which are published as supporting informa-tion on the PNAS web site). Both viruses showed significantbase preferences proximal to the site of integration and littlepreference distant from it (Figs. 2 and 3 and Tables 4 and 5,which are published as supporting information on the PNASweb site). The preferences found in the region of integration,

[– 4]DNST(int)VVTRBSAV[7] for MLV and [– 4]ST-NN(int)SNNNNSNAAS[9] for ASLV contained individualbases occurring up to 2-fold more or 2-fold less often thanexpected, representing P values as low as 10�77. Among thethree viruses, the patterns of base preferences showed simi-larity neither in the pattern of bases preferred nor in the sitesat which a preference occurred. This result is consistent withthe observation that different viruses produce different pat-terns of integration hot and cold spots (11). Likely, the basepreferences observed alter the bending, twisting, or stacking ofthe DNA strand to produce a secondary structure optimal forinteraction with a specific species of viral integrase.

Further analysis of the preferred sequences revealed a strikingsymmetry that may provide important clues to the role of each LTRintegrase complex in the targeting of the integration reaction.Integration is mediated by a presumably symmetrical complex offour integrase molecules, the two LTR ends, and some associatedhost factors. During integration, both 3� ends of the viral DNA areinserted into the host DNA separated by four to six bases, depend-ing on the viral species (6, 12). Therefore, for HIV-1 with itsfive-base separation, the base pair that represents offset 0 from oneLTR integration also represents offset 4 observed from the per-spective of the opposite LTR (Fig. 4A). To determine whether thispredicted symmetry was represented in the pattern of preferredbases, we plotted the observed base preferences for each viruscompared with the same preferences seen from the opposite LTRintegrase complex.‡ HIV-1 exhibited a strong axis of symmetry atoffset 2 corresponding to its five-base integration spacing (Fig. 4B,black vs. red lettering). This symmetry was retained through bothpreferred and avoided bases, for example, the preferred G�C pairat offset 0 and the avoided C�G pair at offset �2. Offsets as far as�15 upstream and 19 downstream showed complementarity to thecorresponding offset, as seen by the opposing complex. This resultstrongly suggests that, for HIV-1, the topology of the double-endedintegration reaction with its conserved 5-bp duplication is key intarget site choice and is thus reflected in the base preferencesobserved surrounding the integration site. ASLV integration cre-ates a six-base duplication, putting the predicted axis of symmetrybetween offsets 2 and 3. Complementary preferred and avoidedbases were seen at all six sites that exhibit significant preferences(Fig. 4D, black vs. red lettering). MLV integration leads to four-base duplication of genomic DNA, corresponding to an axis of

†The Schroder, Wu, and Mitchell sequence sets are deposited in Genbank, with theintegration site occurring at the 5� followed by the cloned genomic sequence. TheNarezkina set places the integration at either end of each sequence, depending onthe strand to which the integration was mapped. The cloning of the Narezkina sequencesused a single four-cutter restriction enzyme whose site was required in the final screeningof the clones. Thus it was possible to reorient the sequences based on the occurrence of therestriction site at the 3� end of the genomic sequence and to place the integration site atthe 5� end. In the few cases where neither or both ends showed a restriction site, thesequence was discarded from the analysis.

‡Base preferences viewed from the opposite LTR were created by reversing the observedpreferences and offsetting them four to six bases depending on the spacing seen betweenthe LTR integration sites for each virus species.

Table 1. Sets of integration clones used

Infecting virusOriginalnumber

Aftercuration Experimental description

Base frequenciesA�C�G�T, %

HIV-1 (1) 481 342 Pseudotyped HIV-1 infecting SupT1 cells 29.5�20.2�20.3�30.0HIV-1 (3) 462 360 Wild-type HIV-1 in H9 cells 30.6�19.6�19.6�30.2HIV-1 (3) 294 164 Pseudotyped HIV-1 infecting HeLa cells 30.0�19.9�20.0�30.1HIV-1 (4) 528 503 Pseudotyped HIV-1 infecting peripheral

blood mononuclear cells30.9�19.2�19.2�30.9

HIV-1 (4) 467 426 Pseudotyped HIV-1 infecting IMR-90 lungfibroblasts

30.4�19.8�19.8�30.1

MLV (3) 623 567 Pseudotyped MLV infecting HeLa cells 28.7�21.4�21.5�28.4MLV (3) 431 372 Pseudotyped MLV infecting HeLa cells 28.0�22.1�22.1�27.8ASLV (5) 226 194 Pseudotyped ASLV infecting HeLa cells 28.9�20.6�20.6�29.7ASLV (4) 455 426 Pseudotyped ASLV infecting 293-TVA cells 29.3�20.2�20.2�30.1Random control 881 881 Randomly chosen 2,001 bp human sequences 28.9�21.1�21.1�29.0Total HIV 2,232 1,795 30.4�19.7�19.7�30.2Total MLV 1,054 939 28.3�21.8�21.8�28.2Total ASLV 681 620 29.2�20.3�20.3�30.0

6104 � www.pnas.org�cgi�doi�10.1073�pnas.0501646102 Holman and Coffin

symmetry between offsets 1 and 2. In the MLV data, however, littlecomplementarity was seen around this axis (Fig. 4C, black vs. redlettering). Thus, the strong symmetry seen with HIV-1 and ASLVis not a feature of retroviral DNA integration in general.

DiscussionBecause the expected symmetry exists in the preference patternsfor HIV-1 and ASLV, both LTRs likely play a strong role in the

the expected axis of symmetry based on the characteristic five-base spacingbetween the sites of HIV-1 DNA integration. The x axis shows the offset foreach base from the integration site. Sequences have been aligned so that allintegrations fall between offsets �1 and 0, as indicated by the black dashedvertical line. The y axis represents the percent of the expected frequencyobserved for each base (58% A�T, 42% G�C). The horizontal line is drawn at100% of the expected frequency. (B) P values obtained by �2 analysis compar-ing observed base frequencies with the expected frequencies. The y axisindicates the negative log10 of the P value. Taller bars indicate a moresignificant P value. Actual P values for each offset are shown at the top of thesection. (C) Negative log10 of P values seen within the entire region 500 bases5� and 3� from the integration site. (D) Base preferences directly surroundingmock integration sites in randomly selected genomic sites. Conventions are asin A. (E) P values of the base preferences surrounding mock integration sitesin randomly selected genomic sites. Conventions are as in B.

Fig. 1. Base preferences directly surrounding cloned HIV-1 integration sites.(A) Preferences around the HIV-1 integration site. Base frequencies relative tothe integration site of the 5� LTR end are shown. The vertical arrow indicates

Fig. 2. Base preferences directly surrounding cloned MLV integration sites.Conventions are as in Fig. 1, except that the symmetry is based on a four-baseoffset between integration sites.

Holman and Coffin PNAS � April 26, 2005 � vol. 102 � no. 17 � 6105

MIC

ROBI

OLO

GY

SEE

COM

MEN

TARY

targeting of the integration reaction, not necessarily acting together.The symmetry observed could be the result of either the interactionof the complex as a whole with a symmetric site or of either onecomplex or the other interacting with an asymmetric target DNAusing the same site preference. There may be fundamental differ-ences in MLV to preclude such symmetry. Possibly the asymmetrictargeting of MLV may be due to the action of a single dominantLTR complex. These differences in symmetry would representevidence of fundamental differences in the integration mechanismsof different retroviruses.

Our analysis has shown highly significant base preferences sur-rounding the integration sites of HIV-1, MLV, and ASLV, in somecases representing 2-fold higher- or lower-than-expected base fre-quencies. The patterns differ strikingly among viruses in both thesites at which preferences exist and the bases preferred at thosesites. All bases are allowed at all sites; none is absolutely requiredor prohibited. The absence of an absolute consensus suggests thattargeting mechanisms other than primary sequence recognition areinvolved. It is likely that structural characteristics optimal forinteraction with integrase are shaped by the DNA primary se-quence. Specificity of integration of retroviral DNA into chromo-somal DNA is defined by a combination of several factors, includingproximity to genes and�or transcriptional start sites (1, 3–5);

Fig. 4. Comparison of the observed integration preferences to the inferredpreferences for the opposite LTR. (A) Schematic of the topology of HIV-1 inte-gration. HIV-1 integration complexes join the viral LTRs to opposite strands of theDNA separated by five bases. MLV joins with an offset of four bases, whereasASLV uses a six-base offset (not pictured). (B) Symmetry observed in HIV-1 withfive-base offset. Black lettering represents the base preference seen from the topLTR(Fig.1). The integrationsite is indicatedbytheblackdashedvertical line in thegraph and the black arrow in the numbering schematic. The vertical arrowindicates the expected axis of symmetry based on the characteristic five-basespacing between the sites of HIV-1 DNA integration. The red lettering representsthe same base preferences; however, they are reversed and shifted five bases torepresent the preferences as observed from the bottom LTR. The inferred inte-gration site is indicated by the red vertical line in the graph and the red arrow inthe numbering schematic. (C) Symmetry observed in MLV with four-base offset.(D) Symmetry observed in ALV with six-base offset.

Fig. 3. Base preferences directly surrounding cloned ASLV integration sites.Conventions are as in Fig. 1, except that the symmetry is based on a six-baseoffset between integration sites.

6106 � www.pnas.org�cgi�doi�10.1073�pnas.0501646102 Holman and Coffin

transcriptional activity of the target DNA at the time of integration(1, 2); and, as exposed by our analysis, sequence of the integrationtarget itself. Large regional effects may cause DNA to becomeaccessible to the integration machinery, whereas microscale effectsmediated by primary sequence seem to regulate actual target sitechoice. For HIV-1 and ASLV but not MLV, the symmetry of theintegration reaction was reflected in the preferred bases, suggestingunderlying differences in the way the virus interacts with DNA.Understanding how these specificities interact with one another iscrucial to understanding the nature, mechanism, and importance ofthe overall specificity of the integration process.

We thank Shawn Burgess of the National Human Genome ResearchInstitute, National Institutes of Health (Bethesda, MD), for the gift ofthe Wu data sets; Anna Narezkina of the University of PennsylvaniaSchool of Medicine (Philadelphia) for the gift of the Narezkina data sets;the Center for Gastroenterology Research on Absorptive and SecretoryProcesses (GRASP) at Tufts University for technical support, and KurtWollenberg for statistical consultation. This work was supported byNational Institutes of Health Grant R01-CA-92192 (to J.M.C.). J.M.C.was a Research Professor of the American Cancer Society, with supportfrom the F. M. Kirby Foundation. A.G.H. was supported by a NationalInstitutes of Health Interdisciplinary Training Program in Cancer Ge-netics (NIH T32 CA65441).

1. Schroder, A. R., Shinn, P., Chen, H., Berry, C., Ecker, J. R. & Bushman, F.(2002) Cell 110, 521–529.

2. Maxfield, L. F., Fraize, C. D. & Coffin, J. M. (2005) Proc. Natl. Acad. Sci. USA102, 1436–1441.

3. Wu, X., Li, Y., Crise, B. & Burgess, S. M. (2003) Science 300, 1749–1751.4. Mitchell, R. S., Beitzel, B. F., Schroder, A. R., Shinn, P., Chen, H., Berry, C. C.,

Ecker, J. R. & Bushman, F. D. (2004) PloS Biol. 2, E234.5. Narezkina, A., Taganov, K. D., Litwin, S., Stoyanova, R., Hayashi, J., Seeger,

C., Skalka, A. M. & Katz, R. A. (2004) J. Virol. 78, 11656–11663.6. Goodarzi, G., Chiu, R., Brackmann, K., Kohn, K., Pommier, Y. & Grandgenett,

D. P. (1997) Virology 231, 210–217.

7. Carteau, S., Hoffmann, C. & Bushman, F. (1998) J. Virol. 72, 4005–4014.8. Kent, W. J. (2002) Genome. Res. 12, 656–664.9. Karolchik, D., Baertsch, R., Diekhans, M., Furey, T. S., Hinrichs, A., Lu, Y. T.,

Roskin, K. M., Schwartz, M., Sugnet, C. W., Thomas, D. J., et al. (2003) NucleicAcids Res. 31, 51–54.

10. Lander, E. S., Linton, L. M., Birren, B., Nusbaum, C., Zody, M. C., Baldwin,J., Devon, K., Dewar, K., Doyle, M., FitzHugh, W., et al. (2001) Nature 409,860–921.

11. Shibagaki, Y. & Chow, S. A. (1997) J. Biol. Chem. 272, 8361–8369.12. Goodarzi, G., Im, G. J., Brackmann, K. & Grandgenett, D. (1995) J. Virol. 69,

6090–6097.

Holman and Coffin PNAS � April 26, 2005 � vol. 102 � no. 17 � 6107

MIC

ROBI

OLO

GY

SEE

COM

MEN

TARY