10
Supporting Information Glessner et al. 10.1073/pnas.1000274107 SI Methods PennCNV-Affy. The CNV calling on the Affymetrix 6.0 platform used a very similar algorithm to those used in the Illumina arrays, but with different signal preprocessing steps. Unlike the Illumina platform, where normalized signal intensities (LRR and BAF) can be exported directly from the BeadStudio software, these signal intensity measures in the Affymetrix platform must be calculated from the collection of genotyped samples. We used APT (https://www.affymetrix.com/support/developer/powertools/ changelog/index.html) to perform data normalization and signal extraction from raw CEL les generated in genotyping experiments. The median smoothingand quantile normalizationoptions were used in APT. In addition, the expr.genotype = trueoption was used to specify allele-specic signal extraction. This step uses a self-normalization algorithm that requires information contained within all of the genotyped samples. The APT software was used for genotype calling as well, with a condence scoreassigned to each genotype call. For each SNP marker, we then relied on the allele- specic signal intensity for the AA, AB, and BB genotypes on all genotyped samples to construct three canonical genotype clusters, similar to the Illumina clustering generation approach. We did not use aenotype calls with a condence score < 0.1 in constructing canonical genotype clusters. Once the canonical genotype clusters have been constructed, we can then transform the signal intensity values for each SNP to LRR and BAF values. The Affymetrix arrays contain nonpolymorphic (NP) markers to provide better genome coverage than SNP markers only. These markers can be handled in a fashion similar to SNPs for CN in- ference, but there are some differences. First, the R value is calculated as the signal intensity of the NP marker, rather than as the sum of two alleles. The expected R value for each NP marker is calculated as the median signal intensity values for all geno- typed samples at this marker. In addition, the BAF values cannot be derived for NP markers. Consequently, they are not used in the likelihood calculation. Finally, due to the use of fewer probes, the variance of LRR values for NP markers may be different from that for SNP markers; thus, the likelihood model parameters for LRR are different in NP markers and SNP markers. CNV Validation by Quantitative PCR. Universal Probe Library (UPL; Roche) probes were selected using ProbeFinder v2.41 (Roche). Quantitative PCR was performed with an ABI 7500 Real-Time PCR instrument or an ABI Prism 7900HT Sequence Detection System (Applied Biosystems). Each sample was analyzed in qua- druplicate either in 25 μL of reaction mixture [250 nM probe, 900 nM each primer, Fast Start TaqMan Probe Master (Roche), and 10 ng genomic DNA] or in 10 μL of reaction mixture [100 nM probe, 200 nM each primer, 1× Platinum Quantitative PCR SuperMix- Uracil-DNA-Glycosylase (UDG) with ROX (Invitrogen), and 25 ng genomic DNA]. The values were evaluated using Sequence Detection Software v2.2.1 (Applied Biosystems). Data analysis was also performed using either the ΔΔC T method or qBase. Refer- ence genes, chosen from COBL, GUSB, and SNCA, were included based on the minimal coefcient of variation, and then the data were normalized by setting a normal control to a value of 1. Illumina Innium Assay for CNV Calling. The genotype data content together with the intensity data provided by the genotyping array provides high condence for CNV calls. The array platform used in this study provides a highly robust and reproducible SNP clustering because of the random placement of SNP specic beads with 18- fold redundancy for each SNP. Using a SNP array provides allele frequency data that can be analyzed and more closely controlled for redundancy and high performance compared with public data- bases. This establishes a more robust denition for normal diploid states than can be provided by intensity alone. The genotype clustering establishes the probe performance at each locus for the expected heterozygous genotype state. Based on the hybridization efciency, this may tend more to the DNP-tagged red range or the biotin-tagged green range for any given locus. The normalization performed to calculate BAF from theta adjusts the SNP-specic range to an expected value of 0.5. This creates more continuous data, because the heterozygous state is properly modeled based on extensive genotyping. Another key technical strength of our study is that the same array was typed at the same genotyping facility at the same time, with the same cluster le for cases and controls. The data analysis is also standardized as described in Methods, and CNVs are called with the same version of PennCNV. CNV Filtering Steps. We performed multiple CNV ltering steps as part of the analysis. First, it is important to note that of the 1,736,438 markers (848,415 SNPs and 888,023 CNs) with chro- mosome annotation, noncomplete genotyping failure, three ge- notype states observed, and normal theta patterns on the Affy- metrix 6.0 array, 33,797 (10,687 SNPs and 23,110 CNs) (1.95%) showed deletion and 44,023 (16,618 SNPs and 27,405 CNs) (2.54%) showed duplication in at least two or more unrelated cases in the MGS/CHOP discovery cohort (frequency 0.205%). The threshold of two cases is selected because it is the minimal case frequency to provide certainty that the calls are reliable in a given region. We nd this upfront exclusion to be very similar to the inclusion threshold of 1% minor allele frequency in GWA SNP genotype studies. This drastically decreases the number of tests performed to correct for genome-wide testing. Second, all CNVs were called simultaneously in both cases and controls and classied into CNVRs as dened in the main text. A total of 70 deletion CNVRs and 50 duplication CNVRs were identied. Third, to search for novel CNVs, we rst ltered out all CNVRs that were not nominally signicantly overrepresented in the CHOP cases (P < 0.05) and carefully reviewed the raw data (BAF and LRR) for accurate CNV calling and statistical signi- cance as described in Methods. This left us with 20 deletion CNVRs and 5 duplication CNVRs that we subsequently divided into two categories: (i ) CNVs present in cases only and absent in controls (n = 5 deletions and 2 duplications) and (ii ) CNVs nominally sig- nicantly overrepresented in the cases (n = 15 deletions and 3 duplications). For (i ), based on the inclusion signicance criteria, there were at least two cases per individual CNV. Thus, the dataset comprising (i ) and (ii ) denes the CNVRs from the discovery co- hort that we used to test for novel schizophrenia CNVs. We next attempted to replicate these CNVRs in the in- dependent case-control dataset (MGS/CHOP). Seven deletion CNVRs and one duplication CNVR survived our replication criteria (P < 0.05 after adjustment for the number of tests per- formed) or were absent in the independent control set and were subsequently experimentally validated with two independent methods, qPCR and Illumina Human Hap550 Beadchip. These results are shown in Table 2. In Table 2, CNVRs that survived multiple testing with Bonferroni adjustment in the discovery phase (P < 0.05 after correction for 20 tests in the case of deletions and 5 tests in the case of duplications) and also survived replication and experimental validation are pre- sented in bold type. CNVRs signicant in the discovery phase but not in the replication phase are listed in Table S2. Glessner et al. www.pnas.org/cgi/content/short/1000274107 1 of 10

Supporting Information - PNAS › content › suppl › 2010 › 05 › 20 › 1000274107... · 2010-05-20 · Supporting Information Glessner et al. 10.1073/pnas.1000274107 SI Methods

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Supporting Information - PNAS › content › suppl › 2010 › 05 › 20 › 1000274107... · 2010-05-20 · Supporting Information Glessner et al. 10.1073/pnas.1000274107 SI Methods

Supporting InformationGlessner et al. 10.1073/pnas.1000274107SI MethodsPennCNV-Affy. The CNV calling on the Affymetrix 6.0 platformused a very similar algorithm to those used in the Illumina arrays,but with different signal preprocessing steps. Unlike the Illuminaplatform, where normalized signal intensities (LRR and BAF)can be exported directly from the BeadStudio software, thesesignal intensity measures in the Affymetrix platform must becalculated from the collection of genotyped samples. We usedAPT (https://www.affymetrix.com/support/developer/powertools/changelog/index.html) to perform data normalization and signalextraction from rawCELfiles generated in genotyping experiments.The “median smoothing” and “quantile normalization” optionswere used in APT. In addition, the “expr.genotype = true” optionwas used to specify allele-specific signal extraction. This step usesa self-normalization algorithm that requires information containedwithin all of the genotyped samples. TheAPT software was used forgenotype calling as well, with a “confidence score” assigned to eachgenotype call. For each SNP marker, we then relied on the allele-specific signal intensity for the AA, AB, and BB genotypes on allgenotyped samples to construct three canonical genotype clusters,similar to the Illumina clustering generation approach. We did notuse aenotype calls with a confidence score < 0.1 in constructingcanonical genotype clusters. Once the canonical genotype clustershave been constructed, we can then transform the signal intensityvalues for each SNP to LRR and BAF values.The Affymetrix arrays contain nonpolymorphic (NP) markers

to provide better genome coverage than SNPmarkers only. Thesemarkers can be handled in a fashion similar to SNPs for CN in-ference, but there are some differences. First, the R value iscalculated as the signal intensity of the NP marker, rather than asthe sum of two alleles. The expected R value for each NP markeris calculated as the median signal intensity values for all geno-typed samples at this marker. In addition, the BAF values cannotbe derived for NPmarkers. Consequently, they are not used in thelikelihood calculation. Finally, due to the use of fewer probes, thevariance of LRR values for NP markers may be different fromthat for SNP markers; thus, the likelihood model parameters forLRR are different in NP markers and SNP markers.

CNV Validation by Quantitative PCR.Universal Probe Library (UPL;Roche) probes were selected using ProbeFinder v2.41 (Roche).Quantitative PCR was performed with an ABI 7500 Real-TimePCR instrument or an ABI Prism 7900HT Sequence DetectionSystem (Applied Biosystems). Each sample was analyzed in qua-druplicate either in 25 μL of reaction mixture [250 nM probe, 900nMeach primer, Fast Start TaqManProbeMaster (Roche), and 10ng genomic DNA] or in 10 μL of reaction mixture [100 nM probe,200 nM each primer, 1× Platinum Quantitative PCR SuperMix-Uracil-DNA-Glycosylase (UDG) with ROX (Invitrogen), and 25ng genomic DNA]. The values were evaluated using SequenceDetection Software v2.2.1 (AppliedBiosystems).Data analysiswasalso performed using either the ΔΔCT method or qBase. Refer-ence genes, chosen fromCOBL,GUSB, and SNCA, were includedbased on the minimal coefficient of variation, and then the datawere normalized by setting a normal control to a value of 1.

Illumina Infinium Assay for CNV Calling. The genotype data contenttogether with the intensity data provided by the genotyping arrayprovides high confidence for CNV calls. The array platform used inthis study provides a highly robust and reproducible SNP clusteringbecause of the random placement of SNP specific beads with ∼18-fold redundancy for each SNP. Using a SNP array provides allele

frequency data that can be analyzed andmore closely controlled forredundancy and high performance compared with public data-bases. This establishes a more robust definition for normal diploidstates than can be provided by intensity alone. The genotypeclustering establishes the probe performance at each locus for theexpected heterozygous genotype state. Based on the hybridizationefficiency, this may tend more to the DNP-tagged red range or thebiotin-tagged green range for any given locus. The normalizationperformed to calculate BAF from theta adjusts the SNP-specificrange to an expected value of 0.5. This creates more continuousdata, because the heterozygous state is properly modeled based onextensive genotyping. Another key technical strength of our studyis that the same array was typed at the same genotyping facility atthe same time, with the same cluster file for cases and controls.The data analysis is also standardized as described inMethods, andCNVs are called with the same version of PennCNV.

CNV Filtering Steps.We performed multiple CNV filtering steps aspart of the analysis. First, it is important to note that of the1,736,438 markers (848,415 SNPs and 888,023 CNs) with chro-mosome annotation, noncomplete genotyping failure, three ge-notype states observed, and normal theta patterns on the Affy-metrix 6.0 array, 33,797 (10,687 SNPs and 23,110 CNs) (1.95%)showed deletion and 44,023 (16,618 SNPs and 27,405 CNs)(2.54%) showedduplication in at least twoormore unrelated casesin the MGS/CHOP discovery cohort (frequency ≥ 0.205%). Thethreshold of two cases is selected because it is the minimal casefrequency to provide certainty that the calls are reliable in a givenregion. We find this upfront exclusion to be very similar to theinclusion threshold of 1% minor allele frequency in GWA SNPgenotype studies. This drastically decreases the number of testsperformed to correct for genome-wide testing.Second, all CNVs were called simultaneously in both cases and

controls and classified into CNVRs as defined in the main text. Atotal of 70 deletion CNVRs and 50 duplication CNVRs wereidentified. Third, to search for novel CNVs, we first filtered out allCNVRs that were not nominally significantly overrepresented inthe CHOP cases (P < 0.05) and carefully reviewed the raw data(BAF and LRR) for accurate CNV calling and statistical signifi-cance as described inMethods. This left us with 20 deletion CNVRsand 5 duplication CNVRs that we subsequently divided into twocategories: (i) CNVs present in cases only and absent in controls(n = 5 deletions and 2 duplications) and (ii) CNVs nominally sig-nificantly overrepresented in the cases (n = 15 deletions and 3duplications). For (i), based on the inclusion significance criteria,there were at least two cases per individual CNV. Thus, the datasetcomprising (i) and (ii) defines the CNVRs from the discovery co-hort that we used to test for novel schizophrenia CNVs.We next attempted to replicate these CNVRs in the in-

dependent case-control dataset (MGS/CHOP). Seven deletionCNVRs and one duplication CNVR survived our replicationcriteria (P < 0.05 after adjustment for the number of tests per-formed) or were absent in the independent control set and weresubsequently experimentally validated with two independentmethods, qPCR and Illumina Human Hap550 Beadchip. Theseresults are shown in Table 2.InTable 2, CNVRs that survivedmultiple testingwithBonferroni

adjustment in the discovery phase (P < 0.05 after correction for 20tests in the case of deletions and 5 tests in the case of duplications)and also survived replication and experimental validation are pre-sented in bold type. CNVRs significant in the discovery phase butnot in the replication phase are listed in Table S2.

Glessner et al. www.pnas.org/cgi/content/short/1000274107 1 of 10

Page 2: Supporting Information - PNAS › content › suppl › 2010 › 05 › 20 › 1000274107... · 2010-05-20 · Supporting Information Glessner et al. 10.1073/pnas.1000274107 SI Methods

Conversely, only one CNV locus overrepresented in controlsreached nominal significance. Thus, the CNVs overrepresented incases exceeded our null expectations. Given the diploid state ofthe vast majority of the genome, the existence of CNVs protectiveagainst the development of schizophrenia seems unlikely.It is important to note that CNV calling is not unequivocally

attained by any one single platform for multiple reasons, includingvariations in DNA provided, array type, DNA processing, dataprocessing, quality control, CNV calling algorithm, genomic fea-tures, genomic coverage, and statistical presentation of regions. Thiscan lead to a high false-positive rate on initial inspection despiteexhaustive efforts to standardize and control each confoundingcontribution.

UPenn Control Population Typed at CHOP. The control populationcollected at UPenn and typed at CHOP included HHDL andPennCATHresearchparticipants.HHDLisacross-sectionalstudyofgenetic factors contributing to elevated HDL cholesterol levels.Probands with elevated HDL cholesterol (>75th percentile for ageand sex) are identified by physician referrals or through theHospitalof the University of Pennsylvania clinical laboratory. Participantscompletea lifestyle questionnaire andprovideablood sample for themeasurement ofHDLand other lipid-related traits (1). PennCATHcomprises consecutive participants undergoing coronary angiogra-phy at University of Pennsylvania Health System hospitals. Of thesubjects in the Penn-CATH study, 85% were Caucasian, 55% weremale, and the average age was 62 years (2). Only Caucasian samplesvalidated by Eigenstrat principal components analysis were used.Inclusion criteria were as follows:

The subject must give signed, informed consent.The proband must have a consensus best-estimate DSM-IVdiagnosis of SZ (schizophrenia) or of schizoaffective disor-

der with at least 6 months’ duration of the “A” criteria forschizophrenia.

The subject must be over 18 years of age at interview (male orfemale).

The informant should have known the subject for at least 2years, be familiar with the psychiatric history, and have atleast 1 h of contact per week with the proband (close familymembers preferred).

Exclusion criteria were as follows:

The subject is unable to give informed consent to all aspects ofthe study.

The subjects is unable to speak and be interviewed in English(to ensure validity of the interviews).

Psychosis is deemed secondary to substance use by the con-sensus diagnostic procedure because psychotic symptomsare limited to periods of likely intoxication or withdrawal,or there are persistent symptoms likely related to substanceuse (e.g,, increasing paranoia after years of amphetamineuse, symptoms limited to visual hallucinations after exten-sive hallucinogen use).

The psychotic disorder is deemed secondary to a neurologicaldisorder, such as epilepsy, based on the nature and timing ofsymptoms. For example, nonspecific, nonfocal EEG abnor-malities are common in SZ, but subjects with psychosis thatemerged in the context of temporal lobe epilepsy would beexcluded.

The subject has severe mental retardation (MR). A subjectwith mild MR (IQ ≥ 55 or based on clinical and educationalhistory) can be included if SZ symptoms and history can beclearly established.

1. Edmondson AC, et al. (2009) Loss-of-function variants in endothelial lipase are a causeof elevated HDL cholesterol in humans. J Clin Invest 119:1042–1050.

2. Lehrke M, et al. (2007) CXCL16 is a marker of inflammation, atherosclerosis, and acutecoronary syndromes in humans. J Am Coll Cardiol 49:442–449.

3. Xu B, et al. (2008) Strong association of de novo CN mutations with sporadicschizophrenia. Nat Genet 40:880–885.

Glessner et al. www.pnas.org/cgi/content/short/1000274107 2 of 10

Page 3: Supporting Information - PNAS › content › suppl › 2010 › 05 › 20 › 1000274107... · 2010-05-20 · Supporting Information Glessner et al. 10.1073/pnas.1000274107 SI Methods

Fig. S1. Attempt to replicate 1q21.1 deletion association with schizophrenia. We observed 1q21.1 deletion in three cases and one control (P = 0.110).

Glessner et al. www.pnas.org/cgi/content/short/1000274107 3 of 10

Page 4: Supporting Information - PNAS › content › suppl › 2010 › 05 › 20 › 1000274107... · 2010-05-20 · Supporting Information Glessner et al. 10.1073/pnas.1000274107 SI Methods

Fig. S2. Attempt to replicate 15q13.3 deletion association with schizophrenia. We observed 15q13.3 deletion in two cases and three controls (P = 0.670).

Glessner et al. www.pnas.org/cgi/content/short/1000274107 4 of 10

Page 5: Supporting Information - PNAS › content › suppl › 2010 › 05 › 20 › 1000274107... · 2010-05-20 · Supporting Information Glessner et al. 10.1073/pnas.1000274107 SI Methods

Fig. S3. Frequency of CNVs observed in study subjects. Red indicates schizophrenia case deletion; blue, schizophrenia case duplication; black, schizophreniacontrol deletion; purple, schizophrenia control duplication. The maximum value displayed is 0.2, to make low-frequency CNVs (the majority of loci) visible.

Glessner et al. www.pnas.org/cgi/content/short/1000274107 5 of 10

Page 6: Supporting Information - PNAS › content › suppl › 2010 › 05 › 20 › 1000274107... · 2010-05-20 · Supporting Information Glessner et al. 10.1073/pnas.1000274107 SI Methods

Fig. S4. Affymetrix genotyping console browser showing the log 2 ratio of schizophrenia cases. Deleted 3′ of CACNA1B on 9q34.3 and of RET on 10q11.21.

Glessner et al. www.pnas.org/cgi/content/short/1000274107 6 of 10

Page 7: Supporting Information - PNAS › content › suppl › 2010 › 05 › 20 › 1000274107... · 2010-05-20 · Supporting Information Glessner et al. 10.1073/pnas.1000274107 SI Methods

Table S1. Schizophrenia cases: Clinical demographic data

Sex dx_cs_1 Percentage DescriptionMale 70% 295 100% Schizophrenic disordersFemale 30%

dx_cs_2 Percentage DescriptionAge at interview, years 303.9 31% Alcoholism

Mean 43.8 305 17% Drug abuse NDMedian 45 311 16% Depressive disorderSD 11.4 296.8 10% Bipolar

305.2 9% Cannabis abuse NDAge at onset, years 304.3 7% Cannabis dependence

Mean 21.0Median 20 dx_cs_3 Percent DescriptionSD 6.58 305.2 23% Cannabis abuse ND

304.3 20% Cannabis dependenceLocation Cases 305 11% Abuse of drugs ND

PA 387 303.9 10% AlcoholismAU 216 311 7% Depressive disorderIL 138 296.8 6% BipolarCO 123 305.7 5% Amphetamine abuse NDNY 112GA 97Israel 94 SumCritA*IA 87 0 1%CA 77 2 2%MO 58 3 7%LA 15 4 21%TX 5 5 24%

6 24%FH psychosis 7 21%

None 71.7%Suspected 16.1% cs_A1 cs_A3Definite 12.2% 0 1% 0 32%

1 99% 1 68%Months Ill cs_A1a cs_A4

Mean 273.2 0 34% 0 32%Median 276 1 66% 1 68%SD 138.9 cs_A2 cs_A5

0 5% 0 19%Months mood 1 95% 1 81%

Mean 24.1 cs_A2aMedian 1 0 50%SD 58.5 1 50%

FH, family history; ND, nondependent; Ill, residual schizophrenia; Mood, full mood syndromes (mania or depression).*All subjects with schizophrenia or schizoaffective disorder diagnoses were judged as meeting DSM-IV criteria B-F for schizophrenia.Consensus rating for DSM-IV schizophrenia criteria: A1, delusions; A1a, implausible, impossible, including Schneiderian delusions; A2,hallucinations; A2a, voices that converse with each other and/or comment on the subject’s thoughts or behavior; A3, disorganizedspeech, frequent derailment or incoherence; A4, grossly disorganized or catatonic behavior; A5, negative symptoms, affective flat-tening, alogia, avolition (1 = present; 0 = absent).

Glessner et al. www.pnas.org/cgi/content/short/1000274107 7 of 10

Page 8: Supporting Information - PNAS › content › suppl › 2010 › 05 › 20 › 1000274107... · 2010-05-20 · Supporting Information Glessner et al. 10.1073/pnas.1000274107 SI Methods

Table S2. Schizophrenia controls: Clinical demographic data

SexMale 46%Female 54%

Age, yearsMean 50.8Median 51SD 16.8

Location, n casesPA 74FL 73TX 67OH 66IL 51CA 45NY 43WA 42WI 41NC 40VA 39AZ 33CO 33IA 30MA 30MO 29GA 27NJ 27OK 27SC 27MN 26OR 24MI 23MD 22AL 21IN 21UT 21LA 19KY 18AR 15ID 14CT 13NE 13SD 13WV 12

Table S3. Attempts to replicate observations of increased frequency of schizophrenia caseswith novel genes affected by CNV

CNV associationconditions

100-kb CNVsize

DGV genesexcluded

10-probeCNV size

10-probeDGV genesexcluded

Multiple novelgenes

impacted

Cases 9.93% 5.83% 48.21% 38.59% 16.48%Controls 13.49% 6.74% 50% 36.99% 17.03%P value 0.009 0.399 0.386 0.443 0.741

Glessner et al. www.pnas.org/cgi/content/short/1000274107 8 of 10

Page 9: Supporting Information - PNAS › content › suppl › 2010 › 05 › 20 › 1000274107... · 2010-05-20 · Supporting Information Glessner et al. 10.1073/pnas.1000274107 SI Methods

Table

S4.

CNVRsstatistically

ove

rrep

resentedin

schizophrenia

casesan

dnotreplicated

inan

indep

enden

tco

hort

CNVR

Probes

Cases

disco

very

Controls

disco

very

Disco

very

Pva

lue

Cases

replication

Controls

replication

Combined

Pva

lue

OR

Gen

eDistance

from

gen

eTy

pe

Can

ary

chr7:321

7745

1–32

3929

7519

83

03.53

×10

−2

00

3.67

×10

−2

NA

PDE1

C0

Dup

Nch

r3:618

0364

1–61

8113

839

40

1.16

×10

−2

01

4.47

×10

−2

8.05

1PT

PRG

0Del

Nch

r4:162

4176

55–16

2424

561

124

01.16

×10

−2

66

1.70

×10

−2

3.36

1FSTL

5,RAPG

EF2

99.9

kb,

1.92

Mb

Del

Y

chr4:135

2767

04–13

5408

238

217

27.44

×10

−3

67

2.06

×10

−2

2.91

6PA

BPC

4L*

0Del

Nch

r5:209

7129

–21

1136

617

96

4.86

×10

−2

45

4.76

×10

−2

2.38

4IRX4

161kb

Del

Nch

r6:572

6814

3–57

2724

5813

40

1.16

×10

−2

02

9.91

×10

−2

4.02

4PR

IM2A

,RAB23

17.9

kb,

73.1

kbDel

N

chr12:60

5588

36–60

5639

7210

117

2.00

×10

−2

04

1.12

×10

−1

2.01

5FA

M19

A2

0Del

Nch

r19:42

6716

–43

4473

53

03.53

×10

−2

710

1.51

×10

−1

2.01

5SH

C2

14.7

kbDup

Nch

r5:527

0291

5–52

7181

3112

74

4.77

×10

−2

15

3.01

×10

−1

1.78

9FS

T10

9kb

Del

Nch

r15:99

9800

78–10

0033

288

365

24.22

×10

−2

03

3.16

×10

−1

2.01

2TM

2D3,

TARSL

20

Dup

N

chr6:164

9955

4–16

5087

1720

62

1.80

×10

−2

05

3.79

×10

−1

1.72

4ATX

N1

0Dup

Nch

r6:162

7404

76–16

2741

040

25

24.22

×10

−2

15

3.79

×10

−1

1.72

4PA

RK2

0Del

Nch

r15:32

7172

47–32

7651

0550

1513

2.51

×10

−2

824

4.10

×10

−1

1.25

2GJD

20

Del

Nch

r7:142

9413

48–14

2963

649

108

37.84

×10

−3

08

4.66

×10

−1

1.46

3AL8

3358

310

.7kb

Del

Nch

r4:114

5736

91–11

4581

335

274

14.26

×10

−2

05

7.39

×10

−1

1.34

0CAMK2D

11.7

kbDel

Nch

r1:920

1431

9–92

0210

2810

52

4.22

×10

−2

06

7.70

×10

−1

1.25

6TG

FBR3

0Del

Nch

r12:69

1589

42–69

1642

949

85

3.69

×10

−2

519

8.61

×10

−1

1.08

9PT

PRB,

KCNMB4

32.6

kb,

47.7

kbDel

N

NA,notap

plicab

le.

*From

Xuet

al.3

Glessner et al. www.pnas.org/cgi/content/short/1000274107 9 of 10

Page 10: Supporting Information - PNAS › content › suppl › 2010 › 05 › 20 › 1000274107... · 2010-05-20 · Supporting Information Glessner et al. 10.1073/pnas.1000274107 SI Methods

Table S5. Independent validation of CNVRs with qPCR and Illumina human Hap550 BeadChip

CNVRCNVtype Sample ID

Relativegenedosage SE Illumina chip ID Tag SNP ID

Illuminalog Rratio

chr22:17404806–19941349 Del 1222439226 0.524 0.035 4290041416_21 rs1934895 −1.052chr22:17404806–19941349 Del 9626794429 0.521 0.011 4276098785_11 rs1934895 −0.996chr22:17404806–19941349 Del 04C28087A* 1.000 0.173 4562262038_21 rs1934895 −0.018chr22:17404806–19941349 Del 04C28139A* 1.029 0.122 4562369091_21 rs1934895 −0.120chr16:29425212–30134444 Dup 7873015771 1.461 0.089 4079019681_A rs4563056 0.498chr16:29425212–30134444 Dup 8623080628 1.489 0.007 1582065333_A rs4563056 0.595chr16:29425212–30134444 Dup 9163054078 1.508 0.096 1846673715_A rs4563056 0.369chr16:29425212–30134444 Dup 04C28087A* 1.000 0.023 4562262038_21 rs4563056 −0.063chr16:29425212–30134444 Dup 04C28139A* 0.975 0.027 4562369091_21 rs4563056 −0.221chr16:68743639–68770545 Del 151169809 0.548 0.034 1587851079_A rs17028422 −0.135chr16:68743639–68770545 Del 04C28087A* 1.000 0.031 4562262038_21 rs2287983 −0.017chr16:68743639–68770545 Del 04C28139A* 0.954 0.017 4562369091_21 rs2287983 −0.059chr9:140145139–140152969 Del 1475148472 0.507 0.246 4147907270_B rs11137379 −1.765chr9:140145139–140152969 Del 3005849912 0.473 0.008 4068230324_B rs11137379 −2.270chr9:140145139–140152969 Del 4311028436 0.475 0.029 4276098403_12 rs11137379 −2.711chr9:140145139–140152969 Del 5678778794 0.545 0.128 1846673296_A rs11137379 −2.025chr9:140145139–140152969 Del 6711973667 0.428 0.154 1796039438_A rs11137379 −1.951chr9:140145139–140152969 Del 8934645510 0.432 0.023 4276098713_22 rs11137379 −2.440chr9:140145139–140152969 Del 9140263548 0.474 0.020 4276098270_12 rs11137379 −2.804chr9:140145139–140152969 Del 04C28087A* 1.000 0.036 4562262038_21 rs11137379 −0.003chr9:140145139–140152969 Del 04C28139A* 1.035 0.091 4562369091_21 rs11137379 −0.136chr10:42932615–42934354 Del 300030062 0.617 0.016 4276098188_12 rs715106 −0.175chr10:42932615–42934354 Del 1207317307 0.527 0.041 4523255137_11 rs715106 −0.204chr10:42932615–42934354 Del 1299194495 0.455 0.126 4506261167_11 rs715106 −0.161chr10:42932615–42934354 Del 5442260823 0.488 0.168 4562297116_21 rs715106 −0.174chr10:42932615–42934354 Del 9508038552 0.375 0.009 4157398294_A rs715106 −0.460chr10:42932615–42934354 Del 04C28087A* 1.000 0.026 4562262038_21 rs715106 −0.003chr10:42932615–42934354 Del 04C28139A* 1.057 0.049 4562369091_21 rs715106 −0.093chr3:4063809–4074877 Del 325927264 0.480 0.022 4240108555_11 rs317528 −0.508chr3:4063809–4074877 Del 2577168153 0.452 0.006 1890578271_A rs317528 −0.607chr3:4063809–4074877 Del 04C28087A* 1.000 0.068 4562262038_21 rs317528 −0.040chr3:4063809–4074877 Del 04C28139A* 1.040 0.041 4562369091_21 rs317528 −0.028chr4:9881886–9884092 Del 332702531 0.510 0.020 4290041726_12 rs10939814 −0.640chr4:9881886–9884092 Del 6483240361 0.440 0.170 4243114252_11 rs10939814 −0.752chr4:9881886–9884092 Del 9655625304 0.611 0.013 1837427556_A rs10939814 −0.585chr4:9881886–9884092 Del 9966812554 0.482 0.024 4276098355_21 rs10939814 −0.502chr4:9881886–9884092 Del 04C28087A* 1.000 0.110 4562262038_21 rs10939814 −0.040chr4:9881886–9884092 Del 04C28139A* 0.823 0.025 4562369091_21 rs10939814 −0.059chr18:38310567–38311765 Del 1317180605 0.000 0.000 4256206108_21 rs10468964 −4.483chr18:38310567–38311765 Del 3613918399 0.000 0.000 4276098785_12 rs10468964 −4.855chr18:38310567–38311765 Del 3673606183 0.000 0.000 4240108637_11 rs10468964 −4.646chr18:38310567–38311765 Del 5301838910 0.000 0.000 4523280020_21 rs10468964 −4.984chr18:38310567–38311765 Del 8334564658 0.000 0.000 4079300087_A rs10468964 −5.693chr18:38310567–38311765 Del 04C28087A 1.000 0.057 4562262038_21 rs10468964 −0.009chr18:38310567–38311765 Del 04C28139A 0.987 0.071 4562369091_21 rs10468964 0.033

*Negative control samples (normal diploid).

Glessner et al. www.pnas.org/cgi/content/short/1000274107 10 of 10