Upload
catherine-johnson
View
219
Download
3
Embed Size (px)
Citation preview
RAPID PUBLICATION
Convergent Genome Wide Association Results forBipolar Disorder and Substance DependenceCatherine Johnson,1 Tomas Drgon,1 Francis J. McMahon,2 and George R. Uhl1*1Molecular Neurobiology Branch, NIDA-IRP, NIH, Baltimore, Maryland2Unit on the Genetic Basis of Mood and Anxiety Disorders, Mood and Anxiety Disorders Program, Bethesda, Maryland
Received 14 August 2008; Accepted 28 October 2008
Twin studies document substantial heritability for substance
dependence and bipolar disorder [Shih et al. (2004); Uhl et al.
(2008a)]. Individuals with bipolar disorder display substance use
disorders at rates that are much higher than those in the general
population [Krishnan (2005)]. We would thus predict: 1) sub-
stantial overlap between different genome wide association
(GWA) studies of bipolar disorder 2) significant overlap between
results from bipolar disorder and substance dependence. Recent
GWA studies [Baum et al. (2007); Sklar et al. (2008); Uhl et al.
(2008a); Wellcome Trust Consortium (2007)] allow us to test
these ideas, although 1) these datasets display difficult features
that include use of differing sets of SNPs, likely polygenic
genetics, likely differences in linkage disequilibrium between
samples, heterogeneity both between and within loci and
2) several, though not all, reports have failed to identify any
allele of any single nucleotide polymorphism (SNP) (‘‘same SNP
same allele’’) that is reproducibly associated with bipolar disor-
der with ‘‘genome wide’’ significance. We now report analyses
that identify clustered, P< 0.05 SNPs within genes that overlap
between the bipolar samples (Monte Carlo P< 0.00001). Over-
lapping data from at least three of these studies identify 69 genes.
23 of these genes also contain overlapping clusters of nominally-
positive SNPs for substance dependence. Variants in these
‘‘addiction/bipolar’’ genes are candidates to influence the brain
in ways that manifest as enhanced vulnerabilites to both sub-
stance dependence and bipolar disorder. � 2009 Wiley-Liss, Inc.
Key words: complex genetics; addiction; depressive disorder;
single nucleotide polymorphism
INTRODUCTION
Data from family, adoption and twin studies support substantial
heritability for vulnerability to dependence on a range of addictive
substances, with most genetic influences common to vulnerability
to dependence on most classes of addictive substances [reviewed in
Uhl et al., 2008a]. Similar classical genetic data support substantial
heritability for vulnerability to bipolar disorder [Shih et al., 2004].
Both addictions and bipolar disorder are ‘‘complex’’ disorders that
are likely to display multiple genetic and multiple environmental
determinants. Linkage-based molecular genetic studies have failed
to identify highly reproducible linkages for either of these disorders
[McQueen et al., 2005; Uhl et al., 2008a]. Such observations,
coupled with twin study observations that support approximately
0.5 heritability for substance dependence and approximately 0.75
heritability for bipolar disorder, fit well with polygenic genetic
architectures for each of these disorders [Shih et al., 2004; Uhl et al.,
2008a].
Polygenic underlying genetic architectures are also consistent
with genome wide association data reported to date for substance
dependence and for bipolar disorder [Uhl et al., 2001; Liu et al.,
2005, 2006; McQueen et al., 2005; Johnson et al., 2006; Baum et al.,
2007; Bierut et al., 2007; WellcomeTrustConsortium, 2007; Sklar
et al., 2008]. There is no large association signal in any of these
reports that would be consistent with an oligogenic effect.
Several lines of evidence suggest that some of the genetic
influences on addiction might overlap with some of the genetic
influences on bipolar disorder. A recent review summarizes
Additional Supporting Information may be found in the online version of
this article.
Abbreviations used: DNA, deoxyribonucleic acid; SNP, single nucleotide
polymorphism; SEM, standard error of the mean; Mb, million base pairs;
Kb, thousand base pairs.
The authors declare that, except for income received from their primary
employer, no financial support or compensation has been received from
any individual or corporate entity over the past 3 years for research or
professional service and there are no personal financial holdings that could
be perceived as constituting a potential conflict of interest.
*Correspondence to:
George R. Uhl, Molecular Neurobiology, P.O. Box 5180, Baltimore, MD
21224. E-mail: [email protected]
Published online 6 January 2009 in Wiley InterScience
(www.interscience.wiley.com)
DOI 10.1002/ajmg.b.30900
How to Cite this Article:Johnson C, Drgon T, McMahon FJ, Uhl GR.
2009. Convergent Genome Wide Association
Results for Bipolar Disorder and Substance
Dependence.
Am J Med Genet Part B 150B:182–190.
� 2009 Wiley-Liss, Inc. 182
Neuropsychiatric Genetics
evidence that up to three quarters of individuals with bipolar
disorder manifest a substance use disorder, a rate that is much in
excess of the rate in the general population [Krishnan, 2005]. The
familial patterns of bipolar and substance use disorders provide
ample evidence for enhanced occurrence of addictive disorders in
members of families that were ascertained on the basis of a member
with bipolar disorder [Biederman et al., 2000; Preisig et al., 2001;
Shih et al., 2004].
In genome wide association studies for addictions, we have
developed and used analytic approaches that aim to identify
modest-sized association signals in genes that are identified repro-
ducibly in multiple samples, in ways that increase confidence in the
results. We have used these approaches to identify dozens of gene
loci that contain association signals from replicated genome
wide association studies of polysubstance abusers, methamphet-
amine abusers and alcohol abusers of European, African and Asian
genetic backgrounds [Uhl et al., 2001, 2008a,b; Liu et al., 2005, 2006;
Johnson et al., 2006].
Recent reports have used several different approaches to analyze
genome wide association data from studies of four bipolar versus
control samples. The individuals studied were of largely European
genetic background and were collected in the United States, the
United Kingdom, and Germany [McQueen et al., 2005; Baum et al.,
2007; WellcomeTrustConsortium, 2007; Sklar et al., 2008]. Most of
these analyses have sought, and failed to identify, the same alleles of
specific single nucleotide polymorphisms (SNP) (‘‘same SNP same
allele’’ analysis) that were strongly associated with bipolar disorder
in each of these multiple samples [Baum et al., 2008; Gershon et al.,
2008; Sklar et al., 2008], although a recent analysis has identified
some SNPs that display some of these properties [Ferreira et al.,
2008].
We now report analyses of data from each of these four bipolar
disorder genome wide association studies. We use preplanned
approaches that seek overlapping clusters of SNPs with nominally
significant associations in the same genes in each of several inde-
pendent samples. Such analyses may provide some robustness in
the face of genetic architectures and technical issues that may
prove to be difficult for ‘‘same SNP same allele’’ analyses, including
(1) polygenic genetic architectures that are likely to provide asso-
ciation signals of modest magnitude at individual SNPs, (2) sample-
to-sample differences in allele frequencies and patterns of linkage
disequilibrium that render different SNPs differentially informative
in different samples, (3) within-locus heterogeneity that could
result in association signals at multiple sites within the same gene,
(4) between-locus heterogeneity that could render association at
one gene locus more prominent in one sample than in other
samples, and (5) genome wide association datasets that use differing
sets of SNPs, such as those found on Affymetrix versus Illumina
platforms.
Using these preplanned analytic approaches, we report substan-
tial overall convergence between the genes identified in these four
bipolar datasets. We also note significant convergence between
these ‘‘bipolar’’ genes and those identified in comparisons of
substance-dependent versus control individuals. These data pro-
vide substantial support for shared molecular genetic bases for
portions of the genetic component of vulnerabilities to addiction
and to bipolar disorder. The ‘‘addiction and bipolar vulnerability’’
genes that we identify in these analyses support the concept that
brains of individuals at greater risk for these disorders differ from
brains of individuals at lower risk for these disorders. The analyses
appear to provide robust overall evidence for overlapping genetic
components, even when observations at single genes do not provide
effects of large magnitude.
MATERIALS AND METHODS
Samples and GenotypingWe use data from autosomal SNPs that were subjected to the quality
control procedures performed by the investigators in each of the
primary studies, to avoid adding inadvertent biases. We note the
numbers of SNPs whose data was available for analyses in each
dataset.
Bipolar disease versus control. WTCCC: 436,604 autosomal
SNP genome wide association for bipolar disorder came from a
study that compared controls with 1,868 United Kingdom indi-
viduals of European descent with bipolar disorders diagnosed using
Research Diagnostic Criteria [WellcomeTrustConsortium, 2007].
Subjects were recruited by teams based in Aberdeen, Birmingham,
Cardiff, London, and Newcastle. Lifetime diagnosis of a bipolar
mood disorder according to Research Diagnostic Criteria included
bipolar subtypes that co-aggregate in family studies: bipolar I
disorder (0.71), schizoaffective disorder bipolar type (0.15), bipolar
II disorder (0.09) and manic disorder (0.05). Uncharacterized
control samples came from two sources. One thousand four
hundred eighty individuals came from a 1958 birth cohort sample.
One thousand four hundred fifty eight individuals came from a
blood service sample, and represent a subset of a United Kingdom
national repository of anonymized DNA samples from 3,622 con-
senting blood donors.
SNP genotyping was performed using Affymetrix 500k arrays.
Genotypes were determined from hybridization intensities were
determined using a CHIAMO algorithm with a �0.9 a posteriori
probability threshold for making genotype calls. We used data for
each of the 436,604 of the 469,557 SNPs assessed by the WTCCC
that could be assigned confident autosomal chromosomal local-
izations based on data in NCBI genome assembly, build 36. A P-
value for each SNP was determined based onc2 tests for significance
of allele frequency differences in bipolar versus control subjects
[WellcomeTrustConsortium, 2007].
NIMH: 536,288 autosomal SNP genome wide association was
assessed in controls compared to 461 unrelated bipolar I probands.
These probands reported European ancestry and were selected from
families with at least one affected sibling pair who were part of the
NIMH Genetics Initiative (http://nimhgenetics.org) [Baum et al.,
2007]. Probands underwent a semi-structured diagnostic interview
and were assigned a ‘‘confident’’ diagnosis of DSM-IV bipolar I
disorder by each of two trained clinicians. 563 unrelated control
individuals of European-American ancestry who failed to display
evidence for DSM-IV criteria for major depression, history of
bipolar disorder or a history of psychosis were recruited by a
marketing firm.
German: 536,288 autosomal SNP genome-wide association was
assessed in controls compared with 772 bipolar I patients were
JOHNSON ET AL. 183
recruited from consecutive hospital admissions [Baum et al., 2007].
DSM-IV bipolar I diagnoses were made by a consensus best-
estimate procedure based on structured interviews, medical records
and family history. Eight hundred seventy six population-based
controls were randomly recruited by the same investigators with the
support of the Bonn (North Rhine-Westphalia, Germany) census
bureau. Individuals with personal histories of affective disorder or
schizophrenia were excluded.
For genotyping, NIMH samples were divided into seven bipolar
and nine control pools of 50–80 subjects/pool. German samples
were divided into 13 bipolar and 10 control pools of 42–60 subjects
per pool. SNP allelic distributions were assessed using duplicate
Illumina HumanHap550 arrays and a BeadStation (Illumina, Inc.,
La Jolla, CA) [Baum et al., 2007]. Normalized allele frequencies
were calculated from raw intensity data averaged across duplicate
pools to obtain a relative allele frequency estimate for each SNP in
each pool. SNPs with >2% variance between replicate pools were
excluded. t tests assessed the null hypothesis that the transformed
relative allele frequencies of cases and controls were equal, com-
paring pool-to-pool variation within phenotypes to phenotype-to-
phenotype differences. We used P values obtained from these t tests.
US and UK bipolar I samples: 364,218 autosomal SNP genome
wide association was assessed in controls compared to bipolar (I)
patients identified as part of the Systematic Treatment Enhance-
ment Program for Bipolar Disorder (STEP-BD; n¼ 955) or at
University College London, United Kingdom (UCL; n¼ 506)
[Sklar et al., 2008]. Control samples were obtained from the United
States National Institute of Mental Health (NIMH) Genetics Re-
pository (n¼ 1,498) and from the UK (n¼ 510, matched to the
UCL cases).
For STEP cases DSM-IV diagnoses came from information in the
Affective Disorders Evaluation and the Mini International Neuro-
psychiatric Interview administered by trained clinical specialists or
psychiatrists. UCL individuals received ICD10 diagnoses of bipolar
disorder from a psychiatrist and DSMIIIR and Research Diagnostic
Criteria diagnoses of bipolar disorder after interviews by research
psychiatrists. SNP genotyping was performed using Afymetrix 500k
arrays. A P value for each SNP was determined based on its c2 test
for significance of allele frequency differences in bipolar versus
control subjects.
Substance dependent versus control. 1 M autosomal SNP
genome wide association data came from European-American
samples selected as previously described [Liu et al., 2006]. For
convenience, we use here the term ‘‘abusers’’ to describe individuals
who displayed heavy lifetime use of illegal substances [Persico et al.,
1996; Uhl et al., 2001] and dependence on at least one illegal
substance based on DSMIIIR or DSMIV criteria, while we use the
term ‘‘controls’’ to describe individuals who displayed neither
abuse nor dependence on any addictive substance and report no
significant lifetime histories of use of any addictive substance. These
studies provide one of the richest datasets of well-characterized
individuals matched for ethnicity that is available for comparison,
though the 400 ‘‘abusers’’ and 280 ‘‘controls’’ provide a smaller
sample than those available for bipolar disorder. A ‘‘t’’ statistic for
the differences between abusers and controls was obtained from
assessments of DNAs analyzed in pools of 20 individuals using
Affymetrix 6.0 array methods that have been extensively validated
[Drgon et al., manuscript in preparation]. We focused on clustered,
reproducibly positive SNPs in chromosomal regions that contained
genes, using a preplanned set of criteria. Such SNPs (a) displayed t
values with P< 0.05 significance in abuser versus control compar-
isons, (b) cluster, so that at least four of these reproducibly positive
SNPs lie within 10 Kb of each other and (c) identify genes. As we
note below, since many of these SNPs are likely to be in linkage
disequilibrium with each other, this analysis provides evidence
against large amounts of genotyping error in these European-
American samples. Convergence of these results with data from
other independent samples that would be expected to reveal over-
lapping genetics (e.g., African-American abuser versus control
comparisons) [Drgon et al., manuscript in preparation] and bipolar
samples (see below) provides independent evidence for the success
of these criteria, though it does not document that these criteria
have been optimized. We do note that we preplanned use of 10 kb
clustering for data from 1 M SNP arrays and 25 kb clusters for
analyses of data from 300 to 500 k SNP arrays based on prior
analyses using these distances for data from SNP datasets of these
differing densities [Liu et al., 2006; Drgon et al., manuscript in
preparation].
In this work, correlations between pooled and individually
determined genotypes were 0.98, standard errors estimating the
variation among the four replicate studies of each DNA pool were
0.03 and standard errors estimating the variation among the pools
studied for each phenotype group were 0.03 [Liu et al., 2006].
Monte Carlo Simulations IData from each study provided a set of SNPs with known chromo-
somal locations and nominal P values for each SNP based on the
results of thec2 or t tests, as noted above. A first set of ‘‘Monte Carlo
I’’ simulations addressed, within each sample, the null hypothesis I
that the extent of clustering of nominally positive SNPs was no
greater than would have been anticipated by chance. We tested this
first null hypothesis using 100,000 Monte Carlo I simulation trials
[Uhl et al., 2001]. Each trial began with sampling from the database
that contained the data from the SNPs in the study of interest. For
each of the 100,000 simulation trials, a randomly selected set of
SNPs of the same size as the authentic set displayed nominally
significant P values in the primary analysis was chosen to test the
null hypothesis I: that the nominally significant SNPs from the
study cluster together no more frequently than would be expected
by chance. The number of trials for which the results from randomly
selected SNPs matched or exceeded the results actually observed
from the authentic SNPs that were actually identified by the results
of the study was tabulated. Empirical P values were calculated by
dividing the number of trials for which the observed results were
matched or exceeded by the total number of Monte Carlo simula-
tion trials performed. Failure to support the null hypothesis I in
relation to clustering of nominally positive SNPs could of course
come from: (a) case versus control allele (and haplotype) frequency
differences that related to clinical phenotype or (b) stochastic
differences between case versus control allele (and haplotype)
frequency differences that do not necessarily relate to the clinical
phenotype.
184 AMERICAN JOURNAL OF MEDICAL GENETICS PART B
Monte Carlo Simulations IIWhen comparing with the results from different studies, we test the
null hypothesis II: that the clustered positive results from the
genome wide association data from one study do not converge
with (e.g., overlap with the chromosomal positions of) clustered
nominally positive SNPs in other studies to extents greater than
expected by chance. Note that testing this null hypothesis does not
require that the same sets of SNPs were examined in each study.
Testing this null hypothesis does not require that we impute
genotypes at SNPs that were not assayed. We first seek chromo-
somal overlap between clusters of nominally positive SNPs, defined
as noted above, from different studies. To balance type I and type II
error, we focus on genes that are identified by overlapping, clustered
nominally positive SNPs in at least three of the four independent
bipolar versus control comparisons analyzed here. Genes identified
by all samples are also identified in boldface in Table I. Failure to
support the null hypothesis II is thus based on identification of
many more overlapping clusters of nominally positive SNPs in
multiple independent samples in authentic samples than anticipat-
ed by chance. In comparison to the results from tests of null
hypothesis I, failure to support null hypothesis II is much less
likely to arise from case versus control allele frequency differences
that relate to stochastic differences in allele and haplotype frequen-
cies. Failure to support null hypothesis II is much more likely to
come from case versus control allele (and haplotype) frequency
differences that relate to clinical phenotype.
We can use similar strategies to test the significance of overlaps
between clusters of nominally positive SNPs that identify genes for
bipolar disorder in different samples and overlap between genome
wide results from bipolar disorder and substance dependence. To
test the significance of these overlaps, we use a second set of Monte
Carlo simulations (‘‘II’’) with trials that each sample randomly
from the SNP datasets noted above. We compare sample to sample
overlap of clustered results from the randomly sampled SNPs to
those seen in the authentic datasets. 100,000 trials allow estimates of
significance using this empirical statistic, as noted above.
Monte Carlo Simulations IIIIn assessing the extent of support for individual genes identified in
this work, we test the null hypothesis III that the clustered positive
results from the entire set of genome wide association datasets do
not identify gene sequences of the same size found in each of the
genes listed in Table I more than anticipated by chance. This set of
simulation studies thus tests a more stringent null hypothesis than
the simulation studies that seek to assess the significance of overall
converge/overlap between the chromosomal positions of clustered
nominally positive SNPs. For these simulation studies, we use a list
of ‘‘gene centric’’ genomic segments that lie flanked by the begin-
ning of the first exon or end of the last exon of annotated genes or
within 10 kb 50 and 30 from these sites, respectively. For each of
10,000 Monte Carlo III trials, we assess randomly ascertained
‘‘within gene’’ segments of the same size as those noted for each
of the ‘‘authentic’’ genes identified in Table I. We obtain a Monte
Carlo P value by comparing the number of trials in which the
number of clustered nominally positive SNPs in the randomly
selected within-gene segment is as large as or larger than the number
actually obtained in the authentic gene, when we consider data from
each of the five genome wide datasets in Table I. To provide a
concrete example of overlap based on the bipolar samples, we
provide detailed data for an area of the CSMD1 gene that is
identified by clustered, nominally positive SNPs from three of the
four bipolar samples in the Supplement, Table S1.
RESULTS
Assessment of Chromosomal Clustering forResults From Bipolar Disorder Genome WideAssociation DatasetsWTCCC bipolar versus control: 28,192 of the 426,604 SNPs analyzed
in the WTCCC bipolar disorder collection displayed c2 values with
P< 0.05 [WellcomeTrustConsortium, 2007]. Twelve thousand five
hundred sixty of these SNPs fell into 1,775 clusters in which at least
four SNPs that each displayed P< 0.05 and were sampled on at least
two array types lay within 25 kb of each other. One way of
understanding the extent of this clustering comes from the follow-
ing statistic: two thirds of these clustered nominally positive
SNPs lay within 7,500 bp of another nominally positive SNP from
this sample. Monte Carlo I simulation trials that assessed the
probabilities that these results were due to chance found an average
of 2,358 SNPs that lay in 511 clusters and that none of 100,000
simulation trials identified as many clusters of SNPs that displayed
nominally positive differences between bipolar versus control
samples as we have identified from this work (thus P< 0.00001)
[WellcomeTrustConsortium, 2007].
NIMH bipolar versus control: 32,835 of the 536,288 SNPs
analyzed in the NIMH bipolar collection displayed t values with
P< 0.05. Nine thousand nine hundred seventy one of these SNPs
fell into 1,770 clusters in which at least 4 SNPs that each displayed
t values (in this sample) corresponding to P< 0.05 lay within 25 kb
of each other [Baum et al., 2007]. Half of these clustered nominally
positive SNPs lay within 7,500 bp of another nominally positive
SNP from this sample. Monte Carlo I simulation trials that assessed
the probabilities that these results were due to chance found an
average of 4,041 SNPs in 853 clusters and that none of 100,000
simulation trials identified as many clustered SNPs that displayed
nominally positive differences between bipolar versus control
samples as we have identified from this work (thus P< 0.00001).
German bipolar versus control: 27,057 of the 532,835 SNPs
analyzed in the German samples displayed t values that corre-
sponded to P< 0.05. Six thousand one hundred ten of these SNPs
fell into 1,137 clusters in which at least four SNPs that each
displayed t values corresponding to P< 0.05 (within this sample)
lay within 25 kb of each other [Baum et al., 2007]. Half of these
clustered nominally positive SNPs lay within 7,500 bp of another
nominally positive SNP from this sample. Monte Carlo I simulation
trials that assessed the probabilities that these results were due to
chance found an average of 2,297 SNPs in 500 clusters and that none
of 100,000 simulation trials identified as many clustered SNPs that
displayed nominally positive differences between bipolar versus
control samples as observed in this work (thus P< 0.00001).
JOHNSON ET AL. 185
TABLE I. Genes That Harbor Overlapping Clusters of �4 Nominally Positive SNPs Whose Allele Frequencies Distinguish Individuals With
Bipolar Disorder From Controls With P < 0.05 Nominal Significance From at Least Three of the Four Samples in This Analysis
Gene Description Chr Gene start Cluster start Cluster stop MNB P-valueC1orf110 Chromosome 1 open reading frame 110 1 161,087,730 161,083,207 161,192,277 0.0040CDCA1 Cell division cycle associated 1 1 161,558,349 161,552,254 161,632,145 0.0053EDARADD EDAR-associated death domain 1 234,624,303 234,639,883 234,690,446 0.0032PLA2R1 Phospholipase A2 receptor 1, 180 kDa 2 160,506,258 160,515,679 160,612,005 Y 0.0040B3GALT1 UDPGal:bGlcNAcb1,3-galactosyltrans’ase pep 1 2 168,383,428 168,261,371 168,388,046 0.0030DNAPTP6 DNA polymerase-transactivated protein 6 2 200,879,041 200,781,378 200,950,180 0.0004SORCS2 Sortilin-related VPS10 dom receptor 2 4 7,245,373 7,340,014 7,581,351 Y 0.0016LGI2 Leucine-rich repeat LGI family, member 2 4 24,612,488 24,610,537 24,676,475 Y 0.0019PALLD Paladin 4 169,654,792 169,766,857 169,812,469 0.0036SEMA5A Semaphorin 5A 5 9,091,850 8,986,130 9,250,607 Y <0.0001SLIT3 Slit homolog 3 5 168,025,857 168,097,322 168,184,154 0.0011C6orf85 Chromosome 6 open reading frame 85 6 3,214,691 3,197,134 3,266,875 0.0006BMP6 Bone morphogenetic protein 6 6 7,672,009 7,625,519 7,738,304 0.0059ATXN1 Ataxin 1 6 16,407,322 16,747,958 16,832,860 Y 0.0009IBRDC2 IBR domain containing 2 6 18,495,573 18,534,587 18,593,152 Y 0.0011SLC17A3 Solute carrier family 17 3 6 25,953,307 25,821,547 25,993,793 0.0015PSMB9 Proteasome subunit b 9 6 32,929,916 32,682,038 32,943,236 <0.0001COL11A2 Collagen type XI a 2 6 33,238,447 33,126,355 33,232,671 Y 0.0006C6orf137 Chromosome 6 open reading frame 137 6 44,346,871 44,354,744 44,405,750 Y 0.0033MGC33600 Hypothetical protein MGC33600 6 44,355,880 44,354,744 44,405,750 0.0050ARHGAP18 Rho GTPase activating protein 18 6 129,939,933 129,960,703 130,130,617 0.0010RGS17 Regulator of G-protein signaling 17 6 153,373,719 153,255,200 153,425,448 0.0011
TULP4 Tubby like protein 4 6 158,653,680 158,747,773 158,850,640 Y 0.0003COL28A1 Collagen XXVIII a 1 7 7,364,769 7,277,156 7,392,189 0.0005EPDR1 Ependymin related protein 1 7 37,926,688 37,802,128 37,939,539 Y 0.0023GRM3 Metabotropic glutamate rec 3 7 86,111,166 86,183,431 86,295,461 0.0012CSMD1 CUB and Sushi multiple domains 1 8 2,782,789 3,495,885 3,661,227 0.0011
4,010,219 4,161,083 Y 0.0002C8orf72 Chromosome 8 open reading frame 72 8 59,069,667 59,215,846 59,318,965 0.0026OLFM1 Olfactomedin 1 9 137,106,992 137,128,752 137,267,562 Y 0.0021ANKRD16 Ankyrin repeat domain 16 10 5,943,695 5,970,518 6,052,396 0.0042FBXO18 F-box protein, helicase, 18 10 5,972,220 5,970,518 6,052,396 0.0042IL15RA Interleukin 15 receptor a 10 6,034,340 5,970,518 6,052,396 0.0042FAM107B Fam seq sim 107B 10 14,600,565 14,703,950 14,837,281 0.0059GFRA1 GDNF family receptor alpha 1 10 117,812,943 117,849,488 117,978,854 Y 0.0017PNLIPRP1 Pancreatic lipase-related protein 1 10 118,340,480 118,345,520 118,409,221 0.0042C10orf82 Chromosome 10 open reading frame 82 10 118,413,197 118,345,520 118,409,221 Y 0.0042TACC2 Transforming acid coiled-coil cont 2 10 123,738,679 123,960,520 124,046,764 0.0014BTBD16 BTB (POZ) domain containing 16 10 124,020,811 123,960,520 124,046,764 Y 0.0014ADAM12 ADAM metallopeptidase domain 12 10 127,693,415 127,864,730 127,979,756 0.0081NAV2 Neuron navigator 2 11 19,691,488 19,826,740 20,019,280 0.0027TMEM16C Transmembrane protein 16C 11 26,309,547 26,567,094 26,673,637 0.0010SLC5A12 Solute carrier family 5 12 11 26,669,781 26,567,094 26,673,637 Y 0.0001FAM118B Family with sequence similarity 118 B 11 125,604,330 125,542,228 125,606,142 0.0036KRT3 Keratin 3 12 51,469,751 51,419,584 51,468,260 0.0080GRIP1 Glutamate receptor interacting protein 1 12 65,028,656 65,476,260 65,645,501 0.0009FARP1 FERM, RhoGEF pleckstrin dom 1 13 97,593,435 97,828,590 97,921,030 0.0011STK24 Serine/threonine kinase 24 13 97,902,414 97,828,590 97,921,030 0.0011TPP2 Tripeptidyl peptidase II 13 102,047,374 102,084,815 102,182,825 0.0179FLJ40176 Hypothetical protein FLJ40176 13 102,179,718 102,084,815 102,182,825 Y 0.0179KCNK10 Potassium channel K 10 14 87,720,993 87,835,586 87,887,777 Y 0.0018CHES1 Checkpoint suppressor 1 14 88,692,274 88,783,336 88,996,264 0.0004SERPINA11 Serpin peptidase inhibitor A 11 14 93,978,554 93,969,013 94,027,467 Y 0.0027SERPINA12 Serpin peptidase inhibitor A 12 14 94,023,373 93,969,013 94,027,467 0.0027
(Continued)
186 AMERICAN JOURNAL OF MEDICAL GENETICS PART B
US and UK bipolar versus control: 20,266 of the 364,218 SNPs
analyzed in the Sklar et al. [2008] samples displayed c2 values that
corresponded to P< 0.05. Seven thousand five hundred sixty seven
of these SNPs fell into 1,165 clusters in which at least four SNPs
(within this sample) that represented each of the two array types and
lie within 25 kb of each other. Two thirds of these clustered
nominally positive SNPs lay within 7,500 bp of another nominally
positive SNP from this sample. Monte Carlo I simulation trials that
assessed the probabilities that these results were due to chance
found an average of 782 SNPs in 177 clusters and that none of
100,000 Monte Carlo I simulation trials identified as many clustered
SNPs that displayed nominally positive differences between bipolar
versus control samples as observed in this work (thus P< 0.00001).
Assessment of Chromosomal Clustering forResults From a Substance Dependence GenomeWide Association DatasetNIDA substance dependence samples [Drgon et al., manuscript in
preparation] were analyzed as previously described [Liu et al.,
2006], selecting ‘‘nominally positive SNPs’’ that displayed P values
<0.05 for comparisons between substance dependent versus con-
trol samples in European-American samples. We assessed the
extent to which nominally positive SNPs clustered together in
small chromosomal regions, such that at least four SNPs that
displayed P< 0.05 lay within 10 kb of each other. Seventy five
thousand three hundred twenty seven of the 870,000 tested auto-
somal SNPs displayed nominally significant abuser versus control
allele frequency differences (e.g., t values with P< 0.05) in these
samples. Fifteen thousand seven hundred seventy nine of these
75,327 reproducibly positive SNPs lie in 2,931 clusters. One thou-
sand three hundred four of these clusters identified 1,123 annotated
genes. None of 100,000 Monte Carlo I simulation trials observed
chromosomal clustering as marked as that observed for the true
reproducibly positive SNPs (P< 0.00001).
Assessment of Extent to Which ChromosomalClusters for Nominally Positive Results FromBipolar Disorder GWA Converge With Each Otherand Identify GenesThere was substantial convergence between the genes identified by
clustered, nominally positive SNPs for bipolar disorder. The
TABLE I. (Continued)
Gene Description Chr Gene start Cluster start Cluster stop MNB P-value
THSD4 Thrombospondin, type I, domain containing 4 15 69,220,842 69,711,264 69,786,121 0.0028ADAMTSL3 ADAMTS-like 3 15 82,113,842 82,285,073 82,397,013 0.0030A2BP1 Ataxin 2-binding protein 1 16 6,009,133 6,011,499 6,209,343 0.0013
7,089,353 7,231,128 Y 0.0023HS3ST4 Heparan sulfate 3-O-sulfotransferase 4 16 25,611,240 25,762,019 25,840,965 Y 0.0046CDH13 Cadherin 13 16 81,218,079 81,422,281 81,515,943 Y 0.0017COX10 Cytochrome c oxidase assembly protein 10 17 13,913,444 13,786,166 13,933,925 0.0025KCTD1 Potassium channel tetramerisation dom 1 18 22,288,872 22,301,872 22,398,109 0.0029ACAA2 AcCoA Ac transferase 2 18 45,563,873 45,598,723 45,743,016 0.0003SCARNA17 Small Cajal body-specific RNA 17 18 45,594,391 45,598,723 45,743,016 0.0003MYO5B Myosin VB 18 45,603,099 45,598,723 45,743,016 0.0003DCC Deleted in colorectal carcinoma 18 48,121,156 48,150,107 48,263,207 0.0176ZNF236 Zinc finger protein 236 18 72,665,104 72,723,160 72,932,824 0.0010MBP Myelin basic protein 18 72,819,777 72,723,160 72,932,824 Y 0.0010HPS4 Hermansky–Pudlak syndrome 4 22 25,177,446 25,209,691 25,304,741 0.0077TFIP11 Tuftelin interacting protein 11 22 25,217,893 25,209,691 25,304,741 0.0077TPST2 Tyrosylprotein sulfotransferase 2 22 25,251,714 25,209,691 25,304,741 0.0077
Boldface only: Genes that are tagged by overlapping clusters of nominally positive SNPs from all four samples. Boldface/italics: Genes with Bonferroni corrected P-value threshold for significance
nominal P< 0.0007. Genes that also display overlapping clusters of nominally positive SNPs in 1M SNP genome wide association data that compares allele frequencies in European American
substance dependent versus control individuals MNB are also indicated. All genes included in this table are identified by overlapping clusters of SNPs from at least three studies. Thus, for each
there is overlap between the coordinates for the positive cluster in one study with the coordinates for the positive cluster in at least two other studies. Since many of the clusters display more
than 4 nominally positive SNPs and since results from different studies overlap only partially, we list the coordinates of the most 30 and most 50 SNPs from any of the overlapping clusters that
identify each gene. Note that some genes display two discrete regions in which clusters of overlapping nominally positive results are found in at least three bipolar versus control comparisons e.g.,
A2BP1, CSMD1. These same approaches identify the pseudogenes: HIST1H2APS2 histone 1 H2a pseudogene 2, PPP1R2P1 protein phosphatase 1 reg subunit 2 pseudogene 1, OR51N1P olfactory
receptor, family 51, subfamily N, member 1 pseudogene, OR52Y1P olfactory receptor, family 52, subfamily Y, member 1 pseudogene, and OR51A8P olfactory receptor, family 51, subfamily A,
member 8 pseudogene.Columns list 1 gene symbols and 2 gene descriptions, 3 chromosome for the gene 4 base pair for the beginning of the gene, 5 base pair for the initial SNP in the region in
which overlapping clusters of nominally positive results from at least three samples are found, 6 base pair for the last SNP in the region in which overlapping clusters of nominally positive result
from at least three samples are found, 7 whether 1 M SNP GWA data from comparisons of European-American substance dependent versus control individuals produces clusters of nominally
positive results that overlap with those identified in at least three samples that compare bipolar versus control individuals. 8 Monte Carlo P values for each gene that arise from 10,000 simulation
trials that each begins with random sampling from a database that contains all bipolar versus control datasets. These simulations assess the frequency of trials in which at least the observed
numbers of clustered, nominally positive SNPs identified in the samples studied here was recorded to provide an empirical P value.
JOHNSON ET AL. 187
69 genes identified by overlapping clusters of>4 nominally positive
SNPs from at least three of four bipolar versus control comparisons
(Table I) were accompanied by identification of five annotated
pseudogenes (Table I legend). Five of the 69 genes were identified by
overlapping clusters of nominally positive SNPs from all four
samples (Table I, boldface). Two of the 69 genes were identified
by two sites that are separated from each other by >25 kb but that
each contain clusters of nominally positive SNPs from three of the
four bipolar versus control comparisons. None of 100,000 Monte
Carlo II simulation trials identified chance overlaps as significant as
those noted in actual data for the clustered positive SNPs for at least
three of the bipolar versus control comparisons (thus, P< 0.00001).
This degree of overall convergence between these samples, assessed
in the ways that we have approached it herein, is thus virtually never
identified by chance.
In addition to the data estimating the significance of the overall
overlaps between the results noted here, we can use Monte Carlo III
methods to estimate the relative significance of the clustered
nominally positive results within individual genes (Table I). We
obtain substantial levels of nominal significance for many of these
69 genes. However, these levels of significance become more modest
when corrected for the 69 repeated comparisons implicit in gene-
by-gene analyses. Following ‘‘conservative’’ Bonferroni correc-
tions, data from the overlapping clusters of repeatedly, nominally
positive SNPs that identify semaphorin 5A, C6orf85, the proteo-
some subunit b 9, collagen type XI a 2, tubby-like protein 4,
collagen XXVIII a 1, cub and sushi domains 1, solute carrier family
5, member 12, checkpoint suppressor 1 and the clustered genes
acetyl co A acetyl transferase 2/small Cajal body-specific RNA 17/
myosin VB retain corrected P values <0.05 (nominal P< 0.0007;
boldfaced italics in Table I).
Assessment of Extent to Which ChromosomalClusters for Results From Bipolar Disorder GWAConverge With Chromosomal Clusters for ResultsFrom Substance Dependence GWAThere was also remarkable convergence between the genes identi-
fied by overlapping clusters of nominally positive SNPs from at least
three studies of bipolar disorder and clusters of nominally positive
SNPs from genome wide association studies of substance depen-
dence in MNB samples of self-reported European ancestry (‘‘MNB’’
in Table I). Twenty three of the 69 ‘‘bipolar’’ genes were identified in
this way. None of the 100,000 Monte Carlo II simulation trials
identified chance overlaps as significant as those noted in actual
data for comparisons between the genes identified by clustered
positive SNPs for the NIDA substance dependence versus control
comparisons and any of the four bipolar disorder versus control
comparison groups (thus, P< 0.00001).
DISCUSSION
The analyses presented in the current report provide strong support
for the extent to which multiple samples identify (1) reproducible
associations for bipolar disorder and (2) significant overlap be-
tween these data and data for substance dependence. It is important
to note that this strong evidence for the sets of genes, taken as a
whole, is accompanied by more modest evidence in support of
many of the individual genes identified here. We discuss the ways in
which these molecular genetic data provide support for observa-
tions from classical genetic and comorbidity studies, as well as the
strengths and limitations of the approaches used here to define and
document convergence.
Availability of multiple datasets that compare bipolar versus
control individuals provides us with opportunities to seek conver-
gence that are not available in any single dataset. Following initial
submission of parts of the analyses presented herein (e.g., based on
three of the four bipolar versus control datasets), several (though
not all) reports appeared that failed to identify any allele of any
single nucleotide polymorphism (SNP) (‘‘same SNP same allele’’
analysis) that was strongly associated with bipolar disorder in each
of the available bipolar versus control comparisons [Baum et al.,
2008; Gershon et al., 2008; Sklar et al., 2008]. The analytic approach
used both here and in the previously submitted manuscript is based
on assumptions that polygenic underlying genetic architectures for
vulnerabilities to bipolar disorder are likely to result in association
signals of modest strength for allelic variants at any single SNP. In
developing this approach, we hypothesized that it was likely that
other features might well confound ‘‘same SNP/same allele’’ anal-
yses. Such confounding features might include subtle differences in
patterns of linkage disequilibrium among the different samples
studied, within-locus heterogeneity of the genetic effects, between-
locus heterogeneity of the genetic effects and use of differing sets of
SNPs in the different studies. By contrast, we have used analyses that
use preplanned criteria based on clusters of nominally positive
SNPs that lie within genes. These analyses use empirical statistics
(e.g., those that do not make assumptions about underlying
distributions) to assess statistical significance at several levels, that
include: (a) at the level of the convergence of overall datasets
(Monte Carlo II) and (b) at the level of individual genes (Monte
Carlo III). Within any single sample, these approaches could simply
identify stochastic variations in haplotype frequency or methodo-
logical noise related to clusters of SNPs that were not associated with
disease vulnerability. However, replication of these observations in
multiple independent samples makes stochastic variation a much
less likely explanation for the overall group of results obtained.
‘‘True-positive’’ association signals are thus likely to be found in
genes in which multiple nearby SNPs display nominally significant
association signals in multiple samples, as we observe here.
With the current approaches, we have thus identified what
appears to be strong evidence for significant convergence of the
results of the four bipolar versus control datasets that are currently
available in the literature. It is important to stress that this strong
evidence for overall convergence contrasts with the modest evi-
dence that identifies many of the individual genes in Table I. On
statistical grounds, it is thus likely that association signals in most of
these genes will be identified in future studies of bipolar disorder
genetics, but that many will not be replicated. In this sense, the
findings from the current analyses mirror the ‘‘lack of genome wide
significance’’ conclusions of several previously reported ‘‘same
SNP/same allele’’ analyses. However, interpretation of (a1) modest
confidence in identification of individual genes in the context of
188 AMERICAN JOURNAL OF MEDICAL GENETICS PART B
(a2) strong overall evidence for overlapping results differs dramati-
cally from prior, implicit or explicit interpretation of (b1) the failure
of ‘‘same SNP/same allele’’ analyses to identify results with genome-
wide significance in (b2) the absence of the context of strong overall
evidence for overall overlapping genetics.
We can come to many of the same conclusions about the overlap
between genome wide association results for bipolar disorder, taken
as a group, with genome wide association results for an ethnically
matched substance dependence versus control comparison. A
remarkably substantial subset of the genes identified in bipolar
versus control studies are also identified in substance dependence
versus control comparisons, providing overall results that are again
virtually never identified by chance (P< 0.00001, Monte Carlo
simulations). Variants in ‘‘addiction/bipolar’’ genes that we iden-
tify are candidates to influence the brain in ways that manifest as
enhanced vulnerabilities to both substance dependence and bipolar
disorder. However, it is again important to note that even this
cumulative evidence for identification of these genes is modest,
taken individually. On statistical grounds, it is thus likely that
association signals in most of these genes will be identified in future
studies of addiction and bipolar disorder genetics, but that some
will not be replicated.
There are a number of important limitations to the current
analyses. (1) The samples employed here are based on research
volunteers and volunteers for blood donation who might not be
representative of all individuals in the general population, especially
of individuals who are less likely to volunteer. Ethical consider-
ations, however, limit many molecular genetic studies to samples of
volunteering individuals. (2) Modest but significant sample to
sample differences in diagnostic criteria, including Research Diag-
nostic and Diagnostic and Statistical Manual criteria, might well
provide ‘‘noise’’ in the current datasets. It is somewhat reassuring
that previous observations support the likelihood that RDC and
DSM criteria for bipolar disease can produce highly convergent
diagnoses [Maier et al., 1986; Sugiura et al., 1998]. (3) None of these
samples provides very high power to identify polygenic effects.
Thus, requirements that at least three out of four such samples
provide nominally significant clustered nominally positive results
are likely to substantially increase the numbers of false-negative
results, while they enhance confidence in the positive results. (4) We
have failed to identify single SNPs that display nominally positive
results in several of the bipolar disorder datasets at rates that are
significantly greater than those expected by chance. We can form
observed/expected ratios between the individual SNPs that display
nominally significant differences between bipolar and control
individuals in pairwise comparisons by multiplying the number
of SNPs studied by (0.05)2 to reflect the requirement that the same
SNP must achieve nominal significance in each of two independent
samples. When we compare the observed number of SNPs that
achieve nominal significance in each two-sample comparison to
those that would be expected, we observe: WTCCC versus NIMH
(331/245), WTCCC versus German (242/297), WTCCC versus
Sklar et al. (1387/1309), NIMH versus German (1599/1657), NIMH
versus Sklar et al. (280/199) and German versus Sklar et al. (211/
241). Only 60 SNPs display nominally positive results in three of the
four samples, fewer than would have been expected by chance. Each
of these observations supports the idea that one or more of the
abovementioned ‘‘difficulties’’ for ‘‘same SNP/same allele’’ analyses
is likely to be present in the genetic architecture of bipolar disorder
and/or in the genome wide methods used in these studies. (5) There
were overlaps among ‘‘NIHM’’ controls used as part of the com-
parison group by McMahon and colleagues with those employed by
Sklar and colleagues. While this may have contributed to the fact
that the observed/expected ratio for comparisons between the
single SNPs identified for these two samples was somewhat higher
than the ratios noted for other two sample comparisons (see above),
we do not believe that this overlap degrades the overall comparisons
to a significant extent.
There is substantial comorbidity for vulnerabilities to substance
dependence and bipolar disorder. A recent review documents that
about three quarters of patients with bipolar disorder also display a
substance use diagnosis [Krishnan, 2005]. Since both substance
dependence and bipolar disorder display substantial heritabilities
in twin studies, it was likely a priori that there would be substantial
overlap between the genetic bases of bipolar disorder and addiction
vulnerability. However, no twin data of which we are aware
previously documented heritable influences on the addiction/bi-
polar comorbidity per se. Indeed, some of the family data that
documents coaggregation of bipolar disorder and substance abuse/
dependence phenotypes has been interpreted to support some
separate segregation of these two phenotypes within pedigrees
[Preisig et al., 2001]. The current molecular genetic analyses thus
provide some of the first direct evidence for genetic bases for
portions of the elevated frequency of substance dependence that
is noted in bipolar patients.
The genes that display convergent reproducibly positive obser-
vations in genome wide association studies of addiction vulnera-
bility and of bipolar disorder represent an interesting set that can be
viewed from a number of perspectives. Products of one group
of these genes that are likely to play substantial roles in the initial
and/or plasticity-related ‘‘wiring’’ of the brain include semaphorin
5A, slit homolog 3, CUB and Sushi multiple domains, neuron
navigator 2 and cadherin 13. From one perspective, the represen-
tation of such genes is not surprising, since we have previously
emphasized the relative abundance of these genes from genome-
wide data for substance dependence in multiple samples [Liu et al.,
2005, 2006; Uhl et al., 2008a]. We and others have also identified
substantial numbers of neuronal connectivity genes when compar-
ing individuals who are more versus less successful at quitting
smoking [Uhl et al., 2007, 2008c], individuals with greater versus
smaller volumes of frontal cerebral cortex [Uhl et al., submitted],
individuals with Alzheimer’s disease versus controls, and individ-
uals with better versus poorer performance on tests of cognitive
abilities [Uhl et al., submitted]. Identifying substantial numbers of
connectivity-associated molecules whose variants alter vulnerabili-
ty to bipolar disorder supports the idea that this disorder is a likely
member of a ‘‘connectivity constellation’’ of disorders in which
differences in neuronal connectivity may play substantial roles in
the genetic contributions to individual differences in vulnerability.
Both addiction and bipolar disorder are common illnesses
[Uhl and Grow, 2004]. Identifying SNP markers whose allelic
frequencies distinguish addicts of several different ethnicities from
matched controls thus supports at least some ‘‘common disease/
common allele’’ genetic architecture [Lander and Schork, 1994] for
JOHNSON ET AL. 189
the genetics of both bipolar disorder and addiction vulnerability.
Clearly, there are also likely contributions to both addiction vul-
nerability and to bipolar disorder from genomic and environmental
features that are specific to each disorder. Nevertheless, the findings
presented here thus promise to enhance understanding of features
that are common to human addictions and bipolar disorder in ways
that could facilitate efforts to personalize prevention and treatment
strategies for these debilitating diseases.
ACKNOWLEDGMENTS
This research was supported financially by the NIH Intramural
Research Programs of NIDA and NIMH, DHSS. We are very
grateful for access to bipolar and control data generated by the
Wellcome Trust Case-Control Consortium and by Dr. Pamela Sklar
and colleagues. A full list of the WTCC investigators who contrib-
uted to the generation of the WTCCC data is available from
www.wtccc.org.uk. Funding for this project was provided by the
Wellcome Trust under award 076113. For studies of substance
dependence, we are also grateful for dedicated help with clinical
characterization of subjects from Judith Hess, Dan Lipstein, Fely
Carillo and other Johns Hopkins-Bayview support staff.
REFERENCES
Baum AE, Akula N, Cabanero M, Cardona I, Corona W, Klemens B,Schulze TG, Cichon S, Rietschel M, Nothen MM, et al. 2007. A genome-wide association study implicates diacylglycerol kinase eta (DGKH) andseveral other genes in the etiology of bipolar disorder. Mol Psychiatry2:197–207.
Baum AE, Hamshere M, Green E, Cichon S, Rietschel M, Noethen MM,Craddock N, McMahon FJ. 2008. Meta-analysis of two genome-wideassociation studies of bipolar disorder reveals important points ofagreement. Mol Psychiatry 13(5):466–467.
Biederman J, Faraone SV, Wozniak J, Monuteaux MC. 2000. Parsing theassociation between bipolar, conduct, and substance use disorders: Afamilial risk analysis. Biol Psychiatry 48(11):1037–1044.
Bierut LJ, Madden PA, Breslau N, Johnson EO, Hatsukami D, PomerleauOF, Swan GE, Rutter J, Bertelsen S, Fox L, et al. 2007. Novel genesidentified in a high-density genome wide association study for nicotinedependence. Hum Mol Genet 16(1):24–35.
Ferreira MA, O’Donovan MC, Meng YA, Jones IR, Ruderfer DM, Jones L,Fan J, Kirov G, Perlis RH, Green EK, et al. 2008. Collaborative genome-wide association analysis supports a role for ANK3 and CACNA1C inbipolar disorder. Nat Genet. [Epub ahead of print].
Gershon ES, Liu C, Badner JA. 2008. Genome-wide association in bipolar.Mol Psychiatry 13(1):1–2.
Johnson C, Drgon T, Liu QR, Walther D, Edenberg H, Rice J, Foroud T,Uhl GR. 2006. Pooled association genome scanning for alcohol depen-dence using 104,268 SNPs: Validation and use to identify alcoholismvulnerability loci in unrelated individuals from the collaborative studyon the genetics of alcoholism. Am J Med Genet Part B 141B(8):844–853.
Krishnan KR. 2005. Psychiatric and medical comorbidities of bipolardisorder. Psychosom Med 67(1):1–8.
Lander ES, Schork NJ. 1994. Genetic dissection of complex traits. Science265(5181):2037–2048.
Liu QR, Drgon T, Walther D, Johnson C, Poleskaya O, Hess J, Uhl GR. 2005.Pooled association genome scanning: Validation and use to identifyaddiction vulnerability loci in two samples. Proc Natl Acad Sci USA102(33):11864–11869.
Liu QR, Drgon T, Johnson C, Walther D, Hess J, Uhl GR. 2006. Addictionmolecular genetics: 639,401 SNP whole genome association identifiesmany ‘‘cell adhesion’’ genes. Am J Med Genet Part B 141B(8):918–925.
Maier W, Philipp M, Buller R, Benkert O. 1986. Sources of disagreementbetween clinical (ICD-9) and operational (RDC, DSM-III) diagnosisof endogenous depression (melancholia). J Affect Disord 11(3):235–243.
McQueen MB, Devlin B, Faraone SV, Nimgaonkar VL, Sklar P, Smoller JW,Abou Jamra R, Albus M, Bacanu SA, Baron M, et al. 2005. Combinedanalysis from eleven linkage studies of bipolar disorder provides strongevidence of susceptibility loci on chromosomes 6q and 8q. Am J HumGenet 77(4):582–595.
Persico AM, Bird G, Gabbay FH, Uhl GR. 1996. D2 dopamine receptor geneTaqI A1 and B1 restriction fragment length polymorphisms: Enhancedfrequencies in psychostimulant-preferring polysubstance abusers. BiolPsychiatry 40(8):776–784.
Preisig M, Fenton BT, Stevens DE, Merikangas KR. 2001. Familial rela-tionship between mood disorders and alcoholism. Compr Psychiatry42(2):87–95.
Shih RA, Belmonte PL, Zandi PP. 2004. A review of the evidence fromfamily, twin and adoption studies for a genetic contribution to adultpsychiatric disorders. Int Rev Psychiatry 16(4):260–283.
Sklar P, Smoller JW, Fan J, Ferreira MA, Perlis RH, Chambert K, Nim-gaonkar VL, McQueen MB, Faraone SV, Kirby A, et al. 2008. Whole-genome association study of bipolar disorder. Mol Psychiatry13(6):558–569.
Sugiura T, Hasui C, Aoki Y, Sugawara M, Tanaka E, Sakamoto S, KitamuraT. 1998. Japanese psychology students as psychiatric diagnosticians:Application of criteria of mood and anxiety disorders to written casevignettes using the RDC and DSM-IV. Psychol Rep 82(3Pt 1): 771–781.
Uhl GR, Grow RW. 2004. The burden of complex genetics in braindisorders. Arch Gen Psychiatry 61(3):223–229.
Uhl GR, Liu QR, Walther D, Hess J, Naiman D. 2001. Polysubstance abuse-vulnerability genes: Genome scans for association, using 1,004 subjectsand 1,494 single-nucleotide polymorphisms. Am J Hum Genet 69(6):1290–1300.
Uhl GR, Liu QR, Drgon T, Johnson C, Walther D, Rose JE. 2007. Moleculargenetics of nicotine dependence and abstinence: Whole genome associa-tion using 520,000 SNPs. BMC Genet 8:10.
Uhl GR, Drgon T, Johnson C, Fatusin OO, Liu QR, Contoreggi C, Li CY,Buck K, Crabbe J. 2008a. Higher order addiction molecular genetics:Convergent data from genome-wide association in humans and mice.Biochem Pharmacol 75:98–111.
Uhl GR, Drgon T, Liu QR, Johnson C, Walther D, Komiyama T, Harano M,Sekine Y, Inada T, Ozaki N, Iyo M, Iwata N, Yamada M, Sora I, Chen CK,Liu HC, Ujike H, Lin SK. 2008b. Genome-wide association for metham-phetamine dependence: Convergent results from two samples. Arc GenPsychiatry 65:345–355.
Uhl GR, Liu QR, Drgon T, Johnson C, Walther D, Rose JE, David SP, NiauraR, Lerman C. 2008c. Molecular genetics of successful smoking cessation:Convergent genome-wide association results. Arch Gen Psychiatry65:683–693.
WellcomeTrustConsortium. 2007. Genome-wide association study of14,000 cases of seven common diseases,3,000 shared controls. Nature447(7145):661–678.
190 AMERICAN JOURNAL OF MEDICAL GENETICS PART B