97
1 Supplementary Materials An Integrated Prognostic Classifier for Stage I Lung Adenocarcinoma based on mRNA, microRNA and DNA Methylation Biomarkers Ana I. Robles 1 , Eri Arai 2 , Ewy A. Mathé 1 , Hirokazu Okayama 1 , Aaron Schetter 1 , Derek Brown 1 , David Petersen 3 , Elise D. Bowman 1 , Rintaro Noro 1 , Judith A. Welsh 1 , Daniel C. Edelman 3 , Holly S. Stevenson 3 , Yonghong Wang 3 , Naoto Tsuchiya 4 , Takashi Kohno 4 , Vidar Skaug 5 , Steen Mollerup 5 , Aage Haugen 5 , Paul S. Meltzer 3 , Jun Yokota 6 , Yae Kanai 2 and Curtis C. Harris 1 Affiliations: 1 Laboratory of Human Carcinogenesis, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 2 Division of Molecular Pathology, National Cancer Center Research Institute, Tokyo 104-0045, Japan. 3 Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National Cancer Center Research Institute, Tokyo 104-0045, Japan. 5 Department of Chemical and Biological Working Environment, National Institute of Occupational Health, NO-0033 Oslo, Norway. 6 Genomics and Epigenomics of Cancer Prediction Program, Institute of Predictive and Personalized Medicine of Cancer (IMPPC), 08916 Badalona (Barcelona), Spain. List of Supplementary Materials Supplementary Materials and Methods Fig. S1. Hierarchical clustering of based on CpG sites differentially-methylated in Stage I ADC compared to non-tumor adjacent tissues. Fig. S2. Confirmatory pyrosequencing analysis of DNA methylation at the HOXA9 locus in Stage I ADC from a subset of the NCI microarray cohort.

Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  1  

Supplementary Materials

An Integrated Prognostic Classifier for Stage I Lung Adenocarcinoma based on mRNA, microRNA and DNA Methylation Biomarkers

Ana I. Robles1, Eri Arai2, Ewy A. Mathé1, Hirokazu Okayama1, Aaron Schetter1, Derek Brown1, David Petersen3, Elise D. Bowman1, Rintaro Noro1, Judith A. Welsh1, Daniel C. Edelman3, Holly S. Stevenson3, Yonghong Wang3, Naoto Tsuchiya4, Takashi Kohno4, Vidar Skaug5, Steen Mollerup5, Aage Haugen5, Paul S. Meltzer3, Jun Yokota6, Yae Kanai2 and Curtis C. Harris1 Affiliations: 1Laboratory of Human Carcinogenesis, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 2Division of Molecular Pathology, National Cancer Center Research Institute, Tokyo 104-0045, Japan.

3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4Division of Genome Biology, National Cancer Center Research Institute, Tokyo 104-0045, Japan. 5Department of Chemical and Biological Working Environment, National Institute of Occupational Health, NO-0033 Oslo, Norway. 6Genomics and Epigenomics of Cancer Prediction Program, Institute of Predictive and Personalized Medicine of Cancer (IMPPC), 08916 Badalona (Barcelona), Spain.

List of Supplementary Materials Supplementary Materials and Methods Fig. S1. Hierarchical clustering of based on CpG sites differentially-methylated in Stage I ADC compared to non-tumor adjacent tissues. Fig. S2. Confirmatory pyrosequencing analysis of DNA methylation at the HOXA9 locus in Stage I ADC from a subset of the NCI microarray cohort.

Page 2: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  2  

Fig. S3. Methylation Beta-values for HOXA9 probe cg26521404 in Stage I ADC samples from Japan. Fig. S4. Kaplan-Meier analysis of HOXA9 promoter methylation in a published cohort of Stage I lung ADC (J Clin Oncol 2013;31(32):4140-7). Fig. S5. Kaplan-Meier analysis of a combined prognostic biomarker in Stage I lung ADC. Table S1. CpG sites differentially-methylated in Stage I ADC compared to non-tumor adjacent tissues. Table S2. Functional characterization of genes hypermethylated in tumors using Gene Ontology and INTERPRO. Table S3. Functional characterization of genes hypomethylated in tumors using Gene Ontology and INTERPRO. Table S4. Summary of Ingenuity Pathway Analysis of genes differentially methylated in tumors. Table S5. Gene Set Enrichment Analysis of genes differentially methylated in tumors from the NCI microarray cohort. Table S6. Gene Set Enrichment Analysis of genes differentially methylated in tumors from the Japan microarray cohort. Table S7. Hypermethylated probe sets corresponding to genes marked by H3K27me3 in ESC. Table S8. Association between methylation cluster and clinical-demographic variables in the NCI microarray cohort. Table S9. Gene expression differences between high and low methylation clusters in the NCI microarray cohort. Table S10. Gene Set Enrichment Analysis of genes differentially expressed between high and low methylation clusters in the NCI microarray cohort. Table S12. miRNA expression differences between high and low methylation clusters in the NCI microarray cohort. Table S13. Univariable and Multivariable Cox Regression of HOXA9 promoter methylation in two cohorts. Table S14. Univariable and Multivariable Cox Regression of 4-protein-coding gene classifier, miR-21 expression and HOXA9 promoter methylation in two cohorts and their overall combination. Table S15. Univariable and Multivariable Cox Regression of High combined 4-gene classifier, miR-21 expression and HOXA9 methylation in the combined NCI/Norway and Japan cohorts.

Page 3: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  3  

Supplementary Materials and Methods DNA methylation array data preprocessing Fluorescent signals were collated via Bead Studio software and converted into β-values (β = ratio of signal from methylated probe relative to the sum of both methylated and unmethylated probes), which, ranging from 0 to 1, reflect the methylation level at each single CpG site. IlluminaBead Studio software output data were normalized using SSN (simple scaling normalization), to correct for dye bias and transformed into M-values (M = log2 ratio of signal from methylated probe signal relative to unmethylated probe), using the R package lumi. Partek Genomics Suite 6.6 was used for visualization and analysis. As an initial quality control, the accurate discrimination of males and females based on X chromosome methylation markers was confirmed. Probe sets corresponding to CpG loci on Xchr and Ychr (1093 probe sets) were excluded from further analysis. Additionally, 616 CpG loci with associated p-values > 0.05, indicative of poor hybridization quality, in > 10% samples, were also excluded, as were 4050 probes containing a known single nucleotide polymorphism, with MAF > 0.05, leaving 22,008 autosomal probes for analysis of differential methylation. Initial exploratory visualization by Principal Component Analysis (PCA) identified a large effect of experimental batch on data distribution. Therefore, visualizations and statistical tests were performed on batch-adjusted data using the “Batch remove” function in Partek. Tumor/non-tumor was the most important source of variation in the data, after batch-adjustment. Differential methylation was assessed by FDR-adjusted paired t-test. mRNA array data preprocessing Raw Data was preprocessed using Bioconductor’s “lumi” package in R. Data from samples with good overall signal intensity were uploaded to BRB-ArrayTools for normalization by robust spline normalization (RSN). BRB-ArrayTools is an Excel package developed by Dr. Richard Simon and the BRB-ArrayTools Development Team. miRNA array data preprocessing nCounter RCC files were imported into nSolver (Nanostring). Samples with good overall signal were normalized to the geometric mean of the top 100 expressed miRNAs within each sample. Normalized probes were imported into Partek Genomics Suite. Further, miRNAs were deemed absent if intensity < 10, and excluded from the analysis if they were not present in at least 40% of samples, leaving 424 miRNAs for further analysis. Initial exploratory visualization by PCA identified a large effect of experimental batch on data distribution. For analysis of differential miRNA expression and visualization, the effect of batch was eliminated using the “Batch Remove” function in Partek. Tumor/non-tumor tissue was the most important source of variation in the data, after batch-adjustment.

Page 4: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  4  

  Fig. S1. Hierarchical clustering of based on CpG sites differentially methylated in Stage I ADC compared to non-tumor adjacent tissues. Each row represents an individual patient and each column an individual CpG probe. T: Tumor tissue; NT: non-tumor adjacent tissue; CGI: CpG Island; non-CGI: non-CpG Island.

non-CGI CG

NT

-2.0 0

Page 5: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  5  

Fig. S2. Confirmatory pyrosequencing analysis of DNA methylation at the HOXA9 locus in Stage I ADC from a subset of the NCI microarray cohort.

HO

XA9

Mea

n M

ethy

latio

n (%

)

Tumor

Non-tumor

0

20

40

60

80

p = 0.0004

miR

-196

b M

ean

Met

hyla

tion

(%)

Tumor

Non-tum

or0

20

40

60

80

p = 0.0001A B

Page 6: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  6  

Fig. S3. Methylation Beta-values for HOXA9 probe cg26521404 in Stage I ADC samples from Japan.

Bet

a-va

lue

Tum

or

Non-Tum

or0.0

0.2

0.4

0.6

0.8

p < 0.0001

Page 7: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  7  

Fig. S4. Kaplan-Meier analysis of HOXA9 promoter methylation in a published cohort of Stage I lung ADC (J Clin Oncol 2013;31(32):4140-7).

cg16104915

0 5 10 15 200.0

0.2

0.4

0.6

0.8

1.0

Rel

apse

-Fre

e S

urvi

val (

prop

ortio

n)

Time to Recurrence (years)

Methylated (n=10)

Unmethylated (n=107)

Log-rank P = 0.0002

Page 8: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  8  

Fig. S5. Kaplan-Meier analysis of a combined prognostic biomarker in Stage I lung ADC. Patients in the NCI/Norway (top panels) and Japan (bottom panels) cohorts were categorized according to the combined number of high values for HOXA9 methylation, miR-21 and 4-protein-coding gene signature. An increasing combined score conferred greater risk for poor outcome in Stage I (left), and within subgroup analysis of Stage IA (middle) and Stage IB (right). P values calculated by log-rank test for trend.

0 20 40 600

20

40

60

80

100

Time after surgery (months)

Can

cer-

spec

ific

surv

ival

(%)

0 (n=24)1 (n=30)2 (n=30)3 (n=7) trend P = 0.003

0 20 40 600

20

40

60

80

100

Time after surgery (months)

Rel

apse

-free

sur

viva

l (%

)

0 (n=26)1 (n=36)2 (n=28)3 (n=23) trend P < 0.0001

0 20 40 600

20

40

60

80

100

Time after surgery (months)

Can

cer-

spec

ific

surv

ival

(%)

0 (n=16)1 (n=16)2 (n=18)3 (n=4) trend P = 0.004

0 20 40 600

20

40

60

80

100

Time after surgery (months)

Rel

apse

-free

sur

viva

l (%

)

0 (n=17)1 (n=30)2 (n=22)3 (n=12) trend P = 0.003

0 20 40 600

20

40

60

80

100

Time after surgery (months)

Can

cer-

spec

ific

surv

ival

(%)

0 (n=8)1 (n=12)2 (n=9)3 (n=2) trend P = 0.01

0 20 40 600

20

40

60

80

100

Time after surgery (months)

Rel

apse

-free

sur

viva

l (%

)

0 (n=9)1 (n=6)2 (n=6)3 (n=11) trend P = 0.005

Stage I Stage IA Stage IB

NC

I/Nor

way

Ja

pan

!"#$%&$'()*$+ $,- $$$$$$$$$$$$,, $$$$$$$$$$$$$$$$$$$$$$$$$,. $$$$$$$$$$$$$.-$. $/+ $$$$$$$$$$$$,/ $$$$$$$$$$$$$$$$$$$$$$$$$.0 $$$$$$$$$$$$$.1$, $/+ $$$$$$$$$$$$,/ $$$$$$$$$$$$$$$$$$$$$$$$$.2 $$$$$$$$$$$$$./$/ $$$3 $$$$$$$$$$$$$$/ $$$$$$$$$$$$$$$$$$$$$$$$$$$, $$$$$$$$$$$$$$$+$

!"#$%&$'()*$+ $,2 $$$$$$$$$$$,1 $$$$$$$$$$$$$$$$$$$$$$.4 $$$$$$$$$$$$./$. $/2 $$$$$$$$$$$/2 $$$$$$$$$$$$$$$$$$$$$$/2 $$$$$$$$$$$$.3$, $,0 $$$$$$$$$$$,1 $$$$$$$$$$$$$$$$$$$$$$,+ $$$$$$$$$$$$.+$/$$$$$$$$$$$$$$$$$$$$$$,/ $$$$$$$$$$$.4 $$$$$$$$$$$$$$$$$$$$$..$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$-$

!"#$%&$'()*$+$$$$$$$$$$$$$$$$.2 $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$.2 $$$$$$$$$$$$$$$$$$.2 $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$.+$.$$$$$$$$$$$$$$$$.2 $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$., $$$$$$$$$$$$$$$$$$.+ $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$4$,$$$$$$$$$$$$$$$$.0 $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$.1 $$$$$$$$$$$$$$$$$$.. $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$0$/$$$$$$$$$$$$$$$$$$- $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$, $$$$$$$$$$$$$$$$$$$$. $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$+$

!"#$%&$'()*$+$$$$$$$$$$$$$$$$$$.3$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$.3 $$$$$$$$$$$$$$$$$$$$./ $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$3$.$$$$$$$$$$$$$$$$$$/+$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$/+ $$$$$$$$$$$$$$$$$$$$/+ $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$.1$,$$$$$$$$$$$$$$$$$$,,$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$,. $$$$$$$$$$$$$$$$$$$$.3 $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$0$/$$$$$$$$$$$$$$$$$$.,$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$., $$$$$$$$$$$$$$$$$$$$$$3 $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$,$$

!"#$%&$'()*$+$$$$$$$$$$$$$$0 $$$$$$$$$$$$$$$$$$$$$$$$$$3 $$$$$$$$$$$$$$2 $$$$$$$$$$$$$$$$$$$$$$$$$$-$.$$$$$$$$$$$$$., $$$$$$$$$$$$$$$$$$$$$$$$$.. $$$$$$$$$$$$$$0 $$$$$$$$$$$$$$$$$$$$$$$$$$1$,$$$$$$$$$$$$$$4 $$$$$$$$$$$$$$$$$$$$$$$$$$3 $$$$$$$$$$$$$$1 $$$$$$$$$$$$$$$$$$$$$$$$$$-$/$$$$$$$$$$$$$$, $$$$$$$$$$$$$$$$$$$$$$$$$. $$$$$$$$$$$$$$. $$$$$$$$$$$$$$$$$$$$$$$$$$+$

!"#$%&$'()*$+$$$$$$$$$$$$$$$$4 $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$4 $$$$$$$$$$$$$$$$$$3 $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$2$.$$$$$$$$$$$$$$$$2 $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$2 $$$$$$$$$$$$$$$$$$2 $$$$$$$$$$$$$$$$$$$$$$$$$$$$$,$,$$$$$$$$$$$$$$$$2 $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$1 $$$$$$$$$$$$$$$$$$- $$$$$$$$$$$$$$$$$$$$$$$$$$$$$,$/$$$$$$$$$$$$$$.. $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$0 $$$$$$$$$$$$$$$$$$1 $$$$$$$$$$$$$$$$$$$$$$$$$$$$$,$$

Page 9: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  9  

Table S1. CpG sites differentially-methylated in Stage I ADC compared to non-tumor adjacent tissues.  Probeset ID Gene Symbol miRNA CPG_ISLAND CPG_ISLAND_LOCATIONS FDR FoldChange (T/NT) cg26521404 HOXA9 HSA-MIR-196B TRUE 7:27170287-27173690 6.75E-11 5.4141 cg01354473 HOXA9 HSA-MIR-196B TRUE 7:27170287-27173690 5.28E-10 4.81311 cg02989940 ERAF FALSE 1.09E-09 -3.52492 cg20959866 AJAP1 TRUE 1:4613727-4616670 1.68E-09 4.68147 cg04330449 NEUROG1 TRUE 5:134898284-134900130 1.68E-09 4.43323 cg09619146 CPXM2 TRUE 10:125640686-125641562 1.68E-09 2.23024 cg10368842 C10orf81 FALSE 1.68E-09 -3.06078 cg25087423 BLR1 TRUE 11:118259679-118259945 1.68E-09 -4.39059 cg24240626 REG3A FALSE 1.70E-09 -3.42862 cg25720804 TLX3 TRUE 5:170667594-170672661 2.10E-09 6.04313 cg27409364 KCNC1 TRUE 11:17712319-17714962 2.24E-09 2.53413 cg12003230 C21orf84 FALSE 2.60E-09 2.10266 cg26609631 GSH1 TRUE 13:27263400-27265360 2.80E-09 4.65057 cg16428251 SOX14 TRUE 3:138964433-138967163 2.80E-09 2.8602 cg24407065 BLZF1 FALSE 3.50E-09 2.14295 cg26530341 TNFRSF10A TRUE 8:23137228-23140012 3.87E-09 -4.12652 cg13406950 GBP1 FALSE 3.95E-09 2.44664 cg11172423 CLDN19 FALSE 4.02E-09 2.29425 cg23290344 NEF3 TRUE 8:24826692-24829288 4.51E-09 5.27032 cg12111714 ATP8A2 TRUE 13:24940557-24941659 4.51E-09 2.8514 cg09936561 DRD5 TRUE 4:9391873-9394231 4.51E-09 2.70623 cg24673765 HSPB6 TRUE 19:40939387-40939960 4.51E-09 2.0248 cg17399166 CD1D FALSE 4.51E-09 -2.86132 cg08572611 ACTL6B TRUE 7:100091590-100092362 4.80E-09 5.19801 cg18555440 MYOD1 TRUE 11:17696873-17700534 5.16E-09 2.55779

Page 10: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  10  

cg07533148 TRIM58 TRUE 1:246086350-246088254 5.41E-09 8.96755 cg09868882 GRM8 FALSE 5.41E-09 -2.04268 cg24423088 KRTAP8-1 FALSE 5.41E-09 -4.51496 cg09099744 CDKN2A TRUE 9:21958106-21958899 5.62E-09 5.03505 cg05441133 GDF2 TRUE 10:48036704-48037082 6.85E-09 2.99165 cg03534410 TMEM40 FALSE 6.85E-09 -2.39156 cg17525406 AJAP1 TRUE 1:4613727-4616670 7.53E-09 5.67884 cg06722633 GRIK3 TRUE 1:37270866-37273461 7.94E-09 3.17803 cg08044694 BRD4 TRUE 19:15252811-15253031 1.02E-08 2.79397 cg04048259 EDN3 TRUE 20:57308563-57309693 1.07E-08 5.39158 cg03975694 ZNF540 TRUE 19:42733846-42734771 1.11E-08 2.91691 cg23037403 ZNF454 TRUE 5:178300061-178301434 1.15E-08 3.05459 cg23130254 HOXD12 TRUE 2:176672156-176673886 1.17E-08 3.51632 cg11946503 NEUROG1 TRUE 5:134898284-134900130 1.31E-08 2.44347 cg08411049 SERPINB5 TRUE 18:59294878-59295284 1.31E-08 -3.38117 cg10409560 FLJ23657 FALSE 1.39E-08 -2.48664 cg26521448 ZC3H7A TRUE 16:11783829-11784227 1.40E-08 -2.86588 cg15191648 SALL3 TRUE 18:74837928-74842504 1.44E-08 5.33531 cg10235817 ADRA2C TRUE 4:3736799-3739691 1.50E-08 2.55344 cg22660578 LHX1 TRUE 17:32365846-32370498 1.50E-08 4.306 cg03355526 ZNF454 TRUE 5:178300061-178301434 1.50E-08 2.33331 cg02994956 NEFH TRUE 22:28205749-28207420 1.50E-08 2.0262 cg16280667 BLR1 TRUE 11:118259679-118259945 1.50E-08 -2.92164 cg14958635 NEUROG1 TRUE 5:134898284-134900130 1.54E-08 3.47517 cg02164046 SST TRUE 3:188870156-188871038 1.70E-08 3.83278 cg18752880 C1QTNF3 FALSE 1.70E-08 3.12185 cg10217445 OLIG2 TRUE 21:33316613-33322726 1.70E-08 2.30999 cg06092815 SKIP TRUE 2:228754274-228755173 1.70E-08 2.08357

Page 11: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  11  

cg10549973 UNQ9438 TRUE 14:57932076-57933064 1.70E-08 -2.6428 cg10238818 CYYR1 TRUE 21:26866776-26867613 1.93E-08 3.77282 cg11525285 CHX10 TRUE 14:73776003-73778973 1.93E-08 2.94485 cg21410991 ISL1 TRUE 5:50714114-50715273 1.93E-08 2.73412 cg02332525 GRM7 TRUE 3:6877272-6878818 1.93E-08 2.40919 cg08397758 C10orf33 FALSE 1.93E-08 2.2363 cg26575445 GPR160 TRUE 3:171238226-171239656 1.93E-08 -2.19567 cg07536847 PAX7 TRUE 1:18829151-18831168 1.94E-08 3.32338 cg23317501 UGT3A1 FALSE 1.94E-08 2.70098 cg12109455 DPYSL4 TRUE 10:133848709-133851502 2.18E-08 3.3647 cg20291049 POU3F3 TRUE 2:104835039-104840521 2.18E-08 2.8985 cg06151165 VSX1 TRUE 20:25009572-25011060 2.20E-08 5.13569 cg12799895 NPTX2 TRUE 7:98083575-98085970 2.20E-08 2.97485 cg17398613 SLC37A1 TRUE 21:42792735-42793017 2.20E-08 2.48055 cg02501779 CBLN4 TRUE 20:54011953-54014199 2.20E-08 2.47415 cg00918005 REG3G FALSE 2.45E-08 -2.71805 cg26789453 TMEM116 FALSE 2.52E-08 -3.20591 cg01580681 HAND2 TRUE 4:174686012-174689831 2.55E-08 2.54454 cg23732024 LY96 FALSE 2.72E-08 -3.09695 cg08118311 SALL3 TRUE 18:74837928-74842504 2.76E-08 5.4242 cg14384532 NTRK3 TRUE 15:86599302-86602208 2.76E-08 3.72684 cg08109815 NMBR TRUE 6:142450876-142451811 2.76E-08 2.00673 cg05615150 ARPP-21 FALSE 2.76E-08 -2.51246 cg06710648 DAB1 FALSE 2.97E-08 2.45502 cg02806777 PGLYRP1 TRUE 19:51217863-51218220 3.03E-08 2.23284 cg03958979 NR2E1 TRUE 6:108591543-108597244 3.08E-08 2.42006 cg11323198 CDH8 TRUE 16:60625845-60628489 3.22E-08 2.9332 cg19797376 TAL1 TRUE 1:47467815-47468163 3.22E-08 2.27375

Page 12: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  12  

cg19456540 SIX6 TRUE 14:60045112-60046647 3.26E-08 6.85826 cg14008883 SLC18A3 TRUE 10:50486969-50490812 3.32E-08 2.60998 cg08575537 EPO TRUE 7:100155958-100156762 3.32E-08 3.29951 cg25750259 C20orf121 FALSE 3.32E-08 -2.15675 cg09088576 CSF3R TRUE 1:36720283-36720726 3.50E-08 2.30994 cg00948500 KRTAP20-2 FALSE 3.52E-08 -2.01304 cg12374721 PRAC TRUE 17:44154292-44154687 3.53E-08 4.20967 cg26388152 CORO6 TRUE 17:24968557-24969595 3.53E-08 -3.02429 cg02757432 GPR26 TRUE 10:125415200-125416731 3.65E-08 2.85014 cg18335068 ZNF677 TRUE 19:58449450-58450521 3.85E-08 3.10358 cg10189695 GPR78 TRUE 4:8632931-8634384 3.85E-08 2.99049 cg16812893 KRTAP15-1 FALSE 3.85E-08 -2.48084 cg20312687 DEFB118 FALSE 3.85E-08 -2.77036 cg26963271 PDE4B TRUE 1:66030397-66031906 3.86E-08 5.64951 cg16042149 NEFH TRUE 22:28205749-28207420 3.86E-08 2.60224 cg07621046 C10orf82 TRUE 10:118419465-118419880 3.95E-08 2.64588 cg02919422 SOX17 TRUE 8:55532627-55535230 3.99E-08 3.02776 cg18267381 ZNF659 TRUE 3:21767324-21768052 4.12E-08 2.2579 cg21614638 DAPP1 FALSE 4.21E-08 -2.46926 cg10646402 PTPRO TRUE 12:15366460-15367398 4.26E-08 5.61547 cg00548268 NPTX2 TRUE 7:98083575-98085970 4.26E-08 2.38077 cg21747271 AIP TRUE 11:67006002-67007558 4.26E-08 -2.49675 cg10257049 C5orf4 FALSE 4.55E-08 3.09162 cg01280080 ATIC TRUE 2:215884284-215885543 4.59E-08 -2.377 cg13302823 SCRT1 TRUE 8:145530348-145533488 4.60E-08 2.2281 cg00489401 FLT4 TRUE 5:180008152-180010059 4.66E-08 3.63054 cg12128017 EPHA10 TRUE 1:38002292-38003550 4.82E-08 3.28722 cg23054883 FZD10 TRUE 12:129211083-129215265 4.87E-08 3.06877

Page 13: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  13  

cg26069745 HOXA2 TRUE 7:27108167-27108756 4.87E-08 2.92004 cg16886259 CYSLTR2 FALSE 4.87E-08 -2.03023 cg19884262 FLJ46831 TRUE 10:129424072-129426344 4.89E-08 3.63278 cg26799474 CASP8 FALSE 4.89E-08 -2.80618 cg00891541 SMPD3 HS_254 TRUE 16:67038237-67039167 4.99E-08 2.4388 cg05824215 CCR6 FALSE 5.05E-08 -2.09417 cg00916635 PTPN22 FALSE 5.05E-08 -3.47102 cg22285621 SSH3 TRUE 11:66826951-66828421 5.74E-08 3.42693 cg09325711 RALA TRUE 7:39628951-39630478 5.78E-08 -2.51651 cg03734874 FLJ42486 TRUE 14:104141516-104142673 5.82E-08 2.20821 cg10660256 BHMT TRUE 5:78443059-78443592 5.96E-08 3.57441 cg14859460 GRM6 TRUE 5:178353697-178355005 6.02E-08 4.64987 cg23196831 COL14A1 TRUE 8:121206129-121207658 6.12E-08 2.80161 cg11171719 CTDSPL TRUE 3:37876779-37879232 6.20E-08 3.01185 cg08441806 NKX6-2 TRUE 10:134447362-134452718 6.20E-08 2.8105 cg26316946 GRIK2 TRUE 6:101953329-101954569 6.20E-08 2.57763 cg27389185 ZNF540 TRUE 19:42733846-42734771 6.20E-08 2.18974 cg08569678 LY6K TRUE 8:143778074-143779717 6.20E-08 -2.13078 cg03874199 HOXD12 TRUE 2:176672156-176673886 6.22E-08 3.63158 cg04086012 FLJ36180 TRUE 4:189297883-189298217 6.22E-08 -2.10453 cg19831575 FGF4 TRUE 11:69297900-69299813 6.45E-08 4.64745 cg06491116 LOC196264 TRUE 11:117627891-117628420 6.47E-08 -2.22093 cg21130374 MX2 TRUE 21:41656108-41656393 6.66E-08 -2.88333 cg26320696 PARVA TRUE 11:12355424-12356620 6.84E-08 2.08672 cg02748539 SLC9A3 TRUE 5:576269-577990 6.87E-08 3.15149 cg18182399 DES TRUE 2:219991189-219992151 6.87E-08 2.83796 cg09427311 ANGPTL2 FALSE 6.87E-08 2.27858 cg15227982 C10orf26 FALSE 6.88E-08 2.4763

Page 14: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  14  

cg22471346 GAS7 TRUE 17:10041659-10043517 7.14E-08 5.60733 cg22994720 CHRDL2 FALSE 7.38E-08 3.4647 cg18902090 PCDHAC1 TRUE 5:140285591-140287803 7.38E-08 2.03878 cg04534765 GALR1 TRUE 18:73090255-73093032 7.42E-08 4.8292 cg08747377 CDH13 TRUE 16:81217810-81219323 7.70E-08 2.82846 cg04622802 LOC387758 TRUE 11:26972301-26972579 7.85E-08 2.54415 cg12880658 CDO1 TRUE 5:115179127-115180765 7.89E-08 3.94951 cg03623878 MCF2L TRUE 13:112703425-112703629 8.06E-08 -2.41266 cg02868123 REGL FALSE 8.08E-08 -2.03152 cg00221494 FARP1 TRUE 13:97592322-97594649 8.08E-08 -2.37114 cg07102705 HTR4 TRUE 5:148013473-148014289 8.30E-08 3.6451 cg14287742 BLZF1 FALSE 8.32E-08 2.44897 cg01988129 ADHFE1 TRUE 8:67506956-67507717 8.37E-08 3.37786 cg09516965 PTGDR TRUE 14:51803838-51805763 8.58E-08 7.5344 cg11450827 CLDN5 FALSE 8.58E-08 2.354 cg09061733 SERPING1 FALSE 8.58E-08 2.24911 cg26701826 MGC26963 FALSE 8.58E-08 2.13633 cg15842276 MTNR1B TRUE 11:92342113-92343227 8.58E-08 2.01113 cg15633390 EIF4E TRUE 4:100068241-100070323 8.58E-08 -2.36758 cg14127336 TCL1A TRUE 14:95249952-95250699 8.58E-08 -2.70449 cg25044651 FLJ90650 TRUE 5:115324903-115327577 8.61E-08 2.7704 cg21513553 COL6A2 TRUE 21:46342002-46344009 9.01E-08 4.9201 cg25993718 CBLN4 TRUE 20:54011953-54014199 9.01E-08 2.39989 cg17619823 ADRB3 TRUE 8:37941431-37943204 9.15E-08 5.03331 cg06384463 BARHL2 TRUE 1:90954376-90957242 9.31E-08 2.4654 cg16954341 SCGN TRUE 6:25760176-25760849 9.41E-08 2.35034 cg22951794 OR10A5 FALSE 9.41E-08 -2.09975 cg04490714 SLC6A2 TRUE 16:54246902-54248577 9.45E-08 4.17568

Page 15: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  15  

cg15005385 CCL3L1 FALSE 9.45E-08 -2.25007 cg13553455 COL17A1 FALSE 9.45E-08 -2.63331 cg02624705 DCC TRUE 18:48122009-48122893 9.89E-08 3.81996 cg10303487 DPYS TRUE 8:105547672-105548659 9.96E-08 5.11633 cg03285457 LBX1 TRUE 10:102977805-102980367 9.96E-08 3.03641 cg05246522 KSR1 FALSE 1.02E-07 -3.95067 cg06675478 SOX1 TRUE 13:111768043-111771727 1.03E-07 2.42143 cg13634319 DGKA FALSE 1.03E-07 -2.6856 cg24387380 GABRA5 FALSE 1.04E-07 -2.68192 cg19205041 PHACTR2 TRUE 6:143971046-143971300 1.05E-07 2.44137 cg00446235 F11R TRUE 1:159274828-159275626 1.05E-07 -2.47313 cg22346765 UNC5CL FALSE 1.06E-07 -2.34562 cg18952647 BNC1 TRUE 15:81742719-81745331 1.08E-07 5.43243 cg01091565 MESP1 FALSE 1.08E-07 -2.32357 cg04958389 PRSS2 FALSE 1.11E-07 -2.05586 cg01316819 PPP1R1A TRUE 12:53268136-53269365 1.15E-07 2.35606 cg17178336 IHH TRUE 2:219632914-219634448 1.15E-07 2.14457 cg23595927 MYL5 TRUE 4:661751-662107 1.15E-07 -2.48137 cg22187630 CACNA1A TRUE 19:13477638-13478603 1.15E-07 2.60502 cg06277657 DGKI TRUE 7:137181541-137183087 1.17E-07 2.71328 cg10556064 SMPD3 HS_254 TRUE 16:67038237-67039167 1.20E-07 2.19371 cg08876932 PHOX2A TRUE 11:71632226-71633530 1.20E-07 3.77851 cg00792849 CX36 TRUE 15:32833627-32834868 1.21E-07 2.55482 cg15520279 HOXD8 TRUE 2:176701094-176703914 1.24E-07 5.37946 cg04272086 DCC TRUE 18:48120150-48121583 1.26E-07 2.83962 cg05368341 SYT6 TRUE 1:114496254-114498818 1.28E-07 4.28225 cg16632715 HOXD11 TRUE 2:176679513-176681344 1.29E-07 2.00205 cg24133115 PDE10A TRUE 6:165993753-165998329 1.30E-07 2.79513

Page 16: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  16  

cg06766367 CBFB TRUE 16:65619963-65621646 1.34E-07 -2.00105 cg01381846 HOXA9 HSA-MIR-196B TRUE 7:27170287-27173690 1.34E-07 3.73728 cg10883303 HOXA13 TRUE 7:27205123-27206974 1.40E-07 4.50961 cg19343464 GRIA4 TRUE 11:104986144-104987083 1.43E-07 3.36941 cg03914397 NMUR2 TRUE 5:151764215-151764644 1.45E-07 -2.16026 cg10141715 SLC5A8 TRUE 12:100127137-100128325 1.47E-07 2.68354 cg23147597 CEACAM19 FALSE 1.47E-07 -2.96781 cg23067535 FAM83A FALSE 1.47E-07 -3.33495 cg21591742 HOXD10 TRUE 2:176688866-176689812 1.51E-07 2.49342 cg07195577 TLCD1 TRUE 17:24076897-24078478 1.51E-07 -2.53355 cg08089301 HOXB4 HSA-MIR-10A TRUE 17:44009922-44011032 1.52E-07 5.96701 cg02055963 CDX2 TRUE 13:27440079-27441627 1.52E-07 2.359 cg10486998 GALR1 TRUE 18:73090255-73093032 1.59E-07 2.64874 cg21296230 GREM1 TRUE 15:30796667-30799245 1.64E-07 2.37998 cg15337897 FGF3 TRUE 11:69340815-69343887 1.64E-07 3.08311 cg26767897 XDH FALSE 1.65E-07 -2.86761 cg10236239 SULT1C2 FALSE 1.66E-07 2.01057 cg11389172 SLC18A3 TRUE 10:50486969-50490812 1.67E-07 5.30777 cg16638540 ZNF135 TRUE 19:63262091-63264376 1.67E-07 3.1566 cg15729869 BARHL1 TRUE 9:134444863-134448377 1.67E-07 2.89348 cg21572897 CD80 FALSE 1.68E-07 -2.39131 cg11206634 SFT2D3 TRUE 2:128174687-128174933 1.68E-07 -2.39588 cg13398291 SFRP1 TRUE 8:41284820-41286784 1.68E-07 2.98346 cg23316360 EDNRB TRUE 13:77390197-77391989 1.72E-07 2.48533 cg18702197 HOXD3 TRUE 2:176735581-176736379 1.72E-07 2.41613 cg15526708 TGFBR1 TRUE 9:100906183-100906393 1.73E-07 2.06542 cg04598121 PENK TRUE 8:57520521-57522710 1.75E-07 2.43636 cg15046693 CEBPG TRUE 19:38554952-38557513 1.77E-07 -2.15529

Page 17: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  17  

cg02764897 KRTAP13-1 FALSE 1.77E-07 -2.57968 cg12493906 MMP26 FALSE 1.79E-07 -3.10875 cg08970694 HBE1 FALSE 1.81E-07 -2.10247 cg12128839 HOXA5 TRUE 7:27147983-27152159 1.83E-07 2.77169 cg20644981 RPS3A TRUE 4:152239741-152240869 1.83E-07 -2.07094 cg06738602 PTGER2 TRUE 14:51850265-51852038 1.85E-07 2.17516 cg01009664 TRH TRUE 3:131175550-131177582 1.88E-07 3.98383 cg12127282 HOXD4 HSA-MIR-10B TRUE 2:176722497-176722964 1.88E-07 2.2873 cg22165175 KCNA2 TRUE 1:110950314-110952114 1.88E-07 2.26499 cg18536148 TBX4 TRUE 17:56888442-56889722 1.91E-07 2.31506 cg24363955 FLJ14054 FALSE 1.91E-07 -2.06293 cg01839464 DCC TRUE 18:48122009-48122893 1.92E-07 3.16985 cg02245378 FLJ32447 TRUE 2:222869688-222870350 1.93E-07 2.2714 cg08832227 KCNA1 TRUE 12:4888590-4891600 1.96E-07 2.4544 cg09229912 CUTL2 TRUE 12:109955332-109958599 2.01E-07 5.65727 cg14269477 TRPV5 FALSE 2.05E-07 -2.06139 cg20587394 HOXC13 TRUE 12:52618141-52618612 2.07E-07 2.26805 cg15387123 CLIC3 TRUE 9:139010788-139010998 2.10E-07 -2.08159 cg09829319 GCM2 TRUE 6:10989365-10990276 2.11E-07 2.479 cg13929328 FLJ46831 TRUE 10:129424072-129426344 2.12E-07 2.61376 cg03567830 NTSR1 TRUE 20:60810126-60811791 2.12E-07 2.78629 cg14991487 HOXD9 TRUE 2:176694360-176697263 2.17E-07 5.17898 cg10300684 FOXG1B TRUE 14:28304543-28307751 2.17E-07 2.98996 cg02676865 UBTD1 TRUE 10:99247397-99249880 2.17E-07 2.29301 cg24898753 FTH1 TRUE 11:61490720-61492672 2.17E-07 -2.27526 cg27081230 SLIC1 FALSE 2.17E-07 -2.71043 cg00718513 FALSE 2.17E-07 -2.67718 cg19352038 PAX3 TRUE 2:222872611-222873277 2.19E-07 2.40163

Page 18: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  18  

cg06910100 USHBP1 FALSE 2.22E-07 2.07593 cg07442479 GDNF TRUE 5:37874803-37876645 2.25E-07 2.66934 cg22418909 SFRP1 TRUE 8:41284820-41286784 2.27E-07 2.95419 cg15107670 WT1 TRUE 11:32411271-32413831 2.28E-07 4.0766 cg06005396 HCN2 TRUE 19:540330-542851 2.28E-07 2.92653 cg21243096 POU3F1 TRUE 1:38282510-38286276 2.28E-07 3.37178 cg19594666 LEP TRUE 7:127667928-127668724 2.29E-07 2.47367 cg12457773 VMP TRUE 6:24234185-24234658 2.29E-07 2.33086 cg07773116 GDF10 TRUE 10:48058333-48059403 2.30E-07 2.46637 cg15427448 BACE1 TRUE 11:116691805-116692604 2.33E-07 2.74696 cg13652336 DEPDC2 TRUE 8:69026529-69027679 2.33E-07 3.40292 cg24130010 CHODL TRUE 21:18538786-18539779 2.33E-07 2.40429 cg07260592 LPA TRUE 6:161020066-161020613 2.34E-07 2.22668 cg17412258 DLK1 TRUE 14:100261944-100263514 2.40E-07 2.96304 cg19620294 TNFRSF11B TRUE 8:120033087-120033943 2.47E-07 2.13157 cg14544583 HBB FALSE 2.54E-07 -2.5483 cg05839235 NPR3 TRUE 5:32747415-32750383 2.59E-07 2.79558 cg18818531 FOSL1 TRUE 11:65423063-65425238 2.61E-07 -2.31271 cg22578204 TIMP3 TRUE 22:31527292-31528286 2.63E-07 3.29557 cg18152830 TNFRSF13B FALSE 2.63E-07 -2.26296 cg14144305 ALX4 TRUE 11:44282161-44283221 2.63E-07 2.4529 cg08045570 FOXF2 TRUE 6:1333978-1336464 2.65E-07 4.69074 cg27549944 PLEKHA6 FALSE 2.65E-07 2.56532 cg17241310 BARHL2 TRUE 1:90954376-90957242 2.73E-07 3.5215 cg13548361 PSD TRUE 10:104168633-104172155 2.73E-07 2.47471 cg08539991 ZBTB32 FALSE 2.73E-07 -2.1742 cg05674944 SLC30A2 TRUE 1:26244727-26245945 2.80E-07 2.79897 cg19358493 EMX2 TRUE 10:119290824-119292327 2.80E-07 2.18958

Page 19: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  19  

cg21238818 GAL3ST3 TRUE 11:65572928-65573369 2.82E-07 3.41768 cg05221167 ZNF560 TRUE 19:9469758-9470740 2.82E-07 3.38145 cg11438428 PTF1A TRUE 10:23519872-23522628 2.84E-07 2.91841 cg15873301 SYN2 TRUE 3:12020118-12021768 2.86E-07 3.43127 cg25882366 HOXB2 TRUE 17:43976727-43977325 2.86E-07 2.26329 cg17586860 SSTR4 TRUE 20:22963641-22965145 2.91E-07 2.8127 cg17078393 LCK TRUE 1:32489393-32489639 2.95E-07 -2.23997 cg12508624 THY1 TRUE 11:118798969-118799279 3.01E-07 4.15888 cg22862656 SCGB2A2 FALSE 3.04E-07 -2.2429 cg16254309 CNTNAP2 TRUE 7:145443587-145445242 3.06E-07 2.51635 cg20691580 APOC3 FALSE 3.06E-07 -2.13195 cg03469054 KIAA1944 TRUE 12:128953408-128955096 3.07E-07 2.65622 cg00290506 CNIH3 TRUE 1:222870113-222872558 3.13E-07 4.5894 cg13323752 SLC2A14 TRUE 12:7916455-7917428 3.15E-07 5.63388 cg15852891 OTP TRUE 5:76969992-76971154 3.26E-07 3.69221 cg13699808 PRKCBP1 FALSE 3.29E-07 -2.32208 cg02008154 TBX20 TRUE 7:35259380-35261286 3.41E-07 2.79645 cg21226224 SOX17 TRUE 8:55532627-55535230 3.43E-07 2.54537 cg21604615 SYTL1 FALSE 3.49E-07 -2.44247 cg14289985 ZNF471 TRUE 19:61710502-61711930 3.52E-07 2.46229 cg06038133 CORO6 TRUE 17:24968557-24969595 3.55E-07 -2.42089 cg24719984 PPFIA2 TRUE 12:80676263-80677619 3.55E-07 2.06245 cg03544320 CRMP1 TRUE 4:5944246-5946191 3.61E-07 2.5568 cg09186006 SLC16A12 TRUE 10:91284810-91285992 3.62E-07 2.56426 cg00662556 GALR1 TRUE 18:73090255-73093032 3.74E-07 2.24707 cg13234863 KIAA1944 TRUE 12:128953408-128955096 3.80E-07 2.51958 cg02946850 GRM8 TRUE 7:126670007-126670217 3.87E-07 -2.10437 cg04057858 UNQ9391 FALSE 3.89E-07 -2.14958

Page 20: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  20  

cg00027083 EPB41L3 TRUE 18:5532741-5534303 3.89E-07 3.23326 cg13035743 PRRT1 TRUE 6:32227504-32227869 3.97E-07 3.2487 cg09871315 HOXA2 TRUE 7:27108930-27109394 3.97E-07 3.19624 cg25942450 TLX3 TRUE 5:170667594-170672661 3.99E-07 2.50244 cg18722841 PHOX2A TRUE 11:71632226-71633530 4.04E-07 3.40178 cg04809787 CHRNB1 TRUE 17:7288911-7290043 4.06E-07 -2.12239 cg07703401 HBQ1 TRUE 16:169874-171815 4.09E-07 3.64735 cg03963198 IRX4 TRUE 5:1934753-1937496 4.09E-07 3.23833 cg11494699 RAG1 FALSE 4.09E-07 -2.06124 cg25902889 FSD1 FALSE 4.09E-07 3.53592 cg16869108 VHL TRUE 3:10157841-10160111 4.09E-07 -2.63551 cg03775422 MGC33530 TRUE 7:54577055-54577803 4.14E-07 2.65033 cg12448933 RAB37 TRUE 17:70178644-70179471 4.15E-07 2.14233 cg24507762 KCNB1 TRUE 20:47533199-47533573 4.16E-07 2.06828 cg05389335 TACR3 TRUE 4:104860449-104860859 4.17E-07 2.2877 cg16142218 CHMP7 FALSE 4.17E-07 -2.59749 cg02927346 RASL10B TRUE 17:31082590-31083559 4.20E-07 2.12697 cg03289872 ZNF667 TRUE 19:61679899-61681705 4.21E-07 3.27777 cg18592174 CHAT TRUE 10:50486969-50490812 4.26E-07 2.29967 cg06357925 PTPRO TRUE 12:15366460-15367398 4.34E-07 3.41593 cg14385738 PTPN22 FALSE 4.34E-07 -2.5911 cg18888403 HMGCL TRUE 1:24025359-24025790 4.34E-07 -3.18717 cg14153740 TRY1 FALSE 4.37E-07 -2.247 cg01366419 WBSCR17 TRUE 7:70233977-70236507 4.39E-07 2.13233 cg07903918 GABBR2 TRUE 9:100510345-100511938 4.39E-07 2.87924 cg20723355 FBXO39 TRUE 17:6619831-6620556 4.42E-07 3.07751 cg03872376 ZP4 FALSE 4.48E-07 -2.74121 cg11846956 KLK10 TRUE 19:56211831-56213022 4.49E-07 2.30542

Page 21: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  21  

cg10895543 CDKN2A TRUE 9:21958106-21958899 4.49E-07 2.1288 cg06760035 HOXB4 HSA-MIR-10A TRUE 17:44009922-44011032 4.54E-07 5.91864 cg04784672 LRFN5 TRUE 14:41146888-41147599 4.54E-07 4.06928 cg12164282 PXDN TRUE 2:1725493-1728007 4.54E-07 3.79473 cg15343119 GALR1 TRUE 18:73090255-73093032 4.54E-07 2.83464 cg24820809 MB FALSE 4.54E-07 -2.43919 cg20904010 SYN3 FALSE 4.54E-07 -3.65603 cg08758850 NR6A1 TRUE 9:126571699-126574781 4.57E-07 2.07521 cg09025324 SART2 TRUE 6:116798192-116799590 4.65E-07 2.01651 cg11500797 DLX5 TRUE 7:96489769-96490524 4.71E-07 2.53781 cg19721889 HAND1 TRUE 5:153836987-153838170 4.71E-07 2.35105 cg23303408 POU4F3 TRUE 5:145698070-145700672 4.71E-07 2.14567 cg23563234 PCDHGB7 TRUE 5:140777221-140777959 4.71E-07 2.11298 cg19584957 TTLL10 TRUE 1:1104363-1104588 4.71E-07 -2.74398 cg19064258 HS3ST2 TRUE 16:22732005-22734135 4.77E-07 2.37165 cg06117855 CLEC3B FALSE 4.78E-07 2.2847 cg05788638 SERPINA10 FALSE 4.78E-07 -2.35349 cg01798589 STAC FALSE 4.82E-07 -2.01197 cg17183546 D4S234E TRUE 4:4438335-4440755 4.85E-07 2.19903 cg04956511 PTPN6 TRUE 12:6926143-6926370 4.86E-07 -2.00814 cg04897683 NEUROG1 TRUE 5:134898284-134900130 4.93E-07 2.53987 cg26757673 IL2RB FALSE 5.11E-07 -2.55632 cg07103493 SLC27A6 TRUE 5:128328505-128329501 5.18E-07 3.19939 cg25875213 FLJ37549 TRUE 19:42874560-42875820 5.23E-07 4.80611 cg07452799 PARD3 TRUE 10:35142690-35145597 5.28E-07 2.48165 cg15543551 FGF12 TRUE 3:193927501-193928340 5.30E-07 2.84336 cg20311730 NALP10 FALSE 5.30E-07 -2.45127 cg17827767 LRRC21 FALSE 5.36E-07 -2.19946

Page 22: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  22  

cg23110514 LCE3E FALSE 5.37E-07 -2.16647 cg06793062 CNTNAP4 FALSE 5.49E-07 -2.27457 cg10636246 AIM2 FALSE 5.49E-07 -3.37107 cg16761581 ADCY4 TRUE 14:23873382-23874334 5.50E-07 4.17516 cg08369065 GATA4 TRUE 8:11598711-11600557 5.59E-07 2.72146 cg00902195 SYT10 TRUE 12:33482438-33484467 5.63E-07 3.8495 cg20050826 K6IRS2 TRUE 12:51280885-51281747 5.63E-07 2.16351 cg25266232 DCC TRUE 18:48120150-48121583 5.63E-07 2.16283 cg04123507 KRTHB6 TRUE 12:50981438-50982407 5.63E-07 2.28487 cg05564251 SP140 FALSE 5.74E-07 -2.32754 cg27341860 OR2L13 TRUE 1:246166364-246166628 5.79E-07 -2.51758 cg21816539 GRIK1 TRUE 21:30233131-30234502 5.88E-07 2.61961 cg02194878 EPHA8 TRUE 1:22762037-22763179 5.95E-07 2.60932 cg05472874 SULT4A1 TRUE 22:42589149-42590724 5.98E-07 2.96084 cg26252167 GPR6 TRUE 6:110406816-110407975 6.26E-07 2.9518 cg02595219 KCNE3 TRUE 11:73855641-73856427 6.26E-07 2.176 cg26200585 PRX FALSE 6.26E-07 2.09513 cg27486427 RARB TRUE 3:25444208-25445101 6.29E-07 2.99637 cg27269921 MN1 TRUE 22:26522650-26528990 6.31E-07 5.1564 cg08432727 SOX11 TRUE 2:5748447-5751811 6.31E-07 2.43327 cg08861115 IL1F9 FALSE 6.48E-07 -2.21874 cg10140638 PTPRN TRUE 2:219881750-219882661 6.51E-07 3.68991 cg08477744 MFAP2 FALSE 6.56E-07 -2.34891 cg14757492 DDX49 TRUE 19:18890160-18891963 6.71E-07 -2.29608 cg17410236 FLRT2 TRUE 14:85065422-85066845 6.72E-07 3.21402 cg26721264 GALR1 TRUE 18:73090255-73093032 6.81E-07 2.85653 cg26186727 NETO1 TRUE 18:68684832-68688500 6.89E-07 3.19396 cg09837977 LRRN3 FALSE 6.90E-07 2.21409

Page 23: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  23  

cg19812619 ITGB7 FALSE 6.91E-07 -2.18973 cg12614105 NPY TRUE 7:24289504-24291701 6.99E-07 2.14211 cg05595345 ARRDC4 TRUE 15:96304362-96305847 7.02E-07 3.91649 cg15748507 PRLHR TRUE 10:120343551-120346241 7.03E-07 2.1116 cg10362591 SLC6A2 TRUE 16:54246902-54248577 7.20E-07 3.51945 cg02064106 C6orf118 TRUE 6:165642419-165642983 7.29E-07 2.43734 cg09643544 ZNF177 TRUE 19:9334490-9335153 7.33E-07 2.38206 cg15446391 WT1 TRUE 11:32408563-32409903 7.45E-07 2.04815 cg01369413 UBQLN3 FALSE 7.45E-07 -2.16156 cg22123464 SLC8A2 FALSE 7.55E-07 2.06632 cg15774153 FGF19 TRUE 11:69226001-69229286 7.59E-07 2.336 cg05828624 REG1A FALSE 7.68E-07 -2.38367 cg27634151 GPR83 TRUE 11:93773573-93774675 7.85E-07 2.17976 cg20792062 KCNA5 TRUE 12:5022830-5024746 7.86E-07 2.98382 cg19774122 LAMA2 TRUE 6:129245749-129246433 7.90E-07 2.29011 cg23748737 SCARF2 TRUE 22:19120497-19122837 7.94E-07 2.08582 cg03264414 PAEP FALSE 7.95E-07 -2.1736 cg15835825 HTR5A TRUE 7:154492621-154493154 8.12E-07 3.30618 cg04907257 ADCY2 TRUE 5:7447893-7448574 8.14E-07 2.77185 cg12397274 TINAG FALSE 8.14E-07 -2.31661 cg12865837 SIM1 TRUE 6:101018215-101020149 8.22E-07 2.79988 cg09067967 UGDH HS_74 TRUE 4:39204770-39206279 8.24E-07 2.11922 cg21529533 HLA-G TRUE 6:29903354-29904626 8.24E-07 3.65348 cg05508084 ZNF667 TRUE 19:61679899-61681705 8.24E-07 2.17107 cg06948294 STXBP6 TRUE 14:24587873-24589587 8.29E-07 2.48335 cg17026542 OR2A4 FALSE 8.34E-07 -2.13336 cg24396745 HCN4 TRUE 15:71446780-71449188 8.49E-07 2.64179 cg12810837 CLEC2D FALSE 8.49E-07 -2.63386

Page 24: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  24  

cg02250594 ONECUT2 TRUE 18:53252264-53255385 8.67E-07 2.26358 cg25741452 KITLG TRUE 12:87497441-87498894 8.67E-07 2.15132 cg13351406 LOC284912 FALSE 8.79E-07 -2.21947 cg25971347 FOXF1 TRUE 16:85098762-85102716 8.86E-07 3.88166 cg25484904 FLJ21511 TRUE 4:48682670-48683738 8.86E-07 2.34339 cg14611174 SIX6 TRUE 14:60045112-60046647 8.91E-07 2.89306 cg23767977 KRT6IRS FALSE 8.93E-07 -2.18762 cg09313705 HOXB2 TRUE 17:43977438-43977701 9.09E-07 2.22351 cg13676215 AHR TRUE 7:17304473-17305753 9.11E-07 -2.04538 cg19466563 SPARCL1 FALSE 9.15E-07 2.17221 cg14785479 SCARF2 TRUE 22:19120497-19122837 9.16E-07 3.96352 cg18794577 GRIN3A TRUE 9:103538958-103541050 9.16E-07 3.24566 cg18750960 HOXD4 TRUE 2:176724528-176725909 9.16E-07 3.00254 cg04435420 SGCD FALSE 9.22E-07 2.06872 cg26728422 C16orf28 TRUE 16:1368934-1370224 9.22E-07 -2.23738 cg06971096 PTPRN TRUE 2:219881750-219882661 9.23E-07 2.08316 cg13462129 DLX5 TRUE 7:96487969-96489640 9.37E-07 2.10339 cg03860768 BLK FALSE 9.46E-07 -2.22117 cg05726109 GP1BB TRUE 22:18089052-18090230 9.56E-07 2.19816 cg03238797 ADAMTS18 TRUE 16:76025670-76027234 9.82E-07 2.94672 cg26055770 PDZRN3 TRUE 3:73755774-73757236 9.82E-07 2.72926 cg25596297 FLJ32447 TRUE 2:222871073-222872248 1.03E-06 2.34712 cg25908985 IHH TRUE 2:219632914-219634448 1.05E-06 4.12723 cg00625653 WNT7A TRUE 3:13894306-13897604 1.05E-06 2.477 cg27444994 CDH8 TRUE 16:60625845-60628489 1.05E-06 2.26614 cg00891278 CCDC37 TRUE 3:127595793-127596808 1.06E-06 2.31286 cg26687173 LOC126248 TRUE 19:38314510-38315181 1.07E-06 2.11946 cg14948822 MPV17 FALSE 1.07E-06 -2.61982

Page 25: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  25  

cg23337382 GPR125 TRUE 4:22125798-22127332 1.08E-06 2.27207 cg05890019 WDFY3 TRUE 4:86105707-86107497 1.10E-06 2.22123 cg13749822 HHIP TRUE 4:145785521-145786153 1.11E-06 2.16679 cg13870866 TBX20 TRUE 7:35259380-35261286 1.11E-06 2.56768 cg21414251 OR12D2 FALSE 1.11E-06 -2.53671 cg15613048 KIF17 TRUE 1:20916071-20917733 1.12E-06 4.24682 cg04005707 FLJ21511 TRUE 4:48682670-48683738 1.12E-06 2.95794 cg06498267 HCN1 TRUE 5:45730846-45732499 1.13E-06 2.82176 cg02248486 HOXA5 TRUE 7:27147983-27152159 1.13E-06 2.31547 cg00848461 RAB24 FALSE 1.13E-06 -2.54183 cg21949305 ADORA2A FALSE 1.14E-06 2.45369 cg25806808 CXCL1 TRUE 4:74953702-74954565 1.15E-06 2.26809 cg16787600 SORCS3 TRUE 10:106389391-106393271 1.15E-06 2.8746 cg18674980 CA3 TRUE 8:86537665-86538525 1.16E-06 2.52129 cg14696396 TM6SF1 TRUE 15:81566671-81567867 1.16E-06 4.44764 cg15439862 DSC3 TRUE 18:26874977-26877256 1.20E-06 2.23604 cg20632573 SLC6A5 TRUE 11:20577839-20578244 1.21E-06 2.34605 cg15551881 TRAF1 FALSE 1.21E-06 -2.02204 cg09458237 HSPA12B FALSE 1.21E-06 -2.10063 cg20380069 MSI1 TRUE 12:119290126-119292199 1.23E-06 2.82197 cg13878010 ADCY5 TRUE 3:124648852-124651990 1.23E-06 2.34668 cg25500444 FLJ23514 TRUE 11:85763239-85763775 1.23E-06 2.13902 cg24784109 HIST1H3D TRUE 6:26307702-26308258 1.25E-06 2.40398 cg20616414 WNK2 TRUE 9:94985882-94988045 1.26E-06 2.80121 cg09068492 CALCA TRUE 11:14949913-14950639 1.26E-06 2.07867 cg23242898 DCC TRUE 18:48120150-48121583 1.26E-06 3.08336 cg12120741 EDNRB TRUE 13:77390197-77391989 1.26E-06 2.4118 cg12699371 GALR1 TRUE 18:73090255-73093032 1.26E-06 2.32417

Page 26: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  26  

cg00685836 PPFIA2 TRUE 12:80676263-80677619 1.26E-06 2.25755 cg19317715 AOC2 FALSE 1.26E-06 2.19073 cg06051311 TRIM15 FALSE 1.28E-06 -3.4203 cg20339230 ST8SIA2 TRUE 15:90737192-90739561 1.29E-06 2.87623 cg26162582 KCNA6 TRUE 12:4788356-4789953 1.30E-06 3.30936 cg04072323 SKIP TRUE 2:228754274-228755173 1.30E-06 2.52826 cg24199834 POU4F2 TRUE 4:147778503-147781596 1.30E-06 2.43876 cg16924616 DLX5 TRUE 7:96490825-96492388 1.32E-06 3.24594 cg02613386 FBXO39 TRUE 17:6619831-6620556 1.32E-06 2.16294 cg05647859 LIN7A TRUE 12:79854346-79856034 1.35E-06 2.69505 cg20535085 SLAMF1 FALSE 1.36E-06 -2.37723 cg20008332 SOX11 TRUE 2:5748447-5751811 1.36E-06 2.76023 cg13449778 C1orf76 TRUE 1:177978406-177980764 1.40E-06 3.84967 cg03755123 UTF1 TRUE 10:134892848-134895213 1.41E-06 2.24299 cg23695504 FLJ45717 TRUE 1:245340797-245342416 1.41E-06 2.23393 cg18484189 NALP10 FALSE 1.41E-06 -2.23743 cg02168291 CDH13 TRUE 16:81228442-81229032 1.41E-06 -2.56137 cg06713098 IGFBP3 TRUE 7:45926064-45928161 1.44E-06 2.2995 cg13672342 WDR39 FALSE 1.47E-06 2.15999 cg20312228 CCDC37 TRUE 3:127595793-127596808 1.48E-06 2.23686 cg02440177 ZNF702 TRUE 19:58187792-58189007 1.49E-06 3.10691 cg25094569 WT1 TRUE 11:32404590-32406255 1.52E-06 2.64322 cg13168820 PTPRT TRUE 20:41249910-41252787 1.55E-06 3.08803 cg06295856 CALCA TRUE 11:14949913-14950639 1.55E-06 2.48149 cg18573383 KCNC2 TRUE 12:73889130-73889725 1.55E-06 2.07829 cg05521696 SLC2A14 TRUE 12:7916455-7917428 1.56E-06 2.49955 cg22680204 CRMP1 TRUE 4:5944246-5946191 1.58E-06 2.59307 cg07175883 HOXD13 TRUE 2:176664597-176666676 1.60E-06 3.68402

Page 27: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  27  

cg25563456 WT1 TRUE 11:32411271-32413831 1.60E-06 3.4131 cg00187686 TCN1 FALSE 1.62E-06 -2.52015 cg06638451 FAM107A FALSE 1.64E-06 3.06781 cg23002761 FBLIM1 TRUE 1:15957249-15958531 1.65E-06 2.64802 cg21269934 FLJ37478 TRUE 4:2029497-2032981 1.66E-06 2.99401 cg11376198 AFAR3 TRUE 1:19472685-19473594 1.66E-06 2.53215 cg27603796 CTTNBP2 TRUE 7:117300033-117301241 1.66E-06 2.23458 cg04456238 WT1 TRUE 11:32406516-32407359 1.66E-06 2.02283 cg25766046 ROR2 TRUE 9:93750567-93753002 1.68E-06 2.22158 cg01643580 KCNK3 TRUE 2:26768692-26770187 1.69E-06 2.20389 cg01683883 CMTM2 TRUE 16:65170077-65171345 1.69E-06 2.09169 cg23760945 ELOF1 FALSE 1.69E-06 -3.25901 cg23182299 LHX5 TRUE 12:112393114-112395188 1.72E-06 2.44826 cg24169822 HOXA4 TRUE 7:27136060-27137536 1.72E-06 2.02008 cg26024843 COL5A1 TRUE 9:136672880-136674375 1.72E-06 2.85393 cg15062535 ZNF610 FALSE 1.72E-06 -2.11557 cg18692273 TNNT2 FALSE 1.78E-06 -2.33362 cg00911351 PCDHGB4 TRUE 5:140747256-140748040 1.80E-06 2.13275 cg11021744 SLC6A1 TRUE 3:11008964-11010557 1.81E-06 2.13286 cg12781568 WT1 TRUE 11:32408563-32409903 1.82E-06 2.43257 cg20903926 C1orf177 TRUE 1:55044094-55044620 1.86E-06 2.0767 cg12847373 EDNRB TRUE 13:77390197-77391989 1.87E-06 2.472 cg21376883 ACTN2 TRUE 1:234915582-234917127 1.88E-06 2.15305 cg08190291 ADAMTS5 TRUE 21:27259555-27262238 1.88E-06 2.73704 cg08331313 SPARC TRUE 5:151046409-151046704 1.88E-06 2.8601 cg04062391 ZNF560 TRUE 19:9469758-9470740 1.88E-06 2.46327 cg07237939 SLC22A3 TRUE 6:160688754-160690468 1.88E-06 3.46703 cg05382123 CSMD2 TRUE 1:34405032-34405247 1.92E-06 3.55791

Page 28: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  28  

cg17749443 ZFP37 TRUE 9:114858290-114859224 1.92E-06 2.90737 cg20449692 CLDN11 TRUE 3:171618793-171620758 1.92E-06 2.83107 cg05902852 MAGI2 TRUE 7:78920008-78920436 1.92E-06 2.49396 cg18841634 DCC TRUE 18:48122009-48122893 1.92E-06 2.06346 cg22658979 MMP13 FALSE 1.92E-06 -2.28877 cg06790324 GRB10 TRUE 7:50827590-50829211 1.92E-06 2.65483 cg02028524 ATXN3 TRUE 14:91641896-91643508 1.92E-06 -2.84122 cg21509023 HBA2 TRUE 16:162188-163573 1.94E-06 2.55707 cg22231902 EN1 TRUE 2:119320039-119321504 1.95E-06 3.02218 cg25659818 CCL4 FALSE 1.98E-06 -2.39888 cg08634024 OR2F1 FALSE 2.00E-06 -2.95053 cg07846220 LAMA1 TRUE 18:7106780-7108404 2.02E-06 3.11571 cg12645220 PAK7 TRUE 20:9766686-9767917 2.05E-06 2.44099 cg16092786 WT1 TRUE 11:32411271-32413831 2.06E-06 2.12219 cg13701109 ADAMTS19 TRUE 5:128823186-128825406 2.07E-06 2.0821 cg14289511 FLJ45256 TRUE 16:24589802-24590005 2.08E-06 3.10908 cg07480567 HS3ST3A1 TRUE 17:13444405-13446633 2.08E-06 2.85405 cg06940792 MEGF10 TRUE 5:126653575-126654899 2.11E-06 2.63076 cg01295203 PRDM14 TRUE 8:71144163-71147771 2.11E-06 2.28627 cg24322623 MYOD1 TRUE 11:17696873-17700534 2.11E-06 2.45156 cg25226247 TFAP2B TRUE 6:50893057-50893267 2.11E-06 2.03484 cg07747336 C11orf38 FALSE 2.12E-06 -2.01272 cg13797031 NIPSNAP1 FALSE 2.12E-06 -2.46585 cg01335367 C12orf34 FALSE 2.14E-06 2.43631 cg07823492 HOXB1 TRUE 17:43962669-43963416 2.18E-06 3.10059 cg06119575 TAL2 FALSE 2.19E-06 -2.08507 cg06995715 FOXL1 TRUE 16:85169562-85171358 2.19E-06 2.88341 cg03177025 RBKS TRUE 2:27966400-27967788 2.20E-06 -2.34457

Page 29: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  29  

cg24355048 CTSG FALSE 2.22E-06 -2.34649 cg13279585 FALSE 2.26E-06 -2.07642 cg21176048 PEX5L TRUE 3:181236808-181238032 2.29E-06 2.80104 cg00386408 TGFBI TRUE 5:135392351-135393108 2.30E-06 2.27873 cg03909500 RLN3R1 TRUE 5:33971758-33974643 2.32E-06 2.09202 cg05440289 IVL FALSE 2.34E-06 -3.37375 cg12163490 CDH11 TRUE 16:63711989-63713859 2.35E-06 3.11896 cg20387341 FGF10 TRUE 5:44424995-44425196 2.36E-06 2.15025 cg25361106 TLX2 TRUE 2:74593784-74597576 2.36E-06 2.22949 cg07072643 EMR3 FALSE 2.37E-06 -2.75639 cg24645221 PENK TRUE 8:57520521-57522710 2.39E-06 2.64992 cg07447922 EPHA10 TRUE 1:38002292-38003550 2.40E-06 2.92987 cg18349835 VIPR2 TRUE 7:158629176-158631369 2.44E-06 4.74152 cg23984130 FALSE 2.44E-06 -2.60448 cg08475088 NALP9 FALSE 2.45E-06 -2.17732 cg19018097 FLJ30934 TRUE 11:65357151-65358255 2.46E-06 4.41913 cg22946150 SH3GL3 TRUE 15:81906373-81908113 2.51E-06 5.71184 cg02899772 EFCBP2 TRUE 16:82558607-82560544 2.51E-06 2.51872 cg02833180 PLCL1 FALSE 2.52E-06 -2.76857 cg12265829 ADCY4 TRUE 14:23873382-23874334 2.52E-06 3.08916 cg21902327 FGF6 TRUE 12:4425015-4425387 2.52E-06 -2.11678 cg01446393 FAM107A FALSE 2.53E-06 2.25757 cg19210770 ACCN4 TRUE 2:220087160-220087532 2.53E-06 2.95358 cg04549333 ALX4 TRUE 11:44282161-44283221 2.56E-06 2.81387 cg12024292 ASTN2 TRUE 9:119217706-119217908 2.56E-06 2.37354 cg07028533 CNTNAP2 TRUE 7:145443587-145445242 2.58E-06 2.38421 cg01593886 COL1A1 TRUE 17:45632846-45634134 2.61E-06 2.15241 cg17749456 HSPBP1 FALSE 2.61E-06 2.07424

Page 30: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  30  

cg12680609 ZFP41 TRUE 8:144399366-144401373 2.65E-06 3.16031 cg18940763 XBP1 TRUE 22:27527316-27527912 2.66E-06 -2.93169 cg26599006 GSCL TRUE 22:17516279-17519015 2.66E-06 3.99266 cg05722918 SLC5A8 TRUE 12:100127137-100128325 2.67E-06 2.37179 cg00848728 DAB1 TRUE 1:58487718-58488900 2.72E-06 4.48978 cg03388193 HPSE2 TRUE 10:100985526-100986619 2.73E-06 2.68428 cg26656452 HABP2 FALSE 2.73E-06 -2.14148 cg13461622 RUNX3 FALSE 2.76E-06 -2.9922 cg22774472 COL5A2 FALSE 2.77E-06 2.05679 cg09053680 UTF1 TRUE 10:134892848-134895213 2.79E-06 4.4996 cg20804821 GPR62 FALSE 2.79E-06 -2.26261 cg19001226 HOXD1 TRUE 2:176761031-176762998 2.80E-06 2.28128 cg01693350 WT1 TRUE 11:32408563-32409903 2.80E-06 2.13479 cg18438777 NPY5R TRUE 4:164484179-164485334 2.80E-06 2.49487 cg15839448 SFRP1 TRUE 8:41284820-41286784 2.82E-06 2.61247 cg18107072 CGI-38 TRUE 16:65984642-65986631 2.83E-06 2.09651 cg14820573 C16orf33 TRUE 16:41660-44528 2.85E-06 -2.34498 cg19332710 RIMS4 TRUE 20:42872025-42873152 2.87E-06 4.50722 cg22268164 TRHR FALSE 2.89E-06 -3.64994 cg17279839 RARRES2 TRUE 7:149668838-149669799 2.92E-06 2.18239 cg24471894 KIAA0020 FALSE 3.00E-06 2.04591 cg15749748 GATA5 TRUE 20:60482564-60485407 3.01E-06 2.85085 cg21790626 ZNF154 TRUE 19:62911404-62912681 3.05E-06 4.82695 cg20950011 CIDEA TRUE 18:12243955-12245559 3.05E-06 2.22077 cg22026853 POU3F2 TRUE 6:99385538-99390983 3.07E-06 2.7043 cg01313514 WNT3A TRUE 1:226260592-226262296 3.08E-06 2.44828 cg21272774 HOXC9 TRUE 12:52679492-52681028 3.13E-06 2.05099 cg13663218 TRHDE TRUE 12:70951587-70954434 3.20E-06 2.56247

Page 31: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  31  

cg02309273 INPP5B FALSE 3.21E-06 2.49668 cg23422659 WNT9B TRUE 17:42283210-42284807 3.35E-06 3.09175 cg25509184 CFTR TRUE 7:116906608-116907566 3.37E-06 2.22439 cg16232126 SLC5A7 TRUE 2:107969161-107970056 3.40E-06 2.41714 cg00333226 UNQ739 TRUE 7:49783381-49786431 3.45E-06 2.20382 cg03538436 NOS1 TRUE 12:116282437-116283985 3.45E-06 2.51487 cg23432345 HOXA7 TRUE 7:27162023-27163571 3.46E-06 3.34255 cg10484958 PCDH8 TRUE 13:52321839-52322137 3.51E-06 2.22123 cg23213217 DEGS1 TRUE 1:222436457-222436832 3.51E-06 -2.28044 cg04263186 TACR3 TRUE 4:104859633-104860226 3.55E-06 2.00067 cg10588377 HTRA1 TRUE 10:124210154-124212965 3.59E-06 2.17831 cg20025656 ACTA1 TRUE 1:227636046-227637395 3.60E-06 2.08214 cg18236477 ATP8A2 TRUE 13:24940557-24941659 3.60E-06 2.27454 cg15772361 SERPINB3 FALSE 3.63E-06 -2.2653 cg25711779 EFEMP1 TRUE 2:56003158-56003360 3.64E-06 2.01686 cg19435264 SLC6A7 TRUE 5:149549539-149550189 3.67E-06 2.97727 cg14614211 IRXL1 TRUE 10:28069904-28075974 3.67E-06 2.28889 cg05890484 BHMT TRUE 5:78443059-78443592 3.68E-06 2.98942 cg00263760 VAX1 TRUE 10:118885881-118888114 3.68E-06 2.21281 cg19402885 PTPRO TRUE 12:15366460-15367398 3.68E-06 2.52075 cg11003133 AIM2 FALSE 3.68E-06 -2.72587 cg25920792 HTRA1 TRUE 10:124210154-124212965 3.71E-06 2.52333 cg17190608 CUTL2 TRUE 12:109955332-109958599 3.73E-06 2.77399 cg22375192 IGF1R TRUE 15:97007820-97012261 3.73E-06 3.05694 cg17460386 FAIM3 FALSE 3.74E-06 -2.07673 cg05485060 CTNNAL1 FALSE 3.79E-06 3.47167 cg17457560 NRG1 TRUE 8:32524620-32526592 3.79E-06 2.1851 cg11466837 TRIM29 FALSE 3.79E-06 -2.3908

Page 32: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  32  

cg06509940 CD80 FALSE 3.79E-06 -2.75327 cg02161900 IRXL1 TRUE 10:28069904-28075974 3.85E-06 2.05312 cg24298280 IBRDC1 FALSE 3.86E-06 -2.0932 cg20630151 BPESC1 FALSE 3.89E-06 -2.03364 cg25259754 FCRL3 FALSE 3.93E-06 -2.05257 cg25234611 VWA1 TRUE 1:1360547-1361469 3.93E-06 2.03741 cg24879335 TF TRUE 3:134947530-134948249 4.00E-06 2.28692 cg15699524 FGF18 TRUE 5:170778207-170780988 4.09E-06 -2.09439 cg17834752 KCNK9 TRUE 8:140783691-140786611 4.09E-06 2.9726 cg16708981 ZNF677 TRUE 19:58449450-58450521 4.15E-06 2.20435 cg12768605 LYPD5 TRUE 19:49016124-49016916 4.20E-06 2.98091 cg08535373 SLC32A1 TRUE 20:36785163-36788096 4.24E-06 2.19842 cg09227563 CDC42EP5 FALSE 4.25E-06 2.41989 cg10398682 BNC1 TRUE 15:81742719-81745331 4.30E-06 2.07179 cg23338195 SLC30A8 FALSE 4.38E-06 -2.32828 cg08307963 GJA5 FALSE 4.40E-06 2.85241 cg19970051 HBA1 TRUE 16:165988-167377 4.40E-06 2.31638 cg23710218 MSC TRUE 8:72918220-72919616 4.47E-06 2.05944 cg24719601 PHOX2B TRUE 4:41445013-41445297 4.66E-06 3.03539 cg16098981 C20orf39 TRUE 20:24397795-24400183 4.72E-06 2.42638 cg03329165 CD200 TRUE 3:113534483-113535262 4.76E-06 2.70554 cg25250358 PLOD2 TRUE 3:147361008-147362052 4.76E-06 2.45114 cg26256793 COL11A1 TRUE 1:103346874-103347272 4.77E-06 2.10908 cg07104706 SLITRK1 TRUE 13:83354079-83354687 4.81E-06 3.3438 cg03506489 KCNA4 TRUE 11:29994145-29995779 4.82E-06 2.41892 cg06291867 HTR7 TRUE 10:92606622-92608182 4.87E-06 2.32556 cg17371081 NELL1 TRUE 11:20647014-20648558 4.94E-06 3.74494 cg20640433 LAMA2 TRUE 6:129245749-129246433 4.97E-06 2.61882

Page 33: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  33  

cg20557202 SLC5A5 TRUE 19:17843659-17846374 5.08E-06 2.44721 cg00347904 SCUBE3 TRUE 6:35288946-35290652 5.16E-06 2.33222 cg10065825 CDH11 TRUE 16:63714025-63715302 5.19E-06 2.41176 cg14182690 RUNX3 TRUE 1:25163450-25163764 5.25E-06 -2.11029 cg04478795 SMO TRUE 7:128615303-128616882 5.33E-06 2.0251 cg12718562 TBC1D21 FALSE 5.33E-06 -2.253 cg13870494 MAMDC2 FALSE 5.34E-06 2.48011 cg27342801 REG3A FALSE 5.36E-06 -2.82789 cg23326197 CYP3A4 FALSE 5.36E-06 -2.31349 cg22341310 ZNF541 TRUE 19:52739817-52741134 5.39E-06 2.54339 cg12348970 SLC24A2 FALSE 5.40E-06 -2.46928 cg04810997 TAS2R60 FALSE 5.42E-06 -3.11472 cg23357981 GRP TRUE 18:55037964-55038946 5.44E-06 2.06416 cg02497758 MAFB TRUE 20:38749293-38753776 5.48E-06 2.4449 cg19589427 TNFSF18 FALSE 5.52E-06 2.02326 cg11695358 MAPK15 TRUE 8:144869897-144871106 5.52E-06 -2.13502 cg04947157 TMC6 TRUE 17:73638939-73640147 5.52E-06 -2.33294 cg02992632 FGF12 TRUE 3:193927501-193928340 5.52E-06 2.14995 cg07307078 TUBB6 TRUE 18:12297135-12299229 5.53E-06 3.22415 cg19167673 PDGFB TRUE 22:37967669-37970999 5.59E-06 2.50658 cg06268694 CELSR1 TRUE 22:45307876-45313678 5.65E-06 2.02536 cg19420968 HCRTR1 FALSE 5.68E-06 -2.18718 cg10807560 SLC8A1 FALSE 5.73E-06 -2.26701 cg24775607 SSTR5 TRUE 16:1068532-1070086 5.74E-06 -3.43788 cg27076139 ADCYAP1R1 TRUE 7:31058165-31059618 5.81E-06 2.6895 cg01805540 CACNB2 TRUE 10:18468996-18470353 5.85E-06 4.74072 cg18328334 TNS1 FALSE 5.86E-06 2.19864 cg04434339 ST6GAL2 TRUE 2:106868671-106870745 5.93E-06 3.2636

Page 34: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  34  

cg23671708 TMEM88 TRUE 17:7698629-7699147 5.94E-06 2.1306 cg19674669 LOC112937 TRUE 11:133650695-133652899 6.00E-06 2.80181 cg20792294 IGF2AS TRUE 11:2121548-2122565 6.03E-06 2.69545 cg05194726 NRIP2 FALSE 6.04E-06 2.44156 cg07376535 ADCYAP1 TRUE 18:894388-895760 6.04E-06 2.60292 cg15238224 TRIM31 FALSE 6.09E-06 -2.41331 cg25943276 C11orf39 TRUE 11:131038328-131038553 6.12E-06 -2.37699 cg08668790 ZNF154 TRUE 19:62911404-62912681 6.14E-06 3.0182 cg22674717 SALL1 TRUE 16:49740746-49744758 6.15E-06 2.60966 cg14400118 MMP2 TRUE 16:54070224-54071206 6.17E-06 2.42062 cg13843613 FAM5B TRUE 1:175406600-175407433 6.23E-06 2.39235 cg12864235 CDH9 FALSE 6.30E-06 2.49903 cg24924779 KCNG1 TRUE 20:49072343-49074019 6.31E-06 3.26259 cg21312148 LCE2D FALSE 6.40E-06 -2.42674 cg25465406 GUCY2D TRUE 17:7846663-7848252 6.40E-06 2.87463 cg23693510 ELAVL2 FALSE 6.47E-06 2.02407 cg17910564 VDAC3 TRUE 8:42367847-42369205 6.47E-06 -2.15197 cg01909245 LSP1 FALSE 6.53E-06 -2.37454 cg02723533 CCND1 TRUE 11:69177944-69178494 6.64E-06 -7.47976 cg20740029 SLC5A8 TRUE 12:100127137-100128325 6.66E-06 2.33454 cg17819635 TCTEX1D1 TRUE 1:66990232-66990994 6.66E-06 2.15435 cg01853981 KRT6IRS FALSE 6.75E-06 -2.22439 cg14262937 OPRM1 FALSE 6.82E-06 2.40878 cg19403023 TESSP1 TRUE 16:2788298-2789206 6.92E-06 2.02527 cg04833845 KCNN4 FALSE 6.96E-06 -2.2595 cg27403635 KCNN2 TRUE 5:113724294-113727611 6.97E-06 2.27675 cg21842478 HOXB13 TRUE 17:44160495-44161328 6.98E-06 3.14898 cg12783776 SERPING1 TRUE 11:57122371-57122627 6.98E-06 2.09461

Page 35: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  35  

cg23349790 IGSF21 TRUE 1:18306563-18308558 7.03E-06 4.22004 cg23040064 JPH3 TRUE 16:86192657-86194851 7.03E-06 2.09918 cg15078027 MGC10471 TRUE 19:13719267-13720489 7.05E-06 -4.49162 cg17771150 LCP1 FALSE 7.06E-06 -2.06768 cg23833452 RAB32 TRUE 6:146905888-146907189 7.19E-06 2.20592 cg10293925 AMPH TRUE 7:38636784-38637861 7.20E-06 2.36901 cg08047907 C1orf114 TRUE 1:167662982-167663571 7.32E-06 3.18407 cg26922202 OR2S2 FALSE 7.34E-06 -2.50113 cg24440147 PKN3 TRUE 9:130504016-130505778 7.35E-06 -2.28838 cg14603345 BTBD3 TRUE 20:11819237-11820394 7.40E-06 2.17294 cg06856528 TMEFF2 TRUE 2:192767148-192769234 7.42E-06 2.6728 cg14419187 C2orf21 TRUE 2:210344445-210345400 7.47E-06 3.54948 cg21241823 PRDM15 TRUE 21:42171521-42173135 7.47E-06 -2.13313 cg13391235 MGC22001 FALSE 7.64E-06 -2.10886 cg11832722 DSC3 TRUE 18:26874977-26877256 7.71E-06 2.00509 cg18110483 THBS4 TRUE 5:79366520-79367294 7.80E-06 2.12199 cg03310469 SIX2 TRUE 2:45088859-45091382 7.80E-06 2.89242 cg08422599 KCNIP1 TRUE 5:169863374-169864167 7.83E-06 2.39651 cg09952204 RASGRF2 TRUE 5:80291476-80292927 7.85E-06 3.39371 cg27196745 PTPRO TRUE 12:15366460-15367398 7.87E-06 2.48855 cg06954481 GBX2 TRUE 2:236740341-236743648 7.95E-06 4.12383 cg21621248 LRRTM1 TRUE 2:80384714-80385520 7.96E-06 2.07959 cg20937139 PDGFC TRUE 4:158111741-158112821 8.28E-06 2.81039 cg21602520 BCL2 TRUE 18:59136000-59137100 8.30E-06 -2.17188 cg04340502 GSTA3 FALSE 8.30E-06 -2.23943 cg07558455 ANKRD38 TRUE 1:62556581-62557855 8.31E-06 2.89852 cg19461621 COLEC12 TRUE 18:489195-491170 8.32E-06 2.54422 cg06552037 INHA TRUE 2:220142916-220144549 8.37E-06 2.4734

Page 36: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  36  

cg12238343 RLN3R1 TRUE 5:33971758-33974643 8.49E-06 2.97551 cg07080358 C2orf32 TRUE 2:68399390-68400827 8.68E-06 3.06997 cg14094960 EGFR TRUE 7:55053462-55055739 8.70E-06 2.66384 cg02579133 KRTAP10-8 FALSE 8.72E-06 -2.3585 cg03876618 IGFBP7 TRUE 4:57670424-57671769 8.83E-06 2.44681 cg14399060 HOXD4 HSA-MIR-10B TRUE 2:176723073-176723586 8.84E-06 2.23736 cg07525077 RNASE3 FALSE 8.85E-06 -2.40293 cg23771603 MYO3A TRUE 10:26262759-26264179 8.98E-06 3.14961 cg07766612 SLC30A2 TRUE 1:26244727-26245945 9.05E-06 2.93471 cg27096144 MSX2 TRUE 5:174083930-174085512 9.05E-06 2.38877 cg21937886 GFPT2 TRUE 5:179711599-179713980 9.05E-06 2.05992 cg10968815 BPIL1 FALSE 9.05E-06 -2.01471 cg10849854 GREB1 FALSE 9.09E-06 -2.26875 cg26656135 EYA4 TRUE 6:133603260-133605886 9.11E-06 3.56387 cg23239396 CALCR TRUE 7:93041484-93042067 9.26E-06 2.05722 cg26850754 CD8B1 TRUE 2:86942147-86942826 9.46E-06 2.73981 cg01468621 BRSK2 TRUE 11:1366566-1368933 9.53E-06 2.67975 cg04995095 CD300E FALSE 9.58E-06 -2.19138 cg24302095 GRB10 TRUE 7:50827590-50829211 9.60E-06 2.03967 cg08458170 ZNF537 TRUE 19:36462141-36462441 9.85E-06 -2.77132 cg07824742 DBH TRUE 9:135491544-135491749 9.87E-06 -4.0546 cg02244695 HCA112 TRUE 7:150127662-150129289 9.97E-06 2.3525 cg20073553 BAPX1 TRUE 4:13154295-13155884 1.00E-05 2.20696 cg26780333 ACOT4 TRUE 14:73126775-73129376 1.02E-05 2.22046 cg20011352 GPR124 TRUE 8:37773009-37775531 1.02E-05 2.36113 cg19162158 NRG1 TRUE 8:32524620-32526592 1.02E-05 2.53723 cg06183267 AFF3 FALSE 1.02E-05 -2.04458 cg08996413 SLA FALSE 1.03E-05 -2.04095

Page 37: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  37  

cg16983159 LOC340061 FALSE 1.04E-05 2.20094 cg15760840 HOXA11 TRUE 7:27189699-27192279 1.05E-05 3.45984 cg14893163 PCDH17 TRUE 13:57103762-57107288 1.05E-05 2.15327 cg21970438 TTLL2 FALSE 1.05E-05 -2.15 cg00775197 MGC33600 TRUE 6:44372906-44373799 1.05E-05 2.06823 cg16773899 EDIL3 TRUE 5:83715224-83716675 1.07E-05 2.58262 cg12629325 PCDHAC1 TRUE 5:140285591-140287803 1.08E-05 2.03363 cg11812218 GHSR TRUE 3:173647899-173649789 1.08E-05 2.26001 cg12005098 SLC16A12 TRUE 10:91284810-91285992 1.09E-05 2.85615 cg18972811 SLIT2 TRUE 4:19862145-19866149 1.09E-05 2.2493 cg16158220 REGL FALSE 1.11E-05 -2.27175 cg10318258 RIPK3 FALSE 1.11E-05 -2.04473 cg15843823 ALOX15 TRUE 17:4489288-4492318 1.12E-05 2.00282 cg01322134 WNT3A TRUE 1:226260592-226262296 1.14E-05 3.34572 cg10364513 RXRG TRUE 1:163680824-163681333 1.15E-05 2.78641 cg27214774 C20orf175 FALSE 1.15E-05 2.33238 cg24335895 COX7A1 TRUE 19:41335049-41335665 1.15E-05 2.61414 cg11428724 PAX7 TRUE 1:18829151-18831168 1.17E-05 2.5399 cg08315770 KCNK17 TRUE 6:39388660-39390286 1.17E-05 2.45542 cg20972214 SLC2A3 FALSE 1.18E-05 2.88502 cg12300353 KCTD8 TRUE 4:44143998-44146017 1.19E-05 3.2435 cg21578906 SLC5A4 FALSE 1.19E-05 -2.01611 cg22161476 FSTL5 TRUE 4:163304513-163305102 1.21E-05 2.21267 cg10942056 DISP1 FALSE 1.21E-05 2.07465 cg00463848 KRT2A FALSE 1.21E-05 -2.85786 cg09416313 MATK TRUE 19:3751859-3753013 1.22E-05 3.19361 cg24562819 THBD TRUE 20:22976034-22978669 1.23E-05 2.38859 cg05766474 CCL16 FALSE 1.23E-05 -2.08247

Page 38: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  38  

cg17277529 FGF3 TRUE 11:69340815-69343887 1.23E-05 2.31104 cg11732619 SLIT3 TRUE 5:168659641-168661256 1.23E-05 3.42421 cg12294121 GABRB1 TRUE 4:46728135-46728396 1.24E-05 2.28031 cg13948987 HCRTR2 TRUE 6:55147028-55147806 1.25E-05 2.63327 cg15435730 TFAP2D FALSE 1.26E-05 2.92006 cg20916523 VHL TRUE 3:10157841-10160111 1.27E-05 -2.52229 cg14225485 FAM84A TRUE 2:14689650-14690612 1.27E-05 2.93043 cg18106189 C18orf1 TRUE 18:13206965-13209279 1.29E-05 3.32751 cg16933388 BSN TRUE 3:49566406-49567668 1.29E-05 2.24228 cg19965810 KCNH7 TRUE 2:163403754-163404083 1.29E-05 2.4831 cg14294758 LSAMP TRUE 3:117646277-117646814 1.29E-05 2.02467 cg09985279 MT1G TRUE 16:55259326-55259869 1.29E-05 2.07131 cg06771126 HOP FALSE 1.30E-05 -2.05708 cg15433631 IRX2 TRUE 5:2801201-2806084 1.31E-05 2.35367 cg06015218 GRM1 TRUE 6:146391972-146392561 1.31E-05 2.08754 cg15128898 FLJ23577 TRUE 5:35653200-35654237 1.31E-05 -2.05745 cg02442161 PI3 FALSE 1.32E-05 -2.87856 cg08185661 SYT9 TRUE 11:7229230-7230968 1.33E-05 2.89451 cg13928961 K6IRS3 FALSE 1.34E-05 -2.81417 cg21033494 MSMB FALSE 1.34E-05 -2.0936 cg23473904 COL6A2 TRUE 21:46342002-46344009 1.35E-05 2.26209 cg04101379 DZIP1 TRUE 13:95093824-95095248 1.35E-05 2.03047 cg23694248 PTPRR TRUE 12:69600579-69600967 1.36E-05 2.10994 cg19718882 WIT-1 TRUE 11:32414112-32415699 1.37E-05 2.01295 cg20645065 ALPL TRUE 1:21707968-21709033 1.38E-05 3.4197 cg25082710 IVL FALSE 1.38E-05 -2.01819 cg13397379 OR2C3 FALSE 1.39E-05 -2.15449 cg18671950 FBN1 TRUE 15:46723986-46725968 1.39E-05 3.90538

Page 39: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  39  

cg25538571 FLJ46365 FALSE 1.41E-05 -2.13395 cg11084611 C2orf27 FALSE 1.46E-05 -2.07184 cg15309006 LOC63928 TRUE 16:23673167-23674440 1.49E-05 2.62238 cg24244000 GABRG3 FALSE 1.50E-05 -2.15262 cg12277666 TDRD5 TRUE 1:177827192-177828306 1.50E-05 2.62053 cg12420104 DMRT3 TRUE 9:966066-967963 1.50E-05 2.48929 cg09126273 PTPRO TRUE 12:15366460-15367398 1.52E-05 2.48358 cg06386517 GRB10 TRUE 7:50827590-50829211 1.52E-05 2.45579 cg02854090 HIST1H2AA TRUE 6:25834320-25834896 1.52E-05 -2.5375 cg11733245 IL2RA TRUE 10:6144187-6144388 1.54E-05 -2.03616 cg15303841 RFPL1 FALSE 1.54E-05 -2.08231 cg01471384 DKK2 TRUE 4:108175935-108177125 1.54E-05 2.13051 cg18815943 FOXE3 TRUE 1:47654351-47655785 1.55E-05 2.37277 cg23413307 LCE1F FALSE 1.55E-05 -2.69364 cg20383064 BFSP2 FALSE 1.55E-05 -2.04691 cg19576304 RAX TRUE 18:55090435-55093173 1.56E-05 2.07668 cg14458834 HOXB4 HSA-MIR-10A TRUE 17:44009922-44011032 1.56E-05 3.945 cg17108819 CD8A TRUE 2:86869360-86871837 1.56E-05 3.35347 cg27120999 HSPA2 TRUE 14:64075878-64079365 1.58E-05 2.12272 cg07947016 KLK2 FALSE 1.58E-05 -2.04963 cg19996355 PBX4 TRUE 19:19589638-19590908 1.60E-05 -2.20698 cg03580247 SLC4A1 FALSE 1.61E-05 -2.18814 cg22396755 RAP1GA1 TRUE 1:21867575-21867913 1.61E-05 2.92681 cg04915566 RUNX1 FALSE 1.63E-05 -2.5814 cg10305797 UNQ467 FALSE 1.63E-05 -2.00411 cg21460081 HOXB4 HSA-MIR-10A TRUE 17:44009922-44011032 1.67E-05 2.70816 cg17162024 UNQ9433 TRUE 8:53639644-53641328 1.68E-05 2.46558 cg00767581 HOXD4 HSA-MIR-10B TRUE 2:176723073-176723586 1.71E-05 2.10733

Page 40: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  40  

cg25760229 CACNA1H TRUE 16:1142283-1144714 1.72E-05 2.32142 cg01405107 HOXB5 TRUE 17:44025432-44027432 1.76E-05 2.01939 cg11191210 VGLL2 TRUE 6:117692316-117694016 1.77E-05 2.08009 cg12019109 AZGP1 FALSE 1.81E-05 -2.54405 cg18454685 CACNA1G TRUE 17:45992788-45994919 1.82E-05 2.47863 cg22598028 ZNF660 TRUE 3:44601241-44601817 1.82E-05 2.43331 cg01390445 LIPH TRUE 3:186753961-186754463 1.82E-05 -2.24547 cg24723331 ST8SIA1 TRUE 12:22377615-22380091 1.83E-05 2.04552 cg16878021 FALSE 1.83E-05 -2.01646 cg16446783 MRGPRX4 FALSE 1.85E-05 -2.03176 cg22325703 GPR83 TRUE 11:93773573-93774675 1.86E-05 2.21556 cg26548883 CCL15 FALSE 1.88E-05 -2.44966 cg01424107 CDX2 TRUE 13:27440079-27441627 1.89E-05 2.66855 cg12497564 RBP1 TRUE 3:140740275-140741996 1.92E-05 2.45761 cg27513764 EFCAB3 FALSE 1.92E-05 -2.21138 cg14823162 POU3F2 TRUE 6:99385538-99390983 1.93E-05 2.12028 cg15835232 HLF TRUE 17:50697039-50698891 1.93E-05 2.13587 cg00325491 FN5 FALSE 1.93E-05 -2.43825 cg13550608 SEMA3B TRUE 3:50287560-50290055 1.94E-05 -2.10043 cg01152019 HOXD4 TRUE 2:176723073-176723586 1.96E-05 2.58004 cg10007262 RELN TRUE 7:103416429-103418531 1.96E-05 2.56035 cg08623787 RXRG TRUE 1:163680824-163681333 1.97E-05 2.10445 cg15046675 CD37 FALSE 2.01E-05 -2.04995 cg09179845 C7 FALSE 2.02E-05 2.47627 cg20855565 TRIM58 TRUE 1:246086350-246088254 2.03E-05 2.91737 cg01557297 SLC22A17 TRUE 14:22890874-22892057 2.03E-05 2.57244 cg11868900 EFCAB1 TRUE 8:49810084-49811166 2.05E-05 2.40752 cg10171125 SCARA5 TRUE 8:27906021-27906359 2.05E-05 2.11551

Page 41: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  41  

cg15786837 HOXB13 TRUE 17:44161863-44162081 2.06E-05 2.20195 cg01405761 MGC34646 TRUE 8:62362917-62363625 2.06E-05 2.34079 cg03109316 ZNF80 FALSE 2.08E-05 -2.25412 cg25057743 PTHR2 TRUE 2:208979311-208980425 2.08E-05 2.01915 cg01316071 C14orf139 TRUE 14:94946744-94946949 2.13E-05 -2.04829 cg03425110 NEUROG3 TRUE 10:71001367-71003520 2.14E-05 2.86089 cg10801369 RFFL TRUE 17:30441105-30441336 2.14E-05 -2.1745 cg25737664 SIRPD FALSE 2.15E-05 -2.037 cg20498685 TWIST1 TRUE 7:19122442-19124663 2.19E-05 2.66897 cg16192029 ANKRD7 TRUE 7:117651804-117652144 2.20E-05 -2.65543 cg11126134 FLJ14834 TRUE 13:30378164-30379189 2.20E-05 2.52823 cg25228126 FZD2 TRUE 17:39989114-39992456 2.21E-05 2.60854 cg00240312 CDH1 TRUE 16:67328348-67330000 2.22E-05 2.20779 cg16616769 MGC35048 FALSE 2.22E-05 -2.41056 cg06274159 ZFP42 TRUE 4:189153056-189154305 2.24E-05 2.47909 cg10530281 TBX3 TRUE 12:113605002-113608462 2.26E-05 2.34522 cg05158615 NPY TRUE 7:24289504-24291701 2.27E-05 2.05028 cg17703324 OPTC FALSE 2.27E-05 -2.10962 cg26590537 KCNA1 TRUE 12:4888590-4891600 2.30E-05 2.3193 cg24176563 EYA4 TRUE 6:133603260-133605886 2.30E-05 2.5977 cg09220361 GABRG2 FALSE 2.31E-05 2.0669 cg17020834 GRIA1 FALSE 2.32E-05 2.89226 cg24068372 LOC349136 TRUE 7:150736827-150739238 2.34E-05 2.64401 cg03840259 GRAP2 FALSE 2.37E-05 -2.44724 cg27619475 SLC16A5 TRUE 17:70595284-70595964 2.37E-05 2.5118 cg03292388 CPZ TRUE 4:8645191-8645719 2.43E-05 2.70981 cg11011938 SEMA5A TRUE 5:9597569-9600028 2.48E-05 2.13361 cg08383315 RIC3 TRUE 11:8146537-8147616 2.53E-05 3.07074

Page 42: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  42  

cg06353345 OR51B4 FALSE 2.56E-05 -2.06214 cg01705587 SV2A TRUE 1:148155815-148156129 2.62E-05 2.75333 cg26898336 TEKT3 TRUE 17:15185000-15186014 2.63E-05 2.56172 cg14795968 ACADL TRUE 2:210797532-210798583 2.64E-05 2.27455 cg04588079 HEBP1 TRUE 12:13044082-13046575 2.66E-05 2.09897 cg02452985 BCL11B TRUE 14:98806507-98807587 2.67E-05 3.17715 cg20082641 PAQR5 TRUE 15:67377209-67377413 2.71E-05 2.45827 cg18482268 POU4F3 TRUE 5:145698070-145700672 2.75E-05 2.30844 cg15742700 BLK FALSE 2.76E-05 -2.42482 cg15584813 SLC38A4 TRUE 12:45505775-45506259 2.81E-05 2.04258 cg23097006 VSX1 TRUE 20:25011730-25013982 2.84E-05 2.45701 cg04713521 PRRX2 TRUE 9:131467236-131468579 2.90E-05 2.14198 cg18508525 CD36 FALSE 2.92E-05 -2.12719 cg23898073 GFRA1 TRUE 10:118020244-118024333 2.93E-05 2.45617 cg07696033 ALX4 TRUE 11:44282161-44283221 2.95E-05 2.07237 cg26673195 G6PC FALSE 2.96E-05 -2.24617 cg18938204 EMILIN3 TRUE 20:39427979-39429312 2.98E-05 2.1106 cg25913233 SPARC FALSE 3.01E-05 2.52079 cg04420907 WDRPUH TRUE 17:9420462-9420957 3.02E-05 -2.34925 cg08434234 DGKI TRUE 7:137181541-137183087 3.05E-05 2.31282 cg21581873 PLEKHA6 FALSE 3.06E-05 2.20699 cg12388309 PAK7 TRUE 20:9766686-9767917 3.16E-05 2.73364 cg21432954 TRPC4 TRUE 13:37341237-37342355 3.18E-05 2.24372 cg00638514 OTOP3 TRUE 17:70443189-70444298 3.19E-05 2.282 cg19461644 COL15A1 TRUE 9:100745603-100747003 3.21E-05 2.04091 cg09551147 SORCS3 TRUE 10:106389391-106393271 3.22E-05 2.46484 cg15279364 TMEM130 TRUE 7:98304821-98306320 3.22E-05 2.09121 cg14900471 GATA4 TRUE 8:11598711-11600557 3.27E-05 3.05856

Page 43: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  43  

cg04623837 HCG9 TRUE 6:30051213-30051683 3.29E-05 2.23788 cg04488521 ZNF354C TRUE 5:178419646-178420866 3.30E-05 2.61037 cg25391023 BTNL2 FALSE 3.30E-05 -2.28749 cg04402875 RET TRUE 10:42891849-42893531 3.32E-05 2.3164 cg03665605 ATP5J2 TRUE 7:98901645-98903300 3.33E-05 -2.05749 cg08047457 RASSF1 TRUE 3:50352707-50353696 3.34E-05 4.64022 cg21458907 CADPS TRUE 3:62834482-62837070 3.34E-05 2.87092 cg00933411 DLC1 FALSE 3.37E-05 -2.17609 cg23338993 UGT1A6 FALSE 3.47E-05 -2.02465 cg08097755 VGF TRUE 7:100592520-100596427 3.49E-05 2.02623 cg09914444 DMBX1 FALSE 3.50E-05 -2.17101 cg09873258 DLK1 TRUE 14:100261944-100263514 3.60E-05 2.76454 cg16689634 CYP4X1 TRUE 1:47261718-47262260 3.62E-05 2.08721 cg05546044 MAPK1 TRUE 22:20550998-20553008 3.63E-05 -2.3395 cg09082287 DNAJC6 TRUE 1:65502866-65503230 3.66E-05 2.10512 cg12748258 HR TRUE 8:22043458-22044831 3.69E-05 3.02826 cg26195812 DPYSL5 TRUE 2:26923226-26926873 3.70E-05 2.50516 cg00949442 ABCA3 TRUE 16:2329513-2332346 3.75E-05 2.73269 cg10500909 CYP11B2 FALSE 3.75E-05 -2.23751 cg10057065 C1orf104 TRUE 1:153558675-153562005 3.77E-05 2.03494 cg08918749 LPL TRUE 8:19840992-19842501 3.77E-05 2.02143 cg08080029 CHD5 TRUE 1:6161731-6163979 3.77E-05 3.05071 cg08592761 LARP6 TRUE 15:68932967-68933961 3.79E-05 2.07971 cg26738880 DPP6 FALSE 3.79E-05 -2.37221 cg20113732 NELL1 TRUE 11:20647014-20648558 3.81E-05 2.49708 cg22040627 SLC13A5 TRUE 17:6556978-6558390 3.82E-05 2.10089 cg25549459 POU3F3 TRUE 2:104835039-104840521 3.88E-05 2.26183 cg04278702 HTR1E TRUE 6:87703675-87704446 3.91E-05 2.20313

Page 44: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  44  

cg03605761 RNF126 TRUE 19:614958-615478 3.93E-05 -2.05752 cg04599297 GAD2 TRUE 10:26543757-26547625 3.93E-05 2.04354 cg26026726 GPR42 TRUE 19:40554102-40554302 3.94E-05 -2.14295 cg14696870 FCER1A FALSE 3.95E-05 -2.27103 cg09405083 C10orf26 FALSE 4.03E-05 2.40137 cg13191049 DMN TRUE 15:97462288-97464202 4.05E-05 2.5519 cg08505473 PRIMA1 TRUE 14:93323632-93325572 4.07E-05 2.26388 cg07039362 CES7 FALSE 4.08E-05 -2.01299 cg19616230 SLC34A2 TRUE 4:25266077-25266795 4.08E-05 2.39298 cg27622610 OR1G1 FALSE 4.09E-05 -2.06021 cg11668923 ADAM12 TRUE 10:128065540-128067647 4.10E-05 2.30374 cg15452573 SNCA TRUE 4:90976237-90978020 4.20E-05 2.70538 cg22469841 FSTL1 TRUE 3:121651743-121653139 4.23E-05 2.72228 cg09630404 STAR TRUE 8:38127398-38127709 4.24E-05 2.89621 cg23850212 ZFP28 TRUE 19:61741391-61742539 4.29E-05 2.22789 cg03483626 KCNA3 TRUE 1:111017242-111019922 4.30E-05 2.11849 cg05396987 FLJ30834 TRUE 4:122904540-122906207 4.34E-05 2.08176 cg05345286 MDFI TRUE 6:41712413-41714762 4.36E-05 2.39549 cg07168556 WFIKKN2 TRUE 17:46266978-46267743 4.36E-05 2.22949 cg04052038 CLDN8 FALSE 4.36E-05 -2.5478 cg17628717 HECW1 TRUE 7:43317647-43318185 4.36E-05 -2.63561 cg26059153 C20orf133 TRUE 20:13923655-13925157 4.38E-05 2.58938 cg26709720 B3GALT5 FALSE 4.39E-05 -2.5302 cg06806711 MS4A1 FALSE 4.40E-05 -2.09253 cg09083627 SLITRK5 TRUE 13:87121398-87123086 4.42E-05 2.72294 cg19988449 BNC1 TRUE 15:81742719-81745331 4.49E-05 2.17824 cg11241627 FERD3L TRUE 7:19150354-19151721 4.52E-05 2.16197 cg23367478 PDE3A TRUE 12:20412318-20414579 4.57E-05 2.34238

Page 45: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  45  

cg01367992 LY9 FALSE 4.60E-05 -2.05855 cg00875272 TAL1 TRUE 1:47467815-47468163 4.63E-05 2.03087 cg07421682 PCSK6 TRUE 15:99846201-99848441 4.65E-05 -2.12039 cg26918645 ZNF285 TRUE 19:49597112-49597803 4.72E-05 2.87335 cg15057581 PTPNS1 TRUE 20:1822372-1824321 4.73E-05 2.08512 cg16581199 TSSK1 FALSE 4.87E-05 -2.13748 cg02780849 GNG4 TRUE 1:233878398-233881230 4.93E-05 2.14975 cg20308679 FRZB TRUE 2:183439392-183439889 4.98E-05 2.03807 cg03848675 FOXF2 TRUE 6:1333978-1336464 5.04E-05 2.15607 cg00243313 IRX4 TRUE 5:1934753-1937496 5.04E-05 2.39858 cg21838334 NES TRUE 1:154912856-154913974 5.07E-05 2.13319 cg04001802 THEDC1 FALSE 5.11E-05 -2.16134 cg05064352 PHF21B TRUE 22:43781383-43785432 5.12E-05 2.04711 cg03026462 FOXA1 TRUE 14:37133338-37135533 5.22E-05 2.08508 cg23211240 BTG4 HSA-MIR-34B/C TRUE 11:110888181-110889264 5.26E-05 2.23887 cg08291098 STMN3 FALSE 5.34E-05 2.06069 cg19777470 CRABP1 TRUE 15:76419520-76421296 5.44E-05 2.58552 cg24496666 GIPC2 TRUE 1:78283822-78285119 5.52E-05 2.02446 cg12153542 REM1 TRUE 20:29526562-29527012 5.74E-05 2.25795 cg24448259 CPVL TRUE 7:29152038-29152803 5.76E-05 2.02277 cg18420965 EPHA5 TRUE 4:66217490-66218382 5.77E-05 2.36638 cg04786857 SPDY1 TRUE 2:28886410-28888374 5.81E-05 2.10766 cg00116234 ADAMTSL1 FALSE 5.81E-05 2.15313 cg04576021 HLA-DOB FALSE 5.88E-05 -2.18236 cg26164184 FCN2 FALSE 5.92E-05 -2.40783 cg06908778 SPAG6 TRUE 10:22673932-22675060 6.05E-05 2.22296 cg21321735 KIF1A TRUE 2:241406746-241409144 6.05E-05 2.0056 cg01248426 ATP6V0D2 FALSE 6.10E-05 -2.19964

Page 46: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  46  

cg16009558 GRIK2 TRUE 6:101953329-101954569 6.20E-05 2.09178 cg14383135 NPAS2 TRUE 2:100801289-100804019 6.31E-05 2.50707 cg14533138 KRTHA3B TRUE 17:36779251-36779455 6.34E-05 -2.06311 cg04884908 CYP26B1 TRUE 2:72226820-72228767 6.39E-05 2.00164 cg22477971 C1QB FALSE 6.44E-05 -2.00838 cg00888479 SLC24A3 TRUE 20:19140330-19141909 6.48E-05 2.02951 cg25938646 SLITRK1 TRUE 13:83354079-83354687 6.51E-05 2.49443 cg19306866 KRTAP6-2 TRUE 21:30892865-30893091 6.55E-05 -2.40649 cg13619915 SLITRK3 TRUE 3:166396716-166397896 6.58E-05 2.02844 cg22063989 RSPO1 TRUE 1:37871744-37873510 6.63E-05 2.19672 cg05767404 C1orf150 FALSE 6.66E-05 -2.35577 cg27167601 RORA TRUE 15:59306719-59309462 6.70E-05 2.31161 cg14654926 FGF10 FALSE 6.73E-05 2.03081 cg16319578 HSPA2 TRUE 14:64075878-64079365 6.74E-05 2.23685 cg22886089 SCG3 TRUE 15:49760777-49761411 6.86E-05 2.72455 cg12351433 LHCGR TRUE 2:48835959-48836520 6.91E-05 2.02616 cg13121699 C2orf10 TRUE 2:185170526-185170785 6.93E-05 2.19811 cg25905812 DMRT1 TRUE 9:831240-833434 7.06E-05 2.33363 cg07015629 ERBB4 TRUE 2:213109348-213112340 7.42E-05 2.04942 cg22836229 EFCAB1 TRUE 8:49810084-49811166 7.62E-05 2.21209 cg21172540 TSSK3 TRUE 1:32599283-32600722 7.79E-05 2.17861 cg08528984 PRDM16 TRUE 1:2973688-2977844 7.81E-05 2.32186 cg18755783 SPG20 TRUE 13:35817747-35819118 7.88E-05 2.44639 cg16372520 NRXN3 FALSE 7.96E-05 -2.06789 cg14449051 SLC6A15 TRUE 12:83829166-83831125 8.06E-05 2.02098 cg12815916 PTPN18 TRUE 2:130829416-130831123 8.08E-05 -3.20443 cg08349806 FLJ39599 TRUE 16:15396648-15397719 8.08E-05 2.18963 cg06563300 SLC17A8 FALSE 8.08E-05 2.21905

Page 47: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  47  

cg08056146 SOX7 TRUE 8:10624002-10628691 8.12E-05 3.52126 cg04717045 CCND1 TRUE 11:69177944-69178494 8.15E-05 -2.30032 cg15516226 BTNL9 TRUE 5:180400061-180400301 8.18E-05 2.07032 cg02932669 VGCNL1 TRUE 13:100866212-100867382 8.18E-05 2.1191 cg13958614 FOXD1 TRUE 5:72778217-72780764 8.22E-05 2.10286 cg20034100 IBRDC1 FALSE 8.25E-05 -2.01491 cg24826867 IRF8 TRUE 16:84489299-84490591 8.30E-05 2.42997 cg20052718 TWIST1 TRUE 7:19122442-19124663 8.33E-05 2.49169 cg00970325 PAQR9 TRUE 3:144163706-144166078 8.39E-05 2.17141 cg26187237 IGFBP2 TRUE 2:217205664-217207225 8.43E-05 2.40326 cg24891133 FLJ14834 TRUE 13:30378164-30379189 8.46E-05 2.62293 cg25437385 SLC35F3 TRUE 1:232106757-232108354 8.46E-05 2.43928 cg08768421 GDA TRUE 9:73953959-73954858 8.54E-05 2.67721 cg03593419 GABRA4 TRUE 4:46689713-46690724 8.61E-05 2.16315 cg10605520 HRH3 TRUE 20:60227496-60229559 8.73E-05 2.65507 cg19912436 PALM TRUE 19:659474-660815 8.81E-05 2.91775 cg01404615 DKK2 TRUE 4:108177508-108178160 8.88E-05 2.05103 cg18089852 PDE8B TRUE 5:76541665-76543102 9.00E-05 2.76002 cg16584573 FGF8 TRUE 10:103525002-103526643 9.24E-05 2.21225 cg10332700 ZNF415 TRUE 19:58327559-58328183 9.41E-05 2.59076 cg19423014 RNF152 TRUE 18:57710982-57713397 9.42E-05 2.5414 cg08040471 C17orf62 TRUE 17:78000840-78001238 9.57E-05 -2.0339 cg19042062 KCNJ2 TRUE 17:65675902-65677508 9.61E-05 2.01599 cg15992730 GDF3 TRUE 12:7739270-7739504 9.64E-05 2.32932 cg25545210 KRTHA4 FALSE 9.77E-05 -2.02695 cg11248413 NEUROG1 TRUE 5:134898284-134900130 9.79E-05 2.70687 cg15774495 GRB10 TRUE 7:50827590-50829211 9.83E-05 2.06791 cg19392109 FLJ39531 FALSE 0.000100005 -2.09436

Page 48: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  48  

cg02090283 CNTNAP3 TRUE 9:39277898-39278810 0.000100212 2.17446 cg24995381 GPR141 FALSE 0.000103015 -2.0053 cg25766774 ZDHHC3 FALSE 0.000107484 2.26273 cg27000831 CCL8 FALSE 0.000107967 -2.42439 cg03294619 NKX2-5 TRUE 5:172593183-172594999 0.000108249 2.31745 cg01857260 SCRL TRUE 19:55357951-55358839 0.000109779 2.05384 cg15883716 ANKRD45 TRUE 1:171905257-171905801 0.000112063 2.36637 cg00777121 RASSF1 TRUE 3:50352707-50353696 0.000114179 3.17844 cg21367957 F10 FALSE 0.000114179 -2.13435 cg20707333 C20orf177 TRUE 20:57947345-57949206 0.000114461 -2.33649 cg15134649 MT1E TRUE 16:55216013-55217311 0.000115357 2.32346 cg20329958 CRH TRUE 8:67251663-67253257 0.000116092 2.14112 cg15461516 CHST1 TRUE 11:45642707-45644150 0.000117747 2.54883 cg06142324 FLJ25530 TRUE 11:124310692-124310920 0.000120328 -2.18799 cg16652259 DLX1 TRUE 2:172657345-172659187 0.000121208 2.01425 cg06088032 ZMYND10 TRUE 3:50357771-50358563 0.000121994 -2.16116 cg20023231 HS3ST2 TRUE 16:22732005-22734135 0.000122172 2.00456 cg09601629 FLJ40365 TRUE 19:14982467-14983303 0.000122971 2.08803 cg05697976 MLSTD1 FALSE 0.000123171 -2.04373 cg20881054 VASH1 TRUE 14:76296726-76298691 0.000123197 2.57168 cg25990647 OPRK1 TRUE 8:54325750-54327151 0.000127972 2.03928 cg02983451 KLF11 TRUE 2:10099704-10102471 0.000132819 2.00519 cg13562542 GPR27 TRUE 3:71885059-71887021 0.000133339 3.12036 cg26516759 BMP7 TRUE 20:55273504-55275954 0.000133513 2.09321 cg10617171 SCN1B TRUE 19:40212928-40214342 0.00013434 2.05248 cg19857541 MORC1 TRUE 3:110319373-110319935 0.00013434 -2.06261 cg01726767 LALBA FALSE 0.000135122 -2.00862 cg21461100 NOVA2 TRUE 19:51168270-51169615 0.000138317 2.66287

Page 49: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  49  

cg10743104 PDPN TRUE 1:13782043-13783604 0.000140439 2.26022 cg22843446 ASAM TRUE 11:122570563-122572659 0.000142579 2.09143 cg22190114 NALP8 FALSE 0.000142635 -2.40122 cg22199118 C8orf34 FALSE 0.000143029 -2.09057 cg05241571 UNQ467 FALSE 0.000143207 -2.04696 cg04391111 TP73 TRUE 1:3556965-3559547 0.000143722 2.44251 cg13847070 KLHL3 FALSE 0.000143722 2.07019 cg20357628 PHACTR3 TRUE 20:57612715-57614362 0.000143838 2.31004 cg01200060 SCRT2 TRUE 20:605211-605426 0.000145199 2.24323 cg25764191 INA TRUE 10:105026442-105028852 0.000145261 2.50526 cg15755084 MSX1 TRUE 4:4912015-4913147 0.000145782 2.24553 cg21087137 MGC26856 TRUE 12:74014606-74014870 0.000148169 2.26632 cg25229305 KCNK18 FALSE 0.000149695 -2.01041 cg23244913 HCG9 TRUE 6:30051213-30051683 0.000156182 2.12926 cg17786776 FKBP9 TRUE 7:32962641-32964267 0.000157439 -2.01393 cg07017374 FLT3 TRUE 13:27571646-27573333 0.000158062 2.55686 cg21949781 PSTPIP2 TRUE 18:41905637-41906628 0.000160598 2.00335 cg06912252 C9orf125 TRUE 9:103287963-103289481 0.00016132 2.02631 cg00725635 B3GALT2 FALSE 0.00016242 2.20686 cg23472215 GSTM3 TRUE 1:110083766-110084903 0.00016352 2.00543 cg10298815 GPR88 TRUE 1:100777010-100778568 0.000163876 2.08615 cg21546671 HOXB4 HSA-MIR-10A TRUE 17:44009922-44011032 0.000165484 2.96422 cg27049761 B3GNT4 TRUE 12:121253265-121255264 0.000166069 2.10571 cg24625128 JAM3 TRUE 11:133443878-133445066 0.000168618 2.56496 cg04532952 CA4 TRUE 17:55581878-55583315 0.000168887 2.02595 cg13589108 FAM5B TRUE 1:175406600-175407433 0.000171585 2.67788 cg10054857 C18orf20 FALSE 0.000173495 -2.18597 cg10143146 COL11A2 TRUE 6:33268548-33269470 0.000174734 2.18365

Page 50: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  50  

cg05810550 DEFB106A FALSE 0.000178926 -2.13188 cg23187653 PCSK1 TRUE 5:95794255-95794915 0.000180626 2.24139 cg21554552 RASSF1 TRUE 3:50352707-50353696 0.000187571 3.5473 cg25119415 MNDA FALSE 0.000191108 -2.20815 cg26054540 NKD2 TRUE 5:1061673-1063043 0.000191606 2.36082 cg20330472 EYA4 TRUE 6:133603260-133605886 0.000191629 2.13328 cg22879515 BTG4 HSA-MIR-34B/C TRUE 11:110888181-110889264 0.000196037 2.49173 cg04491443 PDILT FALSE 0.000198132 -2.24701 cg08268099 OLFM1 FALSE 0.000199255 -2.10083 cg24512400 KLK10 TRUE 19:56213378-56214712 0.000199605 2.28733 cg06539804 CPXM TRUE 20:2728596-2730091 0.000200098 2.04592 cg13274713 TBX2 TRUE 17:56827708-56833527 0.000200621 2.05277 cg05485062 SERPINA12 FALSE 0.000202005 -2.07942 cg07906724 CHRNA6 FALSE 0.000202386 -2.01519 cg17740399 IPF1 TRUE 13:27391843-27392974 0.000202997 2.66458 cg20625138 UTS2 FALSE 0.000203437 2.0768 cg17778120 RBP2 FALSE 0.000206501 -2.09262 cg08460026 CTLA4 FALSE 0.000207019 -2.5615 cg21303011 THRB TRUE 3:24510366-24512597 0.000209528 -2.09089 cg13921352 FAM19A4 TRUE 3:69063227-69065264 0.000212054 2.28721 cg22036988 SPSB4 TRUE 3:142252136-142254676 0.000212368 2.62658 cg04034767 GRASP TRUE 12:50686496-50688060 0.000212856 3.03805 cg23559331 KCNH4 TRUE 17:37585717-37587171 0.000213104 2.26509 cg00400263 C20orf177 TRUE 20:57947345-57949206 0.000214056 -2.98758 cg09495977 HTRA3 TRUE 4:8322116-8323142 0.000215264 2.00253 cg17788682 ARNT2 TRUE 15:78482994-78484493 0.000217329 2.33927 cg12910797 HOXB3 TRUE 17:44006560-44006958 0.000220897 2.35964 cg27257987 PSG4 FALSE 0.00022338 -2.11401

Page 51: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  51  

cg24446548 TWIST1 TRUE 7:19122442-19124663 0.000225325 2.43456 cg22594309 SYT2 TRUE 1:200945489-200946481 0.000226533 2.68512 cg03382304 JAM2 TRUE 21:25933195-25934679 0.000235136 2.14668 cg16377872 SLC1A6 TRUE 19:14945806-14946065 0.000240605 -2.03901 cg14409941 ADAMTS2 TRUE 5:178703214-178705508 0.000242966 2.59447 cg16608652 B3GALT2 FALSE 0.00024753 2.53382 cg15817236 ALX4 TRUE 11:44286979-44288424 0.000251823 2.49784 cg16173109 FLJ38379 TRUE 2:242594067-242595375 0.000252185 -2.10359 cg16539629 C14orf132 TRUE 14:95574710-95576343 0.000254723 2.25484 cg00117172 RUNX3 TRUE 1:25127692-25131906 0.000256331 2.41981 cg03167883 FLJ46365 FALSE 0.000256564 -2.11312 cg13694867 SIM2 TRUE 21:36989945-36994434 0.000257905 2.03712 cg05288803 TIMP3 TRUE 22:31527292-31528286 0.000264246 2.31261 cg17194182 EPO TRUE 7:100155958-100156762 0.000264264 2.06575 cg24765446 WFDC6 FALSE 0.000266356 -2.01466 cg21032583 LMLN TRUE 3:199170124-199170535 0.000266391 -2.06396 cg11225410 SOCS2 TRUE 12:92490435-92491891 0.000268189 2.13651 cg06497752 COL9A3 TRUE 20:60918098-60919487 0.000273442 2.12754 cg03549571 DHH TRUE 12:47774151-47774653 0.000274229 2.23451 cg08009622 COL12A1 TRUE 6:75971040-75974382 0.000278791 2.10174 cg21688264 SNAP91 TRUE 6:84474137-84476277 0.000279041 2.48087 cg26550234 SLC39A12 TRUE 10:18279665-18279982 0.000282165 -2.04801 cg14659547 RETNLB FALSE 0.00028383 -2.09648 cg04172348 SYN2 TRUE 3:12020118-12021768 0.000291978 2.00484 cg23276695 CNR1 FALSE 0.000295254 -2.05478 cg01722994 GRIN2A TRUE 16:10183716-10185105 0.00029543 2.07622 cg18952560 PTPNS1 TRUE 20:1822372-1824321 0.000296612 2.0956 cg05071677 OR2V2 FALSE 0.000302223 -2.05927

Page 52: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  52  

cg12686016 HOXA1 TRUE 7:27101711-27103391 0.000308502 2.16945 cg02126753 AEBP1 TRUE 7:44110152-44111361 0.000309384 2.26893 cg25058957 RAXL1 TRUE 19:3722529-3722766 0.00032343 -2.08614 cg18630040 PLA2G7 TRUE 6:46809996-46811568 0.000325916 2.52521 cg26304237 DNAJC6 TRUE 1:65502866-65503230 0.000326626 2.36126 cg18389810 C14orf8 TRUE 14:20568722-20568923 0.000330581 -2.19689 cg08186362 HRH3 TRUE 20:60227496-60229559 0.000331051 2.19392 cg14223995 UCP1 TRUE 4:141709103-141709997 0.000338374 2.11142 cg26390526 FLG FALSE 0.000342588 -2.30803 cg03168582 DMRT1 TRUE 9:831240-833434 0.000344196 3.15062 cg10775273 CRYBA2 TRUE 2:219565851-219567355 0.000365544 3.01432 cg23587449 LRAT TRUE 4:155882485-155885476 0.000369312 2.01727 cg12322132 IGF2AS TRUE 11:2121548-2122565 0.000374094 2.30059 cg23146358 CDKN1C TRUE 11:2861494-2864350 0.000377889 2.32802 cg13877915 ZNF132 TRUE 19:63642983-63644185 0.000389433 2.23357 cg04988423 ALX4 TRUE 11:44286979-44288424 0.000393571 2.4316 cg24844534 C1QTNF1 TRUE 17:74531570-74532529 0.00039872 2.0759 cg21359747 ALDH1A3 TRUE 15:99236455-99238833 0.000401435 2.50875 cg03663215 ADCY2 TRUE 5:7448835-7450137 0.000406317 2.00052 cg01231779 NID2 TRUE 14:51604216-51606602 0.000410995 2.27497 cg14070647 RSPO2 TRUE 8:109162808-109165294 0.000429672 2.0454 cg09874752 SFRP5 TRUE 10:99520826-99522150 0.000431921 2.13622 cg19427610 WIF1 TRUE 12:63801033-63802623 0.000437729 2.00402 cg14009688 CALD1 FALSE 0.000439708 2.01519 cg24921858 BCL2L14 FALSE 0.000439782 -2.0528 cg06722216 NOL4 TRUE 18:30056047-30057926 0.000454793 2.14686 cg23922454 LHX5 TRUE 12:112393114-112395188 0.000462301 2.08055 cg08097882 POU4F1 TRUE 13:78073448-78076158 0.000465821 2.63877

Page 53: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  53  

cg26365854 ALX4 TRUE 11:44286979-44288424 0.000469216 2.20925 cg05209917 CNTN6 FALSE 0.000471469 2.0471 cg19118812 ELMO1 TRUE 7:37453368-37455385 0.000480086 2.76835 cg26164310 LPPR4 TRUE 1:99502292-99503207 0.000480251 2.45594 cg04970352 ALX4 TRUE 11:44283505-44284554 0.000484255 2.07665 cg03700462 HOXA1 TRUE 7:27101711-27103391 0.000486655 2.42153 cg06437862 TUBA2 TRUE 13:18653806-18654153 0.0005107 -2.10894 cg14896516 CRHR2 TRUE 7:30687736-30689084 0.000523117 2.53492 cg17298704 CLDN18 TRUE 3:139200225-139200459 0.000536343 -2.48698 cg01086895 DCHS1 TRUE 11:6632917-6634478 0.000554457 2.00883 cg01805282 EYA4 TRUE 6:133603260-133605886 0.000571345 2.01375 cg22202141 FCGR3A FALSE 0.000573457 -2.14789 cg13359415 LGI2 TRUE 4:24640989-24642011 0.000586467 2.22572 cg11591325 F2R TRUE 5:76046156-76048713 0.000586753 2.42832 cg17030820 MSMB FALSE 0.000607625 -2.23919 cg23828595 PRKG1 TRUE 10:52503354-52504907 0.000612021 2.04561 cg18489434 VASH1 TRUE 14:76296726-76298691 0.000612254 3.09332 cg02144933 AOX1 TRUE 2:201158654-201159798 0.000619171 2.03779 cg19404832 TFAP2D FALSE 0.000620238 2.01627 cg05937453 SFRP5 TRUE 10:99520826-99522150 0.000740669 2.24883 cg18450227 MAPK4 TRUE 18:46444299-46444864 0.000745274 -2.55987 cg06572160 KCNC3 TRUE 19:55522929-55526307 0.000839804 2.29715 cg06276653 C8ORFK32 FALSE 0.000849984 -2.01182 cg21303386 RGS7 TRUE 1:239586193-239587613 0.000885705 2.00196 cg20249919 PCSK6 TRUE 15:99846201-99848441 0.000892491 2.32022 cg09440243 PTPRD FALSE 0.000905924 -2.02351 cg26309134 ZNF542 TRUE 19:61570983-61571926 0.000962984 2.80287 cg07663789 NPR3 TRUE 5:32745529-32747303 0.000975565 2.02362

Page 54: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  54  

cg08958015 CCDC65 TRUE 12:47583729-47584457 0.0010212 2.48867 cg23129478 ST8SIA5 TRUE 18:42589800-42592268 0.00103859 2.09782 cg05222924 WT1 TRUE 11:32406516-32407359 0.00107946 2.10247 cg02361557 NTNG1 TRUE 1:107483962-107486112 0.00108633 2.00554 cg03421687 ZMYND10 TRUE 3:50357771-50358563 0.00119853 -2.04736 cg18877506 PDPN TRUE 1:13782043-13783604 0.00120225 2.08212 cg04684516 SNCAIP TRUE 5:121675020-121676508 0.00120374 2.51498 cg26646370 SHD TRUE 19:4229703-4231365 0.00121198 2.21104 cg16340268 ITPKB TRUE 1:224991084-224993760 0.00134663 2.07506 cg10784090 CLDN18 TRUE 3:139200225-139200459 0.0013717 -2.0268 cg16778809 ADAM23 TRUE 2:207015632-207017538 0.00145113 2.68697 cg08453021 ELMO1 TRUE 7:37453368-37455385 0.00152709 2.32568 cg11599505 C20orf102 TRUE 20:35964157-35965744 0.00164272 -2.22852 cg23366752 DNAJA4 TRUE 15:76343418-76344791 0.00173765 -2.29608 cg17872757 FLI1 TRUE 11:128067557-128070534 0.00181661 2.23753 cg15479752 FFAR2 TRUE 19:40632541-40633082 0.00184054 -2.09277 cg25886284 ZNF545 TRUE 19:41600507-41601830 0.00198886 2.33297 cg11930592 MSX1 TRUE 4:4912015-4913147 0.00205674 2.12538 cg12741420 IRF4 TRUE 6:336092-339080 0.00227594 2.10377 cg23208152 ZMYND10 TRUE 3:50357771-50358563 0.00233252 -2.2943 cg08853659 CLSTN2 TRUE 3:141135928-141137912 0.0025211 2.07823 cg21530890 SOX8 TRUE 16:970186-973037 0.00252559 2.06505 cg11920519 MAP1LC3A TRUE 20:32598592-32598809 0.00276767 -2.01461 cg27650175 DAB2IP TRUE 9:123500662-123502056 0.0030311 2.29451 cg27637521 SOCS3 TRUE 17:73866027-73868731 0.00309219 -2.21535 cg18913951 TMEM45B FALSE 0.00330854 -2.00733 cg26615830 MSX1 TRUE 4:4912015-4913147 0.00333685 2.24481 cg06377278 RUNX3 TRUE 1:25127692-25131906 0.00337294 3.08687

Page 55: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  55  

cg14174099 SLC8A3 TRUE 14:69724727-69725968 0.00339537 2.14036 cg20295671 YPEL1 TRUE 22:20419518-20420996 0.00346265 2.01973 cg04609859 HOXB4 TRUE 17:44009922-44011032 0.00360603 2.70709 cg20530314 AGTR1 TRUE 3:149898032-149898802 0.00394647 2.0098 cg15538820 OBP2B FALSE 0.00418138 -2.10851 cg09864712 RHBDL1 TRUE 16:666628-667223 0.00463539 -2.83705 cg19876444 CA2 TRUE 8:86562612-86564342 0.00498334 2.05276 cg26728886 C1orf118 FALSE 0.00507973 2.47947 cg16175725 TCF1 TRUE 12:119901088-119901290 0.00533986 -2.06085 cg13801416 AKR1B1 TRUE 7:133793468-133794751 0.00543372 2.37784 cg21553524 MGC33926 TRUE 2:39745656-39747607 0.00575663 2.20262 cg22487322 IL20RA TRUE 6:137406829-137408018 0.00606992 -2.04674 cg18342279 ZAR1 TRUE 4:48186609-48188513 0.00696502 2.29725 cg06905514 CAMK2B TRUE 7:44330839-44332219 0.00710189 2.3442 cg06911113 UBXD3 TRUE 1:20384918-20385413 0.00854348 2.08086 cg16557944 GPX7 TRUE 1:52840357-52841350 0.00904729 2.30136 cg18416881 AKR1B1 TRUE 7:133793468-133794751 0.00945958 2.18597 cg16994506 CCND2 HS_185.1 TRUE 12:4252826-4255273 0.0095895 2.0014 cg06444558 PBK FALSE 0.0138487 2.21656 cg12382902 CCND2 HS_185.1 TRUE 12:4252826-4255273 0.0150115 2.28745 cg07981910 DAB2IP TRUE 9:123500662-123502056 0.0153695 2.18136 cg23653187 ADPN TRUE 22:42650559-42652185 0.0305049 2.37956 cg23857226 ZNF671 TRUE 19:62930217-62931307 0.0308336 2.51812 cg03169527 C3orf31 TRUE 3:11862686-11863678 0.0308639 2.20277          

Page 56: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  56  

Table S2. Functional characterization of genes hypermethylated in tumors using Gene Ontology and INTERPRO. Category Term Count Fold Enrichment FDR INTERPRO IPR017970:Homeobox, conserved site 64 6.96 3.46E-32 INTERPRO IPR001356:Homeobox 63 6.77 7.66E-31 INTERPRO IPR012287:Homeodomain-related 63 6.68 1.70E-30 GOTERM_MF_FAT GO:0003700~transcription factor activity 124 2.88 8.99E-25 GOTERM_MF_FAT GO:0043565~sequence-specific DNA binding 95 3.55 1.09E-24 GOTERM_BP_FAT GO:0030182~neuron differentiation 77 3.97 2.50E-22 GOTERM_BP_FAT GO:0001501~skeletal system development 64 4.53 3.51E-21 GOTERM_BP_FAT GO:0048598~embryonic morphogenesis 60 4.41 4.68E-19 GOTERM_BP_FAT GO:0007267~cell-cell signaling 85 3.20 1.61E-18 GOTERM_BP_FAT GO:0007389~pattern specification process 55 4.65 2.53E-18 GOTERM_BP_FAT GO:0003002~regionalization 46 5.27 4.12E-17 GOTERM_CC_FAT GO:0044459~plasma membrane part 165 1.89 1.24E-14 GOTERM_MF_FAT GO:0030528~transcription regulator activity 138 2.07 2.35E-14 GOTERM_BP_FAT GO:0019226~transmission of nerve impulse 56 3.61 2.00E-13 GOTERM_BP_FAT GO:0006355~regulation of transcription, DNA-dependent 151 1.92 7.88E-13 GOTERM_CC_FAT GO:0031226~intrinsic to plasma membrane 108 2.24 9.10E-13 GOTERM_BP_FAT GO:0051252~regulation of RNA metabolic process 152 1.89 2.36E-12 GOTERM_CC_FAT GO:0005887~integral to plasma membrane 105 2.23 3.94E-12 GOTERM_CC_FAT GO:0005886~plasma membrane 232 1.55 5.75E-12 GOTERM_BP_FAT GO:0009952~anterior/posterior pattern formation 33 5.32 1.54E-11 GOTERM_BP_FAT GO:0048706~embryonic skeletal system development 25 7.33 2.11E-11 GOTERM_BP_FAT GO:0048562~embryonic organ morphogenesis 32 5.43 2.29E-11 GOTERM_CC_FAT GO:0044421~extracellular region part 90 2.36 2.18E-11 GOTERM_BP_FAT GO:0007268~synaptic transmission 48 3.64 3.52E-11 GOTERM_BP_FAT GO:0048705~skeletal system morphogenesis 29 5.85 6.73E-11

Page 57: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  57  

GOTERM_BP_FAT GO:0048666~neuron development 51 3.40 7.53E-11 GOTERM_BP_FAT GO:0035295~tube development 40 4.11 1.35E-10 GOTERM_BP_FAT GO:0048568~embryonic organ development 35 4.60 2.13E-10 GOTERM_BP_FAT GO:0048732~gland development 31 5.19 2.40E-10 GOTERM_BP_FAT GO:0045165~cell fate commitment 31 5.04 5.45E-10 INTERPRO IPR001827:Homeobox protein, antennapedia type, conserved site 14 14.72 1.36E-09 GOTERM_BP_FAT GO:0043009~chordate embryonic development 48 3.28 1.86E-09 GOTERM_BP_FAT GO:0009792~embryonic development ending in birth or egg hatching 48 3.25 2.60E-09 GOTERM_CC_FAT GO:0005576~extracellular region 140 1.76 6.32E-09 GOTERM_BP_FAT GO:0048736~appendage development 25 5.48 2.34E-08 GOTERM_BP_FAT GO:0060173~limb development 25 5.48 2.34E-08 GOTERM_BP_FAT GO:0030326~embryonic limb morphogenesis 23 5.97 2.95E-08 GOTERM_BP_FAT GO:0035113~embryonic appendage morphogenesis 23 5.97 2.95E-08 GOTERM_BP_FAT GO:0048704~embryonic skeletal system morphogenesis 19 7.53 3.55E-08 GOTERM_BP_FAT GO:0035108~limb morphogenesis 24 5.47 6.69E-08 GOTERM_BP_FAT GO:0035107~appendage morphogenesis 24 5.47 6.69E-08 GOTERM_BP_FAT GO:0031175~neuron projection development 38 3.35 3.10E-07 GOTERM_BP_FAT GO:0035239~tube morphogenesis 26 4.62 4.32E-07 GOTERM_MF_FAT GO:0022836~gated channel activity 42 3.07 3.87E-07 GOTERM_CC_FAT GO:0031012~extracellular matrix 42 3.07 3.67E-07 GOTERM_BP_FAT GO:0048545~response to steroid hormone stimulus 32 3.76 5.96E-07 GOTERM_BP_FAT GO:0001655~urogenital system development 24 4.93 6.45E-07 GOTERM_MF_FAT GO:0031420~alkali metal ion binding 35 3.48 5.70E-07

GOTERM_BP_FAT GO:0006357~regulation of transcription from RNA polymerase II promoter 71 2.21 8.46E-07

GOTERM_BP_FAT GO:0030902~hindbrain development 18 6.78 8.76E-07 GOTERM_CC_FAT GO:0043005~neuron projection 41 3.02 9.80E-07 GOTERM_BP_FAT GO:0045761~regulation of adenylate cyclase activity 22 5.18 1.57E-06 GOTERM_BP_FAT GO:0009719~response to endogenous stimulus 48 2.68 2.19E-06

Page 58: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  58  

GOTERM_BP_FAT GO:0031279~regulation of cyclase activity 22 5.02 2.86E-06 GOTERM_BP_FAT GO:0051339~regulation of lyase activity 22 4.92 4.21E-06 GOTERM_BP_FAT GO:0030817~regulation of cAMP biosynthetic process 22 4.92 4.21E-06 GOTERM_BP_FAT GO:0030814~regulation of cAMP metabolic process 22 4.82 6.12E-06 GOTERM_CC_FAT GO:0005578~proteinaceous extracellular matrix 38 2.99 6.01E-06 GOTERM_BP_FAT GO:0048812~neuron projection morphogenesis 32 3.39 8.09E-06 GOTERM_BP_FAT GO:0009725~response to hormone stimulus 44 2.71 8.52E-06 GOTERM_BP_FAT GO:0030001~metal ion transport 51 2.48 8.67E-06 GOTERM_BP_FAT GO:0030030~cell projection organization 44 2.70 9.25E-06 GOTERM_MF_FAT GO:0022843~voltage-gated cation channel activity 26 4.01 8.30E-06 GOTERM_BP_FAT GO:0045449~regulation of transcription 173 1.50 9.69E-06 GOTERM_BP_FAT GO:0051960~regulation of nervous system development 30 3.53 1.08E-05 GOTERM_CC_FAT GO:0045202~synapse 40 2.84 9.73E-06 GOTERM_MF_FAT GO:0022832~voltage-gated channel activity 30 3.49 1.22E-05 GOTERM_MF_FAT GO:0005244~voltage-gated ion channel activity 30 3.49 1.22E-05 GOTERM_CC_FAT GO:0044456~synapse part 32 3.28 1.48E-05 GOTERM_BP_FAT GO:0030808~regulation of nucleotide biosynthetic process 22 4.52 2.11E-05 GOTERM_BP_FAT GO:0030802~regulation of cyclic nucleotide biosynthetic process 22 4.52 2.11E-05 GOTERM_BP_FAT GO:0019932~second-messenger-mediated signaling 33 3.17 2.39E-05 GOTERM_MF_FAT GO:0005267~potassium channel activity 24 4.09 2.45E-05 GOTERM_MF_FAT GO:0022803~passive transmembrane transporter activity 46 2.52 2.66E-05 GOTERM_BP_FAT GO:0016477~cell migration 36 2.95 3.23E-05 GOTERM_BP_FAT GO:0030799~regulation of cyclic nucleotide metabolic process 22 4.40 3.48E-05 GOTERM_BP_FAT GO:0007155~cell adhesion 65 2.10 3.85E-05 GOTERM_BP_FAT GO:0022610~biological adhesion 65 2.09 3.98E-05 GOTERM_BP_FAT GO:0048729~tissue morphogenesis 28 3.51 4.22E-05

GOTERM_BP_FAT GO:0007187~G-protein signaling, coupled to cyclic nucleotide second messenger 22 4.32 4.80E-05

GOTERM_BP_FAT GO:0060284~regulation of cell development 30 3.31 4.89E-05

Page 59: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  59  

GOTERM_BP_FAT GO:0006140~regulation of nucleotide metabolic process 22 4.28 5.63E-05 GOTERM_BP_FAT GO:0044057~regulation of system process 38 2.78 5.87E-05 GOTERM_BP_FAT GO:0048858~cell projection morphogenesis 33 3.04 6.57E-05 GOTERM_MF_FAT GO:0030955~potassium ion binding 23 4.07 5.65E-05 INTERPRO IPR003968:Potassium channel, voltage dependent, Kv 11 10.28 6.35E-05 GOTERM_MF_FAT GO:0015267~channel activity 45 2.47 7.00E-05 GOTERM_MF_FAT GO:0022838~substrate specific channel activity 44 2.50 7.19E-05 GOTERM_BP_FAT GO:0019935~cyclic-nucleotide-mediated signaling 23 4.00 9.43E-05 GOTERM_BP_FAT GO:0007166~cell surface receptor linked signal transduction 130 1.58 9.52E-05 GOTERM_MF_FAT GO:0005216~ion channel activity 43 2.52 8.42E-05 INTERPRO IPR005821:Ion transport 20 4.59 8.99E-05

GOTERM_BP_FAT GO:0045935~positive regulation of nucleobase, nucleoside, nucleotide and nucleic acid metabolic process 59 2.14 1.04E-04

GOTERM_BP_FAT GO:0015672~monovalent inorganic cation transport 38 2.70 1.25E-04 GOTERM_BP_FAT GO:0006928~cell motion 49 2.33 1.27E-04 GOTERM_BP_FAT GO:0050767~regulation of neurogenesis 26 3.54 1.29E-04 GOTERM_BP_FAT GO:0051173~positive regulation of nitrogen compound metabolic process 60 2.10 1.31E-04 GOTERM_BP_FAT GO:0045893~positive regulation of transcription, DNA-dependent 49 2.32 1.44E-04 INTERPRO IPR001828:Extracellular ligand-binding receptor 12 8.41 1.31E-04 GOTERM_BP_FAT GO:0014033~neural crest cell differentiation 12 8.21 1.63E-04 GOTERM_BP_FAT GO:0014032~neural crest cell development 12 8.21 1.63E-04 GOTERM_BP_FAT GO:0007423~sensory organ development 31 3.06 1.65E-04 GOTERM_BP_FAT GO:0051254~positive regulation of RNA metabolic process 49 2.30 1.86E-04 GOTERM_BP_FAT GO:0032990~cell part morphogenesis 33 2.91 1.86E-04 GOTERM_MF_FAT GO:0003677~DNA binding 153 1.49 1.67E-04 GOTERM_BP_FAT GO:0031328~positive regulation of cellular biosynthetic process 62 2.04 2.13E-04 GOTERM_BP_FAT GO:0006813~potassium ion transport 25 3.53 2.54E-04 GOTERM_BP_FAT GO:0043627~response to estrogen stimulus 20 4.30 2.57E-04 GOTERM_BP_FAT GO:0042127~regulation of cell proliferation 68 1.95 2.69E-04

Page 60: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  60  

GOTERM_MF_FAT GO:0005261~cation channel activity 34 2.80 2.43E-04 GOTERM_CC_FAT GO:0005615~extracellular space 57 2.10 2.33E-04 GOTERM_BP_FAT GO:0010628~positive regulation of gene expression 55 2.14 3.05E-04 GOTERM_BP_FAT GO:0001822~kidney development 19 4.47 3.12E-04 GOTERM_BP_FAT GO:0009891~positive regulation of biosynthetic process 62 2.01 3.57E-04 GOTERM_BP_FAT GO:0006812~cation transport 53 2.16 3.65E-04 GOTERM_BP_FAT GO:0031644~regulation of neurological system process 24 3.54 4.44E-04 GOTERM_BP_FAT GO:0048870~cell motility 36 2.65 4.66E-04 GOTERM_BP_FAT GO:0051674~localization of cell 36 2.65 4.66E-04 GOTERM_BP_FAT GO:0060485~mesenchyme development 14 6.08 4.98E-04 GOTERM_BP_FAT GO:0001755~neural crest cell migration 10 9.82 5.06E-04 GOTERM_BP_FAT GO:0006811~ion transport 66 1.94 5.20E-04 GOTERM_MF_FAT GO:0005249~voltage-gated potassium channel activity 19 4.31 4.81E-04 INTERPRO IPR003091:Voltage-dependent potassium channel 11 8.41 5.62E-04 GOTERM_BP_FAT GO:0045941~positive regulation of transcription 53 2.12 6.74E-04 GOTERM_BP_FAT GO:0007409~axonogenesis 27 3.16 6.84E-04 GOTERM_MF_FAT GO:0046873~metal ion transmembrane transporter activity 37 2.56 6.22E-04

GOTERM_BP_FAT GO:0045944~positive regulation of transcription from RNA polymerase II promoter 40 2.43 8.13E-04

GOTERM_MF_FAT GO:0008066~glutamate receptor activity 11 8.04 7.45E-04 GOTERM_BP_FAT GO:0048667~cell morphogenesis involved in neuron differentiation 28 3.03 9.60E-04 GOTERM_BP_FAT GO:0007190~activation of adenylate cyclase activity 14 5.75 1.01E-03 GOTERM_BP_FAT GO:0006836~neurotransmitter transport 17 4.63 1.02E-03 GOTERM_BP_FAT GO:0010557~positive regulation of macromolecule biosynthetic process 58 2.00 1.16E-03 GOTERM_BP_FAT GO:0048483~autonomic nervous system development 9 10.70 1.21E-03 GOTERM_BP_FAT GO:0048754~branching morphogenesis of a tube 15 5.21 1.26E-03 GOTERM_BP_FAT GO:0045762~positive regulation of adenylate cyclase activity 14 5.65 1.26E-03 GOTERM_BP_FAT GO:0010033~response to organic substance 62 1.94 1.27E-03 GOTERM_BP_FAT GO:0031281~positive regulation of cyclase activity 14 5.55 1.56E-03

Page 61: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  61  

INTERPRO IPR001791:Laminin G 12 6.73 1.60E-03 GOTERM_BP_FAT GO:0000902~cell morphogenesis 38 2.41 2.12E-03 GOTERM_BP_FAT GO:0051349~positive regulation of lyase activity 14 5.36 2.38E-03 GOTERM_BP_FAT GO:0048663~neuron fate commitment 12 6.45 2.50E-03 GOTERM_CC_FAT GO:0030054~cell junction 45 2.19 1.93E-03 GOTERM_BP_FAT GO:0035270~endocrine system development 15 4.91 2.71E-03 GOTERM_BP_FAT GO:0014031~mesenchymal cell development 13 5.76 2.89E-03 GOTERM_BP_FAT GO:0048762~mesenchymal cell differentiation 13 5.76 2.89E-03 GOTERM_BP_FAT GO:0001708~cell fate specification 13 5.76 2.89E-03 GOTERM_BP_FAT GO:0009953~dorsal/ventral pattern formation 14 5.27 2.92E-03 GOTERM_BP_FAT GO:0016337~cell-cell adhesion 32 2.62 3.11E-03 GOTERM_BP_FAT GO:0022037~metencephalon development 11 7.10 3.17E-03 GOTERM_CC_FAT GO:0030425~dendrite 22 3.40 2.48E-03 GOTERM_BP_FAT GO:0051969~regulation of transmission of nerve impulse 22 3.38 3.45E-03 INTERPRO IPR013032:EGF-like region, conserved site 31 2.67 3.09E-03 GOTERM_BP_FAT GO:0050804~regulation of synaptic transmission 21 3.49 3.82E-03 GOTERM_BP_FAT GO:0006350~transcription 136 1.46 4.01E-03

GOTERM_BP_FAT GO:0000122~negative regulation of transcription from RNA polymerase II promoter 31 2.63 4.20E-03

GOTERM_BP_FAT GO:0007422~peripheral nervous system development 11 6.90 4.21E-03 GOTERM_BP_FAT GO:0035136~forelimb morphogenesis 9 9.24 4.56E-03 INTERPRO IPR000742:EGF-like, type 3 24 3.12 4.20E-03 GOTERM_BP_FAT GO:0060429~epithelium development 28 2.79 4.89E-03 GOTERM_BP_FAT GO:0043062~extracellular structure organization 23 3.19 5.19E-03 GOTERM_BP_FAT GO:0003007~heart morphogenesis 15 4.64 5.50E-03 GOTERM_BP_FAT GO:0021546~rhombomere development 6 19.36 5.76E-03 GOTERM_BP_FAT GO:0045597~positive regulation of cell differentiation 28 2.76 5.79E-03 GOTERM_BP_FAT GO:0007167~enzyme linked receptor protein signaling pathway 36 2.38 5.86E-03 GOTERM_BP_FAT GO:0030900~forebrain development 22 3.27 5.97E-03

Page 62: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  62  

GOTERM_BP_FAT GO:0045596~negative regulation of cell differentiation 27 2.82 5.97E-03 GOTERM_CC_FAT GO:0044420~extracellular matrix part 18 3.88 4.59E-03 GOTERM_BP_FAT GO:0001763~morphogenesis of a branching structure 15 4.58 6.51E-03 GOTERM_BP_FAT GO:0051216~cartilage development 15 4.58 6.51E-03 GOTERM_BP_FAT GO:0009954~proximal/distal pattern formation 9 8.84 6.73E-03 GOTERM_BP_FAT GO:0007194~negative regulation of adenylate cyclase activity 13 5.34 6.80E-03 GOTERM_BP_FAT GO:0051350~negative regulation of lyase activity 13 5.34 6.80E-03 GOTERM_BP_FAT GO:0031280~negative regulation of cyclase activity 13 5.34 0.006797788 INTERPRO IPR012680:Laminin G, subdomain 2 11 6.61 0.006468651 INTERPRO IPR003131:Potassium channel, voltage dependent, Kv, tetramerisation 12 5.82 0.007417909 INTERPRO IPR003972:Potassium channel, voltage dependent, Kv1 6 18.93 0.007644517 GOTERM_BP_FAT GO:0021510~spinal cord development 11 6.37 0.009347796    Table S3. Functional characterization of genes hypomethylated in tumors using Gene Ontology and INTERPRO. Category Term Count Fold Enrichment FDR GOTERM_BP_FAT GO:0006955~immune response 33 3.08 0.00004 GOTERM_BP_FAT GO:0006952~defense response 30 3.14 0.00013 GOTERM_BP_FAT GO:0007267~cell-cell signaling 27 2.90 0.00321  

Page 63: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  63  

Table S4. Summary of Ingenuity Pathway Analysis of genes differentially methylated in tumors.  Genes hypermethylated in tumors Top Diseases and Biological Functions Category Diseases or Functions Annotation p-value1 # Molecules2 Embryonic Development development of body axis 1.21E-40 144 Organismal Development development of body axis 1.21E-40 144 Organismal Survival organismal death 2.47E-36 247 Cellular Development differentiation of cells 1.60E-31 206 Nervous System Development and Function morphology of nervous system 1.99E-31 121 Connective Tissue Development and Function abnormal morphology of bone 7.94E-31 86

Upstream Regulators Upstream Regulator Molecule Type Fold Change Predicted Activation State Activation z-score p-value of overlap3

CTNNB1 transcription regulator Inhibited -5.392 2.13E-20 POU4F1 transcription regulator -2.639 -0.009 7.57E-18 SOX2 transcription regulator 0.016 4.32E-16 EZH2 transcription regulator Activated 2.096 6.48E-15 HTT transcription regulator 0.498 1.14E-14 POU5F1 transcription regulator 0.563 4.21E-12 ISL1 transcription regulator -2.734 -0.464 7.31E-12 GLI3 transcription regulator -0.096 1.11E-11 REST transcription regulator Activated 2.412 1.70E-10 RNF2 transcription regulator Activated 2.236 4.46E-10 NANOG transcription regulator 0.186 3.64E-09 EED transcription regulator 1.982 4.08E-09

Page 64: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  64  

Genes hypomethylated in tumors Top Diseases and Biological Functions Category Diseases or Functions Annotation p-value1 # Molecules2 Inflammatory Disease Rheumatic Disease 5.51E-13 56 Cellular Development proliferation of blood cells 2.33E-11 45 Hematological System Development and Function proliferation of immune cells 3.24E-11 43 Immunological Disease systemic autoimmune syndrome 3.72E-10 50 Tissue Morphology quantity of blood cells 7.27E-10 49 Cell-To-Cell Signaling and Interaction activation of mononuclear leukocytes 1.48E-09 86

1 p-values indicate the likelihood that the association between each set of molecules in the experiment and a given process or pathway is due to random chance. 2 The number of molecules is the aggregate number of unique molecules in all sets within that category. 3 The overlap p-­‐value calls likely upstream regulators based on significant overlap between dataset genes and known targets regulated by a transcription regulator. It is calculated using Fisher’s Exact Test

Page 65: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  65  

Table S5. Gene Set Enrichment Analysis of genes differentially methylated in tumors from the NCI microarray cohort.

GO ID Enrichment Score Enrichment p-value BENPORATH_ES_WITH_H3K27ME3 191.113 1.00202E-83 BENPORATH_EED_TARGETS 190.913 1.22279E-83 BENPORATH_SUZ12_TARGETS 182.755 4.27212E-80 MIKKELSEN_MEF_HCP_WITH_H3K27ME3 147.212 1.1653E-64 BENPORATH_PRC2_TARGETS 141.327 4.19208E-62 MIKKELSEN_NPC_HCP_WITH_H3K27ME3 115.962 4.34992E-51 MIKKELSEN_MCV6_HCP_WITH_H3K27ME3 103.673 9.45089E-46 MEISSNER_NPC_HCP_WITH_H3K4ME2_AND_H3K27ME3 93.685 2.05638E-41 MEISSNER_BRAIN_HCP_WITH_H3K27ME3 79.9969 1.81043E-35 MEISSNER_BRAIN_HCP_WITH_H3K4ME3_AND_H3K27ME3 63.305 3.21341E-28 HATADA_METHYLATED_IN_LUNG_CANCER_UP 61.256 2.49382E-27 MARTENS_TRETINOIN_RESPONSE_UP 41.4478 9.98777E-19 KEGG_NEUROACTIVE_LIGAND_RECEPTOR_INTERACTION 38.8019 1.40779E-17 SCHLESINGER_H3K27ME3_IN_NORMAL_AND_METHYLATED_IN_CANCER 38.3011 2.32286E-17 MIKKELSEN_IPS_WITH_HCP_H3K27ME3 35.4926 3.85252E-16 REACTOME_NEURONAL_SYSTEM 27.8887 7.72821E-13 REACTOME_GPCR_LIGAND_BINDING 26.701 2.53462E-12 REACTOME_SIGNALING_BY_GPCR 24.2608 2.90863E-11 MEISSNER_NPC_HCP_WITH_H3_UNMETHYLATED 24.0438 3.61333E-11 MEISSNER_NPC_HCP_WITH_H3K4ME2 24.0224 3.69138E-11 MEISSNER_NPC_HCP_WITH_H3K27ME3 23.1524 8.81119E-11 YOSHIMURA_MAPK8_TARGETS_UP 22.7931 1.26204E-10 HOQUE_METHYLATED_IN_CANCER 21.1025 6.84411E-10

Page 66: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  66  

Table S6. Gene Set Enrichment Analysis of genes differentially methylated in tumors from the Japan microarray cohort. GO ID Enrichment Score Enrichment p-value BENPORATH_ES_WITH_H3K27ME3 238.494 2.65E-104 BENPORATH_EED_TARGETS 195.235 1.62E-85 BENPORATH_SUZ12_TARGETS 192.211 3.34E-84 YOSHIMURA_MAPK8_TARGETS_UP 184.361 8.57E-81 BENPORATH_PRC2_TARGETS 163.012 1.60E-71 MEISSNER_BRAIN_HCP_WITH_H3K4ME3_AND_H3K27ME3 155.975 1.82E-68 MIKKELSEN_MEF_HCP_WITH_H3K27ME3 141.196 4.78E-62 MARTENS_TRETINOIN_RESPONSE_UP 121.306 2.08E-53 MIKKELSEN_MCV6_HCP_WITH_H3K27ME3 99.5285 5.96E-44 MIKKELSEN_NPC_HCP_WITH_H3K27ME3 96.1319 1.78E-42 MEISSNER_NPC_HCP_WITH_H3K4ME2_AND_H3K27ME3 92.8625 4.68E-41 MEISSNER_BRAIN_HCP_WITH_H3K27ME3 85.43 7.91E-38 KEGG_NEUROACTIVE_LIGAND_RECEPTOR_INTERACTION 85.0589 1.15E-37 SCHUETZ_BREAST_CANCER_DUCTAL_INVASIVE_UP 81.7163 3.24E-36 REACTOME_GPCR_LIGAND_BINDING 81.0677 6.20E-36 SMID_BREAST_CANCER_LUMINAL_B_DN 81.049 6.32E-36 HATADA_METHYLATED_IN_LUNG_CANCER_UP 80.6859 9.09E-36 SMID_BREAST_CANCER_NORMAL_LIKE_UP 78.8772 5.55E-35 REACTOME_NEURONAL_SYSTEM 73.1086 1.78E-32 WONG_ADULT_TISSUE_STEM_MODULE 71.8064 6.53E-32 REACTOME_CLASS_A1_RHODOPSIN_LIKE_RECEPTORS 68.7898 1.33E-30 SWEET_LUNG_CANCER_KRAS_DN 67.9604 3.06E-30 SMID_BREAST_CANCER_BASAL_UP 62.4874 7.28E-28 BOQUEST_STEM_CELL_UP 60.8659 3.68E-27 DELYS_THYROID_CANCER_UP 60.4157 5.78E-27

Page 67: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  67  

TARTE_PLASMA_CELL_VS_PLASMABLAST_UP 57.8931 7.20E-26 SATO_SILENCED_BY_METHYLATION_IN_PANCREATIC_CANCER_1 57.5412 1.02E-25 ONDER_CDH1_TARGETS_2_DN 57.0513 1.67E-25 REACTOME_TRANSMEMBRANE_TRANSPORT_OF_SMALL_MOLECULES 56.5553 2.74E-25 MEISSNER_NPC_HCP_WITH_H3_UNMETHYLATED 55.8782 5.40E-25 LIM_MAMMARY_STEM_CELL_UP 55.586 7.23E-25 LIU_PROSTATE_CANCER_DN 54.2976 2.62E-24 KEGG_CYTOKINE_CYTOKINE_RECEPTOR_INTERACTION 53.3728 6.61E-24 CHEN_METABOLIC_SYNDROM_NETWORK 52.8838 1.08E-23 SANSOM_APC_TARGETS_DN 50.4574 1.22E-22 SMID_BREAST_CANCER_BASAL_DN 47.3188 2.82E-21 LINDGREN_BLADDER_CANCER_CLUSTER_2B 46.4058 7.02E-21 MEISSNER_NPC_HCP_WITH_H3K4ME2 45.5744 1.61E-20 REACTOME_POTASSIUM_CHANNELS 45.5221 1.70E-20 POOLA_INVASIVE_BREAST_CANCER_UP 45.3778 1.96E-20 KEGG_CALCIUM_SIGNALING_PATHWAY 45.0855 2.63E-20 PICCALUGA_ANGIOIMMUNOBLASTIC_LYMPHOMA_UP 44.5906 4.31E-20 REACTOME_TRANSMISSION_ACROSS_CHEMICAL_SYNAPSES 43.5589 1.21E-19 BOQUEST_STEM_CELL_CULTURED_VS_FRESH_UP 43.4664 1.33E-19 RIGGI_EWING_SARCOMA_PROGENITOR_UP 42.8625 2.43E-19 WANG_MLL_TARGETS 42.7424 2.74E-19 REACTOME_PEPTIDE_LIGAND_BINDING_RECEPTORS 42.5218 3.41E-19 MCLACHLAN_DENTAL_CARIES_DN 42.3741 3.96E-19 JAEGER_METASTASIS_DN 41.5941 8.63E-19 MCLACHLAN_DENTAL_CARIES_UP 37.9591 3.27E-17 BRUINS_UVC_RESPONSE_VIA_TP53_GROUP_A 37.7883 3.88E-17 VERHAAK_AML_WITH_NPM1_MUTATED_DN 37.3695 5.90E-17 WALLACE_PROSTATE_CANCER_RACE_UP 36.9725 8.77E-17

Page 68: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  68  

HOQUE_METHYLATED_IN_CANCER 36.6927 1.16E-16 MOREAUX_MULTIPLE_MYELOMA_BY_TACI_UP 36.5001 1.41E-16 KEGG_PATHWAYS_IN_CANCER 36.1866 1.92E-16 MIKKELSEN_NPC_HCP_WITH_H3K4ME3_AND_H3K27ME3 36.0442 2.22E-16 HELLER_SILENCED_BY_METHYLATION_UP 35.7616 2.94E-16 SMID_BREAST_CANCER_RELAPSE_IN_BONE_DN 35.5712 3.56E-16 REACTOME_HEMOSTASIS 35.3982 4.23E-16 MIKKELSEN_IPS_WITH_HCP_H3K27ME3 35.2688 4.82E-16 SCHLESINGER_METHYLATED_DE_NOVO_IN_CANCER 35.1745 5.30E-16 REACTOME_NEUROTRANSMITTER_RECEPTOR_BINDING_AND_DOWNSTREAM_TRANSMISSION_IN_THE_POSTSYNAPTIC_CELL 33.719 2.27E-15 SABATES_COLORECTAL_ADENOMA_DN 33.7035 2.31E-15 ZWANG_TRANSIENTLY_UP_BY_2ND_EGF_PULSE_ONLY 33.0377 4.49E-15 KEGG_CELL_ADHESION_MOLECULES_CAMS 32.5513 7.30E-15 CHYLA_CBFA2T3_TARGETS_UP 32.2873 9.50E-15 BYSTRYKH_HEMATOPOIESIS_STEM_CELL_QTL_TRANS 32.1922 1.04E-14 MIKKELSEN_MCV6_LCP_WITH_H3K4ME3 31.821 1.51E-14 VART_KSHV_INFECTION_ANGIOGENIC_MARKERS_UP 31.6896 1.73E-14 CHICAS_RB1_TARGETS_CONFLUENT 31.3505 2.42E-14 REACTOME_G_ALPHA_I_SIGNALLING_EVENTS 30.7129 4.59E-14 REACTOME_SLC_MEDIATED_TRANSMEMBRANE_TRANSPORT 30.6961 4.66E-14 SCHAEFFER_PROSTATE_DEVELOPMENT_48HR_DN 30.6325 4.97E-14 SHEN_SMARCA2_TARGETS_DN 30.308 6.88E-14 SERVITJA_ISLET_HNF1A_TARGETS_UP 30.1397 8.14E-14 REACTOME_GASTRIN_CREB_SIGNALLING_PATHWAY_VIA_PKC_AND_MAPK 30.1316 8.20E-14 LEE_BMP2_TARGETS_UP 29.8878 1.05E-13 MIKKELSEN_ES_ICP_WITH_H3K4ME3_AND_H3K27ME3 29.6739 1.30E-13 HSIAO_LIVER_SPECIFIC_GENES 29.6654 1.31E-13 WEST_ADRENOCORTICAL_TUMOR_DN 29.5449 1.48E-13

Page 69: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  69  

VART_KSHV_INFECTION_ANGIOGENIC_MARKERS_DN 29.1872 2.11E-13 HOSHIDA_LIVER_CANCER_SUBCLASS_S1 29.0458 2.43E-13 KAAB_HEART_ATRIUM_VS_VENTRICLE_UP 28.9827 2.59E-13 JAATINEN_HEMATOPOIETIC_STEM_CELL_DN 28.7957 3.12E-13 REACTOME_G_ALPHA_S_SIGNALLING_EVENTS 28.431 4.49E-13 REACTOME_G_ALPHA_Q_SIGNALLING_EVENTS 28.3585 4.83E-13 REACTOME_SIGNALING_BY_GPCR 28.3069 5.09E-13 MCBRYAN_PUBERTAL_BREAST_3_4WK_UP 27.8888 7.73E-13 DELYS_THYROID_CANCER_DN 27.7515 8.86E-13 QI_PLASMACYTOMA_UP 27.6912 9.42E-13 ROZANOV_MMP14_TARGETS_UP 27.3909 1.27E-12 OHM_METHYLATED_IN_ADULT_CANCERS 27.1003 1.70E-12 ANASTASSIOU_CANCER_MESENCHYMAL_TRANSITION_SIGNATURE 26.979 1.92E-12 DELACROIX_RARG_BOUND_MEF 26.8877 2.10E-12 NAKAYAMA_SOFT_TISSUE_TUMORS_PCA1_UP 26.7807 2.34E-12 SENESE_HDAC1_AND_HDAC2_TARGETS_DN 26.7734 2.36E-12 ACEVEDO_FGFR1_TARGETS_IN_PROSTATE_CANCER_MODEL_DN 26.7487 2.42E-12 HOSHIDA_LIVER_CANCER_SUBCLASS_S3 26.4778 3.17E-12 SMID_BREAST_CANCER_LUMINAL_A_UP 26.3148 3.73E-12 VERHAAK_GLIOBLASTOMA_NEURAL 26.2311 4.05E-12 CAIRO_LIVER_DEVELOPMENT_DN 26.1614 4.35E-12 TONKS_TARGETS_OF_RUNX1_RUNX1T1_FUSION_HSC_DN 26.1357 4.46E-12 WONG_ENDMETRIUM_CANCER_DN 26.0229 4.99E-12 LEE_NEURAL_CREST_STEM_CELL_UP 25.5682 7.87E-12 VERHAAK_AML_WITH_NPM1_MUTATED_UP 25.4882 8.52E-12 MCGARVEY_SILENCED_BY_METHYLATION_IN_COLON_CANCER 25.0326 1.34E-11 NAKAYAMA_SOFT_TISSUE_TUMORS_PCA1_DN 24.8744 1.57E-11 FEVR_CTNNB1_TARGETS_UP 24.8639 1.59E-11

Page 70: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  70  

FOSTER_KDM1A_TARGETS_UP 24.6974 1.88E-11 BROWNE_HCMV_INFECTION_48HR_DN 24.645 1.98E-11 KEGG_DILATED_CARDIOMYOPATHY 24.543 2.19E-11 YAUCH_HEDGEHOG_SIGNALING_PARACRINE_DN 24.3928 2.55E-11 KEGG_FOCAL_ADHESION 24.2966 2.81E-11 NAKAYAMA_SOFT_TISSUE_TUMORS_PCA2_DN 24.1844 3.14E-11 MCBRYAN_PUBERTAL_BREAST_4_5WK_UP 24.1566 3.23E-11 DURAND_STROMA_MAX_UP 24.1174 3.36E-11 WU_CELL_MIGRATION 24.0897 3.45E-11 REACTOME_PHOSPHOLIPASE_C_MEDIATED_CASCADE 24.0237 3.69E-11 ONDER_CDH1_TARGETS_3_DN 23.7067 5.06E-11 KEGG_MELANOMA 23.6473 5.37E-11 PEREZ_TP63_TARGETS 23.6301 5.47E-11 BLALOCK_ALZHEIMERS_DISEASE_UP 23.4018 6.87E-11 TAKEDA_TARGETS_OF_NUP98_HOXA9_FUSION_16D_UP 23.1551 8.79E-11 FLECHNER_BIOPSY_KIDNEY_TRANSPLANT_REJECTED_VS_OK_UP 23.0474 9.79E-11 SMIRNOV_CIRCULATING_ENDOTHELIOCYTES_IN_CANCER_UP 23.0328 9.93E-11 SERVITJA_ISLET_HNF1A_TARGETS_DN 22.5588 1.60E-10 CROMER_TUMORIGENESIS_UP 22.527 1.65E-10 BRIDEAU_IMPRINTED_GENES 22.527 1.65E-10 KIM_BIPOLAR_DISORDER_OLIGODENDROCYTE_DENSITY_CORR_DN 22.4982 1.70E-10 REACTOME_CELL_JUNCTION_ORGANIZATION 22.38 1.91E-10 VECCHI_GASTRIC_CANCER_EARLY_DN 22.3672 1.93E-10 MARTINEZ_RB1_AND_TP53_TARGETS_UP 22.3612 1.94E-10 ZHU_CMV_24_HR_DN 22.2817 2.10E-10 PILON_KLF1_TARGETS_UP 22.1008 2.52E-10 REACTOME_VOLTAGE_GATED_POTASSIUM_CHANNELS 21.9757 2.86E-10 MARTINEZ_RB1_TARGETS_UP 21.8652 3.19E-10

Page 71: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  71  

CHIANG_LIVER_CANCER_SUBCLASS_CTNNB1_DN 21.8421 3.27E-10 REACTOME_EXTRACELLULAR_MATRIX_ORGANIZATION 21.6444 3.98E-10 BOQUEST_STEM_CELL_DN 21.4679 4.75E-10 ZHU_CMV_ALL_DN 21.4613 4.78E-10 LOPES_METHYLATED_IN_COLON_CANCER_UP 21.4362 4.90E-10 PLASARI_TGFB1_TARGETS_10HR_DN 21.3677 5.25E-10 BERTUCCI_INVASIVE_CARCINOMA_DUCTAL_VS_LOBULAR_DN 21.2971 5.63E-10 DOANE_BREAST_CANCER_ESR1_UP 21.192 6.26E-10 KEGG_VASCULAR_SMOOTH_MUSCLE_CONTRACTION 21.0828 6.98E-10 LI_WILMS_TUMOR_VS_FETAL_KIDNEY_2_DN 20.9583 7.91E-10 BOYLAN_MULTIPLE_MYELOMA_C_D_DN 20.8894 8.47E-10 REACTOME_CELL_CELL_JUNCTION_ORGANIZATION 20.7436 9.80E-10

Page 72: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  72  

Table S7. Hypermethylated probesets corresponding to genes marked by H3K27me3 in ESC. Probeset ID Gene Symbol miRNA CPG_ISLAND p-value Mean(NT) Mean(T) MeanDiff(NT-T) cg26521404 HOXA9 HSA-MIR-196B TRUE 3.07E-15 -2.3463 0.0904213 -2.43672 cg01354473 HOXA9 HSA-MIR-196B TRUE 4.80E-14 -2.72687 -0.4599 -2.26697 cg04330449 NEUROG1 TRUE 4.04E-13 -3.09065 -0.942288 -2.14836 cg25720804 TLX3 TRUE 1.05E-12 -3.72331 -1.12801 -2.5953 cg16428251 SOX14 TRUE 1.86E-12 -2.89995 -1.38383 -1.51611 cg09936561 DRD5 TRUE 4.86E-12 -2.34616 -0.909869 -1.43629 cg08572611 ACTL6B TRUE 5.45E-12 -3.51003 -1.13207 -2.37796 cg06722633 GRIK3 TRUE 1.23E-11 -2.46637 -0.798237 -1.66813 cg23130254 HOXD12 TRUE 2.12E-11 -2.91962 -1.10555 -1.81407 cg02994956 NEFH TRUE 3.37E-11 -1.07458 -0.0558057 -1.01877 cg02332525 GRM7 TRUE 5.50E-11 -1.83744 -0.56889 -1.26855 cg07536847 PAX7 TRUE 5.79E-11 -3.23052 -1.49787 -1.73265 cg03958979 NR2E1 TRUE 1.32E-10 -1.5624 -0.287356 -1.27504 cg12374721 PRAC TRUE 1.67E-10 -3.38259 -1.30888 -2.07371 cg02757432 GPR26 TRUE 1.74E-10 -1.56702 -0.0559919 -1.51103 cg02919422 SOX17 TRUE 2.08E-10 -2.40037 -0.802118 -1.59825 cg26069745 HOXA2 TRUE 2.87E-10 -1.85808 -0.312093 -1.54599 cg00891541 SMPD3 HS_254 TRUE 3.04E-10 0.0560959 1.34227 -1.28617 cg14859460 GRM6 TRUE 3.91E-10 -3.69743 -1.48024 -2.21719 cg08441806 NKX6-2 TRUE 4.15E-10 -1.81566 -0.324833 -1.49083 cg09516965 PTGDR TRUE 7.25E-10 -5.02301 -2.10952 -2.91349 cg17619823 ADRB3 TRUE 7.97E-10 -3.87024 -1.53873 -2.33151 cg04490714 SLC6A2 TRUE 8.48E-10 -3.87848 -1.81647 -2.06201 cg06277657 DGKI TRUE 1.17E-09 -2.97912 -1.53908 -1.44004 cg01381846 HOXA9 HSA-MIR-196B TRUE 1.45E-09 -2.43723 -0.535243 -1.90199

Page 73: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  73  

cg10883303 HOXA13 TRUE 1.52E-09 -3.69946 -1.52646 -2.173 cg01009664 TRH TRUE 2.37E-09 -3.46042 -1.46626 -1.99416 cg12127282 HOXD4 HSA-MIR-10B TRUE 2.41E-09 -1.82713 -0.633483 -1.19364 cg08832227 KCNA1 TRUE 2.59E-09 -2.02401 -0.728642 -1.29537 cg14991487 HOXD9 TRUE 3.00E-09 -4.3529 -1.98023 -2.37267 cg19352038 PAX3 TRUE 3.07E-09 -1.56496 -0.300947 -1.26401 cg17241310 BARHL2 TRUE 4.39E-09 -2.34033 -0.524143 -1.81619 cg25882366 HOXB2 TRUE 4.78E-09 -1.21514 -0.036715 -1.17842 cg02008154 TBX20 TRUE 6.03E-09 -2.65866 -1.17506 -1.4836 cg06038133 CORO6 TRUE 6.38E-09 -0.0560772 -1.33162 1.27554 cg03544320 CRMP1 TRUE 6.57E-09 -1.04082 0.313523 -1.35434 cg09871315 HOXA2 TRUE 7.50E-09 -2.84254 -1.16617 -1.67637 cg13035743 PRRT1 TRUE 7.51E-09 -3.11223 -1.41236 -1.69986 cg03963198 IRX4 TRUE 7.88E-09 -2.8071 -1.11185 -1.69525 cg04897683 NEUROG1 TRUE 1.07E-08 -2.5652 -1.22045 -1.34475 cg16761581 ADCY4 TRUE 1.24E-08 -3.87428 -1.81245 -2.06183 cg15748507 PRLHR TRUE 1.75E-08 -0.868192 0.210143 -1.07833 cg10362591 SLC6A2 TRUE 1.81E-08 -3.29482 -1.47947 -1.81535 cg20792062 KCNA5 TRUE 2.03E-08 -2.58586 -1.0087 -1.57716 cg24396745 HCN4 TRUE 2.28E-08 -2.37673 -0.975208 -1.40152 cg13870866 TBX20 TRUE 3.34E-08 -2.54534 -1.18487 -1.36047 cg01683883 CMTM2 TRUE 6.10E-08 -1.57711 -0.512441 -1.06467 cg12781568 WT1 TRUE 6.73E-08 -0.701206 0.581276 -1.28248 cg07823492 HOXB1 TRUE 8.70E-08 0.801668 2.43421 -1.63254 cg12265829 ADCY4 TRUE 1.05E-07 -3.67221 -2.045 -1.62721 cg01693350 WT1 TRUE 1.22E-07 -0.878453 0.215639 -1.09409 cg23349790 IGSF21 TRUE 4.10E-07 -2.50336 -0.426105 -2.07726 cg11428724 PAX7 TRUE 7.90E-07 -2.3281 -0.983327 -1.34477

Page 74: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  74  

cg25228126 FZD2 TRUE 1.79E-06 -3.12411 -1.74087 -1.38324 cg09873258 DLK1 TRUE 3.32E-06 -1.5223 -0.0552616 -1.46704

Page 75: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  75  

Table S8. Association between methylation cluster and clinical-demographic variables in the NCI microarray cohort.

Variable

Categories cluster 1

(Low methylation) cluster 2

(High methylation) Fisher's exact test

p-value n=18 n=17 Sex Male 8 8

Female 10 9 1 Smoking < 20 packyears 4 5

> 20 packyears 7 9 1 Race EA 14 12

AA 4 5 0.711 Stage Stage I 16 14

Stage II 2 2 1 Stage I Stage IA 11 8

Stage IB 5 6 0.707 Vital Status Dead 1 6

Aive 17 11 0.041

Page 76: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  76  

Table S9. Gene expression differences between high and low methylation clusters in the NCI microarray cohort.

Illumina ID (mRNA) Symbol p-value Mean (cluster 1 - low methylation)

Mean (cluster 2 - high methylation)

FoldChange (cluster2/ cluster1)

ILMN_2361603 NDRG2 0.000196178 8.83644 7.99533 -1.79143 ILMN_1793433 RAB10 0.000356981 11.5793 11.9859 1.32556 ILMN_2072603 MRPL14 0.000381947 10.7707 11.2886 1.4319 ILMN_1761996 SFRS5 0.000420546 11.5355 11.0466 -1.40337 ILMN_2094360 NR2F2 0.000453287 9.02144 8.2382 -1.721 ILMN_1797534 RIOK1 0.000464144 8.68444 9.0364 1.27629 ILMN_1774110 CHN2 0.000472588 7.68006 7.2538 -1.34374 ILMN_1751062 SCARA5 0.00048927 7.7295 6.92073 -1.75171 ILMN_1682792 BYSL 0.000496084 8.57211 9.0332 1.37658 ILMN_1723007 ZCCHC9 0.000526032 9.14722 9.45007 1.23357 ILMN_1750790 GSTM5 0.000622751 7.62178 7.22793 -1.31389 ILMN_1697597 KIAA0494 0.000665294 10.1344 9.73267 -1.32108 ILMN_1718565 CDKN1C 0.000684226 7.96217 7.50627 -1.37164 ILMN_1677765 LRP8 0.000688326 7.79128 8.2256 1.35128 ILMN_1787879 ARL2 0.000709533 10.7446 10.4617 -1.21663 ILMN_2123665 SBF2 0.000774468 7.60878 7.3248 -1.21755 ILMN_2197247 POLR3A 0.00112359 8.47256 8.77893 1.2366 ILMN_1689665 NAE1 0.00112541 9.76006 10.0462 1.21938 ILMN_1756220 DDX18 0.00128384 10.2216 10.5417 1.24848 ILMN_2378868 SFRS5 0.00130348 11.5402 11.0093 -1.44476 ILMN_1676998 SCN2B 0.00134394 7.26483 6.98973 -1.21008 ILMN_2320330 MAL 0.00143312 8.04733 7.5786 -1.38389 ILMN_1765636 FLJ22184 0.00151627 7.25339 7.73167 1.39308 ILMN_1807525 CNTD2 0.00153487 7.36872 7.65573 1.22011

Page 77: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  77  

ILMN_1780170 APOD 0.00157176 10.9932 9.08893 -3.74324 ILMN_1761450 DHRS4L2 0.00196629 7.89706 7.6004 -1.22829 ILMN_1723481 CHST3 0.00199145 7.94794 8.7218 1.70983 ILMN_1663119 DSC2 0.00208644 8.40922 8.988 1.49358 ILMN_1716089 KANK2 0.0021235 8.33072 7.9236 -1.32604 ILMN_1715661 TFAM 0.00212833 9.45094 9.77027 1.24774 ILMN_1782057 ATP8B2 0.00230325 7.72328 7.37387 -1.27404 ILMN_2129349 TSSC1 0.00246646 8.96789 9.25507 1.22025 ILMN_1807807 SKA2 0.00251024 7.96633 8.349 1.30375 ILMN_2082273 RGS5 0.00272761 8.88406 8.2994 -1.49968 ILMN_1805474 C1orf131 0.00276252 9.23239 9.49713 1.20142 ILMN_1778650 VILL 0.00281194 7.5595 7.28447 -1.21002 ILMN_2334693 NARF 0.00284764 9.38961 9.80167 1.33058 ILMN_1766712 TCF21 0.00293466 8.33217 7.57293 -1.69259 ILMN_1776953 MYL9 0.00299836 7.42856 7.0548 -1.29572 ILMN_1798804 SRPK1 0.00310633 9.95994 10.4172 1.37293 ILMN_2268068 MAPKAP1 0.00321347 8.01917 8.3184 1.23049 ILMN_1660973 GAD1 0.00334542 7.0775 7.50367 1.34366 ILMN_1697448 TXNIP 0.00343633 10.7634 10.0974 -1.58672 ILMN_2403247 CMTM7 0.00350849 8.63206 8.1512 -1.39557 ILMN_1727360 MAOB 0.00350984 7.75294 7.3102 -1.35919 ILMN_2404746 RNF39 0.00354472 7.44683 7.81973 1.29495 ILMN_1810797 WASF3 0.00357736 7.99828 7.58407 -1.33257 ILMN_1674386 PITX1 0.00361314 8.16806 9.4556 2.44112 ILMN_2414399 NME1 0.00365456 7.91517 8.3048 1.31006 ILMN_2375879 VEGFA 0.00366796 8.68767 9.36467 1.59881 ILMN_1664464 PTGDS 0.00367753 10.9408 9.602 -2.52947 ILMN_1683562 SNRPG 0.00371214 12.6854 13.0789 1.31356

Page 78: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  78  

ILMN_1716246 FRZB 0.00373661 9.55661 8.7982 -1.69163 ILMN_1812441 C17orf63 0.00375283 8.54856 8.82953 1.21502 ILMN_1676548 BZW2 0.00394571 10.0068 10.4364 1.34683 ILMN_1691570 METTL5 0.00395412 9.38067 9.80067 1.33793 ILMN_2234187 CDO1 0.00407902 7.917 7.48867 -1.34568 ILMN_1670353 RAD51AP1 0.00416364 8.02472 8.64553 1.53774 ILMN_1657361 CBX7 0.00432703 8.82994 8.35707 -1.38788 ILMN_1790136 C20orf20 0.00437456 9.87989 10.4261 1.46028 ILMN_2390299 PSMB8 0.00457235 10.367 10.8563 1.4038 ILMN_1768488 TERF2 0.00460035 8.50728 8.15547 -1.27616 ILMN_1688464 MAP6D1 0.00462831 7.78228 8.1546 1.29443 ILMN_2166524 CCNYL1 0.00462874 7.7265 7.99887 1.20779 ILMN_1762764 SH3BGRL2 0.00466211 9.83806 9.25087 -1.50232 ILMN_1743499 POLDIP2 0.0047433 7.951 8.23393 1.21667 ILMN_1778242 CALM1 0.0047645 12.2796 11.8825 -1.3169 ILMN_2375319 RASGRP2 0.00487933 7.78272 7.28227 -1.41466 ILMN_1809566 ZSCAN16 0.0049546 8.62589 9.01973 1.31389 ILMN_2363634 ADHFE1 0.00496448 7.51311 7.22973 -1.21704 ILMN_1682099 TNFAIP8L3 0.0050404 7.90622 7.4996 -1.32558 ILMN_2308903 WFDC3 0.00512811 7.93333 8.7962 1.81865 ILMN_1781097 UBXN4 0.00514495 11.4124 11.6839 1.20705 ILMN_2107991 HABP4 0.00532733 8.07878 7.76713 -1.24112 ILMN_1744628 FDX1L 0.00537723 8.29956 8.58887 1.22206 ILMN_1782086 AOC3 0.005424 7.21339 6.9378 -1.21049 ILMN_1713807 MAN1C1 0.00546261 8.256 7.7864 -1.38473 ILMN_1655307 FAM136A 0.00549431 9.35161 9.7704 1.3368 ILMN_2363621 RBBP8 0.00558545 9.5465 10.022 1.3904 ILMN_1701551 ABCA6 0.00559997 7.98489 7.55473 -1.34738

Page 79: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  79  

ILMN_1664012 CANT1 0.00563662 9.48489 9.90593 1.3389 ILMN_2343010 BOLA3 0.00568535 10.7225 11.1723 1.36588 ILMN_2098446 PMAIP1 0.00572526 7.61561 8.06527 1.36571 ILMN_1737631 PAQR6 0.00587072 7.28411 7.54727 1.2001 ILMN_2331735 AP2B1 0.00598814 7.622 7.95613 1.26062 ILMN_1800308 GTF2H4 0.0060324 8.55889 8.901 1.26761 ILMN_1656285 METTL7A 0.00609219 8.89122 8.3132 -1.4928 ILMN_1795166 PTH1R 0.0061518 8.55122 8.03633 -1.42888 ILMN_1741133 NME1 0.00615404 11.457 12.0576 1.51635 ILMN_2338038 AK3L1 0.00629074 7.90406 8.69187 1.72645 ILMN_1761968 PPP1R14A 0.00633981 9.59656 8.97593 -1.53754 ILMN_1740415 WFDC3 0.00638695 7.54206 8.19627 1.57376 ILMN_1721833 IER5 0.00648886 8.51067 9.05813 1.46152 ILMN_2343563 ANAPC11 0.00650444 9.054 9.42273 1.29122 ILMN_1673962 NUP205 0.00654911 9.70606 9.99307 1.22011 ILMN_1764309 ADH1A 0.00655299 10.2153 8.91787 -2.45787 ILMN_1761101 CCDC112 0.00661118 7.75783 8.02653 1.20472 ILMN_1751776 CKAP2L 0.00663706 7.67556 8.2676 1.50738 ILMN_1670134 FADS1 0.00664192 8.31756 8.84893 1.44531 ILMN_1722713 FBLN1 0.00669342 8.58561 7.93953 -1.56491 ILMN_2373632 IDH3B 0.00672794 9.76544 9.49573 -1.20557 ILMN_1747016 CEP55 0.0067531 7.94756 8.57613 1.54604 ILMN_1766264 PI16 0.00681332 7.58733 7.1714 -1.33416 ILMN_2396672 ABLIM1 0.00688325 8.85328 8.33713 -1.43013 ILMN_1722953 USP47 0.00691606 8.02044 7.68047 -1.26574 ILMN_1756402 TMEM177 0.00692943 8.35517 8.62673 1.20712 ILMN_1699695 TNFRSF21 0.00700965 11.5032 12.1433 1.55838 ILMN_2386530 RPLP1 0.00702356 12.8394 12.4331 -1.32529

Page 80: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  80  

ILMN_2415776 WWOX 0.00710853 7.33494 7.06653 -1.20448 ILMN_1699570 TPD52L2 0.00711571 11.0008 11.4105 1.32841 ILMN_1736670 PPP1R3C 0.00725434 8.75761 8.08187 -1.59742 ILMN_1794599 SNRPD3 0.00727056 7.79089 7.48707 -1.23441 ILMN_1724480 AXIN2 0.00729017 9.34028 8.7742 -1.48049 ILMN_2064725 METTL7B 0.00730771 7.70522 8.31547 1.52652 ILMN_1786658 BOLA3 0.00764989 10.5873 11.0038 1.33471 ILMN_2319326 ADARB1 0.00770373 8.32422 7.92953 -1.31466 ILMN_1678669 RRM2 0.00780406 7.13083 7.40987 1.21338 ILMN_1785424 ABLIM1 0.00781995 11.0886 10.4475 -1.55949 ILMN_1777660 RNF144 0.00782099 8.62572 8.92227 1.2282 ILMN_1709486 SRPX 0.00786534 8.9535 8.26007 -1.61713 ILMN_1680424 CTSG 0.00792955 8.74133 7.84567 -1.86047 ILMN_1803882 VEGFA 0.00793605 7.47133 7.80993 1.26453 ILMN_2045729 WDR12 0.00794651 8.54728 8.88007 1.25945 ILMN_1718769 ITSN1 0.00797396 9.09856 8.8078 -1.22328 ILMN_2403237 CHN2 0.00799063 8.50606 8.1478 -1.28188 ILMN_1747395 SLC24A1 0.00803304 7.78472 7.4634 -1.24948 ILMN_1780283 C20orf201 0.0083011 6.89994 7.1884 1.22133 ILMN_1759232 IRS1 0.00862967 7.74311 8.22573 1.39728 ILMN_1779353 PUS7 0.008649 8.95856 9.3286 1.29239 ILMN_1723846 FAM119B 0.00867453 9.60906 10.0339 1.3424 ILMN_1744968 KCNAB1 0.00867707 7.72556 7.3472 -1.29986 ILMN_2214473 ARHGEF5L 0.00869854 9.08994 9.5908 1.41505 ILMN_1807106 LDHA 0.00875587 12.9073 13.3643 1.37267 ILMN_2364376 ILK 0.00892411 11.2122 10.8837 -1.2557 ILMN_1789136 SERF2 0.00899175 12.2698 11.9155 -1.27842 ILMN_1702197 C9orf140 0.00899214 7.14094 7.73793 1.51256

Page 81: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  81  

ILMN_2374293 DYRK1A 0.00902388 9.13478 8.85807 -1.21143 ILMN_1724941 CDCP1 0.00906464 8.34961 8.79367 1.36042 ILMN_1736828 CHST10 0.00908557 8.03511 7.5742 -1.37641 ILMN_2052871 TMEM116 0.00915023 8.722 8.4124 -1.23936 ILMN_2364022 SLC16A3 0.00923442 10.1372 10.6791 1.45583 ILMN_2327860 MAL 0.00928336 9.15189 8.4786 -1.5947 ILMN_1691117 DNTTIP1 0.00935461 9.06333 9.55747 1.40847 ILMN_1786197 NR2F1 0.00936857 8.79094 8.31887 -1.38711 ILMN_1800626 SESN1 0.00942654 9.19689 8.80553 -1.31163 ILMN_1790859 PLAC9 0.00943394 8.94533 8.19533 -1.68179 ILMN_2112638 SVEP1 0.00949015 9.66772 8.76293 -1.87227 ILMN_2388466 TIA1 0.00963804 9.76239 10.0842 1.2499 ILMN_1670708 F10 0.00964153 7.35944 7.07513 -1.21783 ILMN_2111187 ELOVL6 0.00964633 8.19628 8.9438 1.67891 ILMN_2193325 MMP23B 0.00971599 9.2495 8.52487 -1.65248 ILMN_1782403 PRR11 0.00986134 7.2805 7.59347 1.24226 ILMN_1684620 SPAG4 0.00990104 7.6295 7.99233 1.28595 ILMN_1724148 ORAI1 0.0100018 7.8675 7.60093 -1.20294 ILMN_2339266 LAMA2 0.0100112 8.1165 7.64833 -1.38335 ILMN_1736184 GSTM3 0.0100269 8.67006 8.24207 -1.34536 ILMN_1748018 GORASP2 0.0101863 9.705 10.0123 1.23736 ILMN_1775759 NRAS 0.0101897 8.60411 8.8862 1.21595 ILMN_1690209 C1orf186 0.0102351 7.825 7.439 -1.30677 ILMN_1768812 FXYD6 0.0103108 8.0655 7.6526 -1.33136 ILMN_1806037 TK1 0.0104983 8.52778 9.14953 1.53875 ILMN_2292646 GAD1 0.010597 7.38539 7.99273 1.52345 ILMN_1653129 CSTF2 0.0106175 9.06306 9.366 1.23366 ILMN_2089656 C1orf107 0.0106457 7.67022 7.9638 1.22568

Page 82: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  82  

ILMN_1708101 LMNB2 0.0109036 9.0545 9.62973 1.48992 ILMN_1657145 MEOX1 0.011012 7.51244 7.22213 -1.2229 ILMN_1657631 STAP2 0.0112558 7.6775 7.94967 1.20762 ILMN_2387471 FLJ22184 0.0114456 8.59783 9.28593 1.61116 ILMN_1799516 DNAJC9 0.0115601 9.60933 10.0392 1.34711 ILMN_2341952 MRPL35 0.0116518 9.11044 9.42633 1.24478 ILMN_1805842 FHL1 0.0116744 9.15122 8.36693 -1.72224 ILMN_1777325 STAT1 0.0117319 11.6301 12.2056 1.49018 ILMN_2395204 SLTM 0.011819 9.49 9.22473 -1.20186 ILMN_1710186 CCL17 0.0119379 7.47933 7.10773 -1.29379 ILMN_2381197 RNF19A 0.0121518 9.83106 10.3374 1.42045 ILMN_1728168 C20orf45 0.0122725 8.84517 9.2448 1.31917 ILMN_2184966 ZHX2 0.0123307 8.54778 8.2214 -1.25386 ILMN_1707336 ARPC4 0.0123995 9.84583 9.56247 -1.21703 ILMN_2358980 ILK 0.0124287 7.93711 7.65687 -1.2144 ILMN_1767113 AOX1 0.0124301 7.661 7.29573 -1.28812 ILMN_1728570 TCF21 0.0124842 8.65239 8.0066 -1.56459 ILMN_1689329 SCD 0.0125988 11.3408 12.0653 1.65239 ILMN_1719641 SMOC2 0.0126121 8.34678 7.74147 -1.52131 ILMN_1777190 CFD 0.0126318 10.8468 10.0095 -1.78679 ILMN_1806502 ZNF165 0.0127362 8.25483 8.60813 1.27748 ILMN_1801869 WDR75 0.0127803 10.1061 10.3902 1.21764 ILMN_2330861 SMC4 0.0130252 9.31561 9.74547 1.3471 ILMN_2143795 MGC4677 0.0131291 11.2512 11.7849 1.4477 ILMN_1687848 C7 0.0131604 8.79622 7.80673 -1.98548 ILMN_1813530 AGT 0.0132284 7.28978 7.65833 1.29106 ILMN_1666924 PINK1 0.0133032 7.99833 7.69273 -1.23593 ILMN_2392546 PAICS 0.0133523 10.1904 10.7519 1.47578

Page 83: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  83  

ILMN_1718866 C5orf46 0.0133578 7.02756 7.482 1.37025 ILMN_1719811 REM1 0.0134842 7.41194 7.14247 -1.20537 ILMN_2071809 MGP 0.0135813 13.0903 12.1575 -1.90907 ILMN_2330307 SLC43A3 0.013623 8.424 8.88113 1.37281 ILMN_1788019 LAMA2 0.013691 8.10678 7.7516 -1.27914 ILMN_1712452 KIF20B 0.013843 8.55128 8.90147 1.27473 ILMN_2247594 RPLP1 0.0138833 13.9798 13.7031 -1.21148 ILMN_2169801 TPSAB1 0.0141797 9.67678 8.78093 -1.8607 ILMN_1745607 A2M 0.0144182 11.3688 10.5003 -1.82585 ILMN_2129015 AFF1 0.014435 8.17578 7.89333 -1.21625 ILMN_1794594 RASGRP2 0.0146329 7.97178 7.445 -1.44071 ILMN_1660114 MMRN1 0.0147745 8.7025 7.8528 -1.80213 ILMN_1712806 AP1S1 0.0151027 8.48278 8.84667 1.28689 ILMN_1765801 GAA 0.0151062 8.32267 8.59913 1.21122 ILMN_1679797 ADARB1 0.0152259 10.4342 9.75053 -1.60618 ILMN_1748926 TMEM209 0.0152622 9.23778 9.54073 1.23367 ILMN_2198413 MYEOV 0.0152847 7.43306 7.90833 1.39019 ILMN_1721283 HSPB6 0.0154974 8.32239 7.7162 -1.52223 ILMN_1791949 PGBD1 0.0155125 7.49294 7.77967 1.21987 ILMN_1730223 RNF39 0.0157105 7.78111 8.3294 1.46235 ILMN_1665483 KIAA0020 0.0159636 9.54439 9.85147 1.2372 ILMN_2112301 DRAP1 0.0159817 10.3944 10.7641 1.29212 ILMN_2239754 IFIT3 0.0160131 7.54394 7.81187 1.20407 ILMN_1744835 MRPL21 0.0160309 9.95856 10.2909 1.25903 ILMN_2353633 EMR2 0.016184 8.14383 8.50533 1.28476 ILMN_1685661 RRP15 0.0162316 8.85772 9.15073 1.22519 ILMN_2160929 FEN1 0.0163983 9.01628 9.56187 1.45962 ILMN_1681741 C1orf31 0.0163997 7.69722 7.99007 1.22505

Page 84: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  84  

ILMN_1759910 SERPINA5 0.0165063 7.58133 7.03707 -1.45828 ILMN_1723123 FGFR3 0.0167252 10.1093 9.10647 -2.00398 ILMN_2088437 CX3CR1 0.016886 8.18517 7.70253 -1.39729 ILMN_2181191 TPI1 0.0169137 12.4067 12.793 1.30702 ILMN_1676822 C2orf40 0.0169408 8.30044 7.509 -1.73081 ILMN_2051972 GPC3 0.0169727 8.35983 7.86133 -1.41274 ILMN_1668247 LTC4S 0.0171524 7.72967 7.45353 -1.21095 ILMN_1791147 YPEL3 0.0172512 9.49439 9.22107 -1.20859 ILMN_1678191 GDF10 0.0174236 7.76528 7.37433 -1.31125 ILMN_1728071 KRAS 0.0175622 8.65711 9.005 1.2727 ILMN_1673111 TSEN34 0.0177413 9.89322 10.1771 1.21743 ILMN_2229649 KCTD12 0.0178955 7.13683 7.5794 1.35902 ILMN_1776157 4-Sep 0.017897 9.18117 8.71093 -1.38533 ILMN_1777322 FAM91A1 0.0179039 7.66739 8.0024 1.26139 ILMN_1664756 KPNA4 0.0181435 10.1913 10.5296 1.26424 ILMN_1673721 EXO1 0.0181663 7.71428 8.20173 1.40197 ILMN_1699603 MRPL12 0.0182077 9.10794 9.43053 1.25057 ILMN_2415926 THOC3 0.0182462 8.32594 8.65307 1.25451 ILMN_2064606 TBC1D2B 0.0182811 9.47211 9.1446 -1.25485 ILMN_1723358 SCARA3 0.0183251 7.82839 7.38453 -1.36023 ILMN_1733164 FBXO11 0.0183517 7.51083 7.7844 1.20879 ILMN_1781942 HMMR 0.0183743 7.65378 8.08327 1.34676 ILMN_1801257 CENPA 0.0185456 7.54356 8.001 1.37311 ILMN_2107613 RHOJ 0.0185904 7.96317 7.58387 -1.30071 ILMN_2106902 CHES1 0.0186279 9.91028 9.6132 -1.22865 ILMN_2202423 HELLS 0.0186493 7.54744 7.82567 1.2127 ILMN_1739154 LSAMP 0.0187562 7.77978 7.50873 -1.20668 ILMN_2068435 ZNF700 0.0187901 9.34533 9.61 1.20136

Page 85: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  85  

ILMN_1661264 SHMT2 0.0188737 10.7363 11.2717 1.4494 ILMN_2396272 PDCD4 0.0188894 11.4128 10.9136 -1.41341 ILMN_1745299 FABP7 0.0188916 7.21917 7.5486 1.25652 ILMN_1655561 ARPC3 0.0189357 10.8419 10.1553 -1.60957 ILMN_1652913 EZH2 0.0190383 7.15717 7.4286 1.20701 ILMN_1703955 FBXO32 0.0190685 9.08356 9.73973 1.5759 ILMN_2300664 CACNA1I 0.0191162 7.48083 7.17273 -1.23808 ILMN_2362245 HNRNPH2 0.0191478 7.51056 7.23147 -1.21343 ILMN_2256359 HSZFP36 0.0191648 8.54517 8.86467 1.2479 ILMN_1771084 ACSM3 0.0191755 8.22728 8.68407 1.37248 ILMN_1669502 E2F3 0.0193314 9.87783 10.2853 1.32639 ILMN_2410713 FGFR4 0.019353 7.929 7.61287 -1.24499 ILMN_1673522 MOCOS 0.0193923 8.17222 8.65627 1.39866 ILMN_2212909 MELK 0.0194221 8.15067 8.921 1.70566 ILMN_1670172 WDR33 0.0198278 10.1484 10.4911 1.26817

Page 86: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  86  

Table S10. Gene Set Enrichment Analysis of genes differentially expressed between high and low methylation clusters in the NCI microarray cohort. GO ID Enrichment Score Enrichment p-value SHEDDEN_LUNG_CANCER_POOR_SURVIVAL_A6 174.781 1.24E-76 ROSTY_CERVICAL_CANCER_PROLIFERATION_CLUSTER 135.578 1.32E-59 DODD_NASOPHARYNGEAL_CARCINOMA_DN 133.817 7.65E-59 GOBERT_OLIGODENDROCYTE_DIFFERENTIATION_UP 133.774 7.99E-59 SOTIRIOU_BREAST_CANCER_GRADE_1_VS_3_UP 130.431 2.26E-57 KOBAYASHI_EGFR_SIGNALING_24HR_DN 128.166 2.18E-56 KINSEY_TARGETS_OF_EWSR1_FLII_FUSION_UP 106.282 6.96E-47 CHANG_CYCLING_GENES 100.739 1.78E-44 DUTERTRE_ESTRADIOL_RESPONSE_24HR_UP 99.372 6.97E-44 VECCHI_GASTRIC_CANCER_EARLY_UP 93.7791 1.87E-41 BERENJENO_TRANSFORMED_BY_RHOA_UP 92.9275 4.39E-41 CHIANG_LIVER_CANCER_SUBCLASS_PROLIFERATION_UP 91.032 2.92E-40 BLUM_RESPONSE_TO_SALIRASIB_DN 90.8166 3.62E-40 NUYTTEN_EZH2_TARGETS_DN 86.936 1.75E-38 CAIRO_HEPATOBLASTOMA_CLASSES_UP 84.5726 1.86E-37 WHITEFORD_PEDIATRIC_CANCER_MARKERS 82.7828 1.12E-36 GRAHAM_CML_DIVIDING_VS_NORMAL_QUIESCENT_UP 81.2603 5.12E-36 CASORELLI_ACUTE_PROMYELOCYTIC_LEUKEMIA_DN 78.4024 8.92E-35 WINNEPENNINCKX_MELANOMA_METASTASIS_UP 77.4974 2.20E-34 PUJANA_CHEK2_PCC_NETWORK 75.3827 1.83E-33 BURTON_ADIPOGENESIS_3 75.1843 2.23E-33 KANG_DOXORUBICIN_RESISTANCE_UP 73.0771 1.83E-32 HORIUCHI_WTAP_TARGETS_DN 71.3091 1.07E-31 LEE_BMP2_TARGETS_DN 70.576 2.23E-31 MANALO_HYPOXIA_DN 69.8287 4.72E-31

Page 87: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  87  

CROONQUIST_IL6_DEPRIVATION_DN 69.6719 5.52E-31 ZHOU_CELL_CYCLE_GENES_IN_IR_RESPONSE_24HR 69.1603 9.21E-31 RODRIGUES_THYROID_CARCINOMA_ANAPLASTIC_UP 68.9441 1.14E-30 MARSON_BOUND_BY_E2F4_UNSTIMULATED 68.4227 1.92E-30 RODRIGUES_THYROID_CARCINOMA_POORLY_DIFFERENTIATED_UP 67.378 5.47E-30 ZHANG_TLX_TARGETS_60HR_DN 66.9013 8.81E-30 KONG_E2F3_TARGETS 66.7534 1.02E-29 LEE_EARLY_T_LYMPHOCYTE_UP 66.4752 1.35E-29 BENPORATH_ES_1 66.0369 2.09E-29 BENPORATH_CYCLING_GENES 65.9918 2.19E-29 PUJANA_BRCA2_PCC_NETWORK 64.9239 6.37E-29 GRAHAM_NORMAL_QUIESCENT_VS_NORMAL_DIVIDING_DN 63.9035 1.77E-28 WONG_EMBRYONIC_STEM_CELL_CORE 62.9129 4.76E-28 RUIZ_TNC_TARGETS_DN 62.3114 8.68E-28 FOURNIER_ACINAR_DEVELOPMENT_LATE_2 62.0628 1.11E-27 BENPORATH_PROLIFERATION 61.0611 3.03E-27 WU_APOPTOSIS_BY_CDKN1A_VIA_TP53 60.8313 3.81E-27 HOFFMANN_LARGE_TO_SMALL_PRE_BII_LYMPHOCYTE_UP 60.1057 7.88E-27 TANG_SENESCENCE_TP53_TARGETS_DN 59.7599 1.11E-26 LINDGREN_BLADDER_CANCER_CLUSTER_3_UP 59.2057 1.94E-26 BASAKI_YBX1_TARGETS_UP 57.7731 8.12E-26 LI_WILMS_TUMOR_VS_FETAL_KIDNEY_1_DN 57.3837 1.20E-25 NAKAYAMA_SOFT_TISSUE_TUMORS_PCA2_UP 57.3526 1.24E-25 CROONQUIST_NRAS_SIGNALING_DN 56.4082 3.18E-25 PATIL_LIVER_CANCER 55.9997 4.78E-25 GAVIN_FOXP3_TARGETS_CLUSTER_P6 55.9853 4.85E-25 ZHAN_MULTIPLE_MYELOMA_PR_UP 55.1541 1.11E-24 FUJII_YBX1_TARGETS_DN 54.9502 1.37E-24

Page 88: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  88  

ZHOU_CELL_CYCLE_GENES_IN_IR_RESPONSE_6HR 54.8402 1.52E-24 PUJANA_BRCA1_PCC_NETWORK 54.0472 3.37E-24 CHICAS_RB1_TARGETS_GROWING 53.059 9.05E-24 SENGUPTA_NASOPHARYNGEAL_CARCINOMA_UP 52.6852 1.32E-23 WANG_RESPONSE_TO_GSK3_INHIBITOR_SB216763_DN 51.5725 4.00E-23 SMID_BREAST_CANCER_BASAL_UP 51.2458 5.55E-23 ISHIDA_E2F_TARGETS 50.8593 8.17E-23 CHEMNITZ_RESPONSE_TO_PROSTAGLANDIN_E2_UP 50.2648 1.48E-22 WEI_MYCN_TARGETS_WITH_E_BOX 49.2526 4.07E-22 BOYAULT_LIVER_CANCER_SUBCLASS_G3_UP 47.4625 2.44E-21 FARMER_BREAST_CANCER_CLUSTER_2 47.0684 3.62E-21 REACTOME_CELL_CYCLE 46.7222 5.11E-21 GOLDRATH_ANTIGEN_RESPONSE 46.4355 6.81E-21 VANTVEER_BREAST_CANCER_METASTASIS_DN 44.9046 3.15E-20 RHEIN_ALL_GLUCOCORTICOID_THERAPY_DN 44.7416 3.71E-20 POOLA_INVASIVE_BREAST_CANCER_UP 44.7303 3.75E-20 REACTOME_CELL_CYCLE_MITOTIC 44.6465 4.08E-20 WHITFIELD_CELL_CYCLE_LITERATURE 44.4631 4.90E-20 ZHANG_BREAST_CANCER_PROGENITORS_UP 44.4331 5.05E-20 MORI_IMMATURE_B_LYMPHOCYTE_DN 44.0808 7.18E-20 SARRIO_EPITHELIAL_MESENCHYMAL_TRANSITION_UP 43.9296 8.35E-20 MARKEY_RB1_ACUTE_LOF_DN 43.7016 1.05E-19 MITSIADES_RESPONSE_TO_APLIDIN_DN 43.1135 1.89E-19 ZHENG_GLIOBLASTOMA_PLASTICITY_UP 42.9846 2.15E-19 FURUKAWA_DUSP6_TARGETS_PCI35_DN 42.7563 2.70E-19 CHICAS_RB1_TARGETS_SENESCENT 41.3997 1.05E-18 ODONNELL_TFRC_TARGETS_DN 41.2098 1.27E-18 MORI_PRE_BI_LYMPHOCYTE_UP 40.9632 1.62E-18

Page 89: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  89  

WEST_ADRENOCORTICAL_TUMOR_UP 39.8816 4.78E-18 FEVR_CTNNB1_TARGETS_DN 39.4005 7.74E-18 JOHNSTONE_PARVB_TARGETS_3_DN 38.9697 1.19E-17 EGUCHI_CELL_CYCLE_RB1_TARGETS 37.9033 3.46E-17 BURTON_ADIPOGENESIS_PEAK_AT_24HR 37.8325 3.71E-17 TIEN_INTESTINE_PROBIOTICS_24HR_UP 37.4569 5.40E-17 LE_EGR2_TARGETS_UP 36.9664 8.82E-17 GRADE_COLON_AND_RECTAL_CANCER_UP 36.8144 1.03E-16 PUJANA_XPRSS_INT_NETWORK 36.3316 1.66E-16 MOLENAAR_TARGETS_OF_CCND1_AND_CDK4_DN 35.4637 3.97E-16 RHODES_UNDIFFERENTIATED_CANCER 35.1727 5.30E-16 KRIEG_HYPOXIA_NOT_VIA_KDM3A 33.8351 2.02E-15 YANG_BCL3_TARGETS_UP 33.1939 3.84E-15 ZHANG_TLX_TARGETS_UP 33.021 4.56E-15 DUTERTRE_ESTRADIOL_RESPONSE_6HR_UP 32.7921 5.74E-15 TARTE_PLASMA_CELL_VS_PLASMABLAST_DN 32.5157 7.56E-15 TOYOTA_TARGETS_OF_MIR34B_AND_MIR34C 32.4819 7.82E-15 AMUNDSON_GAMMA_RADIATION_RESPONSE 32.256 9.80E-15 LINDGREN_BLADDER_CANCER_CLUSTER_1_DN 32.05 1.20E-14 KAMMINGA_EZH2_TARGETS 31.8556 1.46E-14 WILCOX_PRESPONSE_TO_ROGESTERONE_UP 31.7794 1.58E-14 MORI_LARGE_PRE_BII_LYMPHOCYTE_UP 30.9849 3.49E-14 NAKAMURA_TUMOR_ZONE_PERIPHERAL_VS_CENTRAL_UP 30.9835 3.50E-14 CROONQUIST_NRAS_VS_STROMAL_STIMULATION_DN 30.9478 3.63E-14 REACTOME_DNA_REPLICATION 30.8656 3.94E-14 JAEGER_METASTASIS_UP 30.7252 4.53E-14 BIDUS_METASTASIS_UP 30.3811 6.39E-14 CUI_TCF21_TARGETS_2_UP 30.0014 9.34E-14

Page 90: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  90  

ZHANG_TLX_TARGETS_36HR_DN 29.5982 1.40E-13 ACEVEDO_LIVER_TUMOR_VS_NORMAL_ADJACENT_TISSUE_UP 29.2883 1.91E-13 REACTOME_MITOTIC_M_M_G1_PHASES 29.0708 2.37E-13 LEE_LIVER_CANCER_SURVIVAL_DN 28.6977 3.44E-13 DELPUECH_FOXO3_TARGETS_DN 28.5718 3.90E-13 DANG_MYC_TARGETS_UP 28.5239 4.09E-13 AFFAR_YY1_TARGETS_DN 28.2654 5.30E-13 MUELLER_PLURINET 27.8975 7.66E-13 RHODES_CANCER_META_SIGNATURE 27.8157 8.31E-13 VANTVEER_BREAST_CANCER_ESR1_DN 27.674 9.58E-13 ODONNELL_TARGETS_OF_MYC_AND_TFRC_DN 27.2193 1.51E-12 REICHERT_MITOSIS_LIN9_TARGETS 27.1112 1.68E-12 COLINA_TARGETS_OF_4EBP1_AND_4EBP2 26.8764 2.13E-12 KRIGE_RESPONSE_TO_TOSEDOSTAT_24HR_DN 26.8519 2.18E-12 SCIAN_CELL_CYCLE_TARGETS_OF_TP53_AND_TP73_DN 26.5418 2.97E-12 OXFORD_RALA_OR_RALB_TARGETS_UP 26.2992 3.79E-12 WHITFIELD_CELL_CYCLE_G2_M 26.1871 4.24E-12 SONG_TARGETS_OF_IE86_CMV_PROTEIN 25.9853 5.18E-12 SHEPARD_BMYB_MORPHOLINO_DN 25.8651 5.85E-12 DANG_BOUND_BY_MYC 25.7543 6.53E-12 KAUFFMANN_MELANOMA_RELAPSE_UP 25.7419 6.61E-12 GRAHAM_CML_QUIESCENT_VS_NORMAL_QUIESCENT_UP 25.5869 7.72E-12 ALCALAY_AML_BY_NPM1_LOCALIZATION_DN 25.5785 7.79E-12 PUJANA_BRCA_CENTERED_NETWORK 25.4478 8.88E-12 FARMER_BREAST_CANCER_BASAL_VS_LULMINAL 25.4393 8.95E-12 BHATTACHARYA_EMBRYONIC_STEM_CELL 25.2227 1.11E-11 BOYAULT_LIVER_CANCER_SUBCLASS_G23_UP 25.1769 1.16E-11 YU_MYC_TARGETS_UP 25.0799 1.28E-11

Page 91: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  91  

GREENBAUM_E2A_TARGETS_UP 24.9981 1.39E-11 KIM_WT1_TARGETS_DN 24.8385 1.63E-11 MISSIAGLIA_REGULATED_BY_METHYLATION_DN 24.722 1.83E-11 CHANG_CORE_SERUM_RESPONSE_UP 24.6566 1.96E-11 PUJANA_BREAST_CANCER_WITH_BRCA1_MUTATED_UP 24.1552 3.23E-11 NAKAMURA_CANCER_MICROENVIRONMENT_DN 23.8905 4.21E-11 FERREIRA_EWINGS_SARCOMA_UNSTABLE_VS_STABLE_UP 23.4043 6.85E-11 BILD_MYC_ONCOGENIC_SIGNATURE 23.3321 7.36E-11 SERVITJA_LIVER_HNF1A_TARGETS_UP 22.9924 1.03E-10 PID_AURORA_B_PATHWAY 22.942 1.09E-10 NADERI_BREAST_CANCER_PROGNOSIS_UP 22.8214 1.23E-10 FRASOR_RESPONSE_TO_SERM_OR_FULVESTRANT_DN 22.8214 1.23E-10 CREIGHTON_ENDOCRINE_THERAPY_RESISTANCE_1 22.475 1.73E-10 ACEVEDO_LIVER_CANCER_UP 22.4306 1.81E-10 PID_MYC_ACTIVPATHWAY 22.0408 2.68E-10 GAL_LEUKEMIC_STEM_CELL_DN 21.8128 3.36E-10 SHEPARD_CRUSH_AND_BURN_MUTANT_DN 21.5507 4.37E-10 LY_AGING_OLD_DN 21.3984 5.09E-10 SMIRNOV_RESPONSE_TO_IR_6HR_DN 21.3651 5.26E-10

Page 92: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  92  

Table S11. Ingenuity Pathway Analysis of Upstream Regulators among genes differentially expressed between High and Low Methylation clusters in the NCI microarray cohort. Upstream Regulator Molecule Type Predicted Activation State Activation z-score p-value of overlap1

TP53 transcription regulator Inhibited -4.40 2.03E-20 E2F1 transcription regulator Activated 2.19 7.00E-15 MYC transcription regulator Activated 3.96 1.60E-12 CCND1 transcription regulator Activated 3.18 1.25E-11 FOXM1 transcription regulator Activated 3.90 7.67E-09 RB1 transcription regulator Inhibited -3.67 6.17E-08 SMARCB1 transcription regulator Inhibited -2.85 7.29E-07

1 The overlap p-­‐value calls likely upstream regulators based on significant overlap between dataset genes and known targets regulated by a transcription regulator. It is calculated using Fisher’s Exact Test

Page 93: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  93  

Table S12. miRNA expression differences between high and low methylation clusters in the NCI microarray cohort.

microRNA ID Fold Change (Cluster 2/Cluster 1) p-value hsa-miR-96 1.95469 0.0183541 hsa-miR-210 1.80922 0.0356865 hsa-miR-200c 1.60149 0.049909 hsa-miR-20a+hsa-miR-20b 1.52342 0.00930698 hsa-miR-21 1.3581 0.0452685 hsa-miR-24 1.19994 0.0336861 hsa-miR-140-5p -1.28846 0.0155375 hsa-miR-328 -1.30798 0.0254302 hsa-miR-33a -1.3202 0.0374737 hsa-miR-29c -1.32608 0.0187135 hsa-miR-520d-3p -1.40123 0.0157468 hsa-miR-337-3p -1.40365 0.0375492 hsa-miR-195 -1.48724 0.0168569 hsa-let-7c -1.51163 0.00145477 hsa-miR-369-3p -1.52434 0.0110974 hsa-miR-497 -1.55277 0.0224112 hsa-miR-10b -1.68698 0.00523345 hsa-miR-99a -1.71531 0.00476387 hsa-miR-1 -1.87246 0.00499649 hsa-miR-218 -1.94809 0.00103967

NOTE: bold indicates that microRNA is differentially-expressed in NCI and Japan cohorts.

Page 94: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  94  

Table S13. Univariable and Multivariable Cox Regression of HOXA9 promoter methylation in two cohorts. Univariable Multivariablea

N HR (95% CI) P HR (95% CI) P

NCI/Norway  cohort   HOXA9  methylationb ≥  40%  /<  40% 99 2.30  (1.01-­‐4.79) 0.03 3.66  (1.46-­‐9.22) 0.006 TNM7th  Stagec   IB/IA 93 1.36  (0.64-­‐2.88) 0.42 0.81  (0.32-­‐2.05) 0.65 Smoking   Ever/Never 92 2.02  (0.28-­‐14.9) 0.45 1.09  (0.08-­‐14.5) 0.95 Packyear ≥  20/<  20 89 1.57  (0.60-­‐4.10) 0.36 1.60  (0.39-­‐6.49) 0.51 Age,  y Continuous 99 0.99  (0.96-­‐1.03) 0.59 0.99  (0.95-­‐1.04) 0.83 Sex Female/Male 99 0.73  (0.36-­‐1.48) 0.38 1.11  (0.44-­‐2.81) 0.83

Japan  cohort   HOXA9  methylationb ≥  40%/<  40% 113 3.21  (1.38-­‐7.44) 0.007 3.02  (1.28-­‐7.16) 0.01 TNM7th  Stage   IB/IA 113 3.41  (1.55-­‐7.49) 0.002 3.83  (1.70-­‐8.62) 0.001 Smoking Ever/Never 113 1.05  (0.48-­‐2.32) 0.90 0.45  (0.10-­‐2.13) 0.32 Packyear ≥  20/<  20 113 1.57  (0.69-­‐3.56) 0.28 4.56  (0.82-­‐25.3) 0.08 Age,  y Continuous 113 0.99  (0.94-­‐1.06) 0.96 0.99  (0.93-­‐1.05) 0.74

NOTE: Bold, significant values < 0.05. aAdjusted for smoking history, sex and age, as well as stage, race, adjuvant therapy, and cohort membership when appropriate. bHOXA9 methylation values were dichotomized based on ≥ 40%/< 40% mean methylation. cUpon restaging to AJCC 7th edition, there were 6 cases in the Norway cohort for which it could not be distinguished whether they were TNM stage IB or II. These are included in univariable analyses and excluded in multivariable analyses. N: The number of available data for a particular variable in the univariable analysis.

Page 95: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  95  

Table S14. Univariable and Multivariable Cox Regression of 4-protein-coding gene classifier, miR-21 expression and HOXA9 promoter methylation in two cohorts and their overall combination.

Univariable Multivariablea

N HR (95% CI) P HR (95% CI) P NCI/Norway cohort

HOXA9 (Pyrosequencing)b ≥ 40%/< 40% 99 2.30 (1.01-4.79) 0.03 3.50 (1.26-9.73) 0.02 miR-21 (Nanostring)c High/Low 91 2.43 (1.14-5.21) 0.02 2.70 (0.85-8.52) 0.09 4-Gene classifier (qRT-PCR)c High/Low 91 2.84 (1.29-6.22) 0.009 3.19 (1.21-8.43) 0.02 TNM7th Staged IB/IA 93 1.36 (0.64-2.88 0.42 0.87 (0.32-2.35) 0.79 Smoking Ever/Never 92 2.02 (0.28-14.9) 0.49 1.86 (0.15-22.5) 0.63 Packyear ≥ 20 /< 20 89 1.57 (0.60-4.10) 0.36 0.92 (0.79-9.41) 0.11

Japan cohort HOXA9 (Pyrosequencing) ≥ 40%/< 40% 113 3.21 (1.38-7.44) 0.007 2.26 (0.91-5.64) 0.08 miR-21 (Nanostring) High/Low 113 3.74 (1.49-9.38) 0.005 1.39 (0.46-4.23) 0.56 4-Gene classifier (qRT-PCR) High/Low 113 4.57 (1.71-12.2) 0.002 4.20 (1.37-12.9) 0.01 TNM7th Stage IB/IA 113 3.41 (1.55-7.49) 0.002 3.82 (1.59-9.16) 0.003 Smoking Ever/Never 113 1.05 (0.48-2.32) 0.90 0.46 (0.10-2.19) 0.33 Packyear ≥ 20 /< 20 113 1.57 (0.69-3.56) 0.28 5.00 (0.96-26.1) 0.06

Combined cohort HOXA9 (Pyrosequencing) ≥ 40%/< 40% 212 2.20 (1.31-3.72) 0.003 2.71 (1.40-5.25) 0.003 miR-21 (Nanostring) High/Low 204 2.90 (1.62-5.19) 3.E-04 2.24 (1.12-4.51) 0.02 4-Gene classifier (qRT-PCR) High/Low 204 3.43 (1.87-6.30) 7.E-05 3.24 (1.61-6.56) 0.001 TNM7th Stage IB/IA 206 2.22 (1.29-3.81) 0.004 1.75 (0.95-3.24) 0.07 Smoking Ever/Never 205 1.56 (0.87-2.83) 0.14 0.79 (0.29-2.20) 0.66 Packyear ≥ 20 /< 20 202 1.82 (1.05-3.16) 0.03 1.75 (0.66-4.64) 0.26

Page 96: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  96  

NOTE: Bold, significant values < 0.05. aAdjusted for sex, stage and smoking history, as well as race, therapy, and cohort membership when appropriate bHOXA9 methylation values were dichotomized based on ≥ 40%/< 40% mean methylation in pyrosequencing analysis cThe 4-coding gene classifier and noncoding miR-21 were each categorized on the basis of median. dUpon restaging to AJCC 7th edition, there were 6 cases in the Norway cohort for which it could not be distinguished whether they were TNM stage IB or II. These are included in univariable analyses and excluded in multivariable analyses. N: The number of available data for a particular variable in the univariable analysis.

Page 97: Supplementary Materials An Integrated Prognostic Classifier ......3Genetics Branch, NCI-CCR, National Institutes of Health, Bethesda, MD 20892, USA. 4 Division of Genome Biology, National

  97  

Table S15. Univariable and Multivariable Cox Regression of High combined 4-gene classifier, miR-21 expression and HOXA9 methylation in the combined NCI/Norway and Japan cohorts.

Univariable Multivariablea

N HR (95% CI) P HR (95% CI) P Combined cohort

Stage I 30 7.09 (2.80-17.9) 3.E-05 13.5 (4.50-40.5) 3.E-06 Stage IA 16 6.98 (1.80-27.1) 0.05 7.95 (1.72-36.9) 0.008 Stage IB 13 6.56 (1.82-23.7) 0.004 15.6 (2.87-84.9) 0.001

Combined cohort (therapy naïve) Stage I 27 7.28 (2.66-20.0) 1.E-04 10.2 (3.43-30.3) 3.E-05 Stage IA 14 6.08 (1.51-24.4) 0.011 8.01 (1.72-37.3) 0.008 Stage IB 12 7.64 (1.66-35.3) 0.009 7.36 (1.40-38.7) 0.018

NOTE: Bold, significant values < 0.05. aAdjusted for smoking history, sex and age, race, and cohort membership, as well as stage and adjuvant therapy, when appropriate. The 4-coding gene classifier and noncoding miR-21 were each categorized on the basis of median. HOXA9 methylation values were dichotomized based on ≥ 40%/< 40% mean methylation in pyrosequencing analysis N: The number of cases with high combined score of 4-gene classifier, miR-21 and HOXA9 promoter methylation.