91
Modulation of RNA Cytosine-5 Methylation by Neuronal Activity and Methyl-donor Folate Xiguang Xu Dissertation submitted to the faculty of the Virginia Polytechnic Institute and State University in partial fulfillment of the requirements for the degree of Doctor of Philosophy In Biological Sciences Hehuang Xie, Chair Liwu Li Kenneth Oestreich Michael Fox May 12, 2020 Blacksburg, Virginia Keywords: RNA cytosine-5 methylation, RNA bisulfite sequencing, neuronal activity, neural stem cell, folic acid Copyright© 2020, Xiguang Xu

Modulation of RNA Cytosine-5 Methylation by Neuronal

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Modulation of RNA Cytosine-5 Methylation by Neuronal Activity

and Methyl-donor Folate

Xiguang Xu

Dissertation submitted to the faculty of the Virginia Polytechnic Institute and State University

in partial fulfillment of the requirements for the degree of

Doctor of Philosophy

In

Biological Sciences

Hehuang Xie, Chair

Liwu Li

Kenneth Oestreich

Michael Fox

May 12, 2020

Blacksburg, Virginia

Keywords: RNA cytosine-5 methylation, RNA bisulfite sequencing, neuronal activity, neural

stem cell, folic acid

Copyright© 2020, Xiguang Xu

Modulation of RNA Cytosine-5 Methylation by Neuronal Activity and Methyl-donor Folate

Xiguang Xu

ABSTRACT

RNA epigenetics or Epitranscriptomics has emerged as a new field for understanding the

post-transcriptional regulation of gene expression by RNA modifications. Among numerous types

of RNA modifications, RNA cytosine-5 methylation (5-mrC) is recognized as an important

epitranscriptomic mark that modulates mRNA transportation, stability and translation.

In chapter 1, we summarize the currently available approaches to detect 5-mrC

modification at global, transcriptome-wide and locus-specific levels, and compare the

corresponding advantages and disadvantages of the techniques. We further focus on the

bioinformatics data analysis of RNA bisulfite sequencing datasets by comparing existing packages

with respect to key parameters for alignment and methylation calling and filtering of potentially

false positive 5-mrC sites.

To investigate the dynamic regulation of 5-mrC modification, as described in chapter 2,

we adopt a widely used neuronal activity model, and perform RNA sequencing (RNA-seq) and

RNA bisulfite sequencing (RNA BS-seq) to profile gene expression as well as transcriptome-wide

5-mrC modification. We have identified distinct gene expression profiles and differentially

methylated 5-mrC sites (DMS) in neurons upon activation, and the genes with DMS sites are

enriched with mitochondrial and synaptic functions. Moreover, it reveals a negative correlation

between RNA methylation and mRNA expression in mouse cortical neurons during neuronal

activity. Thus, these findings identify the dynamic regulation of 5-mrC modification during

neuronal activity and reveal a potential link between RNA methylation and mRNA expression.

In chapter 3, we investigate the effect of folate, a methyl-donor, on RNA cytosine-5

methylation (5-mrC) modification in adult mouse neural stem cells (NSCs). Compared to the

control, NSCs cultured in folate deficiency or supplementation condition have shown no changes

in mRNA expression, but significant changes in mRNA translation efficiency. RNA bisulfite

sequencing of both total and polysome poly(A) RNA samples shows distinct 5-mrC profiles in

NSCs treated with different concentrations of folic acid. It also shows consistent hypermethylation

in polysome mRNAs than that in total mRNAs. This study presents the comprehensive influence

of folate deficiency and supplementation on RNA cytosine-5 methylation and mRNA translation.

Modulation of RNA Cytosine-5 Methylation by Neuronal Activity

and Methyl-donor Folate

Xiguang Xu

GENERAL AUDIENCE ABSTRACT

RNA epigenetics, a collection of RNA modifications, has recently emerged as an exciting,

new field for understanding post-transcriptional regulation of gene expression. RNA cytosine-5

methylation (5-mrC) is one of the most well-known RNA modifications that modulates mRNA

export, stability and translation.

In the first chapter, we summarize the currently available methods for the measurement of

5-mrC modification. We highlight one of the techniques, RNA bisulfite sequencing (RNA BS-seq)

and focus on the bioinformatics data analysis of RNA BS-seq datasets. We have compared several

existing tools in regard of the key parameters in data analysis.

In the second chapter, we adopt a widely used neuronal activity model to study the dynamic

regulation of RNA cytosine-5 methylation (5-mrC). We perform RNA-seq and RNA BS-seq in

neurons in response to stimulation. We have identified numerous differentially expressed genes

and differentially methylated 5-mrC sites in activated neurons and find that these DMS-related

genes are associated with mitochondrial and synaptic functions. Furthermore, we identify a

negative correlation between RNA methylation and mRNA expression, indicating a potential role

of 5-mrC modification in the regulation of mRNA expression.

In the third chapter, we investigate the influence of a nutrient supplement, folic acid, on 5-

mrC modification in adult mouse neural stem cells. Compared to the control, NSCs cultured in

folate deficiency or supplementation condition have shown no changes in mRNA expression, but

significant changes in mRNA translation efficiency. We perform RNA bisulfite sequencing of both

total poly(A) RNA samples and polysome poly(A) RNA samples. We identify distinct 5-mrC

profiles in NSCs treated with different concentrations of folic acid. It shows consistent

hypermethylation in polysome mRNAs than that in total mRNAs. This study presents the

comprehensive influence of folate deficiency and supplementation on RNA cytosine-5

methylation and mRNA translation.

v

ACKNOWLEDGMENTS

First and foremost, I thank my advisor, Dr. Hehuang David Xie, for his invaluable guidance

and support throughout my Ph.D. journey. I truly appreciate the opportunity to explore science in

the exciting field of epigenetics (DNA methylation) and the emerging field of epitranscriptomics

(RNA methylation) with the state-of-art Next Generation Sequencing (NGS) techniques. During

the training process, I have gained a lot of experience in library construction for high-throughput

sequencing. In addition to the essential technique of library construction, I have learned how to

develop critical thinking on biological questions, to learn leadership in the research group, to

transform from a dependent Ph.D. student to an independent researcher. I know there is still a long

way to go. Dr. Xie is leading me on that way. Thank you!

My special thanks go to my co-advisor, Dr. Liwu Li, who has been generously offering

help and support, giving me priceless advice on research, sharing his research experiences and

offering help in my defense. I thank my committee members, Dr. Michael Fox and Dr. Kenneth

Oestreich, for their insightful feedback, comments and suggestions on my research and writing.

I’m grateful to have such a dedicated committee that has been guiding me in each stage during my

entire Ph.D. program.

During the research, I have received a lot of help from our collaborators. I thank Dr. James

Smyth and his Ph.D. student Rachel Padget for their generous help in polysome fractionation

preparation. I thank Dr. Michelle Theus and Dr. Xinyu Zhao for their advice in adult mouse neural

stem cell isolation and culture. I thank Dr. Alicia Pickrell for her help in the preparation of

lentivirus for knockdown experiments. I thank Dr. Michael Fox and his former Ph.D. student

Aboozar Monavarfeshani for their collaboration in the paper “Retinal-input-induced epigenetic

dynamics in the developing mouse dorsal lateral geniculate nucleus”.

I thank my current and previous lab members. I thank Xiaoran Wei, Natalie Melville and

Zachary Johnson for their generous time and effort in bioinformatics data analysis. I thank Alex

Murray, Razan Alajoleen, Dr. Jiayi Fan for their help in experiments. I thank our former lab

members, Dr. Ming-an Sun, Dr. Zhixiong Sun, Jianlin He, and Dr. Sharmi Banerjee for their help

in bioinformatics analysis and experiments. I thank the undergraduates that I worked with, Niki

Armstrong, Karen Huang, Megan Harrigan, for their curiosity in science and help in experiments.

I thank Amanda Wang for her help in editing my writing. It’s my pleasure to work with them.

vi

I thank my friends in the Blacksburg Chinese Community: Johnny Yu, Dr. Y.A. Liu, Ziwei

Zuo, Waifong Chan, Qiang Li, Xiaoqi Li, Yu Zhou, Ming Xie, Yuchang Wu, …, a long list. I

received so much help and support from this community. We have had very impressive fellowship

and reunion time during the past years and will continue the friendship in the future. Friendship is

an indispensable part of my Ph.D. life in Blacksburg.

Lastly, I thank my family: my wife, my parents, my elder brother and sister, and my

parents-in-law, who supported my academic pursuits, and provided the help at every stage of my

personal life. My special thanks go to my wife, Yanan Jiao, my two lovely kids, Jeremy and Jasper.

You’re my endless source of happiness and inspiration. I love you all!

vii

Tables of Contents

ABSTRACT...............................................................................................................................ii

GENERALAUDIENCEABSTRACT..............................................................................................iv

ACKNOWLEDGMENTS.............................................................................................................v

TablesofContents.................................................................................................................vii

ListofFigures..........................................................................................................................x

ListofTables..........................................................................................................................xi

ListofAbbreviations..............................................................................................................xii

Chapter1-AdvancesinMethodsandSoftwareforRNACytosineMethylationAnalysis........1

1.1Abstract......................................................................................................................................2

1.2Background................................................................................................................................3

1.3TechniquesforthedetectionofRNACytosine-5methylation.....................................................4

1.3.1Globalassessmentofthe5-mrClevel........................................................................................5

1.3.2Transcriptome-wideapproachestogenerate5-mrCprofiles....................................................5

1.3.3Locus-specificapproachestodeterminemethylationwithinagivenmRNA.............................7

1.4DataanalysisforRNAcytosine-5methylationstudies.................................................................9

1.4.1SharedstepsforRNAbisulfitesequencingdataanalysis.........................................................10

1.4.2ComparisonofexistingtoolsforRNAbisulfitesequencingdataanalysis................................11

1.5ConclusionsandFuturePerspectives........................................................................................13

1.6References................................................................................................................................13

Chapter2-NeuronalActivityModifiesRNACytosine-5MethylationLandscapeinMouse

CorticalNeuron.....................................................................................................................18

2.1Abstract....................................................................................................................................19

2.2Background..............................................................................................................................20

2.3Methods...................................................................................................................................21

2.4Results......................................................................................................................................25

2.4.1Distinctgeneexpressionprofileuponneuronalactivation......................................................25

viii

2.4.2Distributionprofileof5-mrCinmousecorticalneurons..........................................................28

2.4.3Dynamic5-mrClandscapeuponneuronalactivation...............................................................31

2.4.4RNAmethylationnegativelycorrelateswithmRNAexpressioninneuronsuponneuronal

activation...........................................................................................................................................32

2.5Discussion................................................................................................................................34

2.6Supplementarydata.................................................................................................................36

SupplementaryFigure1.ReproducibilitybetweenreplicatesinRNA-seqdatasets.........................36

SupplementaryFigure2.GOannotationofdifferentiallyexpressedgenesinneuronsupon

activation...........................................................................................................................................37

SupplementaryFigure3.Expressionprofileoflateresponsegenes.................................................38

SupplementaryFigure4.ReproducibilitybetweenreplicatesinRNABS-seqdatasets.....................39

SupplementaryFigure5.GOannotationofmRNAscontaining5-mrCsites.....................................39

2.7References................................................................................................................................40

Chapter3-InfluenceofFolateonRNACytosine-5MethylationinNeuralStemCells.............42

3.1Abstract....................................................................................................................................43

3.2Background..............................................................................................................................44

3.3Methods...................................................................................................................................46

3.4Results......................................................................................................................................50

3.4.1Distributionprofileof5-mrCintotalmRNAsinadultmouseneuralstemcells.......................50

3.4.2FolateinduceschangesintotalmRNAmethylationinadultmouseneuralstemcells............55

3.4.3Distributionprofileof5-mrCinpolysomemRNAsinadultmouseneuralstemcells...............57

3.4.4FolateinduceschangespolysomemRNAmethylationinadultmouseneuralstemcells........60

3.4.5Distinct5-mrCprofileintotalandpolysomemRNAinadultmouseneuralstemcells............61

3.4.6FolateinduceschangesinmRNAtranslationinadultmouseneuralstemcells.......................64

3.5Discussion................................................................................................................................66

3.6Supplementarydata.................................................................................................................67

SupplementaryFigure1.Reproducibilityof5-mrCsitesbetweenreplicatesintotalpoly(A)RNABS-

seqdatasets.......................................................................................................................................67

SupplementaryFigure2.GOannotationof5-mrCcontainingmRNAsinNSCs.................................68

SupplementaryFigure3.Schematicdiagramofpolysomefractionation.........................................69

SupplementaryFigure4.Reproducibilityof5-mrCsitesbetweenreplicatesinpolysomepoly(A)

RNABS-seqdatasets..........................................................................................................................69

ix

SupplementaryFigure5.Reproducibilitybetweenreplicatesintotalandpolysomepoly(A)RNA-seq

datasets.............................................................................................................................................70

3.7References................................................................................................................................70

Chapter4–ConclusionsandFutureDirections......................................................................75

4.1Conclusions..............................................................................................................................75

4.2Futuredirections......................................................................................................................76

4.3References................................................................................................................................77

x

List of Figures

Figure 2-1 Characterization of E16.5 cortical neuronal culture ................................................... 26

Figure 2-2 Neuronal activity induces distinct gene expression profiles ....................................... 27

Figure 2-3 Distribution profile of 5-mrC modification in mouse cortical neurons during neuronal

activity ........................................................................................................................................... 30

Figure 2-4 Neuronal activity induces RNA methylation changes in neurons .............................. 32

Figure 2-5 5-mrC hypermethylation negatively correlates with mRNA expression .................... 34

Figure 3-1 Characterization of adult mouse neural stem cell (NSC) culture ................................ 51

Figure 3-2 Distribution profile of 5-mrC modification in adult mouse NSCs .............................. 54

Figure 3-3 Folate induces RNA methylation changes in total mRNAs in adult mouse NSCs ..... 56

Figure 3-4 Distribution profile of 5-mrC in polysome mRNAs in adult mouse NSCs ................ 59

Figure 3-5 Folate induces RNA methylation changes in polysome mRNAs in adult mouse NSCs

....................................................................................................................................................... 61

Figure 3-6 Distinct methylation profiles of 5-mrC modification in total and polysome mRNAs in

NSCs ............................................................................................................................................. 63

Figure 3-7 Identification of differentially translated genes in NSCs with different concentration of

folate ............................................................................................................................................. 65

xi

List of Tables

Table 1-1 Summary of techniques for the detection of RNA cytosine-5 methylation (5-mrC) ..... 8

Table 1-2 Comparison of filters in RNA BS-seq data analysis pipeline from different studies ... 12

Table 2-1 Mapping statistics of RNA-seq datasets ....................................................................... 27

Table 2-2 Mapping statistics of RNA BS-seq datasets ................................................................. 28

Table 3-1 Mapping statistics of total and polysome poly(A) RNA-seq data ................................ 51

Table 3-2 Mapping statistics of total and polysome poly (A) RNA BS-seq data ......................... 52

xii

List of Abbreviations

Symbol Description

3'UTR 3' untranslated region

5-hmrC 5-hydroxymethylcytosine

5-mrC RNA cytoine-5 methylation

5'UTR 5' untranslated region

ALYREF ALY/REF export factor

ASD Autism spectrum disorder

Aza-IP 5-azacytidine-mediated RNA immunoprecipitation

bFGF basic fibroblast growth factor

bp Base pair

CDS Coding sequence

CHX cyclohexamide

CPM counts per million

DEG differentially expressed gene

DMS differentially methylated site

DNMTs DNA methyltransferases

dsRNA double-strand RNA

DTG differentially translated gene

EGF epidermal growth factor

ELISA Enzyme-Linked Immunosorbent Assay

FA folic acid

GO Gene ontology

GSC germline stem cell

HF high folate

LC–MS Liquid chromatography coupled with tandem mass spectrometry

LF low folate

m1A 1-Methyladenosine

m6A N6-methyadenosine

xiii

meRanTK Methylated RNA analysis ToolKit

MeRIP methylated RNA immune-precipitation

MF medium folate

miCLIP Methylation-individual nucleotide resolution crosslinking and immunoprecipitation

mRNA messenger RNA

mt-mRNA Mitochondrial messenger RNA

MZT maternal-to-zygotic transition

ncRNA Non-protein-coding RNA

NGS next generation sequencing

NSC neural stem cell

NSUN2 NOP2/Sun RNA methyltransferase family member 2

NTD neural tube defect

ORF Open reading frame

RNA BS-seq RNA bisulfite sequencing

RNA-seq RNA sequencing

ROS reactive oxidative species

rRNA Ribosomal RNA

RT-qPCR quantitative reverse-transcription polymerase chain reaction

SVZ subventricular zone

TET family ten-eleven translocation family

TPM transcripts per million

tRNA Transfer RNA

1

Chapter 1 - Advances in Methods and Software for RNA Cytosine

Methylation Analysis

Xiguang Xu1,2, Xiaoran Wei1,3, Hehuang Xie1,2,3*

1. Fralin Life Sciences Institute at Virginia Tech, Blacksburg, VA 24061, USA

2. Department of Biological Sciences, Virginia Tech, Blacksburg, VA 24061, USA

3. Department of Biomedical Sciences and Pathobiology, Virginia-Maryland College of

Veterinary Medicine, Blacksburg, VA 24061, USA

*Corresponding author: Email: [email protected]

History: Received 3 August 2019, Revised 2 October 2019, Accepted 29 October 2019, Available

online 31 October 2019.

Citation: Xu, X., et al. (2019). "Advances in methods and software for RNA cytosine methylation

analysis." Genomics.

Author contributions

Conceptualization, X.X. and H.X.; original draft preparation and editing, X.W., X.X. and H.X.;

funding acquisition, H.X.

2

Highlights

l Epitranscriptomics is an exciting, new field for understanding the fundamental

mechanisms underlying RNA modifications and their impact on gene expression.

l Cytosine methylation in mRNA (5-mrC) is an important epitranscriptomic mark that

modulates mRNA transportation, translation, and stability at the post-transcriptional

level.

l This short review summarizes the experimental techniques that are exploited to determine

5-mrC in mRNA and the computational procedures implemented for RNA bisulfite

sequencing data analysis.

1.1 Abstract Our understanding of RNA modifications has been growing rapidly over the last decade.

Epitranscriptomics has recently emerged as an exciting, new field for understanding the

fundamental mechanisms underlying RNA modifications and their impact on gene expression.

Among the over one hundred different kinds of RNA modifications, cytosine methylation in

mRNA (5-mrC) is now recognized as an important epigenetic mark that modulates mRNA

transportation, translation, and stability at the post-transcriptional level. Across plant and animal

species, recent studies have revealed the roles of mRNA cytosine methylation in several

fundamental biological processes. In mammals, genome-wide profiling has determined thousands

of mRNA transcripts carrying the 5-mrC modification in a tissue specific manner. Here, we

summarize the experimental techniques that were exploited to determine 5-mrC in mRNA and the

computational procedures implemented for RNA bisulfite sequencing data analysis.

Keywords: RNA cytosine methylation; post-transcriptional regulation; RNA bisulfite sequencing;

methylation data analysis

3

1.2 Background “RNA epigenetics” or “epitranscriptomics” is an emerging new field in the study of RNA

post-transcriptional modification (1-3). Currently, around 170 distinct types of RNA modifications,

including N6-methyladenosine, N1-methyladenosine, 5-methylcytosine, and 5-

hydroxymethylcytosine, have been identified (4). The N6-methyladenosine modification in

poly(A) RNA has been extensively studied and was found to regulate messenger RNA (mRNA)

splicing, stability, and translation efficiency in diverse biological processes (5-7). RNA cytosine

methylation (5-mrC) is another important form of RNA modification. In the 1960’s, studies

identified the presence of 5-mrC in ribosomal RNA (8). Later studies showed that 5-mrC was not

only found in rRNA and tRNA but was also found in mRNA and non-coding RNA from all three

domains of life: Archaea, Bacteria, and Eukarya (9-14).

In recent years, several pivotal findings have been reported regarding the writers, erasers,

and readers of 5-mrC in RNA. In mammalian cells, the addition of a methyl group on the fifth

carbon of cytosine in RNA is catalyzed by a large protein family called the NOP2/Sun domain

RNA methyltransferases (NSUN) and by DNA methyltransferase 2 (12, 15, 16). Both Yang et. al

and Huang et. al identified that NSUN2 is the major RNA methyltransferase mediating the

formation of 5-mrC in mRNAs (14, 17). Previous studies showed that the ten-eleven translocation

(TET) family of Fe(II)- and 2-oxoglutarate-dependent dioxygenases function as DNA

demethylases via sequential oxidation of 5-methylcytidine to yield 5-hydroxymethylcytidine, 5-

formylcytidine, 5-carboxylcytidine, and eventually unmethylated cytosines (18-21). Interestingly,

5-mrC in RNA can be oxidized by TET enzymes (TET1, TET2, TET3) to 5-

hydroxymethylcytosine (22), and then further oxidized to 5-formylcytosine (23) and 5-

carboxylcytosine (24). The molecular mechanism that mediates the conversion of 5-

carboxylcytosine to unmethylated Cs in RNA remains elusive. Very little information has been

gained about 5-mrC reader proteins. Aly/REF Export Factor (ALYREF) was recently identified as

a 5-mrC specific binding protein that mediates target mRNAs export from the nucleus to the

cytoplasm (14), indicating the critical role of 5-mrC reader protein in RNA metabolism.

Advances in next generation sequencing (NGS) accelerated the development of high-

throughput 5-mrC detection methods, which provided a comprehensive view of 5-mrC distribution

4

across the transcriptome. Transcriptome-wide distribution of 5-mrC has been revealed in poly(A)

RNAs from a broad range of mammalian cell lines and tissues (13, 14, 17, 25). Almost all recent

transcriptome-wide studies showed that methylated cytosines are preferentially enriched around

the translation initiation sites (TIS) of mRNAs (13, 14, 17), indicating an important regulatory role

of RNA cytosine methylation on the translation of mRNAs. Moreover, cytosine methylation in

mRNA regulates systemic mRNA mobility and promotes mRNA nuclear export (14). In

Arabidopsis thaliana, 5-mrC are significantly enriched in graft-mobile mRNAs that can be

transported over graft junctions to distinct plant parts (26). Together with RNA-binding proteins,

methylated RNA and RNA methyltransferases gain the ability to mediate the interactions between

transcription factors and genomic DNA to participate in chromatin organization (27). Despite these

recent findings, the functional roles of 5-mrC in mRNA during biological processes and their

relevance to human disease are just beginning to be understood.

In this review, we focus on the experimental techniques and corresponding data analysis of mRNA

cytosine methylation. We summarize current available approaches for detecting RNA cytosine-5

methylation at the global, transcriptome-wide and locus-specific levels. Additionally, we

emphasize the bioinformatics data analysis of RNA bisulfite sequencing datasets by comparing

key features of three published packages, meRanTK, BS-RNA, and BisRNA (28-30), and discuss

the major issues in the analysis of RNA bisulfite sequencing data.

1.3 Techniques for the detection of RNA Cytosine-5 methylation Methylation at position 5 of cytosine in mRNA was discovered over 40 years ago (9, 10).

Most of the early studies on 5-mrC relied on radial labelling and paper chromatography (10). Due

to the lack of reliable and sensitive techniques for 5-mrC detection, the distribution and functional

roles of 5-mrC in low abundance mRNA has remained largely unknown over the past four decades.

Recent advances in NGS techniques have enabled a transcriptome-wide view of 5-mrC distribution

in diverse biological processes, broadening our understanding of the functional roles of RNA

cytosine methylation. Below, we summarized currently available approaches for detecting 5-mrC

at the global, transcriptome-wide, and locus-specific levels (Table 1). The advantages and

limitations of these techniques, including future directions, are discussed.

5

1.3.1 Global assessment of the 5-mrC level

The global level of 5-mrC modification in mRNA refers to the sum of all 5-mrC that can

be identified in all mRNA transcripts from a given cell or tissue sample. Since tRNA and rRNA

molecules are rich in 5-mrC modifications, one key step of the global approach for detecting 5-

mrC in mRNA is to remove undesired RNA species. RNA dot blot and mass spectrometry are

frequently used global approaches. Dot blot is a traditional technique that has been widely used to

measure the level of protein expression (31). This technique was later applied to detect base

modifications, such as 5-mC, in DNA (19) and RNA (32). RNA dot blot for 5-mrC utilizes the

anti-5-mrC antibody to measure the levels of 5-mrC in RNA samples. The signal density captured

represents the relative 5-mrC level. RNA dot blot results are regarded as qualitative or semi-

quantitative data. Despite the straightforward signal provided, the RNA dot blot may not be able

to detect slight changes in RNA methylation. Anti-5-mrC antibody has also been explored in

Enzyme-Linked Immunosorbent Assay (ELISA)-based approaches (33, 34). The standard curve,

generated with controls at different methylation levels, allows the ELISA-based kit to accurately

quantitate the global level of 5-mrC in RNA. Like dot blot, an ELISA-based kit accepts a wide

range of input RNA samples from vertebrate, plant, and microbial sources.

Liquid chromatography coupled with tandem mass spectrometry (LC–MS) is an accurate,

quantitative approach to assess the 5-mrC level globally (14, 22, 23). Prior to the analysis, a critical

step that should be taken is to completely digest the input RNA molecules into individual

ribonucleotides. With a 5-mrC standard as a positive reference, LC-MS separates individual

ribonucleotides to obtain the absolute 5-mrC level in a given RNA sample. RNA dot blot, ELISA

and mass spectrometry can provide the global methylation level but not locus-specific methylation

information. In other words, even if no change in 5-mrC level can be detected with these global

approaches, some mRNA transcripts could have different levels of methylation modification at

specific cytosines.

1.3.2 Transcriptome-wide approaches to generate 5-mrC profiles

A transcriptome-wide view of the 5-mrC profile may be achieved via antibody-based or

bisulfite conversion-based approaches coupled with high-throughput sequencing. RNA

immunoprecipitation of 5-mrC, followed by deep-sequencing (5-mrC-RIP-seq) utilizes 5-mrC-

specific antibodies to enrich 5-mrC-modified RNAs (11, 35). The use of antibodies enables the

enrichment of mRNA transcripts with low 5-mrC levels, which may go undetected in a large pool

6

of unmethylated RNA molecules. In addition, 5-mrC-RIP-seq allows distinction of RNA having

the 5-mrC modification from RNA having other methylation modifications such as 5-hmrC. Not

surprisingly, the specificity of such an approach is highly dependent on the antibody used. Non-

specific bound RNA may be introduced in the immunoprecipitation process as well. The sequence

reads generated for RNA pulled down by anti-5-mrC antibodies are usually 100-150 nt in length.

Thus, the resolution of 5-mrC-RIP-seq for methylation detection is not at the single-nucleotide

level.

5-azacytidine-mediated RNA immunoprecipitation (Aza-IP) utilizes 5-azacytidine, a

cytidine analog that traps its target RNA methyltransferase by forming a stable RNA

methyltransferase-RNA adduct. Covalently bound enzyme-RNA complexes may be

immunoprecipitated with either tag- or enzyme-specific antibodies. The target RNA with 5-Aza-

C is eventually read as a guanine during reverse transcription and sequencing (36). The most

significant advantage is that this technique allows for identification of enzyme-specific cytosine

substrates at single-nucleotide resolution. Due to stable covalent binding between the RNA

methyltransferase and the 5-azacytidine, the enzyme-RNA substrate complexes can be

immunoprecipitated with highly stringent washes, thus largely reducing the non-specific binding

of unmethylated RNA. However, efficient enrichment of the enzyme-RNA complexes depends

highly on the specific antibodies against the target enzymes or the expression of epitope-tagged

enzymes in the target cells. The incorporation efficiency of the cytidine analog 5-Aza is also a

concern. The methylation targets in nascent RNA molecules without 5-Aza incorporations will be

missed. Furthermore, genomic DNA in somatic tissues is heavily methylated and 5-Aza may

incorporate into DNA molecules, particularly in proliferating cells. Such altered DNA methylation

profiles may lead to differential gene expression and, thus, may influence the transcription profile.

Methylation-individual nucleotide resolution crosslinking and immunoprecipitation

(miCLIP) is a customized technique derived from the individual-nucleotide-resolution

crosslinking and immunoprecipitation (iCLIP) method, which allows the detection of RNA

methyltransferase-specific substrate sites at nucleotide resolution (37). This technique has been

used to identify NSUN2 and NSUN3 substrates (38, 39). The point mutation of the conserved

cysteine that is needed within the catalytic domain of RNA methyltransferases for the release of

methylated RNA from the enzyme results in the irreversible formation of covalent RNA-enzyme

complex at the methylation sites. Covalent crosslinking of the RNA-protein complex leaves a short

7

peptide at the target 5-mrC site, which stalls the reverse transcription during library construction.

As a result, all sequences end at the methylation site (38, 39). Despite its robustness and high

specificity, miCLIP requires the generation of mutant enzymes, which is expensive and time-

consuming.

Bisulfite sequencing was originally developed to detect the 5-mC sites in genomic DNA

(40). In the presence of sodium bisulfite, unmethylated cytosines are converted to uracils, which

are later replaced by thymines during subsequent PCR amplification, while methylated cytosines

remain unchanged. In recent years, bisulfite sequencing has been modified to identify the 5-mrC

profile in RNAs on a transcriptome-wide scale (41). After the initial development of RNA bisulfite

sequencing, this technique has been commercialized and various RNA bisulfite conversion kits are

available, including the EZ RNA Methylation Kit from ZymoResearch and Methylamp RNA

Bisulfite Conversion Kit from Epigentek (42). The primary advantage of this technique is that it

can provide a transcriptome-wide view of 5-mrC deposition at single-nucleotide resolution.

However, bisulfite sequencing has the limitation that it cannot differentiate 5-methylcytosine from

5-hydroxymethylcytosine, as both are resistant to deamination, but the level of 5-hmrC is very low

in human and mouse mRNAs (23, 43). The ratio between 5-hmrC:5-mrC is estimated to be around

1:5,000 (22), making RNA bisulfite sequencing an attractive approach to generate the 5-mrC

profile. Bisulfite treatment results in significant degradation of RNA, making it difficult to detect

5-mrC in low expressed mRNA molecules (44). To protect RNA integrity, RNA bisulfite

conversion is usually performed at a relatively low temperature compared to DNA bisulfite

conversion. Bisulfite conversion can also be encumbered by the secondary structures of RNAs,

such as double-strand RNA (dsRNA) and stem-loop structures. Thus, incomplete denaturation of

RNA secondary structure may introduce cytosines resistant to bisulfite conversion, which end as

false positive signals. Despite these disadvantages, RNA bisulfite sequencing has been

increasingly applied to study RNA cytosine methylation in recent years (13, 14, 17, 25, 30, 45).

1.3.3 Locus-specific approaches to determine methylation within a given mRNA

Locus-specific approaches have been developed to measure the methylation level of

specific 5-mrC sites in mRNA. The most common approach is to use 5-mrC RIP, followed by RT-

qPCR (13, 35). In this procedure, RNA molecules are fragmented and pulled down by the 5-mrC

antibody and then reverse transcribed to cDNA. Real-time qPCR is then performed to measure the

relative fold changes for specific transcripts. With appropriate controls, such as normal IgG control,

8

this approach has been used to validate the 5-mrC sites identified by RNA bisulfite sequencing

(13). RNA bisulfite conversion combined with either cloning-based (41, 45) or PCR amplicon-

based (11, 14) Sanger sequencing are another two locus-specific methylation assays commonly

used in the validation of 5-mrC sites. The cDNA template derived from bisulfite-converted RNA

was used for cloning into vectors or PCR amplification with primers fused with consensus

sequences, and then subject to Sanger sequencing. RNA bisulfite pyrosequencing may be

developed as an alternative approach to determine the 5-mrC levels for multiple cytosines in a

short stretch of RNA molecule. Similar to the pyrosequencing of bisulfite-converted DNA (46),

RNA molecules may be subjected to bisulfite conversion first prior to cDNA generation. After

reverse transcription, cDNA molecules are used as templates for PCR and pyrosequencing.

Since each technique for RNA methylation detection has its own features, the combination

of these approaches may provide more comprehensive understanding on multiple levels. For

example, RNA dot blot and mass spectrometry can be used as the initial steps to explore the

changes of 5-mrC at the global level in a specific biological process (35). Aza-IP and miCLIP can

be used to study the substrates of a specific RNA methyltransferase. As the sequencing cost

continues to decrease, RNA bisulfite sequencing becomes even more attractive for gaining a

Table 1-1 Summary of techniques for the detection of RNA cytosine-5 methylation (5-mrC)

transcriptome-wide view of RNA methylation at single nucleotide resolution. Although 5-mrC-

RIP cannot provide methylation information at single nucleotide resolution, it may serve as an

9

alternative approach to validate bisulfite sequencing results and to eliminate the false-positive 5-

mrC sites resulting from an incomplete bisulfite conversion.

1.4 Data analysis for RNA cytosine-5 methylation studies The methods used for 5-mrC data analysis depend on the types of data results obtained

with different 5-mrC detection approaches. For techniques used to detect 5-mrC at the global level

(i.e., ELISA) or at the locus-specific level (i.e., RNA bisulfite pyrosequencing), each measurement

provides a numerical number. A typical experiment often includes multiple biological or technical

replicates as one group and the research goal may embrace the determination of group differences.

A two-tailed paired Student's t-test is frequently used to determine the significance of the

methylation differences between two groups, while ANOVA can be used to compare the

methylation levels among two or more groups.

For transcriptome-wide approaches, the data analysis strategies vary depending on the

principle of each technique. The analysis of datasets generated using antibody-based techniques

follows the same principle as ChIP-seq for the identification of transcription factor binding sites.

One frequently used tool is Model-based Analysis of ChIP-Seq (MACS), which adopts a dynamic

Poisson distribution for peak calling (47). Peaks, ranked by p-value, indicate the local biases of

read coverage in the genome. The primary goal of both Aza-IP-seq and miCLIP-seq techniques is

to identify the direct RNA substrates of cytosine-5 RNA methyltransferases. The data analysis of

Aza-IP-seq includes sequence alignment, enrichment analysis and signature analysis. After the

sequences alignment, enrichment analysis is performed using the open-source USeq package to

identify transcripts that are enriched in replicate samples compared to IgG control sample.

Signature analysis is then performed using the VarScan package (48) to scan the enriched

transcripts for significant C to G transversion sites that are caused by Aza-IP but not SNPs or indel.

These transversion sites are then determined as the cytosine targets of a specific methyltransferase

(49). Despite differential methylation analysis is not desired, meRanTK toolkit provides functions

of mapping, methylation calling and enrichment comparison for Aza-IP data. Similarly, the

analysis of miCLIP-seq data is to identify enzyme-specific target sites. After sequence alignment,

the miCLIP read stop positions will be determined and read counts are normalized to per thousand

reads in the replicates. To perform differential methylation analysis for 5-mrC-RIP-seq, miCLIP-

seq, and Aza-IP-seq results, both the enrichment of peaks/sites and RNA expression level will be

10

required. Therefore, additional RNAseq data has to be generated. With the reduced cost of NGS,

RNA bisulfite sequencing is becoming the prevailing approach to study 5-mrC profiles at single

nucleotide resolution. However, the data analysis for RNA bisulfite sequencing is a challenging

task. Below, we summarize the key features for several bioinformatics packages dealing with RNA

bisulfite sequencing data.

1.4.1 Shared steps for RNA bisulfite sequencing data analysis

Like regular RNA-seq data processing, RNA bisulfite sequencing data analysis involves

steps for quality control and read alignment to references. Due to bisulfite conversion,

unmethylated cytosines in mRNA will end up as thymines after cDNA conversion. Given that, the

level of methylated cytosine in mRNA is much lower than that in genomic DNA (13) and the

frequency of C (or G in the cDNA) is extremely low in mRNA bisulfite sequencing data. For

Illumina sequencing, the sequence quality deteriorates along the read, particularly for bisulfite

sequencing reads with low GC content. Prior to sequence alignment, low quality bases should be

trimmed off from the raw RNA bisulfite sequencing reads along with adaptor sequences. Clean

reads may be obtained using software tools such as Cutadapt (50), Trim Galore!

(http://www.bioinformatics.babraham.ac.uk/projects/trim_galore), or Trimmomatic (51) to

eliminate low-quality bases.

Either an annotated genome or a transcriptome may be used as a reference for the alignment

of bisulfite sequencing reads. A step that should not be skipped is to prepare an in silico bisulfite-

converted reference. If a transcriptome is chosen as the reference, Bowtie 2 is recommended,

which is a memory-efficient, highly sensitive and accurate alignment algorithm (52). Mapping

with the transcriptome as a reference may have the issue that a sequence read may be aligned to

multiple transcripts derived from the same gene. To address this issue, the longest transcript with

the highest mapping score were usually selected as the top candidate (28). Using a large set of

small indexes, HISAT2 is a fast and sensitive splicing aware program with alignment strategies

that manage reads spanning multiple exons (53). Thus, it is a great tool to align reads to the genome.

Either using the transcriptome or the genome as a reference, the mapping efficiency is expected to

be around 70-80%. To achieve a higher mapping rate, genome and transcriptome references may

be used in sequential order. For instance, sequence reads may be mapped to the genome first, and

then aligned against the transcriptome for the reads that cannot be mapped to the genome (14).

11

1.4.2 Comparison of existing tools for RNA bisulfite sequencing data analysis

Several bioinformatics tools have been developed to aid in mapping the clean reads and

subsequent methylation calling processes (17, 28, 29). Methylated RNA analysis ToolKit

(meRanTK) is the first publicly available software specialized for high-throughput RNA cytosine

methylation data analysis (28). Written in the Perl language, it utilizes splice-aware bisulfite

sequencing read mapping to either the genome or the transcriptome. The toolkit allows for

methylation calling and the identification of differentially methylated cytosines with statistical

analysis. In addition, a package is provided by meRanTK to annotate candidate 5-mrC sites with

genomic features such as gene or transcript names and positional metrics. Worthy of mention,

MeRanTK can be used to handle Aza-IP data as well.

Similar to meRanTK, BS-RNA is another efficient and highly automated mapping and

annotation tool developed in the Perl language (29). BS-RNA only supports RNA bisulfite

sequencing data generated from directional libraries. Yet, the mapping speed of BS-RNA is much

faster than that of meRanTK. By calling the HISAT2 program, BS-RNA can finish the mapping

of 80 M 100 bp paired-end reads to the reference genome within five hours. The same job takes

over 35 hours to perform for meRanGs using STAR (54) or 101 hours for meRanGt using TopHat2

(55), which are the two variants of aligners provided by meRanTK. Similar to meRanTK, BS-

RNA can also manage “dovetailing” reads generated with paired-end sequencing, where one or

both reads seem to extend past the start of the mate read. Such “dovetailing” reads often result

from the sequence reads that have their 5’-ends trimmed.

BisRNA is a statistical modeling method for methylation calling (30). This software

integrates tailored filtering to address sequencing and alignment artifacts and data-driven statistical

modeling to eliminate the artifacts associated with bisulfite sequencing. Using BisRNA, Legrand

et. al reported that very sparse methylated Cs, or possibly none at all, can be found in mRNAs (30).

This result raises awareness for developing more reasonable and statistically reliable data analysis

strategies for RNA bisulfite sequencing datasets. BisRNA software can only be used for

methylation calling. meRanTK and BS-RNA toolkits have similar functions on handling the

processes of mapping, methylation calling, and annotation. Liang et al. performed a comparison

between BS-RNA, meRanGs and meRanGt (29). They concluded that BS-RNA has a better

performance than both meRanGs and meRanGt when dealing with simulated reads in the mapping

process. Both BS-RNA and meRanGs performed better than meRanGt when mapping published

12

single-end bisulfite sequencing reads. In the methylation calling process, although there is no

significant difference in precision among these tools, BS-RNA has a significant higher recall rate

than meRanGt and meRanGs.

Several methods have been taken to eliminate false positive sites. Most of them adopted

statistical methods to avoid false positive sites and set strict filters during methylation calling (11,

13, 14, 17). In addition, low quality and unconverted reads were excluded (11, 14, 17) and RNA

secondary structure prediction tools were used to filter bisulfite conversion-resistant sites (13, 56).

Furthermore, databases including dbSNP for single nucleotide polymorphisms (SNPs) and

REDIdb for RNA editing sites may be explored to filter candidate methylated cytosines

overlapping SNPs or RNA editing sites (57). A recent published paper integrates some of these

filters together to exclude the noise that occurs during the generation of RNA bisulfite sequencing

data (17). First, it sets filters in the methylation calling process for read coverage, methylation

level, and methylated cytosine depth of sites. Then the Gini coefficient is used to determine the C-

cutoff to remove the reads that have too many unconverted cytosines. A signal ratio filter is used

to further remove sites in regions that are resistant to bisulfite conversion. P-value is calculated for

the gene-specific conversion rate and genes with low conversion rates are discarded. Lastly,

Stouffer’s method is adopted to calculate the combined P value for biological replicates. A

comparison of mapping procedures and filtering steps used in recent publications is summarized

in Table2.

Table 1-2 Comparison of filters in RNA BS-seq data analysis pipeline from different

studies

13

1.5 Conclusions and Future Perspectives In the past decade, technology advancements in methylation detection has reignited interest

in the dynamics and biological impacts of 5-mrC in mRNA. However, several issues should be

taken into consideration when undertaking RNA methylation studies. mRNA molecules are prone

to heat degradation and are more chemically labile than DNA. To avoid RNA degradation, the less

aggressive conditions that are adopted in bisulfite conversion will lead to a large number of false

positive sites. Thus, it is critical to ensure successful bisulfite conversion, i.e., by monitoring the

bisulfite conversion rate of spike-in RNA controls. On the other hand, over 60% of cytosines in

mRNA have methylation levels of less than 20% in mammals (14, 17). This poses a challenge to

accurately determining all the methylation sites in a given sample. The multiple filtering steps

during analytical procedures may result in a significant number of false negative calls.

Development of novel techniques and associated bioinformatics tools is driven by the needs to

address specific biological questions. For instance, determination of co-methylated mRNA

transcripts in a single cell may reveal gene pathways sharing a same regulatory mechanism. Finally,

future techniques and associated analytical procedures are desired to generate and analyze more

sophisticated data to determine the association of mRNA methylation with other important

biological phenomena, such as RNA splicing, RNA editing, and other kinds of RNA modifications.

ACKNOWLEDGEMENTS

This work was supported by the Center for One Health Research at the Virginia-Maryland, College

of Veterinary Medicine and The Edward Via College of Osteopathic Medicine, NIH grant

NS094574, and the Fralin Life Sciences Institute faculty development fund for H.X., and VT’s

Open Access Subvention Fund. We recognize The Center for Engineered Health and the Virginia-

Maryland College of Veterinary Medicine at Virginia Tech. We thank Dr. Janet Webster for

English language editing.

COMPETING INTERESTS

The authors declare no competing interests.

1.6 References 1. He C. Grand challenge commentary: RNA epigenetics? Nature chemical biology. 2010;6(12):863-5.

14

2. Saletore Y, Meyer K, Korlach J, Vilfan ID, Jaffrey S, Mason CE. The birth of the Epitranscriptome: deciphering the function of RNA modifications. Genome biology. 2012;13(10):175. 3. Song J, Yi C. Chemical Modifications to RNA: A New Layer of Gene Expression Regulation. ACS chemical biology. 2017;12(2):316-25. 4. Boccaletto P, Machnicka MA, Purta E, Piatkowski P, Baginski B, Wirecki TK, et al. MODOMICS: a database of RNA modification pathways. 2017 update. Nucleic acids research. 2018;46(D1):D303-d7. 5. Zhao X, Yang Y, Sun BF, Shi Y, Yang X, Xiao W, et al. FTO-dependent demethylation of N6-methyladenosine regulates mRNA splicing and is required for adipogenesis. Cell research. 2014;24(12):1403-19. 6. Wang X, Lu Z, Gomez A, Hon GC, Yue Y, Han D, et al. N6-methyladenosine-dependent regulation of messenger RNA stability. Nature. 2014;505(7481):117-20. 7. Wang X, Zhao BS, Roundtree IA, Lu Z, Han D, Ma H, et al. N(6)-methyladenosine Modulates Messenger RNA Translation Efficiency. Cell. 2015;161(6):1388-99. 8. Iwanami Y, Brown GM. Methylated bases of ribosomal ribonucleic acid from HeLa cells. Archives of biochemistry and biophysics. 1968;126(1):8-15. 9. Dubin DT, Stollar V. Methylation of Sindbis virus "26S" messenger RNA. Biochemical and biophysical research communications. 1975;66(4):1373-9. 10. Dubin DT, Taylor RH. The methylation state of poly A-containing messenger RNA from cultured hamster cells. Nucleic acids research. 1975;2(10):1653-68. 11. Edelheit S, Schwartz S, Mumbach MR, Wurtzel O, Sorek R. Transcriptome-wide mapping of 5-methylcytidine RNA modifications in bacteria, archaea, and yeast reveals m5C within archaeal mRNAs. PLoS genetics. 2013;9(6):e1003602. 12. Squires JE, Patel HR, Nousch M, Sibbritt T, Humphreys DT, Parker BJ, et al. Widespread occurrence of 5-methylcytosine in human coding and non-coding RNA. Nucleic acids research. 2012;40(11):5023-33. 13. Amort T, Rieder D, Wille A, Khokhlova-Cubberley D, Riml C, Trixl L, et al. Distinct 5-methylcytosine profiles in poly(A) RNA from mouse embryonic stem cells and brain. Genome biology. 2017;18(1):1. 14. Yang X, Yang Y, Sun BF, Chen YS, Xu JW, Lai WY, et al. 5-methylcytosine promotes mRNA export - NSUN2 as the methyltransferase and ALYREF as an m(5)C reader. Cell research. 2017;27(5):606-25. 15. Goll MG, Kirpekar F, Maggert KA, Yoder JA, Hsieh CL, Zhang X, et al. Methylation of tRNAAsp by the DNA methyltransferase homolog Dnmt2. Science (New York, NY). 2006;311(5759):395-8. 16. Tuorto F, Liebers R, Musch T, Schaefer M, Hofmann S, Kellner S, et al. RNA cytosine methylation by Dnmt2 and NSun2 promotes tRNA stability and protein synthesis. Nature structural & molecular biology. 2012;19(9):900-5. 17. Huang T, Chen W, Liu J, Gu N, Zhang R. Genome-wide identification of mRNA 5-methylcytosine in mammals. Nature structural & molecular biology. 2019;26(5):380-8. 18. Tahiliani M, Koh KP, Shen Y, Pastor WA, Bandukwala H, Brudno Y, et al. Conversion of 5-methylcytosine to 5-hydroxymethylcytosine in mammalian DNA by MLL partner TET1. Science (New York, NY). 2009;324(5929):930-5.

15

19. Ito S, D'Alessio AC, Taranova OV, Hong K, Sowers LC, Zhang Y. Role of Tet proteins in 5mC to 5hmC conversion, ES-cell self-renewal and inner cell mass specification. Nature. 2010;466(7310):1129-33. 20. He YF, Li BZ, Li Z, Liu P, Wang Y, Tang Q, et al. Tet-mediated formation of 5-carboxylcytosine and its excision by TDG in mammalian DNA. Science (New York, NY). 2011;333(6047):1303-7. 21. Ito S, Shen L, Dai Q, Wu SC, Collins LB, Swenberg JA, et al. Tet proteins can convert 5-methylcytosine to 5-formylcytosine and 5-carboxylcytosine. Science (New York, NY). 2011;333(6047):1300-3. 22. Fu L, Guerrero CR, Zhong N, Amato NJ, Liu Y, Liu S, et al. Tet-mediated formation of 5-hydroxymethylcytosine in RNA. Journal of the American Chemical Society. 2014;136(33):11582-5. 23. Huber SM, van Delft P, Mendil L, Bachman M, Smollett K, Werner F, et al. Formation and abundance of 5-hydroxymethylcytosine in RNA. Chembiochem : a European journal of chemical biology. 2015;16(5):752-5. 24. Basanta-Sanchez M, Wang R, Liu Z, Ye X, Li M, Shi X, et al. TET1-Mediated Oxidation of 5-Formylcytosine (5fC) to 5-Carboxycytosine (5caC) in RNA. Chembiochem : a European journal of chemical biology. 2017;18(1):72-6. 25. Shen Q, Zhang Q, Shi Y, Shi Q, Jiang Y, Gu Y, et al. Tet2 promotes pathogen infection-induced myelopoiesis through mRNA oxidation. Nature. 2018;554(7690):123-7. 26. Yang L, Perrera V, Saplaoura E, Apelt F, Bahin M, Kramdi A, et al. m(5)C Methylation Guides Systemic Transport of Messenger RNA over Graft Junctions in Plants. Curr Biol. 2019. 27. Cheng JX, Chen L, Li Y, Cloe A, Yue M, Wei J, et al. RNA cytosine methylation and methyltransferases mediate chromatin organization and 5-azacytidine response and resistance in leukaemia. Nature communications. 2018;9(1):1163. 28. Rieder D, Amort T, Kugler E, Lusser A, Trajanoski Z. meRanTK: methylated RNA analysis ToolKit. Bioinformatics. 2016;32(5):782-5. 29. Liang F, Hao L, Wang J, Shi S, Xiao J, Li R. BS-RNA: An efficient mapping and annotation tool for RNA bisulfite sequencing data. Comput Biol Chem. 2016;65:173-7. 30. Legrand C, Tuorto F, Hartmann M, Liebers R, Jacob D, Helm M, et al. Statistically robust methylation calling for whole-transcriptome bisulfite sequencing reveals distinct methylation patterns for mouse RNAs. Genome research. 2017;27(9):1589-96. 31. Vera-Cabrera L, Rendon A, Diaz-Rodriguez M, Handzel V, Laszlo A. Dot blot assay for detection of antidiacyltrehalose antibodies in tuberculous patients. Clinical and diagnostic laboratory immunology. 1999;6(5):686-9. 32. Miao Z, Xin N, Wei B, Hua X, Zhang G, Leng C, et al. 5-hydroxymethylcytosine is detected in RNA from mouse brain tissues. Brain research. 2016;1642:546-52. 33. Lewinska A, Adamczyk-Grochala J, Kwasniewicz E, Wnuk M. Downregulation of methyltransferase Dnmt2 results in condition-dependent telomere shortening and senescence or apoptosis in mouse fibroblasts. Journal of cellular physiology. 2017;232(12):3714-26. 34. Lewinska A, Adamczyk-Grochala J, Kwasniewicz E, Deregowska A, Semik E, Zabek T, et al. Reduced levels of methyltransferase DNMT2 sensitize human fibroblasts to oxidative stress and DNA damage that is accompanied by changes in proliferation-related miRNA expression. Redox biology. 2018;14:20-34. 35. Cui X, Liang Z, Shen L, Zhang Q, Bao S, Geng Y, et al. 5-Methylcytosine RNA Methylation in Arabidopsis Thaliana. Molecular plant. 2017;10(11):1387-99.

16

36. Khoddami V, Cairns BR. Identification of direct targets and modified bases of RNA cytosine methyltransferases. Nature biotechnology. 2013;31(5):458-64. 37. George H, Ule J, Hussain S. Illustrating the Epitranscriptome at Nucleotide Resolution Using Methylation-iCLIP (miCLIP). Methods in molecular biology (Clifton, NJ). 2017;1562:91-106. 38. Hussain S, Sajini AA, Blanco S, Dietmann S, Lombard P, Sugimoto Y, et al. NSun2-mediated cytosine-5 methylation of vault noncoding RNA determines its processing into regulatory small RNAs. Cell reports. 2013;4(2):255-61. 39. Van Haute L, Dietmann S, Kremer L, Hussain S, Pearce SF, Powell CA, et al. Deficient methylation and formylation of mt-tRNA(Met) wobble cytosine in a patient carrying mutations in NSUN3. Nature communications. 2016;7:12039. 40. Frommer M, McDonald LE, Millar DS, Collis CM, Watt F, Grigg GW, et al. A genomic sequencing protocol that yields a positive display of 5-methylcytosine residues in individual DNA strands. Proceedings of the National Academy of Sciences of the United States of America. 1992;89(5):1827-31. 41. Schaefer M, Pollex T, Hanna K, Lyko F. RNA cytosine methylation analysis by bisulfite sequencing. Nucleic acids research. 2009;37(2):e12. 42. Chen YS, Ma HL, Yang Y, Lai WY, Sun BF, Yang YG. 5-Methylcytosine Analysis by RNA-BisSeq. Methods in molecular biology (Clifton, NJ). 2019;1870:237-48. 43. Foss-Feig JH, Adkinson BD, Ji JL, Yang G, Srihari VH, McPartland JC, et al. Searching for Cross-Diagnostic Convergence: Neural Mechanisms Governing Excitation and Inhibition Balance in Schizophrenia and Autism Spectrum Disorders. Biol Psychiatry. 2017;81(10):848-61. 44. Hussain S, Aleksic J, Blanco S, Dietmann S, Frye M. Characterizing 5-methylcytosine in the mammalian epitranscriptome. Genome biology. 2013;14(11):215. 45. Amort T, Souliere MF, Wille A, Jia XY, Fiegl H, Worle H, et al. Long non-coding RNAs as targets for cytosine methylation. RNA biology. 2013;10(6):1003-8. 46. Tost J, Gut IG. DNA methylation analysis by pyrosequencing. Nature protocols. 2007;2(9):2265-75. 47. Feng J, Liu T, Qin B, Zhang Y, Liu XS. Identifying ChIP-seq enrichment using MACS. Nature protocols. 2012;7(9):1728-40. 48. Koboldt DC, Chen K, Wylie T, Larson DE, McLellan MD, Mardis ER, et al. VarScan: variant detection in massively parallel sequencing of individual and pooled samples. Bioinformatics (Oxford, England). 2009;25(17):2283-5. 49. Khoddami V, Cairns BR. Transcriptome-wide target profiling of RNA cytosine methyltransferases using the mechanism-based enrichment procedure Aza-IP. Nature protocols. 2014;9(2):337-61. 50. Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnetjournal. 2011;17:10--2. 51. Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30(15):2114-20. 52. Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9(4):357-9. 53. Kim D, Langmead B, Salzberg SL. HISAT: a fast spliced aligner with low memory requirements. Nat Methods. 2015;12(4):357-60. 54. Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29(1):15-21.

17

55. Kim D, Pertea G, Trapnell C, Pimentel H, Kelley R, Salzberg SL. TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome biology. 2013;14(4):R36. 56. Wei Z, Panneerdoss S, Timilsina S, Zhu J, Mohammad TA, Lu ZL, et al. Topological Characterization of Human and Mouse m(5)C Epitranscriptome Revealed by Bisulfite Sequencing. International journal of genomics. 2018;2018:1351964. 57. Parker BJ. Statistical Methods for Transcriptome-Wide Analysis of RNA Methylation by Bisulfite Sequencing. Methods in molecular biology (Clifton, NJ). 2017;1562:155-67.

18

Chapter 2 - Neuronal Activity Modifies RNA Cytosine-5 Methylation

Landscape in Mouse Cortical Neuron

Xiguang Xu1,2, Zachary Johnson1,2, Xiaoran Wei1,3, and Hehuang Xie1,2,3*

1. Fralin Life Sciences Institute at Virginia Tech, Blacksburg, VA 24061, USA

2. Department of Biological Sciences, Virginia Tech, Blacksburg, VA 24061, USA

3. Department of Biomedical Sciences and Pathobiology, Virginia-Maryland College of

Veterinary Medicine, Blacksburg, VA 24061, USA

*Corresponding author: Email: [email protected]

Status: Manuscript under preparation.

19

Highlights

l Neuronal activity induces distinct gene expression changes at the early and late phases.

l RNA bisulfite sequencing reveals dynamic RNA 5-mrC landscape in neurons upon

activation.

l mRNA methylation changes negatively correlate with mRNA expression changes in

activated neurons.

2.1 Abstract RNA cytosine-5 methylation (5-mrC) is an important posttranscriptional modification

involved in diverse biological processes. The dynamic regulation of 5-mrC modification in

response to environmental stimuli is still largely unknown. Here we provide a transcriptome-wide

map of 5-mrC modification at single nucleotide-resolution combined with gene expression profile

in mouse cortical neurons upon activation. We have identified distinct gene expression changes in

activated neurons at both the early and late stages. RNA bisulfite sequencing reveals dynamic

RNA 5-mrC landscape during neuronal activity. It shows mRNAs harboring differentially

methylated 5-mrC sites (DMS) are associated with mitochondrial and synaptic functions.

Moreover, it shows a negative correlation between RNA methylation changes and mRNA

expression changes in activated neurons. In summary, our study provides the transcriptome-wide

landscape of RNA methylation dynamics in neurons in response to environmental input and

reveals a potential link between RNA methylation and mRNA expression.

Keywords: RNA cytosine-5 methylation, neuronal activity, RNA bisulfite sequencing.

20

2.2 Background Post-transcriptional modification of RNA is emerging as a new layer in the regulation of

gene expression (1). With recent advances in chemical and biochemical detection techniques,

researchers have identified more than 170 types of RNA modifications (2), including N6-

methyladenosine (m6A) and 5-methylcytosine (5-mrC). A number of studies have appeared (3-5)

on the modification of m6A in mRNA regarding its writer, eraser and reader proteins, indicating a

reversible and highly dynamic property of RNA modification. Meanwhile, the study of 5-mrC has

just begun.

RNA cytosine-5 methylation (5-mrC) was first identified in the more abundant and stable

ribosomal RNA (rRNA) and transfer RNA (tRNA) (6, 7). Later on, 5-mrC modification was

identified in the much less abundant messenger RNA (mRNA) and non-coding RNA by applying

transcriptome-wide approaches based on next generation sequencing (NGS) (8, 9). The 5-mrC

modification in mRNA was reported to be introduced mainly by NOP2/Sun RNA

methyltransferase family member 2 (NSUN2) (10, 11). There are reports (12-14) that 5-mrC

modification in RNA can be sequentially oxidized by the ten-eleven translocation (TET) enzymes

(TET1, TET2, TET3) to form 5-hydroxymethylcytosine (5-hmrC), 5-formylcytosine (5-fC) and 5-

carboxylcytosine (5-caC). However, we still don’t know much about the underlying mechanism

that further mediates the conversion of 5-carboxylcytosine to unmethylated cytosine in RNA.

Moreover, Tet1/Tet2/Tet3 triple knockout mouse embryonic stem cells (ESCs) showed reduced

but detectable 5-hmrC level compared to wild type ESCs (12), indicating that additional unknown

enzymes may be affecting the RNA demethylation pathway.

Despite the elusive pathway for 5-mrC demethylation, recent studies have revealed critical

roles of 5-mrC modification in RNA metabolism. Transcriptome-wide mapping of 5-mrC

modification shows a significant enrichment in the vicinity of the translational start sites and 3’-

untranslated regions (3’UTRs) (8, 10, 15). 5-mrC in mRNAs facilitates mRNA export from the

nucleus to the cytoplasm with the aid of the 5-mrC reader protein ALY/REF export factor

(ALYREF) (10). Moreover, the changes of 5-mrC in mRNAs affect the regulation of mouse testis

tissue development (10), the ovarian germline stem cell (GSC) development in Drosophila (16),

the process of maternal-to-zygotic transition (MZT) in Zebrafish (17), and the pathogenesis of

21

human bladder cancer (18). These findings indicate highly dynamic regulation of 5-mrC

modification in diverse physiological and pathological conditions.

To investigate the dynamic changes of 5-mrC modification, we adopt a widely used

neuronal activity model, in which the in vitro cultured mouse cortical neurons were depolarized

with potassium chloride (19). Membrane depolarization triggers a calcium influx and activates a

complex signaling cascade with highly dynamic gene expression (19, 20). This provides an ideal

system to investigate the dynamics of 5-mrC modification in neurons in response to environmental

stimuli. We perform RNA bisulfite sequencing (RNA BS-seq) and RNA-seq to provide the single-

nucleotide resolution of 5-mrC modification at the transcriptome-wide level, as well as gene

expression profile upon neuronal activation. We have identified distinct gene expression profiles

at the early and late stages of activated neurons. Differential methylation analysis shows the

dynamic mRNA methylation changes during neuronal activity, and the DMS-containing genes are

linked to mitochondrial and synaptic functions. Furthermore, the changes in mRNA methylation

are negatively correlated with the changes in mRNA expression in activated neurons. Thus, our

findings illustrate the highly dynamics of 5-mrC modification induced by neuronal activity, and

indicate a potential link between 5-mrC modification and mRNA expression.

2.3 Methods

Animal

C57BL/6 mice are maintained and bred in a 12-hour light/dark cycle under standard

pathogen-free conditions; adult female and male mice are used for time pregnancy. Embryos are

timed by checking virginal plugs daily in the morning. Positive plugs are designated as E0.5. The

experiments have been approved prior to the study by the Institutional Animal Care and Use

Committee (IACUC) of Virginia Tech.

Primary mouse cortical neuronal culture

Primary mouse cortical neurons are prepared as previously described (19) with some

modifications. Briefly, C57BL/6 E16.5 mouse embryos are micro-dissected for cortex tissues and

the cortex tissues are dissociated into single-cell suspension by Neural tissue dissociation kit (P)

(Cat# 130-092-628) according to the manufacturer’s instructions. After dissociation, neuronal cells

are filtered through 70-µm strainer (Falcon), and spun at 300g for 10 min. The cell pellet is

resuspended in neuronal culture medium (Neurobasal medium containing 2% B27 supplement

22

(Invitrogen), 1% Glutamax (ThermoFisher) and 1% penicillin-streptomycin (ThermoFisher)) and

seeded on laminin and poly-ornithine coated 10-cm dishes. Neurons are grown in vitro for 7 days

with fresh medium changed on DIV3 and DIV6.

Membrane depolarization with potassium chloride

At DIV6, neuronal cells are silenced with 1 µM tetrodotoxin (TTX; Fisher) and 100 µM DL-

2-amino-5-phosphopentanoic acid (DL-AP5; Fisher) overnight. The next morning, neuronal cells

are depolarized with 55mM KCl for 0h, 2h, and 6h. At the end time point, the neuronal cells are

harvested and lysed with TRIzol reagent for RNA extraction.

RNA sample preparation

Total RNA is extracted using TRIzol reagent combined with RNeasy min kit (QIAGEN)

with DNase I on-column digestion. To enrich poly(A)-containing mRNAs, two rounds of poly(A)

selection are performed using oligo(dT) beads (ThermoFisher) following the manufacturer’s

instructions.

Generation of spike-in unmethylated mRNA control

The spiked-in unmethylated mRNA is transcribed from the pTRI-Xef plasmid supplied by

the MEGAscript™ T7 Transcription Kit (Invitorgen). Briefly, the linearized pTRI-Xef plasmid is

in vitro transcribed in a reaction with MEGAscript T7 RNA polymerase (Ambion) at 37 °C for 4

h, followed by DNase treatment to remove DNA template. The RNA sample is purified by RNeasy

Mini Kit (QIAGEN). The in vitro transcribed unmethylated mRNA control is spiked at a ratio of

0.5% in the RNA samples before bisulfite treatment.

RNA BS-seq library construction

RNA bisulfite conversion is performed as previously described (15) with minor

modifications. Briefly, poly(A) RNA is spiked-in with Xef unmethylated RNA and bisulfte

converted using the EZ RNA methylation Kit (Zymo Research) with initial denaturation at 95°C

for 1min, followed by three cycles of 70 °C for 10min and 64 °C for 45min. Binding,

desulphonation, and purification are performed on-column following the manufacturer’s

instructions. The eluted RNA is used for stranded RNA-seq library construction using the TruSeq

Stranded mRNA Library Preparation Kit (Illumina) with the following modifications: 1) omit the

fragmentation step; 2) supplement ACT random hexamers during first strand cDNA synthesis.

RNA-seq library construction

23

Stranded RNA-seq libraries are constructed using the TruSeq Stranded mRNA Library

Preparation Kit (Illumina) following the manufacturer’s instructions. Briefly, after two rounds of

poly(A) selection, the mRNA samples are fragmented and primed to synthesize first strand cDNA,

followed by the synthesis of the second strand cDNA. After Ampure XP beads purification, dA

tailing is performed and indexed adapters are ligated to both ends of the ds cDNA. Adapter-ligated

DNA fragments are enriched by PCR amplification for 12 cycles. After Ampure XP beads

purification, the PCR products are size-selected with the range from 350bp to 550bp on 2% dye-

free agarose gel using pippin recovery system (Sage Science). The recovered libraries are

sequenced on Hiseq 4000 platform with 150bp paired end mode (Illumina).

RNA-seq data analysis

Raw reads are trimmed off adapter sequences and low quality bases (Q < 30) using Trim

Galore (version 0.5.0) (https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/). The

processed reads with lengths greater than 30 nt are defined as clean reads. Clean reads are mapped

to mm10 genome and gene expression level are calculated by RSEM (21). We filter out genes that

are not expressed (TPM=0). The union genes of two replicates are used as expressed gene list. For

differentially expressed genes analysis, we use the cpm function from the edgeR package (22, 23)

to generate the CPM (Counts per million) values. Then we filter out the genes with CPM ≤ 0.5.

The raw counts are used to identify differentially expressed genes by DESeq2 (24). The criteria of

differentially expression genes includes: (1) the adjusted p-value is less than 0.05, and (2) the gene

expression fold change is above 1.5.

RNA BS-seq data analysis

Mouse transcriptome (GRCm38) and annotation files are download from Emsemble

database. Raw reads are trimmed off the first 6 bases on 5’ end, adapter sequences, and low quality

bases by Trim Galore (https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/). The

processed reads with lengths greater than 30 nt are defined as clean reads and mapped to mouse

transcriptome using “meRanT align” from meRanTK (version 1.2.1) with the parameters: -fmo –

mmr 0.01 (25). Analysis of the unmethylated Xef mRNA spike-in controls reveals global bisulfite

conversion rate > 99.8%. Unambiguously aligned reads are used to call candidate 5-mrCs by

meRanCall from meRanTK with parameters: -md 1 -ei 0.1 -fdr 0.01. Only cytosine positions with

coverage depth ≥ 20, methylation level ≥ 0.1 and methylated cytosine depth ≥ 3 are considered as

candidate 5-mrC sites. The candidate 5-mrC sites found on transcripts that are not expressed in the

24

corresponding RNA-seq datasets are further filtered out. The overlapped 5-mrC sites between two

replicates are considered as credible 5-mrC sites and used for downstream analysis. The

coordinates of these 5-mrC sites are converted to genome coordinates using R package ensembldb

(26) (Supplementary Table 3).

Distribution of 5-mrC sites

The 5-mrC sites are annotated with GTF file from Ensemble. The 5-mrC sites located within

mRNAs are assigned into three segments: 5’ UTR, CDS, and 3’ UTR. Based on the ratio of the

average lengths of 5’ UTR, CDS, and 3’ UTR in the transcriptome, we assign 5, 22 and 18 bins to

5’ UTR, CDS, and 3’ UTR, respectively. The number of 5-mrC sites located in each bin was

counted and and the percentage of 5-mrC sites in each bin was calculated to plot the density of 5-

mrC sites along mRNA transcripts.

Differential 5-mrC methylation analysis

The sites used for differential methylation analysis require the following two criteria: (1)

coverage depth ≥ 1 in all replicates, and (2) candidate 5-mrC sites in at least one condition. A

customized Perl code implemented with Fisher Exact Test is used to evaluate the significance of

differential methylation, and false discovery rate (FDR) method is used to correct for multiple

comparisons. Sites with adjusted p value < 0.05 are considered as differentially methylated sites

(DMS).

Correlation analysis between RNA methylation and RNA expression

The odds ratio (OR) or methylation fold change is calculated as previously described (27).

Pearson correlation between log2 expression fold changes and log2 methylation fold change is

performed to identify the correlation between RNA methylation and RNA expression.

Gene Ontology analysis

Gene ontology (GO) analysis is performed using the R package clusterProfiler (28). Default

parameters are used for the enrichment analysis for Biological Process (BP), cellular component

(CC), and molecular function (MF). The ten most significant BP categories are shown.

Immunostaining

Immunostaining is performed as previously described (29). Briefly, E16.5 mouse cortical

neurons are dissociated and seeded on 8-well chamber and cultured in vitro for 7 days (DIV7). The

neurons are fixed with 4% paraformaldehyde in PBS for 15 min and permeabilized with 0.2%

25

TritonX-100 in PBS for 10 min. After blocked with 5% Normal Goat Serum (ThermoFisher) at

RT for 1 h, the cells are incubated with mouse anti-Tuj1 antibody (Biolegend, 801201) and rabbit

anti-GFAP antibody (Sigma, HPA056030) at 4 °C overnight. Then the cells are incubated with

Cy3 conjugated anti-rabbit IgG (A10520, Invitrogen) and Alexa Fluor 488 conjugated anti-mouse

IgG (A10680, Invitrogen) secondary antibodies at RT in darkness for 1 h. After washing 3 × 5 min

with 1×PBS, cells are then mounted with DAPI-Fluoromount-G™ Clear Mounting Media

(SouthernBiotech, 010020). Fluorescent images are acquired using confocal microscope.

2.4 Results

2.4.1 Distinct gene expression profile upon neuronal activation

Mouse E16.5 cortical neurons are dissociated and cultured as previously described (19).

Immunostaining of the neuronal culture indicates a high purity of neurons (Figure 1). To identify

the transcriptome-wide gene expression changes upon neuronal activation, we have performed

RNA-seq in E16.5 cortical neurons at 0h, 2h, and 6h after membrane depolarization by 55mM KCl

(Table 1). The coverages are comparable among all the six RNA-seq libraries (Figure S1b). The

correlation of two biological replicates is 0.98~0.99 (Figure S1c, S1d, S1e). PCA analysis shows

two replicates are close to each other (Figure S1a). With a cutoff of fold change 1.5 and adjusted

p-value 0.05, we have identified numerous differentially expressed genes (DEGs) in activated

neurons. Compared to the control (0h), neurons stimulated with KCl for 2h show 770 up-regulated

genes and 1,145 down-regulated genes, while neurons stimulated with KCl for 6h display 2,222

up-regulated genes and 2,146 down-regulated genes (Figure 2a, 2b). GO annotation of the

differentially expressed genes shows enrichment of a large number of biological functions (Figure

S2). Genes up-regulated at both 2h and 6h are enriched in a common sets of biological processes,

such as the regulation of protein kinase activity, protein phosphorylation, Wnt signal pathway,

ERK1 and ERK2 cascade. Meanwhile, genes increased at 2h are specifically enriched in

transcription factor complex, while genes increased at 6h are enriched in the components of

cellular structures in neurons, such as post-synaptic membrane, axon part, growth cone.

Interestingly, the genes down-regulated at both 2h and 6h are significantly enriched in RNA

modifications and RNA methyltransferase activity (Supplementary Figure 2). This indicates an

overall inhibition of RNA methylation upon neuronal activation.

26

To further decipher the distinct gene expression profiles at the early and late stages of

neuronal activity, we define early response genes with two criteria: 1) up-regulated at 2h compared

to 0h; 2) down-regulated at 6h compared to 2h, and late response genes as: 1) up-regulated at 6h

compared to 0h; 2) up-regulated at 6h compared to 2h. As such, we have identified 111 early

response genes, including the transcription factors Egr1, Fos, Gadd45b, Npas4, Nr4a1, and 1,051

late response genes, including the pan-neuronal genes Nptx2, Gpr3, Kcna1. Figure 2c and 2d show

the distinct expression profiles of the early response genes and late response genes (Figure 2c, 2d).

The supplementary figure 3 shows the gene expression profile of the full list of late response genes

(Supplementary Figure 3). The early and late response genes are submitted for GO annotation

and the top 10 BP terms are shown (Figure 2e). It shows that the early response genes are highly

enriched in biological processes associated with transcription factor complex, such as positive

regulation of transcription from RNA polymerase II promoter, sequence-specific DNA binding,

while the late response genes are associated with protein kinase activity and axongenesis.

Collectively, these results clearly indicate the highly dynamic and distinct gene expression profiles

in neurons at the early and late stages of neuronal activity.

Figure 2-1 Characterization of E16.5 cortical neuronal culture

E16.5 mouse cortical neurons are dissociated and cultured in vitro for 7 days. The neurons are

double immunostained by Tuj-1 and GFAP. Scale bar: 50um.

Tuj1

DAPI GFAP

Merge

27

Table 2-1 Mapping statistics of RNA-seq datasets

Figure 2-2 Neuronal activity induces distinct gene expression profiles

RNA-seqdatasets(NA)

Sample#ofrawread

pairs#ofcleanread

pairs#ofmappedreadpairs Mappingrate

KCl0hrep1RNA-seq 13,330,587 13,316,191 12,045,284 93.76%KCl0hrep2RNA-seq 13,871,464 13,857,424 12,435,292 93.26%KCl2hrep1RNA-seq 12,423,041 12,408,432 11,128,841 93.37%KCl2hrep2RNA-seq 14,222,511 14,201,809 12,552,491 92.27%KCl6hrep1RNA-seq 13,473,791 13,457,201 11,739,394 90.96%KCl6hrep2RNA-seq 13,787,793 13,772,741 12,310,596 92.95%

-50510

60

40

20

0-log10(adjustedp-value)

Log2foldchange(2h/0h)

60

40

20

0-log10(adjustedp-value)

-505Log2foldchange(6h/0h)

a

b

c d Lateresponsegenes2h 6h

Log2Fold-changevs0h

Earlyresponsegenes

2h 6h

eskeletalmusclecelldifferentiationskeletalmuscleorgandevelopment

cellularresponsetoglucosestarvationskeletalmuscletissuedevelopment

muscletissuedevelopmentnegativeregulationofcellularresponse totransforminggrowthfactorbetastimulusnegativeregulationoftransforminggrowthfactorbetareceptor signalingpathway

striatedmuscletissuedevelopmenttransmembranereceptorproteinserine/threoninekinasesignalingpathway

regulationofMAPkinaseactivitycell-substrateadhesion

activationofJUNkinaseactivityactivationofproteinkinaseactivity

activationofMAPKactivitypeptidyl-serinephosphorylation

cell-matrixadhesionaxonogenesis

potassiumiontransportproteinautophosphorylation

-Log10(adjustedp-value)

010203040Count

28

(a, b) Volcano plot showing the differentially expressed genes (a: 0h vs 2h, b: 0h vs 6h) with

adjusted p-value < 0.05 and fold change > 1.5. (c, d) Heatmap showing the expression profile of

111 early response genes (c) and 111 late response genes (d). Gene expression level is presented

as log2 Fold-change vs 0h. (e) Gene ontology analysis of early response genes (111 genes) and

late response genes (1,051 genes).

2.4.2 Distribution profile of 5-mrC in mouse cortical neurons

To investigate the global profile of 5-mrC modification in mouse cortical neurons during

neuronal activity, we perform RNA bisulfite sequencing (RNA BS-seq) according to the method

described previously (15) with minor modification. We obtain 42 million ~ 62 million read pairs

for each library, and 27 million ~ 33 million reads are unambiguously mapped to the reference

transcriptome (mm10) (Table 2). To monitor the global bisulfite conversion efficiency,

unmethylated in vitro-transcribed Xef mRNAs are spiked in the poly(A)-enriched RNAs, and the

overall conversion rate (C to T conversion) is estimated to be 99.8%~99.9% in all the RNA BS-

seq libraries (Table 2). We perform mapping and methylation calling using the meRanTK package

with stringent criteria (see details in the methods). After methylation calling, we consider sites

with a coverage depth ≥  20, methylation level ≥  0.1 and methylated cytosine depth ≥  3 as

candidate 5-mrC sites. As the reduced sequence complexity of bisulfite-converted RNAs could

cause an incorrect read alignment (30), the candidate 5-mrC sites located on mRNAs that are not

expressed (TPM = 0) are further excluded. The remaining 5-mrC sites are considered as credible

5-mrC sites. Only the overlapped 5-mrC sites between two biological replicates are used for

downstream analysis.

Table 2-2 Mapping statistics of RNA BS-seq datasets

Sample#ofrawread

pairs#ofcleanreadpairs

#ofmappedreadpairs

Mappingrate

Bisulfiteconversionrate

KCl0hrep1RNABS-seq 49,159,745 46,819,385 30,438,509 65.01% 0.9989KCl0hrep2RNABS-seq 62,632,428 59,306,908 33,429,983 56.37% 0.9989KCl2hrep1RNABS-seq 42,128,172 39,653,238 27,638,572 69.70% 0.9989KCl2hrep2RNABS-seq 58,781,480 54,432,207 32,344,606 59.42% 0.9989KCl6hrep1RNABS-seq 54,899,136 51,731,054 31,692,653 61.26% 0.9983KCl6hrep2RNABS-seq 46,938,529 44,291,901 29,832,561 67.35% 0.9989

29

The reproducibility of RNA BS-seq datasets are high between two biological replicates

(Supplementary Figure 4). The percentage of overlapped 5-mrC sites between two biological

replicates ranges from 49.9% to 78.9% (Supplementary Figure 3a-c)., and the Pearson’s

correlation for the methylation level of the overlapped 5-mrC sites between two biological

replicates is in the range from 0.81 to 0.90 (Supplementary Figure 3d-f). A total of 2009-3175

5-mrC sites within 249-334 RNA molecules are identified in neurons stimulated with KCl for 0h,

2h and 6h. Among the 5-mrC sites identified, the majority (95.6% ~ 97.6%) are located within

messenger RNAs (mRNAs) (Figure 3f-h). The remaining 5-mrC sites are mapped to diverse types

of RNAs, including processed transcripts, pseudogene transcripts, and others (Figure 3f-h). The

medium methylation level of 5-mrC sites is approximately 20% among the three groups (20.4% in

0h, 20.0% in 2h, and 21.6% in 6h) (Figure 3d). The methylation level of the majority (71.6% in

0h, 68.8% in 2h, and 62.5% in 6h) of 5-mrC sites is below 30%, and only 5.6%-6.0% of 5-mrC

sites shows methylation level above 50% (Figure 3a-c). The sequence frequency logo shows the

embedment of 5-mrC sites in C-C/T-rich sequence context (Figure 3e). Density plot shows a mild

peak of 5-mrC sites immediately downstream of translation initiation sites and significant peaks

at 3’UTR (Figure 3i-k).

To further explore the potential functions of 5-mrC modification in neurons upon neuronal

activation, we perform gene ontology (GO) analysis on mRNAs harboring 5-mrC modification.

We find that 5-mrC containing mRNAs in all the three groups (0h, 2h, 6h) showed consistent

enrichment in mitochondrial function, such as oxidative phosphorylation, ATP synthesis, electron

transport chain (Supplementary Figure 5). This indicates a critical role of 5-mrC modification in

mitochondrial mRNAs regarding the regulation of cellular energy metabolism. This is consistent

with previous findings that mitochondrial mRNAs in tissues with high demand of energy, such as

muscle and heart, are enriched with 5-mrC modification (11). Interestingly, 5-mrC containing

mRNAs in neurons at 2hr are specifically enriched with numerous signaling pathways, such as the

response to extracellular stimuli, and TOR signaling, while 5-mrC containing mRNAs in neurons

at 6hr are more enriched in synaptic functions, such as the regulation of long-term neuronal

synaptic plasticity, synapse organization, and the positive regulation of neuron projection

development (Supplementary Figure 5). The difference in GO enrichment during the different

stages of neurons after stimulation indicates a potential role of 5-mrC modification’s involvement

30

in the regulation of neuronal activity. These results suggest dynamic regulation of 5-mrC

modification during neuronal activity.

Figure 2-3 Distribution profile of 5-mrC modification in mouse cortical neurons during

neuronal activity

a

b

c

d

e

0.10.20.30.40.50.60.70.80.91.0Methylationlevel(0h)

50

40

30

20

10

0

Percentage(%

)

0.10.20.30.40.50.60.70.80.91.0Methylationlevel(2h)

50

40

30

20

10

0

Percentage(%

)

0.10.20.30.40.50.60.70.80.91.0Methylationlevel(6h)

50

40

30

20

10

0

Percentage(%

)

0h2h6hKCl treatment

100

75

50

25

0

Methylatio

nlevel(%)

1.00

0.50

0.00

1.00

0.50

0.00

1.00

0.50

0.00

Probability

-10010

0h

2h

6h

Others(0.19%)Processedtranscript(3.65%)mRNA(96.16%)

Others(0.05%)Processedtranscript(2.39%)mRNA(97.56%)

Others(0.03%)Processedtranscript(4.33%)mRNA(95.64%)

f

h

g

0h

2h

6h

8

6

4

2

0

Density(x10

-2)

8

6

4

2

0

Density(x10

-2)

8

6

4

2

0

Density(x10

-2)

j

i

k

31

(a, b, c) Histogram showing the distribution of 5-mrC methylation levels. (d) Boxplot showing the

methylation levels of 5-mrC sites. (e) Sequence frequency logo for the sequence context proximal

to 5-mrC sites. (f, g, h) Pie chart showing the percentage of 5-mrC sites in various RNA types. (i,

j, k) Density plot showing the distribution of 5-mrC sites along mRNA transcripts (5’UTR, CDS,

3’UTR). The moving average of percentages of mRNA 5-mrC sites were shown.

2.4.3 Dynamic 5-mrC landscape upon neuronal activation

To determine the dynamic feature of 5-mrC modification during neuronal activity, we

compare the methylation profiles in the three groups. Firstly, we check the overlap of 5-mrC sites

among the three groups (Figure 4a). It shows that 1587 sites are conserved in neurons during

neuronal activity, while a significant number of 5-mrC sites are lost or gained upon neuronal

activation (Figure 4a). This suggests dynamic regulation of 5-mrC modification in neurons upon

activation. To further identify the dynamic changes of 5-mrC modification upon neuronal

activation, we perform differential methylation analysis in two comparisons: 0h vs 2h, and 0h vs

6h, representing the early and late stages of neuronal activity, respectively. Fisher’s exact test is

performed with adjusted p-value cutoff of 0.05. For the comparison between quiescent neurons

(0h) and activated neurons at the early stage (2h), we include a total of 3,896 C sites for differential

methylation analysis and identify 1,166 5-mrC sites within 261 mRNAs as differentially

methylated 5-mrC sites (DMS). For the comparison between quiescent neurons (0h) and activated

neurons at the late stage (6h), we contain a total of 4,980 C sites for differential methylation

analysis and identify 641 5-mrC sites within 234 mRNAs as DMS. The global methylation profile

of the DMS sites is shown (Figure 4b). GO annotation is performed to identify potential functions

of the DMS-containing mRNAs. DMS-containing mRNAs from both comparisons are enriched

for mitochondrial oxidative phosphorylation and synaptic function (Figure 4c,d), indicating the

fine-tune regulation of 5-mrC modification in the homeostasis of mitochondria and synapse.

32

Figure 2-4 Neuronal activity induces RNA methylation changes in neurons

(a) Venn diagram showing the overlap of 5-mrC sites among the three groups. (b) Heatmap

showing the methylation profile of the union differentially methylated 5-mrC sites (0h vs 2h and

0h vs 6h) in the three groups. (c) Gene ontology analysis of mRNAs with differentially methylated

5-mrC sites (0h vs 2h and 0h vs 6h).

2.4.4 RNA methylation negatively correlates with mRNA expression in neurons upon

neuronal activation

To investigate the link between 5-mrC modification and gene expression, we integrate

RNA-seq and RNA BS-seq datasets to compare 5-mrC methylation changes and corresponding

mRNA expression changes. This procedure is performed for both comparisons (0h vs 2h, and 0h

vs 6h). Firstly, we include the union 5-mrC sites between the two groups and the corresponding

127

184

1,587

111798

662

397

0h 2h

6h

6h0h 2h

a b

cCellmigrationinhindbrain

HindbrainradialgliaguidedcellmigrationATPsynthesiscoupledelectrontransport

TORsignalingRegulationofRas proteinsignaltransduction

RegulationofsmallGTPasemediatedsignalingtransductionATPsynthesiscoupledelectrontransport

electrontransportchainrespiratoryelectrontransportchainProton transmembranetransport

CellmigrationinhindbrainGenerationofprecursormetabolitesandenergy

CellularrespirationRegulationofsmallGTPasemediatedsignaltransduction

OxidativephosphorylationRegulationofRas proteinsignaltransduction

-Log10(adjustedp-value)

DMS-genes(0hvs2h)

DMS-genes(0hvs6h)

0510Count

33

mRNAs for Pearson correlation analysis. It shows mild but significant negative correlation (0h vs

2h comparison: R = -0.063, p-value = 1.42e-4; 0h vs 6h comparison: R = -0.078, p-value = 7.27e-

8) between log2 expression fold change and log2 methylation fold change for both comparisons

(Figure 5a-b). Then we narrow down to include only DMS identified for the two comparisons (0h

vs 2h, 0h vs 6h) and the DMS-containing mRNAs for Pearson correlation analysis. It shows

consistent negative correlation between log2 expression fold change and log2 methylation fold

change for the two comparisons (0h vs 2h comparison: R = -0.042, p-value = 0.162; 0h vs 6h

comparison: R = -0.089, p-value = 0.032) (Figure 5c-d). Furthermore, we include mRNAs that

are both differentially expressed (DEG) and differentially methylated (DMS) between two groups

(0h vs 2h, 0h vs 6h). It shows consistent negative correlation between RNA methylation changes

and mRNA expression changes. For 0h vs 2h comparison, it shows more down-regulated mRNAs

with hypermethylated 5-mrC sites. For 0h vs 6h comparison, it shows more down-regulated

mRNAs with hypermethylated 5-mrC sites as well as more up-regulated mRNAs with

hypomethylated 5-mrC sites (Figure 5e-f). These results indicate that 5-mrC modification in

mRNAs could inhibit the mRNA expression in activated neurons.

34

Figure 2-5 5-mrC hypermethylation negatively correlates with mRNA expression

(a, b) Scatter plot showing the Pearson correlation between log2 methylation level odds ratio and

log2 gene expression fold change in the union 5-mrC sites between 0h and 2h (a) or between 0h

and 6h (b). (c, d) Scatter plot showing the Pearson correlation between log2 methylation level odds

ratio and log2 gene expression fold change in differentially methylated 5-mrC sites between 0h

and 2h (c) or between 0h and 6h (d). (e, f) distribution of mRNAs with significant changes in both

5-mrC methylation level and gene expression level in quiescent and activated neurons (e: 0h vs 2h,

f: 0h vs 6h).

2.5 Discussion In the nervous system, activity-driven gene expression is an essential part of neuronal

response to environmental stimuli, which could lead to long-lasting structural and

R:-0.078***P:7.27e-8

-4-2024Log2(oddsratio)(6h/0h)

2

0

-2

Log2(Fold-change)(6h/0h)

R:-0.063**P:1.42e-4

-4-2024Log2(oddsratio)(2h/0h)

2

0

-2

Log2(Fold-change)(2h/0h)

R:-0.089*P:0.032

-4-2024Log2(oddsratio)(6h/0h)

2

0

-2

Log2(Fold-change)(6h/0h)

R:-0.042P:0.162

-2.502.5Log2(oddsratio)(2h/0h)

2

1

0

-1

-2

Log2(Fold-change)(2h/0h)

-0.5-0.3-0.1 0.30.5

Differenceinmethylationlevel(2h-0h)

1.4

1.0

-0.6

-0.8

-1.0

Log2(Fold-change)(2h/0h)

52

21

3

6

-0.3-0.2-0.1 0.1.0.20.30.4

Differenceinmethylationlevel(6h-0h)

2.0

1.0

-1.0

-2.0

Log2(Fold-change)(6h/0h)

32

32

10

46

c d

e

a b

f

35

electrophysiological adaptations in the neural circuit during development, learning and memory

formation (31, 32). Previous studies have reported the dynamic changes in DNA methylation and

chromatin accessibility in neurons in response to stimulation (33, 34). Meanwhile, the changes in

RNA cytosine-5 methylation in activated neurons have not been studied yet.

In this study, we have applied a classical neuronal activity model to investigate the dynamic

5-mrC profile in activated neurons. With both RNA-seq and RNA BS-seq datasets, we are able to

profile gene expression as well as 5-mrC modification at transcriptome-wide level. We identify

distinct gene expression profiles with one set of early response genes (111 genes) for the early

stage and one set of late response genes (1,051 genes) for the late stage of neuronal activity. The

number of late responses genes is many more than that of early response genes. This is consistent

with the concept that the early response genes serve as the regulatory factors, such as transcription

factors, that regulate the expression of late response genes, which are involved in diverse aspects

of neuronal functions (19). Moreover, genes down-regulated at 2h are highly enriched in the

regulation of RNA methyltransferase activity. More studies are needed to elucidate the biological

functions of RNA methylation-related differentially expressed genes during neuronal activity.

With stringent parameters for alignment, methylation calling and filtering of potential false

positive 5-mrC sites, we identify thousands of 5-mrC sites in neurons during different stages of

neuronal activity. We further perform differential methylation analysis by Fisher’s exact test and

identify dynamic 5-mrC modification landscape upon neuronal activation. GO annotation shows

that DMS-related genes are significantly enriched in mitochondrial and synaptic functions. This

indicates the potential roles of 5-mrC modification in the regulation of energy metabolism and

synaptic adaptation in neurons in response to environment stimuli.

Furthermore, we investigate the relationship between RNA methylation and RNA expression.

We perform Pearson correlation between log2 expression fold changes and log2 methylation fold

changes using mRNAs containing either the union sets of 5-mrC sites or the DMS sites between

two groups (early stage: 0h vs 2h, late stage: 0h vs 6h). It shows consistently negative correlation

between RNA expression changes and RNA methylation changes. We further confirm this trend

by plotting the distribution of differentially expressed genes containing differentially methylated

5-mrC sites. Thus, these findings illustrate a potential link between RNA methylation and RNA

expression.

36

In this study, we provide a transcriptome-wide map of 5-mrC modification in neurons in

response to environmental stimuli. To further our understanding of RNA methylation in the

regulation of neuronal activity, we need more functional studies focusing on specific genes,

especially the studies on RNA methyltransferases and functional proteins that facilitate the

regulation of 5-mrC modification.

2.6 Supplementary data The following figures are the supplementary data to this project:

Supplementary Figure 1. Reproducibility between replicates in RNA-seq datasets

(a) PCA analysis showing the similarities of the six RNA-seq datasets. (b) Boxplot showing the

coverage among the six RNA-seq libraries. (c, d, e) Scatter plot showing the Pearson correlation

between two biological replicates in RNA-seq datasets.

15

10

5

0

Log2(count+1)(0hrep2)

051015Log2(count+1)(0hrep1)

Pearson’sr=0.99

15

10

5

0

Log2(count+1)(2hrep2)

051015Log2(count+1)(2hrep1)

Pearson’sr=0.98

15

10

5

0

Log2(count+1)(6hrep2)

051015Log2(count+1)(6hrep1)

Pearson’sr=0.98

25

0

-25

PC2

-2002040

PC1

0h2h6h

15

10

5

0Log2(normalize

counts+1)

0hre

p1

0hre

p2

2hre

p1

2hre

p2

6hre

p1

6hre

p2

c d e

a b

Pearson’sr=0.9829Pearson’sr=0.9810Pearson’sr=0.9857

37

Supplementary Figure 2. GO annotation of differentially expressed genes in neurons upon

activation

Gene ontology analysis of differentially expressed mRNAs (0h vs 2h up-regulated, 0h vs 2h down-

regulated, 0h vs 6h up-regulated, 0h vs 6h down-regulated).

0hvs2hdown

0hvs2hup

0hvs6hup

0hvs6hdown

020406080100

Count

-Log10(adjustedp-value)

38

Supplementary Figure 3. Expression profile of late response genes

Gene expression profile of the full list of late response genes (1,051 genes).

Lateresponsegenes(1051genes)

2h 6h

Log2Fold-changevs0h

3,174776 1,494 1,152 2,009 1,031 784 2,865 2,909

61.82% 63.59% 63.56% 66.09% 78.85% 49.91%

0hrep10hrep2 2hrep12hrep2 6hrep16hrep2

1.00

0.75

0.50

0.25

m5Clevel(0hre

p2)

0.250.500.751.00m5Clevel(0hrep1)

Pearson’sr=0.8969 1.00

0.75

0.50

0.25

m5Clevel(2hre

p2)

0.250.500.751.00m5Clevel(2hrep1)

1.00

0.75

0.50

0.25

m5Clevel(6hre

p2)

0.250.500.751.00m5Clevel(6hrep1)

Pearson’sr=0.8451 Pearson’sr=0.8145

d e f

a b c

39

Supplementary Figure 4. Reproducibility between replicates in RNA BS-seq datasets

(a, b, c) Venn diagram showing the overlap of 5-mrC sites between two biological replicates in

neurons. (d, e, f) Scatter plot showing the Pearson correlation of common 5-mrC sites between two

biological replicates in neurons.

Supplementary Figure 5. GO annotation of mRNAs containing 5-mrC sites

Gene ontology analysis of mRNAs with 5-mrC sites (0h, 2h, 6h).

0h

2h

6h

051015Count

-Log10(adjustedp-value)

40

2.7 References 1. Song J, Yi C. Chemical Modifications to RNA: A New Layer of Gene Expression Regulation. ACS chemical biology. 2017;12(2):316-25. 2. Boccaletto P, Machnicka MA, Purta E, Piatkowski P, Baginski B, Wirecki TK, et al. MODOMICS: a database of RNA modification pathways. 2017 update. Nucleic acids research. 2018;46(D1):D303-d7. 3. Liu J, Yue Y, Han D, Wang X, Fu Y, Zhang L, et al. A METTL3-METTL14 complex mediates mammalian nuclear RNA N6-adenosine methylation. Nature chemical biology. 2014;10(2):93-5. 4. Zheng G, Dahl JA, Niu Y, Fedorcsak P, Huang CM, Li CJ, et al. ALKBH5 is a mammalian RNA demethylase that impacts RNA metabolism and mouse fertility. Molecular cell. 2013;49(1):18-29. 5. Li M, Zhao X, Wang W, Shi H, Pan Q, Lu Z, et al. Ythdf2-mediated m(6)A mRNA clearance modulates neural development in mice. Genome biology. 2018;19(1):69. 6. Agris PF. Bringing order to translation: the contributions of transfer RNA anticodon-domain modifications. EMBO reports. 2008;9(7):629-35. 7. Schaefer M, Pollex T, Hanna K, Lyko F. RNA cytosine methylation analysis by bisulfite sequencing. Nucleic acids research. 2009;37(2):e12. 8. Squires JE, Patel HR, Nousch M, Sibbritt T, Humphreys DT, Parker BJ, et al. Widespread occurrence of 5-methylcytosine in human coding and non-coding RNA. Nucleic acids research. 2012;40(11):5023-33. 9. Edelheit S, Schwartz S, Mumbach MR, Wurtzel O, Sorek R. Transcriptome-wide mapping of 5-methylcytidine RNA modifications in bacteria, archaea, and yeast reveals m5C within archaeal mRNAs. PLoS genetics. 2013;9(6):e1003602. 10. Yang X, Yang Y, Sun BF, Chen YS, Xu JW, Lai WY, et al. 5-methylcytosine promotes mRNA export - NSUN2 as the methyltransferase and ALYREF as an m(5)C reader. Cell research. 2017;27(5):606-25. 11. Huang T, Chen W, Liu J, Gu N, Zhang R. Genome-wide identification of mRNA 5-methylcytosine in mammals. Nature structural & molecular biology. 2019;26(5):380-8. 12. Fu L, Guerrero CR, Zhong N, Amato NJ, Liu Y, Liu S, et al. Tet-mediated formation of 5-hydroxymethylcytosine in RNA. Journal of the American Chemical Society. 2014;136(33):11582-5. 13. Huber SM, van Delft P, Mendil L, Bachman M, Smollett K, Werner F, et al. Formation and abundance of 5-hydroxymethylcytosine in RNA. Chembiochem : a European journal of chemical biology. 2015;16(5):752-5. 14. Basanta-Sanchez M, Wang R, Liu Z, Ye X, Li M, Shi X, et al. TET1-Mediated Oxidation of 5-Formylcytosine (5fC) to 5-Carboxycytosine (5caC) in RNA. Chembiochem : a European journal of chemical biology. 2017;18(1):72-6. 15. Amort T, Rieder D, Wille A, Khokhlova-Cubberley D, Riml C, Trixl L, et al. Distinct 5-methylcytosine profiles in poly(A) RNA from mouse embryonic stem cells and brain. Genome biology. 2017;18(1):1. 16. Zou F, Tu R, Duan B, Yang Z, Ping Z, Song X, et al. Drosophila YBX1 homolog YPS promotes ovarian germ line stem cell development by preferentially recognizing 5-methylcytosine RNAs. Proceedings of the National Academy of Sciences of the United States of America. 2020;117(7):3603-9.

41

17. Yang Y, Wang L, Han X, Yang WL, Zhang M, Ma HL, et al. RNA 5-Methylcytosine Facilitates the Maternal-to-Zygotic Transition by Preventing Maternal mRNA Decay. Molecular cell. 2019. 18. Chen X, Li A, Sun BF, Yang Y, Han YN, Yuan X, et al. 5-methylcytosine promotes pathogenesis of bladder cancer through stabilizing mRNAs. Nature cell biology. 2019;21(8):978-90. 19. Malik AN, Vierbuchen T, Hemberg M, Rubin AA, Ling E, Couch CH, et al. Genome-wide identification and characterization of functional neuronal activity-dependent enhancers. Nature neuroscience. 2014;17(10):1330-9. 20. Kim TK, Hemberg M, Gray JM, Costa AM, Bear DM, Wu J, et al. Widespread transcription at neuronal activity-regulated enhancers. Nature. 2010;465(7295):182-7. 21. Li B, Dewey CN. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC bioinformatics. 2011;12:323. 22. Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26(1):139-40. 23. McCarthy DJ, Chen Y, Smyth GK. Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation. Nucleic Acids Res. 2012;40(10):4288-97. 24. Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15(12):550. 25. Rieder D, Amort T, Kugler E, Lusser A, Trajanoski Z. meRanTK: methylated RNA analysis ToolKit. Bioinformatics (Oxford, England). 2016;32(5):782-5. 26. Rainer J, Gatto L, Weichenberger CX. ensembldb: an R package to create and use Ensembl-based annotation resources. Bioinformatics. 2019;35(17):3151-3. 27. Wei Z, Panneerdoss S, Timilsina S, Zhu J, Mohammad TA, Lu ZL, et al. Topological Characterization of Human and Mouse m(5)C Epitranscriptome Revealed by Bisulfite Sequencing. International journal of genomics. 2018;2018:1351964. 28. Yu G, Wang LG, Han Y, He QY. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS. 2012;16(5):284-7. 29. Sun Z, Xu X, He J, Murray A, Sun MA, Wei X, et al. EGR1 recruits TET1 to shape the brain methylome during development and upon neuronal activity. Nature communications. 2019;10(1):3892. 30. Khoddami V, Yerra A, Cairns BR. Experimental Approaches for Target Profiling of RNA Cytosine Methyltransferases. Methods in enzymology. 2015;560:273-96. 31. Leslie JH, Nedivi E. Activity-regulated genes as mediators of neural circuit plasticity. Progress in neurobiology. 2011;94(3):223-37. 32. West AE, Greenberg ME. Neuronal activity-regulated gene transcription in synapse development and cognitive function. Cold Spring Harbor perspectives in biology. 2011;3(6). 33. Guo JU, Ma DK, Mo H, Ball MP, Jang MH, Bonaguidi MA, et al. Neuronal activity modifies the DNA methylation landscape in the adult brain. Nature neuroscience. 2011;14(10):1345-51. 34. Su Y, Shin J, Zhong C, Wang S, Roychowdhury P, Lim J, et al. Neuronal activity modifies the chromatin accessibility landscape in the adult brain. Nature neuroscience. 2017;20(3):476-83.

42

Chapter 3 - Influence of Folate on RNA Cytosine-5 Methylation in Neural

Stem Cells

Xiguang Xu1,2, Xiaoran Wei1,3, Natalie Melville, Razan Alajoleen1,3, Rachel Padget2,4, James

Smyth2,4, Hrubec Terry5, Hehuang Xie1,2,3*

1. Epigenomics and Computational Biology Lab, Fralin Life Sciences Institute of Virginia

Tech, VA, 24060 USA

2. Department of Biological Sciences, College of Science, Virginia Tech, Blacksburg, VA

24061, USA

3. Department of Biomedical Sciences and Pathobiology, Virginia-Maryland College of

Veterinary Medicine; Virginia Tech, VA, 24060 USA

4. Fralin Biomedical Research Institute at VTC, Virginia Tech, VA, 24060 USA

*Corresponding author: Email: [email protected]

Status: Manuscript under preparation.

43

Highlights

l Folate deficiency and supplementation induce changes in mRNA translation efficiency in

adult mouse neural stem cells.

l Folate deficiency and supplementation induce 5-mrC modification changes in adult

mouse neural stem cells.

l It reveals 5-mrC hypermethylation in polysome mRNAs than that in total mRNAs.

3.1 Abstract RNA cytosine-5 methylation (5-mrC) is a post-transcriptional modification involved in diverse

physiological and pathological conditions. The formation of 5-mrC modification is mediated by

the transfer of a methyl-group to RNA cytosine-5 position. Meanwhile, the influence of the

methyl donor, folate, on 5-mrC modification is largely unkown. Here, we provide a

transcriptome-wide landscape of 5-mrC modification in both total mRNAs and polysome-

associated mRNAs at single-nucleotide resolution in adult mouse neural stem cells (NSCs)

cultured in low folate (LF), medium folate (MF) and high folate (HF) conditions. Polysome

profiling reveals a panel of differentially translated mRNAs in NSCs with folate deficiency or

supplementation. We identify distinct 5-mrC modification profiles in both total mRNAs and

polysome mRNAs in NSCs treated with different concentration of folate. Moreover, it reveals 5-

mrC hypermethylation in polysome mRNAs than that in total mRNAs. This study presents the

comprehensive influence of folate deficiency and supplementation on RNA cytosine-5

methylation and mRNA translation.

Keywords RNA cytosine-5 methylation, folic acid, adult mouse neural stem cell, polysome

profiling

44

3.2 Background Folate is an essential B vitamin and a major methyl donor with many important biological

functions including DNA methylation and synthesis (1-6). The demand for folate increases during

pregnancy because of the growth of fetus, placenta and uterus (7). The benefits of sufficient folate

on reproductive and cardiovascular health have been well established (8-11). However, recent

studies have raised the concern about the adverse effect of maternal folate excess. Beard CM et

al. found that too much intake of folic acid may lead to autism-associated nervous tissue damage

(12). A recent study reveals a ‘U shaped’ relationship between maternal multivitamin

supplementation frequency and autism spectrum disorder (ASD) risk; this association is further

supported by findings based on the measurement of maternal plasma folate levels (13). Similarly,

the methyl donor supplementation used in the yellow agouti mouse model prevents

transgenerational amplification of obesity (14). However, high folate intake has been shown to

have adverse effects on rodent development (15) with a higher incidence of ventricular septal

defects, embryonic growth retardation and short-term memory impairment in offspring (16-18).

Additionally, at the molecular level, aberrant expression of imprinted and autism-related genes

including Aust1 and Fmr1 are observed in the cerebral cortex of postnatal day 1 (P1) pups (19).

Maternal folate supplementation prior to conception rescues the proliferation potential of

neural stem cells in Sp-/- embryos via epigenetic mechanisms (20). The splotch (Sp-/-) mice have a

homogenous mutation of Pax3 gene and is a widely used neural tube defect (NTD)-prone mouse

model with impaired ability to synthesize thymidylate and spontaneous occurrence of neural tube

defects in Sp-/- embryos (21, 22). Furthermore, recent studies showed that folic acid promotes the

proliferation of neural stem cells (NSCs) (23-25). It increases the phosphorylation of ERK1/2 (26),

and activates the ERK signaling that is implicated in proliferation (27). It activates Notch signaling

with elevated expression of Notch1 and Hes5 at both mRNA and protein levels (23). Moreover,

folic acid supplementation increases the protein expression and enzymatic activities of DNMT

family (24), resulting in altered DNA methylation profile in the PI3K/Akt/CREB pathway (25).

Folate deficiency leads to elevated level of homocysteine, which may induce DNA damage via

increased reactive oxidative species (ROS) production, leading to apoptosis in NSCs (28). In

addition, homocysteine inhibits the phosphorylation of ERK1/2, thus suppressing ERK signaling

(29), which affects the regulation of cell growth (27). The protein expression levels and enzymatic

activities of aconitase and respiratory complex III, two critical components on mitochondrial

45

respiratory chain, are decreased because of the neurotoxicity induced by homocysteine in NSCs

(30). Moreover, high level of homocysteine reduces the protein expression and the enzymatic

activity of DNA methyltransferases including DNMT1, DNMT3a and DNMT3b (31). This

indicates dysregulation of methylation events is an essential molecular mechanism underlying the

pathogenesis of folate deficiency.

Post-transcriptional modification of RNA is emerging as a new layer of gene expression

(32). Among the numerous modifications, 5-methylcytosine (5-mrC) is one of the most well-

known RNA modifications detected in transfer RNAs (tRNAs), ribosomal RNAs (rRNAs) and

most recently in messenger RNAs (mRNAs) (33, 34). RNA cytosine-5 methylation plays an

essential role in the regulation of diverse biological processes. 5-mrC in tRNAs is involved in the

regulation of tRNA stability and protein synthesis (35). 5-mrC in rRNAs affects the regulation of

translational fidelity and ribosome biogenesis (36, 37). 5-mrC in mRNAs regulates the stability,

export, translation efficiency of mRNAs (38, 39). Folate affects the regulation of RNA methylation

as well. The one-carbon unit bound by folate is shown to be essential for the methylation of tRNA

in mammalian mitochondria, which is required for mitochondrial mRNA (mt-mRNA) translation

and subsequent oxidative phosphorylation (40). Despite the critical roles of folate as a methyl-

donor, there has been no previous study to investigate the influence of folate intake on mRNA

cytosine-5 methylation.

The goal of this study is to explore the folate dose-response relationships and underlying

molecular mechanisms in term of mRNA methylation, transcription and translation. We

hypothesize that the intake of the methyl donor folate may influence RNA metabolism in neural

stem cells. Here, we systematically assess the transcriptome-wide influence of folic acid deficiency

and supplementation on RNA cytosine-5 methylation, transcription as well as translation profiles

in adult mouse neural stem cells (NSCs). To our surprise, we haven’t detected differentially

expressed genes but a number of differentially translated genes in NSCs with folate deficiency or

supplementation. We identify distinct 5-mrC modification profiles in NSCs cultured with different

concentration of folic acid. Moreover, we find consistent hypermethylation in polysome mRNAs

compared to total mRNAs. Our findings illustrate the transcriptome-wide influence of folate on

mRNA methylation and translation in NSCs and indicate a potential link between mRNA

methylation and mRNA translation efficiency.

46

3.3 Methods

Adult mouse NSC culture and treatments

Adult mouse neural stem cells (NSCs) from the subventricular zone (SVZ) of the lateral

ventricles are isolated and cultured as previously described (41). The mouse adult NSCs within 10

passages are used for experiments.

To test the effect of folic acid (FA), we prepare low FA medium (1.5 µmol/L folic acid) by

mixing folic acid-free DMEM (Sigma) and Ham’s F12 medium (containing 3 µmol/L folic acid)

at 1:1 volume ratio, with supplement of 2% B27 supplement, 2 mmol/L L-glutamine, 1x penicillin-

streptomycin, 20 ng/ml epidermal growth factor (EGF, PeproTech), 20 ng/ml basic fibroblast

growth factor (bFGF, PeproTech). 10mM FA stock is prepared from folic acid powder (Sigma)

and filtered through 0.22µm membrane. NSCs are incubated with the indicated concentration of

FA for 4 days, with medium change at day 2. The three treatment groups are 1.5 µmol/L folic acid

(LF group), 10 µmol/L folic acid (MF group), 80 µmol/L folic acid (HF group).

Polysome fractionation

Polysome fractionation is performed as previously described (42). After treating with

different concentrations of folic acid for 4 days, monolayer culture of NSCs are incubated with

cyclohexamide (CHX, Sigma Aldrich, 100 µg/ml) at 37 °C for 10 min to stabilize ribosomes. After

washed with ice-cold PBS containing 100 µg/ml CHX, the NSCs are detached from the plate by

cell scraper and spun down at 300g at 4°C for 5min. The cell pellet is immediately stored in -80

°C freezer for later analysis. Frozen cell pellets are thawed on ice, lysed in hypotonic lysis buffer,

centrifuged at 15,000 rpm for 5 min at 4 °C. The supernatant is collected and subjected to the

measurement of OD260nm using Nanodrop 2000. Based on the values of OD260nm, equal amount

of the lysate is loaded on 10%-50% sucrose gradients and ultracentrifuged in a SW41Ti rotor

(Beckman Coulter) at 35,000 rpm at 4°C for 3 hours. The gradients are fractionated into 15

fractions through Gradient Station (BioCamp). Polysome fractions (fraction 9-15) are identified,

pooled, and extracted with TRIzol LS reagent. The purified polysome RNA samples are used for

RNA-seq and RNA BS-seq library construction.

Generation of unmethylated spike-in mRNA control

The spiked-in unmethylated mRNA is transcribed from the pTRI-Xef plasmid supplied by

the MEGAscript™ T7 Transcription Kit (Invitorgen), which encodes 1.85 kb Xenopus elongation

47

factor 1α mRNA according to the manufacturer’s manual. Briefly, the linearized pTRI-Xef

plasmid is in vitro transcribed in an reaction with MEGAscript T7 RNA polymerase (Ambion) at

37 °C for 4 h, followed by DNase treatment to remove DNA template. The RNA sample is purified

by RNeasy Mini Kit (QIAGEN). The in vitro transcribed unmethylated mRNA control is spiked

at a ratio of 0.5% in the RNA samples before bisulfite treatment.

RNA BS-seq library construction

RNA bisulfite conversion is performed as previously described (43) with minor

modifications. Briefly, poly(A) RNA is spiked-in with Xef1 unmethylated RNA and bisulfte

converted using the EZ RNA methylation Kit (Zymo Research) with initial denaturation at 94°C

for 1min, followed by three cycles of 70 °C for 10min and 64 °C for 45min. Binding,

desulphonation, and purification are performed on-column following the manufacturer’s

instructions. The eluted RNA is used for stranded RNA-seq library construction using the TruSeq

Stranded mRNA Library Preparation Kit (Illumina) with the following modifications: 1) omit the

fragmentation step; 2) supplement ACT random hexamers during first strand cDNA synthesis.

RNA-seq library construction

Stranded RNA-seq libraries are constructed using the TruSeq Stranded mRNA Library

Preparation Kit (Illumina) following the manufacturer’s instructions. Briefly, after two rounds of

poly(A) selection, the mRNA samples are fragmented and primed to synthesize first strand cDNA,

followed by the synthesis of the second strand cDNA. After Ampure XP beads purification, dA

tailing is performed and indexed adapters are ligated to both ends of the ds cDNA. Adapter-ligated

DNA fragments are enriched by PCR amplification for 12 cycles. After Ampure XP beads

purification, the PCR products are size-selected with the range from 350bp to 550bp on 2% dye-

free agarose gel using pippin recovery system (Sage Science). The recovered libraries are

sequenced on Hiseq 4000 platform with 150bp paired end mode (Illumina).

RNA-seq data analysis

Raw reads are trimmed of adapter sequences and low quality bases (Q < 30) using Trim

Galore (version 0.5.0) (https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/). The

processed reads with lengths greater than 30 nt are defined as clean reads. Clean reads are mapped

to mm10 genome and gene expression levels are outputted by RSEM (44). We filter out genes that

are not expressed (TPM=0) and the union genes of the two replicates are compiled as the expressed

48

gene list. For differentially expressed genes analysis, we use the cpm function from the edgeR

package (45, 46) to generate the CPM (Counts per million) values and then filter out the genes

with CPM ≤ 0.5. The raw counts are employed to identify differentially expression genes by

DESeq2 (47). The criteria of differentially expressed genes include: (1) the adjusted p-value is less

than 0.05, and (2) the fold change is above 1.5.

Differential translation analysis

Translation efficiency (TE) is estimated as the ratio between polysome mRNA counts and

total mRNA counts (TE=polysome/total). Fold changes in TE between two conditions are

calculated as TE(treatment)/TE(control). We perform differential translation efficiency analysis

using the package Xtail (48) with the following parameter: minMeanCount = 1. The criteria of

differentially translated genes (DTG) include: (1) the adjusted p-value is less than 0.05, and (2)

there are at least 1.5 fold changes.

RNA BS seq data analysis

Mouse transcriptome and annotation files are download from Emsemble database. Raw reads

are trimmed of adapter sequences, the first 6 bases on 5’ end, and low quality bases (Q < 30) using

Trim Galore (https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/). The processed

reads with lengths greater than 30 nt are defined as clean reads and mapped to the mouse

transcriptome (GRCm38) using “meRanT align” from meRanTK (version 1.2.1) with stringent

parameters: -fmo –mmr 0.01 (49). Analysis of the Xef spike-in controls reveals bisulfite

conversion rates of > 99%. Unambiguously aligned reads are used to call candidate m5Cs by

meRanCall from meRanTK with the parameters: -md 1 -ei 0.1 -fdr 0.01. Only cytosine positions

with coverage depth ≥ 20, methylation level ≥ 0.1 and methylated cytosine depth ≥ 3 are considered

as candidate 5-mrC sites. As the reduced complexity of bisulfite-converted RNA could cause

incorrect read alignment (50), we exclude 5-mrC sites located on transcripts that are not expressed

in the corresponding RNA-seq datasets (TPM = 0). Each group contains two biological replicates

and the overlapped m5C sites between the two replicates are used for downstream analysis. The

coordinates of these sites are converted to genome coordinates using R package ensembldb (51).

Distribution of 5-mrC sites

The 5-mrC sites are annotated with the GTF file downloaded from Ensemble. The 5-mrC

sites are assigned to four regions: 5’ UTR, 3’ UTR, CDS and noncoding RNA. According to the

49

average lengths of 5’ UTR, 3’ UTR and CDS in the whole transcriptome, we divide the three

segments into 5, 18 and 22 bins, respectively. The numbers of 5-mrC sites in each bin is counted

and the percentage is calculated to plot the distribution of 5-mrC sites along the mRNA transcripts.

Differential 5-mrC methylation analysis

The sites used for differentially methylated sites analysis require the following two criteria:

(1) coverage depth ≥ 20 in all the four libraries used for comparison, and (2) candidate 5-mrC sites

in at least one condition. Fisher’s Exact test is used to evaluate the significance of differential

methylation and the false discovery rate (FDR) method is applied to correct multiple comparisons.

Sites with adjusted p-value < 0.05 are considered as differential methylated sites (DMS).

GO analysis

GO analysis is performed with the R package clusterProfiler (52). Default parameters are

used for the enrichment analysis for Biological Process (BP), cellular component (CC), and

molecular function (MF). The ten most significant BP terms are shown.

Immunostaining

Immunostaining is performed as previously described (53). Briefly, adult mouse neural stem

cells are seeded on 8-well chamber overnight. The NSCs are fixed with 4% paraformaldehyde in

PBS at RT for 15 min. After washed three times with PBS, NSCs are permeabilized with 0.2%

TritonX-100 in PBS at RT for 10 min. The cells are then blocked with 5% Normal Goat Serum

(ThermoFisher) at RT for 1 h, and incubated with mouse anti-Nestin antibody (Millipore,

MAB353) and rabbit anti-Sox2 antibody (Abcam, ab97959) at 4 °C overnight. After washed three

times with 1×PBS, the cells are incubated with Cy3 conjugated anti-rabbit IgG (A10520,

Invitrogen) and Alexa Fluor 488 conjugated anti-mouse IgG (A10680, Invitrogen) secondary

antibodies at RT in darkness for 1 h. After washed three times with 1×PBS, cells are mounted with

DAPI-Fluoromount-G™ Clear Mounting Media (SouthernBiotech, 010020) and the fluorescent

images are captured using confocal microscope.

50

3.4 Results

3.4.1 Distribution profile of 5-mrC in total mRNAs in adult mouse neural stem cells

Adult mouse neural stem cells (NSCs) are isolated from the subventricular zone (SVZ) of

adult mice and maintained as previously described (41). Immunostaining analysis shows positive

for the two NSC markers, Sox2 and Nestin, in NSC culture (Figure 1), indicating a homogenous

NSC population. We follow a procedure described in previous reports (23-25) and culture NSCs

in three different folate concentrations: 1.5 µM folic acid as low folate (LF), 10 µM folic acid as

medium folate (MF, folate level commonly supplied in cell culture media), and 80 µM folic acid

as high folate (HF). After four days in culture, with fresh medium changed at day 2, NSCs are

harvested for total RNA extraction with DNase digestion to remove any residual DNA

contamination. Total mRNA molecules are enriched by two rounds of oligo(dT) beads selection.

We perform both RNA-seq for gene expression and RNA bisulfite sequencing (RNA BS-seq) for

transcriptome-wide mapping of 5-mrC modification in total mRNA samples that are derived from

NSCs treated with low, medium and high concentration of folic acid. With two biological

replicates for each condition, a total of 6 RNA-seq libraries and 6 RNA BS-seq libraries are

constructed and sequenced on illumine Hiseq platform with 150bp paired end mode. We obtain an

average of 26 million raw read pairs with around 21 million read pairs uniquely mapped to the

reference for total poly(A) RNA-seq datasets (Table 1). We also get an average of 100 million

raw read pairs with around 55 million read pairs uniquely mapped to reference transcripome for

total poly(A) RNA BS-seq datasets (Table 2).

To assess the overall bisulfite conversion efficiency, we include in vitro transcribed

Xenopus elongation factor 1α mRNA as a control, which shares approximately 85% sequence

identity to its mouse homologue. Based on this spiked-in unmethylated mRNA control, the

bisulfite conversion rates for all six BS libraries are determined to be above 99.9% (Table 2). To

ensure reliable methylation calling, we consider sites with a coverage depth ≥ 20, methylation

level ≥ 0.1 and number of methylated cytosine ≥ 3 as candidate 5-mrC sites. Previous research has

shown that the reduced sequence complexity of bisulfite-converted RNAs could cause incorrect

read alignment (50). As such, we further filter the candidate 5-mrC sites located on mRNAs that

are not expressed (TPM = 0). The remaining 5-mrC sites are considered as credible 5-mrC sites.

And only the overlapped 5-mrC sites between two biological replicates are used for downstream

analysis.

51

Figure 3-1 Characterization of adult mouse neural stem cell (NSC) culture

Adult mouse NSCs cultured under proliferating conditions are double stained with the neural

progenitor markers Nestin (cytoplasmic, green) and Sox2 (nuclear, red; DAPI in blue). Scale bar:

50-µm.

Table 3-1 Mapping statistics of total and polysome poly(A) RNA-seq data

Nestin

DAPI Sox2

Merge

RNA-seqdatasets(FA)

Sample #ofrawreadpairs

#ofcleanreadpairs

#ofmappedreadpairs

Mappingrate

LFmRNArep1 26,781,701 25,757,779 21,718,689 84.32%LFmRNArep2 28,234,115 27,091,134 23,593,252 87.09%MFmRNArep1 24,903,472 23,922,138 20,506,797 85.72%MFmRNArep2 26,306,216 25,148,954 21,339,365 84.85%HFmRNArep1 28,714,364 27,544,469 23,089,647 83.83%HFmRNArep2 25,824,529 24,595,403 20,701,742 84.17%LFpRNArep1 17,321,485 16,564,034 15,152,827 91.48%LFpRNArep2 15,831,347 15,230,889 14,017,758 92.04%MFpRNArep1 18,755,538 18,005,002 16,197,355 89.96%MFpRNArep2 20,706,654 19,865,578 17,815,546 89.68%HFpRNArep1 18,238,833 17,504,146 16,187,385 92.48%HFpRNArep2 18,014,123 17,263,948 16,160,007 93.61%

52

Table 3-2 Mapping statistics of total and polysome poly (A) RNA BS-seq data

Our results show good reproducibility between the two biological replicates. About 34.8%

to 66.1% of 5-mrC sites identified in one biological replicate are found to be methylated in the

other biological replicate as well (Supplementary Figure 1a-c). In addition, the Pearson’s

correlation for the methylation level of 5-mrC sites overlapped between two replicates is in the

range from 0.90 to 0.97 (Supplementary Figure 1d-f). A total of 1,706-1,777 5-mrC sites within

128-159 mRNA molecules are identified in NSCs cultured with different concentrations of folic

acid. The majority (98.7% ~ 99.5%) of the 5-mrC sites are found to occur within mRNAs (Figure

2f-h). The rest 5-mrC sites are mapped to noncoding RNAs (Figure 2f-h). Similar to previous

reports (38, 54), the medium methylation level of 5-mrC sites is ~25% among the three groups

(27.6% in LF, 22.5% in MF, and 24.6% in HF) (Figure 2d). The majority (48.3% in LF, 60.0% in

MF, and 51.9% in HF) of 5-mrC sites are below 30%, and only 11.0%-27.8% of 5-mrC sites show

methylation level above 50% (Figure 2a-c). In addition, the sequence frequency logo shows that

5-mrC sites are embedded in C-C/U-rich sequence context (Figure 2e). This is similar to the

sequence context reported in zebrafish early embryos (55), but slightly different from previous

finding in HeLa cells that 5-mrC sites are embedded in CG-rich environment (38). The distribution

profile of 5-mrC sites in mRNAs shows an enrichment of 5-mrC modification in the coding

sequences (CDS) and 3’UTR (Figure 2f-h). The density plot shows a mild peak of 5-mrC sites

RNABS-seqdatasets(FA)

Sample#ofrawread

pairs#ofcleanreadpairs

#ofmappedreadpairs

Mappingrate

Bisulfiteconversionrate

LFmRNA_BSrep1 77,642,938 71,099,322 50,784,214 71.43% 0.9994LFmRNA_BSrep2 91,976,960 86,729,807 57,488,191 66.28% 0.9994MFmRNA_BSrep1 72,018,791 66,361,290 49,604,196 74.75% 0.9994MFmRNA_BSrep2 77,644,048 72,915,734 51,864,274 71.13% 0.9994HFmRNA_BSrep1 153,352,803 120,820,824 67,063,229 55.51% 0.9994HFmRNA_BSrep2 124,057,769 90,325,039 55,291,666 61.21% 0.9995LFpRNA_BSrep1 135,957,235 101,147,250 34,690,802 34.30% 0.9994LFpRNA_BSrep2 190,267,641 137,991,422 52,362,410 37.95% 0.9994MFpRNA_BSrep1 121,340,008 104,067,601 35,324,997 33.94% 0.9994MFpRNA_BSrep2 134,987,984 108,062,216 32,469,104 30.05% 0.9994HFpRNA_BSrep1 138,865,399 115,410,482 34,319,240 29.74% 0.9994HFpRNA_BSrep2 134,315,655 127,897,752 68,248,452 53.36% 0.9994

53

immediately downstream of translation initiation sites and significant peaks at 3’UTR (Figure 2i-

k), indicating a potential role of 5-mrC in the posttranscriptional regulation of RNA metabolisms.

To investigate the potential role of 5-mrC modification in NSCs in response to folate

deficiency and supplementation, we perform GO annotation for 5-mrC containing mRNAs in the

three conditions. It shows that 5-mrC containing mRNAs in NSCs with all the three concentration

of folic acid (LF, MF, HF) are consistently enriched in mitochondrial functions, such as ATP

synthesis coupled electron transport and respiratory electron transport chain. This indicates a

potential role of 5-mrC modification in mitochondrial RNAs. Interestingly, 5-mrC containing

mRNAs in NSCs treated with LF and HF are both enriched in purine metabolic process, indicating

the importance of the methyl-donor folate in nucleic acid metabolism. Moreover, 5-mrC

modification in NSCs are involved in important neural functions, such as regulation of cell growth,

axonogenesis, dendrite development, neuron projection extension, suggesting a critical role of 5-

mrC modification in neurons.

54

Figure 3-2 Distribution profile of 5-mrC modification in adult mouse NSCs

(a, b, c) Histogram showing the distribution of 5-mrC methylation levels in total mRNAs in NSCs.

(d) Boxplot showing the methylation levels of 5-mrC sites in total mRNAs in NSCs. (e) Sequence

frequency logo for the sequence context proximal to 5-mrC sites in total mRNAs in NSCs. (f, g,

h) Pie chart showing the percentage of 5-mrC sites in mRNA (5’UTR, CDS, 3’UTR) and

noncoding RNA. (i, j, k) Density plot showing the distribution of 5-mrC sites along mRNA

0.10.20.30.40.50.60.70.80.91.0Methylationlevel(1.5μMFA)

50

40

30

20

10

0

Percentage(%

)

0.10.20.30.40.50.60.70.80.91.0Methylationlevel(10μMFA)

50

40

30

20

10

0

Percentage(%

)

0.10.20.30.40.50.60.70.80.91.0Methylationlevel(80μMFA)

50

40

30

20

10

0

Percentage(%

)

1.00

0.50

0.00

1.00

0.50

0.00

1.00

0.50

0.00

Probability

-10010

1.5μM

10μM

80μM

1.5μM 10μM 80μMFolicacidconcentration

100

75

50

25

0

Methylatio

nlevel(%)

a

b

c

d

e

1.5μM

10μM

80μM

j

k

NoncodingRNA(0.68%)5’UTR(2.81%)CDS(75.91%)3’UTR(20.60%)

f

NoncodingRNA(0.47%)5’UTR(4.46%)CDS(75.15%)3’UTR(19.92%)

g

NoncodingRNA(1.31%)5’UTR(1.2%)CDS(75.17%)3’UTR(22.32%)

h

55

transcripts (5’UTR, CDS, 3’UTR). The moving average of percentages of mRNA 5-mrC sites were

shown.

3.4.2 Folate induces changes in total mRNA methylation in adult mouse neural stem cells

To investigate the influence of folate deficiency and supplementation on 5-mrC

modification in NSCs, we perform differential methylation analysis with two comparisons: LF vs

MF and HF vs MF. The medium level of folate represents the physiological level of folate and

thus serves as a control, while the low level and high level of folate represent folate deficiency and

supplementation, respectively. We first check the overlap of 5-mrC sites identified in the three

groups by Venn diagram (Figure 3a). Collectively, we identify a total of 3,019 methylated

cytosine sites in NSCs cultured in three different levels of folate, with only 721 sites shared by all

three conditions (Figure 3a). This suggests an effect of folate on mRNA methylation in NSCs. To

further identify the transcriptome-wide influence of folic acid on 5-mrC modification in NSCs, we

implement a customized Perl code with Fisher’s exact test for differential methylation analysis.

We identify 168 DMS sites within 17 mRNAs between LF and MF conditions and 1,770 DMS

sites within 129 mRNAs between MF and HF conditions. Figure 3b shows the methylation profile

of these DMS sites in the three groups (Figure 3b). GO annotation of mRNAs containing DMS

sites in both comparisons show significant enrichment on mitochondrial functions and purine

metabolic process. In addition, the mRNAs harboring DMS sites in the comparison of MF and HF

are enriched in a number of biological processes, such as the regulation of cell growth, the

regulation of cell size, positive regulation of neuron differentiation, indicating a critical role of

folate supplementation in neural stem cell self-renewal and differentiation.

56

Figure 3-3 Folate induces RNA methylation changes in total mRNAs in adult mouse NSCs

(a) Venn diagram showing the overlap of 5-mrC sites within total mRNAs among the three groups.

(b) Heatmap showing the methylation profile of the union differentially methylated 5-mrC sites

1.5μM 10μM

80μM

666276

721

4365

325

923

80μM1.5μM 10μMa b

c 1.5μM vs10μM

ATPsynthesiscoupledelectrontransportrespiratoryelectrontransportchain

electrontransportchainoxidativephosphorylation

cellularrespirationpurineribonucleoside monophosphatemetabolicprocess

purineribonucleoside triphosphatemetabolicprocesspurinenucleosidemonophosphatemetabolicprocessenergyderivationbyoxidationoforganiccompounds

ATPmetabolicprocess

-Log10(adjustedp-value)

0 24Count

80μM vs10μM

ATPsynthesiscoupledelectrontransportribonucleoside triphosphatemetabolicprocess

respiratoryelectrontransportchainelectrontransportchain

nucleosidetriphosphatemetabolicprocessmitochondrialATPsynthesiscoupledelectrontransport

ATPmetabolicprocessoxidativephosphorylation

energyderivationbyoxidationoforganiccompoundspurineribonucleoside monophosphatemetabolicprocess

-Log10(adjustedp-value)

0 610Count

d

57

(1.5µM vs 10µM and 80µM vs 10µM) in the three groups. (c, d) Gene ontology analysis of mRNAs

with differentially methylated 5-mrC sites (c: 1.5µM vs 10µM, d: 80µM vs 10µM).

3.4.3 Distribution profile of 5-mrC in polysome mRNAs in adult mouse neural stem cells

To investigate the influence of folate on translation and provide a direct evidence of the

methylation status of actively translating mRNAs, we perform polysome profiling as well as RNA

bisulfite sequencing of polysome-associated mRNAs. Briefly, we culture NSCs in three

concentrations of folic acid, 1.5 µM (LF), 10 µM (MF) and 80 µM (HF), for 4 days. The NSC

cultures are treated with cyclohexamide (CHX) to stabilize ribosomes on poly(A) RNAs, and then

lysed with hypotonic lysis buffer. The cell lysates are separated by sucrose gradient

ultracentrifugation and fractionated by Gradient Station (BioCamp). The polysome fractions with

more than 3 ribosomes are pooled for RNA extraction, representing medium to high actively

translating mRNAs. The polysome RNA samples are digested with DNase enzyme and then

subjected to two rounds of oligo(dT) beads selection to enrich poly(A)-containing mRNAs. The

polysome mRNA samples are used for RNA-seq and RNA BS-seq library construction. With two

biological replicates for each condition, we construct 6 polysome poly(A) RNA-seq libraries and

6 polysome poly(A) RNA BS-seq libraries for high-throughput sequencing. We obtain an average

of 18 million raw read pairs for polysome poly(A) RNA-seq libraries with around 16 million read

pairs uniquely mapped to the reference (Table 1). We also obtain an average of 140 million raw

read pairs for polysome poly(A) RNA BS-seq libraries with around 43 million read pairs uniquely

mapped to the reference transcriptome (Table 2).

Similar to the total poly(A) RNA BS-seq datasets, the spiked-in unmethylated Xef mRNA

control shows a very high bisulfite conversion rate (0.9994) in all the six polysome poly(A) RNA

BS-seq libraries (Table 2). To obtain consistent and comparable results for 5-mrC calling on

polysome-associated mRNAs, we implement the same parameters of trimming, alignment and

methylation calling for polysome poly(A) RNA BS-seq datasets. We apply the same criteria to

filter potentially false-positive 5-mrC sites. Only the overlapped 5-mrC sites between two

biological replicates are used for downstream analysis.

The reproducibility is very high between two biological replicates (Supplementary Figure

4). We obtain 2,253~4,207 credible 5-mrC sites within 236-283 RNA molecules in polysome

58

poly(A) RNA BS-seq datasets. Similar to total poly(A) RNAs, most of 5-mrC sites identified from

polysome-associated poly(A) RNAs are located on mRNA molecules (LF: 95.3%; MF: 96.7%;

HF: 96.6%) (Figure 4f-h). The medium methylation level is around 25% (24.4% in LF, 23.6% in

MF, and 25.9% in HF) (Figure 4d). Sequence frequency logo shows a C-C/U-rich sequence

context (Figure 4e). The distribution of 5-mrC modification shows enrichment at 5’UTR and

3’UTR, with a small peak downstream of translation initiation sites (Figure 4i-k), indicating the

potential regulatory role of 5-mrC modification in mRNA translation.

59

Figure 3-4 Distribution profile of 5-mrC in polysome mRNAs in adult mouse NSCs

(a, b, c) Histogram showing the distribution of 5-mrC methylation levels in polysome mRNAs in

NSCs. (d) Boxplot showing the methylation levels of 5-mrC sites in polysome mRNAs in NSCs.

(e) Sequence frequency logo for the sequence context proximal to 5-mrC sites in polysome

mRNAs in NSCs. (f, g, h) Pie chart showing the percentage of 5-mrC sites in mRNA (5’UTR,

CDS, 3’UTR) and noncoding RNA. (i, j, k) Density plot showing the distribution of 5-mrC sites

a d

0.10.20.30.40.50.60.70.80.91.0

Methylationlevel(p1.5μMFA)

50

40

30

20

10

0

Percentage(%

)

0.10.20.30.40.50.60.70.80.91.0

Methylationlevel(p10μMFA)

50

40

30

20

10

0

Percentage(%

)

0.10.20.30.40.50.60.70.80.91.0

Methylationlevel(p80μMFA)

50

40

30

20

10

0

Percentage(%

)

p1.5μM p10μM p80μM

Folicacidconcentration

100

75

50

25

0

Methylationlevel(%)

1.00

0.50

0.00

1.00

0.50

0.00

1.00

0.50

0.00

Probability

-10010

p1.5μM

p10μM

p80μM

b

c

e

p1.5μM

p10μM

p80μM

f

h

i

g j

k

NoncodingRNA(4.75%)5’UTR(7.42%)CDS(23.96%)3’UTR(63.87%)

NoncodingRNA(3.33%)5’UTR(8.92%)CDS(28.09%)3’UTR(59.66%)

NoncodingRNA(3.37%)5’UTR(11.19%)CDS(26.10%)3’UTR(59.34%)

60

along mRNA transcripts (5’UTR, CDS, 3’UTR). The moving average of percentages of mRNA 5-

mrC sites are shown.

3.4.4 Folate induces changes polysome mRNA methylation in adult mouse neural stem cells

To identify the influence of folate on 5-mrC modification on polysome-associated mRNAs,

we apply the same procedures for differential methylation analysis. Venn diagram shows a total

of 5,342 5-mrC sites and 1,303 sites are conserved in the three conditions (Figure 5a). We next

perform Fisher’s exact test to identify differentially methylated 5-mrC sites. As a result, we

identify 465 DMS sites within 43 mRNAs between LF and MF, and 905 DMS sites within 86

mRNAs between HF and MF. Figure 5b shows the methylation profile of these DMS sites in the

three groups (Figure 5b). GO annotation is performed to identify the potential biological functions

associated with DMS-containing mRNAs. We are not able to identify any enrichment for DMS-

containing mRNAs in the comparison of MF and HF. However, it shows significant enrichment

for DMS-containing mRNAs in the comparison of LF and MF in several biological processes,

such as cellular response to starvation, cellular response to nutrient levels (Figure 5c), indicating

a critical role of the methyl donor folate as a nutrient supplement.

61

Figure 3-5 Folate induces RNA methylation changes in polysome mRNAs in adult mouse

NSCs

(a) Venn diagram showing the overlap of 5-mrC sites within polysome mRNAs among the three

groups. (b) Heatmap showing the methylation profile of the union differentially methylated 5-mrC

sites (p1.5µM vs p10µM and p80µM vs p10µM) in the three groups. (c) Gene ontology analysis

of mRNAs with differentially methylated 5-mrC sites (p1.5µM vs p10µM).

3.4.5 Distinct 5-mrC profile in total and polysome mRNA in adult mouse neural stem cells

We further compare the RNA methylomes between total mRNAs and polysome mRNAs.

To ensure comparable coverage between the two sets of methylomes, sites included for each

comparison must meet two requirements: 1) called as 5-mrC sites in either total mRNA methylome

1.5μM 10μM

80μM

649

643

1,303

164458

1,797

328

p80μMp1.5μM p10μMa b

cp1.5μM vsp10μM

-Log10(adjustedp-value)

0 24

Count

cellularresponsetostarvation

responsetostarvation

cellularresponsetonutrient levels

cellularresponsetoextracellularstimulus

cellularresponsetoexternalstimulus

mitochondrialelectrontransport,cytochromectooxygen

aerobicelectrontransportchain

responsetonutrientlevels

mitoticG2DNAdamagecheckpoint

growthplatecartilagechondrocytedifferentiation

62

or polysome mRNA methyome; 2) coverage depth ≥ 20 in all the four replicates in the comparison.

To our surprise, our results show consistent hypermethylation in polysome mRNAs (1.5uM vs

p1.5uM, 10uM vs p10uM, 80uM vs p80uM) (Figure 6a-c). Among the sites used for DMS

analysis, there are many more 5-mrC sites in polysome mRNAs (Figure 6d-f). Fisher’s exact test

is performed to identify differential methylated sites between total mRNAs and polysome mRNAs.

We identify 3,238 DMS within 192 mRNAs between 1.5uM and p1.5uM groups, 2,173 DMS

within 154 mRNAs between 10uM and p10uM groups, 2,304 DMS within 197 mRNAs between

80uM and p80uM groups. To our surprise, further GO annotation shows almost no enrichment,

except for response to starvation in mRNAs harboring DMS sites identified between 10uM and

p10uM groups. It suggests that hypermethylation in polysome-associated mRNAs could be a

general status in NSCs.

63

Figure 3-6 Distinct methylation profiles of 5-mrC modification in total and polysome

mRNAs in NSCs

(a, b, c) Boxplot showing the methylation level between total mRNAs and polysome mRNAs

(1.5µM vs p1.5µM, 10µM vs p10µM and 80µM vs p80µM). (d, e, f) Venn diagram showing the

overlap of 5-mrC sites between total mRNAs and polysome mRNAs (1.5µM vs p1.5µM, 10µM vs

p10µM and 80µM vs p80µM). (g, h, i) Heatmap showing the methylation profile of differentially

methylated 5-mrC sites between total mRNAs and polysome mRNAs (1.5µM vs p1.5µM, 10µM

vs p10µM and 80µM vs p80µM).

g h i1.5μM p1.5μM 10μM p10μM 80μM p80μM

1.5μM p1.5μM

0.8

0.6

0.4

0.2

0.0

Methylatio

nlevel

10μM p10μM

0.8

0.6

0.4

0.2

0.0

Methylatio

nlevel

80μM p80μM

0.8

0.6

0.4

0.2

0.0

Methylatio

nlevel

a b c

d f

364 517 1,4742,971 94221 7391.5μM

p1.5μM10μM

p10μM80μM

p80μM

1,966 250

e

64

3.4.6 Folate induces changes in mRNA translation in adult mouse neural stem cells

We further perform polysome profiling analysis to investigate the transcriptome-wide

influence of folate on mRNA abundance and translation. Differentially expression analysis shows

that no genes induce significant mRNA level changes under folic acid deficiency or

supplementation conditions. However, we identify 10 genes with translation efficiency going up

and 93 genes with translation efficiency going down in NSCs with folate deficiency (Figure 7a),

250 genes with translation efficiency going up and 143 genes with translation efficiency going

down in NSCs with folate supplementation (Figure 7b). We further conduct GO annotation to

determine the potential roles of these differentially translated genes. It shows the differentially

translated genes between LF and MF are enriched in cytoplasmic translation, cellular response to

zinc ion, and protein localization to mitochondrion. And the differentially translated genes between

HF and MF are enriched in the regulation of cell substrate adhension, extracellular structure

organization, glial and neuronal differentiation. The difference in the functional enrichment

suggests the distinct influences between folate deficiency and supplementation on neural cell

metabolism.

65

Figure 3-7 Identification of differentially translated genes in NSCs with different

concentration of folate

(a, b) Volcano plot showing fold change of translation efficiency (TE) (x axis) and associated

adjusted p values (y axis) in NSCs with folate deficiency (a) and supplementation (b) compared to

the control. Genes with statistically significant changes in TE are labelled according to the

direction of the change: up (red) or down (green) regulation. (c, d) Gene ontology analysis of

mRNAs with differential translation efficiency (c: 1.5µM vs10µM, d: 80µM vs 10µM).

-100 10

log2FC(TE)(1.5μMvs10μM)

30

20

10

0

-log10(adjustedp-value)

-100 10

log2FC(TE)(80μMvs10μM)

20

10

0

-log10(adjustedp-value)

cytoplasmictranslationresponsetozincion

proteintargetingtomitochondrioncellularresponsetozincion

establishmentofproteinlocalizationtomitochondrialmembraneestablishmentofproteinlocalizationtomitochondrion

cellularresponsetocopper ionproteinlocalizationtomitochondrion

mitochondrialtransportcellularresponsetocadmiumion

1.5μM vs10μM

-Log10(adjustedp-value)

0246Count

cell-substrateadhesiongliogenesis

positiveregulationofneurondifferentiationpositiveregulationofneuronprojectiondevelopment

extracellularstructureorganizationpositiveregulationofcellprojectionorganization

extracellularmatrixorganizationcell-matrixadhesion

regulationofcell-substrateadhesionanatomicalstructurearrangement

80μM vs10μM

-Log10(adjustedp-value)

0 1020Count

a b

c

d

66

3.5 Discussion The beneficial effect of folate supplementation before and during pregnancy has been

identified for nearly 30 years (56). As such, folic acid fortification in the enriched grain food has

been the regular practice in the US since the year 1998 to prevent certain birth defects, including

NTDs, at the population level (57). As the methyl donor, folate influences DNA methylation and

gene expression (6, 58, 59). However, the influence of folate on RNA cytosine-5 methylation (5-

mrC) remains unknown.

In this study, we aim to systematically investigate the transcriptome-wide impact of folate

deficiency and supplementation on mRNA expression, methylation and translation. To achieve

this aim, we first perform RNA-seq and RNA BS-seq for total poly(A) RNA samples from NSCs

treated with three different concentrations of folic acid (LF, MF, and HF). our study has not

detected differentially expressed genes, but we have observed numerous differentially methylated

5-mrC sites. The DMS-containing genes are associated with mitochondrial functions and purine

metabolic process.

We also profile polysome fractions by sucrose gradient ultracentrifugation and perform

RNA-seq and RNA BS-seq for polysome-associated mRNAs. We identify a panel of genes with

changed translation efficiency as well as a number of differentially methylated 5-mrC sites in

NSCs trated with different concentration of folic acid. The difference in the GO enrichment of

differently translated genes between LF vs MF and HF vs MF suggests the distinct influences

between folate deficiency and supplementation on neural cell metabolism.

RNA bisulfite sequencing of polysome-associated mRNAs has provided a direct evidence

of the methylation status of actively translating mRNAs. Surprisingly, our study shows consistent

hypermethylation in polysome mRNAs than that in total mRNAs, indicating a critical role of 5-

mrC modification in the regulation of mRNA translation.

In this study, we present transcriptome-wide profiles of 5-mrC modification in both total

mRNAs and polysome mRNAs from NSCs cultured with different concentrations of folate. Our

study indicates a potential link between mRNA methylation and mRNA translation. We need to

further elucidate the molecular mechanism underlying the regulation of mRNA translation by 5-

mrC modification. More studies are needed to investigate the effect of folate deficiency and

supplementation in vivo, in mice model and in human populations.

67

3.6 Supplementary data The following are the supplementary figures for this project:

Supplementary Figure 1. Reproducibility of 5-mrC sites between replicates in total poly(A)

RNA BS-seq datasets

(a, b, c) Venn diagram showing the overlap of 5-mrC sites between two biological replicates in

total mRNAs in NSCs. (d, e, f) Scatter plot showing the Pearson correlation of common 5-mrC

sites between two biological replicates in total mRNAs in NSCs.

1.00

0.75

0.50

0.25

m5Clevel(1.5uMre

p2)

0.250.500.751.00m5Clevel(1.5uMrep1)

Pearson’sr=0.9050 1.00

0.75

0.50

0.25m5Clevel(10uM

rep2)

0.250.500.751.00m5Clevel(10uMrep1)

Pearson’sr=0.9012 1.00

0.75

0.50

0.25m5Clevel(80uM

rep2)

0.250.500.751.00m5Clevel(80uMrep1)

Pearson’sr=0.9699d e f

9121,777

1,0531,706 1,752

3,278 1,943976874

62.79% 66.08% 66.12% 63.61% 34.83% 47.42%

1.5uMrep2

1.5uMrep1

10uMrep2

10uMrep1

80uMrep2

80uMrep1

a b c

68

Supplementary Figure 2. GO annotation of 5-mrC containing mRNAs in NSCs

(a, b, c) Gene ontology analysis of mRNAs with 5-mrC sites (1.5µM, 10µM, and 80µM).

a

b

c 80μM

negativeregulationofproteincatabolicprocessregulationofproteincatabolicprocess

negativeregulationofcellularproteincatabolicprocesscellularresponsetopeptide

signaltransductioninresponsetoDNAdamageresponsetoangiotensin

positiveregulationofcatabolicprocessnegativeregulationofcatabolicprocess

negativeregulationofproteasomal proteincatabolicprocesscellularresponsetoangiotensin

-Log10(adjustedp-value)

0 48Count

10μM

responsetotransforminggrowthfactorbetaregulationofcellgrowth

positiveregulationofneuronprojectiondevelopmentnegativeregulationofproteincatabolicprocess

cellularresponsetotransforminggrowthfactorbetastimulusregulationofproteincatabolicprocess

transforminggrowthfactorbetareceptorsignalingpathwaypositiveregulationofcatabolicprocess

neuronprojectionextensiondevelopmentalgrowthinvolved inmorphogenesis

-Log10(adjustedp-value)

0 48Count

1.5μM

positiveregulationofneuronprojectiondevelopmentpositiveregulationofneurondifferentiation

positiveregulationofcatabolicprocessnegativeregulationofproteincatabolicprocess

semaphorin-plexin signalingpathwaypositiveregulationofcellprojectionorganization

regulationofproteincatabolicprocessdendritedevelopment

axonogenesisresponsetostarvation

-Log10(adjustedp-value)

0 48Count

69

Supplementary Figure 3. Schematic diagram of polysome fractionation

(a, b) Schematic representation of the experimental procedures of polysome fractionation: 1) 10%-

50% sucrose gradient is used to separate ribosome-free and ribosome-bound mRNAs by

ultracentrifugation. 2) representative polysome profile is recorded at 254 nm. polysome fraction is

indicated.

Supplementary Figure 4. Reproducibility of 5-mrC sites between replicates in polysome

poly(A) RNA BS-seq datasets

1.Sucrosegradient

10%

50%

80S

40S60S 2 3 4 5 6

Polysome

FreeRNP

Monosome

13579111315Fraction

0

0.1

0.2

0.3

OD254nm

2.Polysomeprofiling recordinga b

1.00

0.75

0.50

0.25

m5Clevel(p1.5uM

rep2)

0.250.500.751.00m5Clevel(p1.5uMrep1)

Pearson’sr=0.9488 1.00

0.75

0.50

0.25m5Clevel(p10uMrep2)

0.250.500.751.00m5Clevel(p10uMrep1)

Pearson’sr=0.9394 1.00

0.75

0.50

0.25m5Clevel(p80uMrep2)

0.250.500.751.00m5Clevel(p80uMrep1)

Pearson’sr=0.8557

p1.5uMrep2

p1.5uMrep1

p10uMrep2

p10uMrep1

p80uMrep2

p80uMrep1

8,8074,207

2,4142,759 2,253

4,150 1,5923,4052,191

63.54% 32.33% 55.74% 44.76% 35.19% 58.60%

a b c

d e f

70

(a, b, c) Venn diagram showing the overlap of 5-mrC sites between two biological replicates in

polysome mRNAs in NSCs. (d, e, f) Scatter plot showing the Pearson correlation of common 5-

mrC sites between two biological replicates in polysome mRNAs in NSCs.

Supplementary Figure 5. Reproducibility between replicates in total and polysome poly(A)

RNA-seq datasets

(a, b, c) Scatter plot showing the Pearson correlation between two biological replicates in total

poly(A) RNA-seq datasets in NSCs. (d, e, f) Scatter plot showing the Pearson correlation between

two biological replicates in polysome poly(A) RNA-seq datasets in NSCs.

3.7 References 1. Williams PJ, Bulmer JN, Innes BA, Broughton Pipkin F. Possible roles for folic acid in the regulation of trophoblast invasion and placental development in normal early human pregnancy. Biol Reprod. 2011;84(6):1148-53. 2. Outinen PA, Sood SK, Pfeifer SI, Pamidi S, Podor TJ, Li J, et al. Homocysteine-induced endoplasmic reticulum stress and growth arrest leads to specific changes in gene expression in human vascular endothelial cells. Blood. 1999;94(3):959-67.

-50510 15log2TPM(p1.5μM rep1)

Pearson’sr=0.9573

-50510 15log2TPM(p10μM rep1)

Pearson’sr= 0.9943

-50510 15log2TPM(p80μM rep1)

Pearson’sr=0.988015

10

5

0

-5

log2TPM(p

1.5μ

Mrep2)

15

10

5

0

-5

log2TPM(p

10μM

rep2)

15

10

5

0

-5

log2TPM(p

80μM

rep2)

-50510 15log2TPM(1.5μM rep1)

Pearson’sr=0.9983

-50510 15log2TPM(10μM rep1)

Pearson’sr= 0.9990 15

10

5

0

-5

log2TPM(8

0μM

rep2)

-50510 15log2TPM(80μM rep1)

Pearson’sr=0.998815

10

5

0

-5

log2TPM(1

0μM

rep2)

15

10

5

0

-5

log2TPM(1

.5μM

rep2)

a b c

d e f

71

3. Doshi SN, McDowell IF, Moat SJ, Lang D, Newcombe RG, Kredan MB, et al. Folate improves endothelial function in coronary artery disease: an effect mediated by reduction of intracellular superoxide? Arterioscler Thromb Vasc Biol. 2001;21(7):1196-202. 4. Di Simone N, Riccardi P, Maggiano N, Piacentani A, D'Asta M, Capelli A, et al. Effect of folic acid on homocysteine-induced trophoblast apoptosis. Mol Hum Reprod. 2004;10(9):665-9. 5. Steegers-Theunissen RP, Smith SC, Steegers EA, Guilbert LJ, Baker PN. Folate affects apoptosis in human trophoblastic cells. BJOG. 2000;107(12):1513-5. 6. Crider KS, Yang TP, Berry RJ, Bailey LB. Folate and DNA methylation: a review of molecular mechanisms and the evidence for folate's role. Advances in nutrition (Bethesda, Md). 2012;3(1):21-38. 7. Greenberg JA, Bell SJ, Guan Y, Yu YH. Folic Acid supplementation and pregnancy: more than just neural tube defect prevention. Reviews in obstetrics & gynecology. 2011;4(2):52-9. 8. Ouyang F, Longnecker MP, Venners SA, Johnson S, Korrick S, Zhang J, et al. Preconception serum 1,1,1-trichloro-2,2,bis(p-chlorophenyl)ethane and B-vitamin status: independent and joint effects on women's reproductive outcomes. The American journal of clinical nutrition. 2014;100(6):1470-8. 9. De Wals P, Tairou F, Van Allen MI, Uh SH, Lowry RB, Sibbald B, et al. Reduction in neural-tube defects after folic acid fortification in Canada. N Engl J Med. 2007;357(2):135-42. 10. Suren P, Roth C, Bresnahan M, Haugen M, Hornig M, Hirtz D, et al. Association between maternal use of folic acid supplements and risk of autism spectrum disorders in children. Jama. 2013;309(6):570-7. 11. Huo Y, Li J, Qin X, Huang Y, Wang X, Gottesman RF, et al. Efficacy of folic acid therapy in primary prevention of stroke among adults with hypertension in China: the CSPPT randomized clinical trial. Jama. 2015;313(13):1325-35. 12. Beard CM, Panser LA, Katusic SK. Is excess folic acid supplementation a risk factor for autism? Med Hypotheses. 2011;77(1):15-7. 13. Raghavan R, Riley AW, Volk H, Caruso D, Hironaka L, Sices L, et al. Maternal Multivitamin Intake, Plasma Folate and Vitamin B12 Levels and Autism Spectrum Disorder Risk in Offspring. Paediatric and perinatal epidemiology. 2018;32(1):100-11. 14. Waterland RA, Travisano M, Tahiliani KG, Rached MT, Mirza S. Methyl donor supplementation prevents transgenerational amplification of obesity. International journal of obesity (2005). 2008;32(9):1373-9. 15. Achon M, Reyes L, Alonso-Aperte E, Ubeda N, Varela-Moreiras G. High dietary folate supplementation affects gestational development and dietary protein utilization in rats. J Nutr. 1999;129(6):1204-8. 16. Pickell L, Brown K, Li D, Wang XL, Deng L, Wu Q, et al. High intake of folic acid disrupts embryonic development in mice. Birth defects research Part A, Clinical and molecular teratology. 2011;91(1):8-19. 17. Mikael LG, Deng L, Paul L, Selhub J, Rozen R. Moderately high intake of folic acid has a negative impact on mouse embryonic development. Birth defects research Part A, Clinical and molecular teratology. 2013;97(1):47-52. 18. Bahous RH, Jadavji NM, Deng L, Cosin-Tomas M, Lu J, Malysheva O, et al. High dietary folate in pregnant mice leads to pseudo-MTHFR deficiency and altered methyl metabolism, with embryonic growth delay and short-term memory impairment in offspring. Hum Mol Genet. 2017;26(5):888-900.

72

19. Barua S, Kuizon S, Brown WT, Junaid MA. High Gestational Folic Acid Supplementation Alters Expression of Imprinted and Candidate Autism Susceptibility Genes in a sex-Specific Manner in Mouse Offspring. J Mol Neurosci. 2016;58(2):277-86. 20. Ichi S, Costa FF, Bischof JM, Nakazaki H, Shen YW, Boshnjaku V, et al. Folic acid remodels chromatin on Hes1 and Neurog2 promoters during caudal neural tube development. The Journal of biological chemistry. 2010;285(47):36922-32. 21. Fleming A, Copp AJ. Embryonic folate metabolism and mouse neural tube defects. Science (New York, NY). 1998;280(5372):2107-9. 22. Wlodarczyk BJ, Tang LS, Triplett A, Aleman F, Finnell RH. Spontaneous neural tube defects in splotch mice supplemented with selected micronutrients. Toxicology and applied pharmacology. 2006;213(1):55-63. 23. Liu H, Huang GW, Zhang XM, Ren DL, J XW. Folic Acid supplementation stimulates notch signaling and cell proliferation in embryonic neural stem cells. Journal of clinical biochemistry and nutrition. 2010;47(2):174-80. 24. Li W, Yu M, Luo S, Liu H, Gao Y, Wilson JX, et al. DNA methyltransferase mediates dose-dependent stimulation of neural stem cell proliferation by folate. The Journal of nutritional biochemistry. 2013;24(7):1295-301. 25. Yu M, Li W, Luo S, Zhang Y, Liu H, Gao Y, et al. Folic acid stimulation of neural stem cell proliferation is associated with altered methylation profile of PI3K/Akt/CREB. The Journal of nutritional biochemistry. 2014;25(4):496-502. 26. Zhang XM, Huang GW, Tian ZH, Ren DL, Wilson JX. Folate stimulates ERK1/2 phosphorylation and cell proliferation in fetal neural stem cells. Nutritional neuroscience. 2009;12(5):226-32. 27. Junttila MR, Li SP, Westermarck J. Phosphatase-mediated crosstalk between MAPK signaling pathways in the regulation of cell survival. FASEB journal : official publication of the Federation of American Societies for Experimental Biology. 2008;22(4):954-65. 28. Wang D, Chen YM, Ruan MH, Zhou AH, Qian Y, Chen C. Homocysteine inhibits neural stem cells survival by inducing DNA interstrand cross-links via oxidative stress. Neuroscience letters. 2016;635:24-32. 29. Yan H, Zhang X, Luo S, Liu H, Wang X, Gao Y, et al. Effects of homocysteine on ERK signaling and cell proliferation in fetal neural stem cells in vitro. Cell biochemistry and biophysics. 2013;66(1):131-7. 30. Cui X, Liang Z, Shen L, Zhang Q, Bao S, Geng Y, et al. 5-Methylcytosine RNA Methylation in Arabidopsis Thaliana. Molecular plant. 2017;10(11):1387-99. 31. Lin N, Qin S, Luo S, Cui S, Huang G, Zhang X. Homocysteine induces cytotoxicity and proliferation inhibition in neural stem cells via DNA methylation in vitro. The FEBS journal. 2014;281(8):2088-96. 32. Song J, Yi C. Chemical Modifications to RNA: A New Layer of Gene Expression Regulation. ACS chemical biology. 2017;12(2):316-25. 33. Schaefer M, Pollex T, Hanna K, Lyko F. RNA cytosine methylation analysis by bisulfite sequencing. Nucleic acids research. 2009;37(2):e12. 34. Squires JE, Patel HR, Nousch M, Sibbritt T, Humphreys DT, Parker BJ, et al. Widespread occurrence of 5-methylcytosine in human coding and non-coding RNA. Nucleic acids research. 2012;40(11):5023-33.

73

35. Tuorto F, Liebers R, Musch T, Schaefer M, Hofmann S, Kellner S, et al. RNA cytosine methylation by Dnmt2 and NSun2 promotes tRNA stability and protein synthesis. Nature structural & molecular biology. 2012;19(9):900-5. 36. Schosserer M, Minois N, Angerer TB, Amring M, Dellago H, Harreither E, et al. Methylation of ribosomal RNA by NSUN5 is a conserved mechanism modulating organismal lifespan. Nature communications. 2015;6:6158. 37. Metodiev MD, Spahr H, Loguercio Polosa P, Meharg C, Becker C, Altmueller J, et al. NSUN4 is a dual function mitochondrial protein required for both methylation of 12S rRNA and coordination of mitoribosomal assembly. PLoS genetics. 2014;10(2):e1004110. 38. Yang X, Yang Y, Sun BF, Chen YS, Xu JW, Lai WY, et al. 5-methylcytosine promotes mRNA export - NSUN2 as the methyltransferase and ALYREF as an m(5)C reader. Cell research. 2017;27(5):606-25. 39. Shen Q, Zhang Q, Shi Y, Shi Q, Jiang Y, Gu Y, et al. Tet2 promotes pathogen infection-induced myelopoiesis through mRNA oxidation. Nature. 2018;554(7690):123-7. 40. Morscher RJ, Ducker GS, Li SH, Mayer JA, Gitai Z, Sperl W, et al. Mitochondrial translation requires folate-dependent tRNA methylation. Nature. 2018;554(7690):128-32. 41. Theus MH, Ricard J, Liebl DJ. Reproducible expansion and characterization of mouse neural stem/progenitor cells in adherent cultures derived from the adult subventricular zone. Current protocols in stem cell biology. 2012;Chapter 2:Unit 2D.8. 42. Morita M, Alain T, Topisirovic I, Sonenberg N. Polysome Profiling Analysis. Bio-protocol. 2013;3(14):e833. 43. Amort T, Rieder D, Wille A, Khokhlova-Cubberley D, Riml C, Trixl L, et al. Distinct 5-methylcytosine profiles in poly(A) RNA from mouse embryonic stem cells and brain. Genome biology. 2017;18(1):1. 44. Li B, Dewey CN. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC bioinformatics. 2011;12:323. 45. Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26(1):139-40. 46. McCarthy DJ, Chen Y, Smyth GK. Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation. Nucleic Acids Res. 2012;40(10):4288-97. 47. Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15(12):550. 48. Xiao Z, Zou Q, Liu Y, Yang X. Genome-wide assessment of differential translations with ribosome profiling data. Nature communications. 2016;7:11194. 49. Rieder D, Amort T, Kugler E, Lusser A, Trajanoski Z. meRanTK: methylated RNA analysis ToolKit. Bioinformatics (Oxford, England). 2016;32(5):782-5. 50. Khoddami V, Yerra A, Cairns BR. Experimental Approaches for Target Profiling of RNA Cytosine Methyltransferases. Methods in enzymology. 2015;560:273-96. 51. Rainer J, Gatto L, Weichenberger CX. ensembldb: an R package to create and use Ensembl-based annotation resources. Bioinformatics. 2019;35(17):3151-3. 52. Yu G, Wang LG, Han Y, He QY. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS. 2012;16(5):284-7. 53. Sun Z, Xu X, He J, Murray A, Sun MA, Wei X, et al. EGR1 recruits TET1 to shape the brain methylome during development and upon neuronal activity. Nature communications. 2019;10(1):3892.

74

54. Huang T, Chen W, Liu J, Gu N, Zhang R. Genome-wide identification of mRNA 5-methylcytosine in mammals. Nature structural & molecular biology. 2019:1 %@ 1545-9985. 55. Yang Y, Wang L, Han X, Yang WL, Zhang M, Ma HL, et al. RNA 5-Methylcytosine Facilitates the Maternal-to-Zygotic Transition by Preventing Maternal mRNA Decay. Molecular cell. 2019. 56. Prevention of neural tube defects: results of the Medical Research Council Vitamin Study. MRC Vitamin Study Research Group. Lancet (London, England). 1991;338(8760):131-7. 57. Food, Drug A. Food standards: Amendment of standards of identity for enriched grain products to require addition of folic acid; final rule (21 CFR Parts 136, 137, and 139). Federal Register. 1996;61:8781-97. 58. Barua S, Kuizon S, Chadman KK, Flory MJ, Brown WT, Junaid MA. Single-base resolution of mouse offspring brain methylome reveals epigenome modifications caused by gestational folic acid. Epigenetics & chromatin. 2014;7(1):3. 59. Barua S, Kuizon S, Brown WT, Junaid MA. DNA Methylation Profiling at Single-Base Resolution Reveals Gestational Folic Acid Supplementation Influences the Epigenome of Mouse Offspring Cerebellum. Frontiers in neuroscience. 2016;10:168.

75

Chapter 4 – Conclusions and Future Directions

4.1 Conclusions Post-transcriptional modification to RNA molecules, now termed “RNA epigenetics” or

“epitranscriptomics”, is a rapidly emerging field studying the regulation of gene expression at post-

transcriptional level (1, 2). As one of the most well-known RNA modifications, RNA cytosine-5

methylation (5-mrC) is formed by adding a methyl (-CH3) group to the fifth position of cytosine.

Previous research has shown that 5-mrC modification is involved in diverse aspects of RNA

metabolism. In particular, it facilitates the export of mRNAs from nucleus to cytoplasm with the

help of the 5-mrC reader protein ALY/REF export factor (ALYREF) (3), maintains RNA stability

by binding with the 5-mrC reader protein YBX1 through its cold-shock domain (4-6). Because of

the low abundance of 5-mrC in mRNAs, our understanding of 5-mrC modification is still very

limited in regard of its distribution, dynamic regulation and biological functions in different

physiological and pathological processes.

In this dissertation, we aim to investigate the dynamic regulation of RNA cytosine-5

methylation (5-mrC) in response to environmental cues and to explore the potential links between

5-mrC modification and mRNA abundance or mRNA translation. Thereafter, we adopt two in vitro

cell models combined with state-of-art high-throughput sequencing techniques.

In chapter 1, we summarize the currently available approaches to measure 5-mrC

modification at global level, transcriptome-wide level and locus-specific level. We specially

highlight the bioinformatics data analysis for RNA bisulfite sequencing datasets, which is able to

provide transcriptoeme-wide profile of 5-mrC modification at single nucleotide resolution. The

RNA bisulfite sequencing technique serves as a powerful tool in the study of 5-mrC modification.

In chapter 2, we adopt a widely used neuronal activity model to study the dynamic regulation

of 5-mrC modification in neurons in response to environmental stimuli. The in vitro cultured

mouse cortical neurons are depolarized with KCl for 0h, 2h, and 6h. RNA sequencing (RNA-seq)

and RNA bisulfite sequencing (RNA BS-seq) are performed simultaneously to profile gene

expression as well as 5-mrC modification at transcriptome-wide level. We have identified distinct

gene expression profiles with one group of early response genes for the early stage and another

76

group of late response genes for the late stage in neurons upon activation. It reveals a dynamic 5-

mrC modification landscape in activated neurons. We have also found two sets of differentially

methylated 5-mrC sites (DMS) for the early and late stages of neuronal activity, and the mRNAs

with DMS sites are associated with mitochondrial and synaptic functions. Furthermore, we have

determined a negative correlation between RNA methylation and mRNA expression in mouse

cortical neurons during neuronal activity. Thus, these findings have shown the dynamic regulation

of 5-mrC modification during neuronal activity and revealed a potential link between RNA

methylation and mRNA expression.

In chapter 3, we investigate the influence of a common nutrient supplement, folate, which

serves as the methyl donor in the methylation events of cellular metabolism, on RNA cytosine-5

methylation (5-mrC) in adult mouse neural stem cells (NSCs). Compared to the control (medium

level of folate, MF), NSCs cultured in folate deficiency (low level of folate, LF) or

supplementation (high level of folate, HF) condition have shown no changes in mRNA abundance,

but changes in mRNA translation efficiency. RNA bisulfite sequencing of both total poly(A) RNA

samples and polysome poly(A) RNA samples has revealed distinct 5-mrC profiles in NSCs treated

with different concentrations of folic acid. Intriguingly, it shows consistent hypermethylation in

polysome mRNAs than that in total mRNAs, indicating a critical role of 5-mrC modification in

the regulation of mRNA translation.

In summary, we have identified the transcriptome-wide distribution of 5-mrC modification

within mRNAs in mouse cortical neurons and adult mouse neural stem cells (NSCs), as well as

the dynamic regulation of 5-mrC modification in response to environmental factors such as

neuronal activity and the methyl donor folate deficiency and supplementation. Furthermore, we

have shown a potential link between mRNA methylation and mRNA expression or mRNA

translation, highlighting the critical role of 5-mrC modification in the post-transcriptional

regulation of RNA metabolism.

4.2 Future directions In this dissertation, we have identified transcriptome-wide profiles of 5-mrC modification in

different biological settings by using high-throughput sequencing techniques. Meanwhile, we need

more functional studies narrowing down to specific mRNAs and specific 5-mrC loci that are linked

77

to important cellular functions in order to further elucidate the critical function of 5-mrC

modification in physiological and pathological conditions.

Recently studies show TET family enzymes are involved in the sequential oxidation of RNA

cytosine-5 methylation (5-mrC) to form 5-hmrC, 5-fC and 5-CaC (7-9). However, underlying

molecular mechanism that mediate the conversion from 5-CaC to unmethylated cytosine is still

elusive. Moreover, Tet1/Tet2/Tet3 triple knockout mouse embryonic stem cells (ESCs) still show

detectable 5-hmrC level (7). More studies are needed to elucidate the comprehensive RNA

demethylation pathway.

RNA cytosine-5 methylation has been shown to influence the binding affinity of specific

RNA binding proteins, such as ALYREF and YBX1 (3-5). These proteins show preferential

binding to methylated mRNAs and thus are termed 5-mrC reader protein. More efforts are needed

to identify novel 5-mrC reader proteins and their involvement in the facilitation of specific RNA

metabolisms, such as mRNA export from nucleus to cytoplasm, mRNA transport to specific

cellular organelles, the regulation of mRNA stabilization or degradation, the regulation of mRNA

translation on polyribosome complex. Identification of novel 5-mrC binding proteins and

functional characterization of these 5-mrC binding proteins in diverse biological settings are

essential to further our understanding the biology of RNA cytosine-5 methylation.

4.3 References 1. He C. Grand challenge commentary: RNA epigenetics? Nature chemical biology. 2010;6(12):863-5. 2. Saletore Y, Meyer K, Korlach J, Vilfan ID, Jaffrey S, Mason CE. The birth of the Epitranscriptome: deciphering the function of RNA modifications. Genome biology. 2012;13(10):175. 3. Yang X, Yang Y, Sun BF, Chen YS, Xu JW, Lai WY, et al. 5-methylcytosine promotes mRNA export - NSUN2 as the methyltransferase and ALYREF as an m(5)C reader. Cell research. 2017;27(5):606-25. 4. Chen X, Li A, Sun BF, Yang Y, Han YN, Yuan X, et al. 5-methylcytosine promotes pathogenesis of bladder cancer through stabilizing mRNAs. Nature cell biology. 2019;21(8):978-90. 5. Yang Y, Wang L, Han X, Yang WL, Zhang M, Ma HL, et al. RNA 5-Methylcytosine Facilitates the Maternal-to-Zygotic Transition by Preventing Maternal mRNA Decay. Molecular cell. 2019. 6. Zou F, Tu R, Duan B, Yang Z, Ping Z, Song X, et al. Drosophila YBX1 homolog YPS promotes ovarian germ line stem cell development by preferentially recognizing 5-methylcytosine

78

RNAs. Proceedings of the National Academy of Sciences of the United States of America. 2020;117(7):3603-9. 7. Fu L, Guerrero CR, Zhong N, Amato NJ, Liu Y, Liu S, et al. Tet-mediated formation of 5-hydroxymethylcytosine in RNA. Journal of the American Chemical Society. 2014;136(33):11582-5. 8. Huber SM, van Delft P, Mendil L, Bachman M, Smollett K, Werner F, et al. Formation and abundance of 5-hydroxymethylcytosine in RNA. Chembiochem : a European journal of chemical biology. 2015;16(5):752-5. 9. Basanta-Sanchez M, Wang R, Liu Z, Ye X, Li M, Shi X, et al. TET1-Mediated Oxidation of 5-Formylcytosine (5fC) to 5-Carboxycytosine (5caC) in RNA. Chembiochem : a European journal of chemical biology. 2017;18(1):72-6.