Genomics Informed Medicine - histology.ro · 2018. 5. 10. · ADN, catena sens sau codantă ADN,...

Genomics Informed Medicine

NGS approaches Massive Parallel Sequencing - a very dynamic field-

choice of test:

Whole Exome Sequencing WES- only coding regions explored,

700 €/sample.

Whole Genome Sequencing WGS-entire genome explored, approx

10,000 €/sample- current efforts aim at 1,000 €/sample .

RNA-seq- determines sequence and levels of gene expression,

800 €/sample (can be performed for large –mRNA/LncRNA and small

miRNA).

Chip-seq- determines DNA sequences bound to proteins,

700 €/sample

Array CGH- determines using arrays deletions and copy number variations

FIRST GENOME SEQUECING: SANGER METHOD, 1.5 BILLION US $

Structura ADN

5 ’ 3 ’

Structura ADN

Cromozomii

La toate fiinţele vii, ADN are aceeaşi structură, variind doar ca mărime.

- virus: ADN = 103 -104 perechi de baze

- bacterii: ADN = 106 pb, ADN circular, un singur cromozom

- eucariote: ADN = 3 x 109 pb (la om), împărţit pe mai mulţi cromozomi, ADN

este asociat cu proteine.

Genomul uman:

-3.2 milliarde de perechi de baze pentru 23 cromozomi;

-aproximativ 21-22.000 gene care codifică pentru proteine (sub 2% din genom)

(genă = regiune a ADN care conţine informaţia necesară sintezei ARN

mesager (ARNm)

Replicarea ADN

Fluxul informaţiei genetice:

REPLICARE

traducere

transcriere

eucariote: ADN

ARN primar

ARN mesager

Proteine

1. Transcrierea ADNului în ARN: principii generale

Transcrierea (Transcription)

Transcriere

Maturare

Traducere

A G C T . . .

T C G A

A G C U 3’ ARN

ADN, catena sens sau codantă

ADN, catena antisens sau non codantă

eucariote:

3 polimeraze :

ARN polimeraza I : ARN ribozomal (5.8S, 18S, 28S)

ARN polimeraza II : ARNm, micro-ARN

ARN polimerase III : ARNt + ARNr 5S + ARN mici nucleare

1. Transcrierea ADNului în ARN: principii generale

Transcrierea

3. Sinteza ARNm : eucariote

- Debutează cu sinteza unui ARN primar

ARN primar suferă o maturare în 3 etape, finalizând cu obţinerea ARNm:

- aditia unui cap 5’

- « splicing » : eliminarea intronilor şi menţinerea exonilor

- poliadenilarea extremităţii 3’ a ARN

ARN primar

ARN primar exon intron exon intron exon

exon exon exon AAAAAA ARNm

Cap 5” coadă poliA

Transcrierea

3. Sinteza ARNm : eucariote

ARN primar este sintetizat de ARN polimeraza II, care este recrutată la situsul de iniţiere al

transcrierii, de către factorii generali ai transcrierii

factorii generali ai transcrierii formează un complex de proteine, ancorat la ADN de TATA-box

binding Protein (TBP):

ADN ARN primar

factorii generali ai

transcrierii ARN pol II

Transcrierea

- Ansamblul secvenţelor cis situate în vecinătatea situsului de iniţiere a transcrierii + TATA sau

TATA-like box constituie promotorul genei.

- Alte grupe de secvenţe cis se pot găsi la distanţă de situsul de iniţiere a transcrierii; ele

formează enhancers (amplificatori) sau silencers (reprimatori), în funcţie de tipul de factori

trans activatori sau represori pe care îi recrutează

TATA cis cis cis // // cis cis exoni + introni cis cis

promotor (300-500 pb) enhancer/

silencer

enhancer/

silencer

Structura unei gene eucariote:

Transcrierea

The 3D Genome in

Transcriptional Regulation

Adapted from B. Ren Cell Stem Cell 2014 2014 June 5; 14(6): 762–775.

How one genome sequence can give rise to so many

different cell types, the answer to this question lies, at least

in part, in the ability of distinct cell types to express genes

at different levels and in different combinations. “Lineage-

specific” regulation of gene expression occurs at the level

of transcription.

Features of the genome beyond its primary nucleotide

sequence must contribute to the lineage-specific gene

regulation that underlies cellular identity. Other genomic

features other than primary nucleotide sequence are

important.

However, no linear representation of the human

genome – no matter how well annotated with

functional elements – can fully capture the

molecular mechanisms responsible for lineage-

specific transcriptional regulation.

The role of non-linear interactions in transcriptional

regulation is exemplified by two fundamental

properties of metazoan enhancer function:

1)Enhancers can direct the expression of target genes

located far away in linear distance (i.e. number of

intervening base pairs)

1)The gene most heavily influenced by an enhancer is

not always the gene that is closest by linear distance.

“Long-range” regulation is possible because

enhancers are in close physical proximity to the

promoters of their target genes in vivo, despite long

stretches of intervening nucleotides.

This physical proximity allows protein complexes

bound at enhancers to interact with those bound at

promoters, thereby influencing transcription of

target genes.

Physical interactions like those between an enhancer and promoter,

a series of molecular techniques based on the concept of Chromatin

Conformation Capture (3C) are used.

Chemical crosslinking secures 3D contacts between genomic loci

occurring in live cells.

This cross-linked chromatin is then isolated, and digested with a

restriction enzyme.

Re-ligation is performed in extremely diluted solutions .

Only loci that were contacting each other in vivo (and thus fixed

together by crosslinking) will be ligated together.

Higher-order genome organization

B. Ren Cell Stem Cell 2014 June 5; 14(6): 762–775.

The genome is organized at many levels ranging

from higher-order structures that are visible under

the microscope. The most fundamental unit of

higher-order genome organization is the

chromosome. Each chromosome occupies its own

sub-volume of the interphase nucleus, known as a

Chromosome Territory (CT).

CTs can be visualized by Fluorescent in Situ Hybridization

(FISH) using probes sets designed to paint entire

chromosomes. Cts are also evident in C-data which

demonstrate a consistent preference for intra-chromosomal

over inter-chromosomal interactions.

Gene-rich regions tend to localize to the periphery of Cts

which facilitates access to transcriptional machinery sharing of

this machinery between active genes on different

chromosomes.

Specific regions can shift position from the CT interior to the

CT periphery as genes in those regions become active during

development.

The position of a given CT within the nucleus is

highly stable through interphase.

Level 1: Chromosomes occupy distinct sub-

regions of the nucleus known as chromosome

territories (CTs). Individual chromosomes are

indicated by different colors.

Genomic regions at the nuclear periphery have been

studied in further detail using a method that can identify

regions that come into contact with proteins of the nuclear

lamina, a filamentous network of proteins abutting the

inner nuclear membrane.

Genomic regions that contact the nuclear lamina, which

are known as Lamin Associated Domains (LADs), are

characterized by low levels of transcriptional activity, low

gene density, and repressive histone modifications

including H3K27me3 and H3K9me.

These observations suggest a link between transcriptional

silencing and the nuclear lamina.

Chromatin organization

• Nuclear intermediate filament (IF) proteins.

• Cytoskeleton component of nucleus

• Associated with inner nuclear membrane.

• Contribute to chromatin regulation, regulate gene expression, DNA replication, DNA repair,

cell proliferation & differentiation etc.

• Developmental processes, including tissue formation and homeostasis and organogenesis

Transcription Factor

DNA methylation &

Histone modifications

Transcription

Nuclear Lamina

Nuclear Lamins

Gene regulation during development

Courtesy of J; Staerk

Lamin A, C, Δ10 and C2

Lamin B1

Lamin B2 and B3

• Viscosity and stiffness to nucleus • Postnatal development. • Lamin A/C KO mice born apparently

normal but develop growth retardation and have a muscular disease phenotype and die .

• Mutations in LMNA found, associated with ∼14 distinct human diseases.

• Confer elasticity to nucleus • Cellular processes during embryogenesis. • Lamin B1 KO mice have major defects in the

lungs and brain, dies shortly after birth. • Few disease-causing mutations have been

identified in LMNB1 or LMNB2 and are mostly embryonic-lethal.

The family of Lamin proteins

Courtesy of J. Staerk

• LADs are large chromosomal domains that associate with the lamina.

• Most genes in LADs are transcriptionally inactive and enriched in repressive histone marks

such as H3K27me3 and H3K9me2 , suggesting a repressive role for LADs.

• Lamina-genome interactions are widely involved in the control of gene expression programs

during lineage commitment and terminal differentiation.

Molecular Cell, Volume 38, Issue 4, 2010, 603 - 613

Neural Precursor Cells

Astrocytes

Embryonic Stem Cell

Lamin associated domains (LADs)

Courtesy of J. Staerk

The association of specific genes with the nuclear

lamina often coincides with their transcriptional

silencing during differentiation (Peric-Hupkes et al.,

2010).

Examples include the key pluripotency genes Oct4,

Nanog, and Klf4. Conversely, loss of association

with the lamina and re-positioning away from the

nuclear periphery often coincides with

transcriptional activation.

Level 2: Transcriptionally inactive regions are enriched at the

nuclear periphery where they contact the nuclear lamina (red).

Actively transcribed genes often co-localize at RNA polymerase

II transcription factories (yellow). These and other instances of

colocalization between regions with similar transcriptional

activity may provide the physical basis for the observations of A

and B compartments in C-data.

Topological domains

Chromosomes are comprised of structural units called

Topological Domains, also known as Topologically-

Associating Domains (TADs).

TADs are regions of high local contact frequency, which are

separated by sharp boundaries across which contacts are

relatively infrequent.

Mammalian genomes contain roughly 2000 TADs covering

more than 90% of the mapable genome. They vary in size from

a few hundred kilobases (kb) to several megabases with an

average size of approximately 1 Mb. TADs are too small for

current microscopy-based methods but FISH is generally

consistent with C-data.

TADs are a fundamental unit of genome

organization. TADs have now been described in

every mouse and human cell type in which they

have been scrutinized as well as in Drosophila

(TAD size is considerably smaller in Drosophila

at ~100 kb on average).

The boundaries between TADs are strikingly consistent

across cell types.

Roughly 50-90% of TAD boundaries overlap in pairwise

comparisons between cell types. The locations of TAD

boundaries are also highly conserved between mouse and

human, indicating that both the existence and location of

TADs have functional significance that is under selective

pressure.

TADs are not detectable during mitosis.

TADs frequently overlap with regions demarcated by other

functional annotations related to transcriptional activity including

histone modifications, replication timing, and association with the

nuclear lamina.

Transitions between compartment A and compartment B also

frequently occur at TAD boundaries. A given TAD tends to be all in

the active compartment A, or all in the inactive compartment B.

TADs in the active compartment A tend to contain a higher density of

internal interactions, as might be expected given the role of

interactions between cis-regulatory elements in transcriptional

activity.

The same TAD can be found in different compartments (i.e. A or B)

in different cell types.

cis interactions across TAD boundaries are

infrequent. These boundaries may limit the

potential target genes of a given enhancer, or vice

versa limit the potential enhancers of a given

target gene.

Promoters and enhancers within the same TAD

often show coordinated activity.

The insertion of a reporter construct designed to act as a

regulatory sensor into different locations within the same

TAD yields highly similar patterns of reporter gene

expression in transgenic mouse embryos. Well-described

cases of long-range regulation involve a promoter and

distal enhancer that lie within the same TAD.

The HOXD gene cluster straddles the border between two

TADs, and is influenced by distal regulatory elements from

those different TADs at different stages in development.

CCCTC-Binding factor (CTCF) binds three

regularly spaced repeats of the core sequence

CCCTC in the Myc promoter and thus was named

CCCTC binding factor. Lobanenkov et al.

Oncogene 5 (12): 1743–53, 1990. It binds to

CCGCGNGGNGGCAG sequence. CTCF binds to

15,000-40,000 sites in the human genome

11 Zinc Finger protein different use of Zinc fingers

for DNA engagament

Binding sites for the protein CTCF are highly

enriched at TAD boundaries.

CTCF can function as a transcriptional insulator in

certain contexts by blocking enhancer-promoter

interactions and/or preventing the spread of epigenetic

marks.

Deletion of a specific TAD boundary containing

CTCF binding sites led to an increase in interactions

between adjacent TADs.

Level 3: Topological domains, or Topologically-Associating Domains

(TADs) are regions of frequent local interactions separated by

boundaries across which interactions are less frequent. CTCF binding

sites and other sequence features (TSS, SINEs; not depicted here) are

enriched at TAD boundaries. Note that CTCF also binds within TADs.

Cohesin is often present at TAD boundaries.

Level 4: Transcriptional regulation depends on long-range

Interactions between cis-regulatory elements such as enhancers (light

red) and promoters (light yellow). These cis-regulatory interactions

are facilitated by proteins including Transcription Factors (“TFs”;

blue), co-factors such as Mediator (“Med”; red) and Cohesin (purple

ring), and RNA Polymerase II (“Pol II”; yellow).

Knockdown of CTCF leads to an increase in interactions

between adjacent domains (so-called “inter-domain

interactions”), though not complete abrogation of TAD

boundaries.

Loss of Cohesin (recruited by CTCF and present at many

TAD boundaries) also leads to an increase in inter-domain

interactions.

TAD boundaries are also enriched for SINE elements and

Transcriptional Start Sites (TSSs, particularly those of so-

called “housekeeping” genes), but the requirement of these

elements for boundary activity has not been explored in as

much detail.

TAD boundaries range in size from tens of kb to more

than 100 kb. The lack of precise boundary locations

may be due in part to limited resolution of the C-

technologies used to identify TAD boundaries

(currently between ~10-40 kb).

The formation of a TAD boundary requires more than

one sequence element – for example, the combination

of several CTCF binding sites, and perhaps

housekeeping TSSs and SINEs, spread over several kb.

TOPOLOGICAL ASSOCIATED DOMAINA

TADs and A/B compartments

A) Diagrammatic representation of

two neighboring TADs.

3C: Capturing Chromosome Conformation

Mediator, CTCF, cohesisn are ARCHITECTURAL PROTEINS

involved in connecting cis-acting elements (promoter-enhancer,

promoter-promoter, enhancer-enhancer)

Interactions between cis-regulatory

elements direct lineage-specific

transcription

Certain contacts occur far more often than expected by

chance based on the linear distance between the loci

involved.

Interaction describes the relationship between loci that are

in contact more frequently than would be expected based

on linear distance. The term “looping” is sometimes used to

describe such interactions.

Using C-technologies and other molecular techniques it

was revealed that the promoters of active β-globin genes

interact with an upstream regulatory sequence known as

the Locus Control Region (LCR), despite more than 40 kb

of intervening sequence. These interactions were not

observed in cell types where β-globin genes are silent.

Reproducible interactions are common in mammalian

genomes, and that interacting loci are highly enriched

for characteristics of cis-regulatory elements.

One recent C-study detected more than a million

interactions genome-wide between loci that are on average

separated by roughly 100 kb, including approximately

30,000 interactions between active promoters and putative

enhancers.

The vast majority of these interactions did not cross a TAD

boundary, consistent with the role of TAD boundaries in

constraining 3D interactions.

Interacting partners are not readily predicted by linear

distance. Fewer than 10% of all interactions between TSSs

and distal regions involved the closest TSS by linear

distance.

Enhancers and promoters do not interact in a simple 1:1

relationship:

-One promoter often interacts with multiple enhancers

-One enhancer often interacts with multiple promoters

-Promoters often interact with other promoters

-Enhancers often interact with other enhancers.

Cis-regulatory interactions often vary between cell types,

which is particularly true for interactions between

promoters and putative enhancers.

The presence of putative enhancer-promoter interactions is

highly correlated with a gene’s transcriptional activity.

Housekeeping genes tend to be highly expressed but not

involved in interactions with putative enhancers.

Lineage-specific genes are particularly dependent on long-

range regulatory interactions. Some broadly-expressed

genes (e.g. Myc) interact with distinct sets of enhancers in

different cell types.

Interactions between the LCR and β-globin genes are not

simply a consequence of transcription, because inhibition of

transcription by treatment with RNA polymerase II inhibitors

does not disrupt these interactions, despite a drastic reduction

in βglobin transcription.

Forced ectopic interactions between the LCR and β-globin

promoter (i.e. the creation of LCR-promoter interactions in

cells where such an interaction is not naturally present)

stimulates β-globin transcription.

Deng and colleagues (2012) created an ectopic interaction

between the β-globin promoter and LCR in the pro-

erythroblast cell line GE1, which does not normally express

β-globin nor display an interaction between promoter and

LCR. Creation of this ectopic interaction caused a dramatic

increase in β-globin expression.

Genome-wide evidence on the action of Tumor Necrosis

Factor on target cells demonstrates that enhancer-promoter

interactions often exist prior to the onset of transcription

for a particular response.

As differentiation proceeds, cells gain priming interactions

for stimuli that are important at later stages of

differentiation, while losing priming interactions required

at earlier stages.

cis-regulatory interactions are

secured by TFs and architectural

proteins

Central to any discussion of cis-regulatory interactions is a

consideration of how, at the molecular level, these

interactions are established and maintained.

At the sequence level both promoters and enhancers are

composed of binding sites for TFs. Promoters bind a core

set of General Transcription Factors (GTFs), and these

GTFs in turn recruit RNA polymerase II and additional

cofactors.

The repertoire of TFs that bind at enhancers is more

contingent on the cell type in question.

The sequence-specific DNA binding factor CTCF stands

apart from other TFs with respect to genome

organization.

Regions bound by CTCF are frequently engaged in

physical interactions with themselves as well as with

other regions.

CTCF is described as a “master weaver of genome” and

as an “architectural protein”. CTCF is ubiquitously

expressed, and binds to tens of thousands of sites

throughout the genome.

Part of CTCF’s function is to establish a structural

framework that is similar between cell types. The

involvement of CTCF in 3D interactions is integral to

its function, the impact of CTCF binding on

transcription depends on the locus and cell type in

question.

CTCF and other TFs share the ability to recruit cofactors that are

also involved in the formation of cis-regulatory interactions.

One such cofactor is the Cohesin complex. Cohesin is well known

for its role in holding sister chromatids together until anaphase

when they are separated and migrate to opposite spindle poles.

Cohesin is commonly found at enhancers, where it acts together

with the Mediator complex to maintain physical interaction

between promoters and enhancers.

Mediator can directly interface with factors bound at enhancers

and those bound at promoters, facilitating communication

between them. Cohesin is also present at CTCF binding sites,

many of which are outside of traditional enhancers and lack

Mediator binding.

In a given cell type (including ESCs) the majority of

Cohesin binding falls into one of two categories:

1)Sites that are co-occupied by Mediator and multiple

2) Sites that cooccupied by CTCF (Kagey et al., 2010,

Yan et al., 2013, Faure et al., 2012, Hnisz et al., 2013).

80% of cis interactions involved loci bound by some

combination of Cohesin, Mediator, and/or CTCF, leading

the authors to label these factors as “architectural proteins”.

Cohesinmediator interactions also occurred over shorter

distances (mean <100 kb) than did Cohesin-CTCF

interactions (mean >1 Mb). Cohesin may function as a

general stabilizer of these interactions.

A complex picture in which a number of trans factors

including lineage-specific TFs, CTCF, Mediator, and

Cohesin are involved in anchoring different types of cis-

regulatory interactions (including, but not limited to,

interactions between promoters and enhancers).

Interactions are anchored by factors that recognize DNA

in a sequence-specific manner, thereby determining

which specific loci are most likely to participate in

stable interactions.

These DNA binding factors in turn recruit cofactors

such as Cohesin and Mediator, which further promote

and stabilize the interactions.

A newly-described class of non-coding RNA (ncRNA-a),

which can direct the transcriptional upregulation of other

genes in cis, thus functioning analogously to classically-

defined enhancer elements.

As ncRNA-a are transcribed, they engage in physical

interactions with their target promoters, and these

interactions are dependent on the recruitment of Mediator

by the nascent ncRNA-a. Like TFs, ncRNA-a can anchor

cis-regulatory interactions and recruit cofactors to further

stabilize these interactions.

Genome organization and pluripotency

Pluripotent cells have the same features of genome

organization as differentiated cells, including A/B

compartments, LADs, TADs, and cis-regulatory interactions.

One unique feature is that chromatin is generally less

condensed and more loosely organized in pluripotent cells

than in lineage committed cells.

Correspondingly, histone modifications that mark

heterochromatin expand during lineage commitment to cover

a substantially larger portion of the genome in differentiated

cells than in ESCs.

C-data revealed that transcriptionally inactive regions tend

to participate in fewer specific long-range interactions in

ESCs than in non-ESCs. These results are all consistent

with a chromatin conformation that is particularly

malleable in pluripotent cells, and which may function to

maintain a state of permissiveness for the different

transcriptional programs required for lineage specification.

Although condensed heterochromatin is less prevalent in

ESCs than in other cell types, transcriptional repression is

still important to the pluripotent state.

The repression of many genes associated with lineage

commitment requires Polycomb group (PcG) proteins.

Genomic regions enriched for PcG binding and/or its

associated repressive histone modification H3K27me3

contact each other at high frequency in C-data generated

from ESCs.

Another unique feature of higher-order genome

organization in pluripotent cells is that regions with a high

density of binding sites for the key pluripotency TFs

Oct4, Sox2, and Nanog (together abbreviated as OSN)

tend to co-localize in nuclear space.

OSN are directly involved in higher-order genome

organization in ESCs, which is further supported by the

demonstration that loss of either Oct4 or Nanog

diminishes long-range contacts between OSN-bound

regions.

Surprisingly, binding of CTCF and Cohesin is not

enriched at long-range contact sites in ESCs, suggesting

that the role of OSN in shaping higher-order structure

of the pluripotent genome is independent of

architectural proteins.

OSN also anchor short-range cis-regulatory interactions

that do require Cohesin.

OSN and other key pluripotency genes are in contact

with the silencing environment of the nuclear lamina

less frequently in ESCs than in differentiated derivates.

Interactions between the promoters of different

pluripotency genes can be detected In ESCs both in cis

and in trans, indicating that they colocalize in the

pluripotent nucleus, perhaps at shared RNA Polymerase

II transcription factories.

A more comprehensive view of the genome as a 3D entity

is required. Many of the functional modules in the genome

are arranged in linear fashion.

Exons are always transcribed in linear order, and

promoters are always located immediately upstream of the

transcription unit. This machinery is processive – that is, it

moves along a stretch of DNA in a line – and thus the

functional modules on which the machinery acts are

arranged in linear fashion in the genome.

Unlike the exons of a gene, the enhancers that regulate a

gene’s transcription are often not arranged in a linear

fashion with respect to the gene in question. Enhancers

can be found upstream or downstream of the genes they

regulate, can act over large linear distances, and can skip

over intervening genes.

The machinery of transcriptional regulation is structural –

that is, it relies on 3D interactions between modules that

may be separated by considerable linear distance.

Genome organization plays a role in myriad other

processes including DNA repair, DNA replication, and X

chromosome inactivation.

Mutations in genes that encode genome and nuclear

architectural components (including subunits of the

Cohesin complex, Mediator complex, and Nuclear Lamins)

can result in severe developmental phenotypes.

Genes encoding Mediator subunits, Cohesin subunits, and

CTCF are also mutated at significant frequency in cancer,

raising questions about the potential contribution of defects

in genome organization to malignancy.

SNPs that are linked to human disease by Genome-Wide

Association Study (GWAS) are commonly found within

enhancers, suggesting that perturbation of long-range

regulation is the mechanism behind a sizable portion of

pathogenic sequence variation.

Genome Sequencing: from The

Human Genome Project to

Clinical Applications

Lander et al., Nature 2001, 409, 877

Plasmids-4 Kb

Cosmids-40 Kb

BAC, YAC 100-500 Kb

Bacterial genome-2Mb

E. Coli F plasmid- BAC allows

stable cloning of up to 1 million bp

Long repetitive sequences make full-resolution difficult

Technology Evolution to Massive Parallel Sequencing

An emulsion method for DNA amplification and a special instrument

Bentley et al. 2008

-Human genome = 30,000 genes

-Hundreds of genes acquired by horizontal transfer from bacteria

-Dozens of genes acquired from transposons

-50% of the genome is derived from transposable elements of which

DNA and LTR transposons are inactive

-Segmental duplication is frequent in the human genome and involved

pericentromeric and subtelomeric regions

-Recombination rates are higher in distal regions of of chromosomes

In a pattern that promotes occurrence of 1 cross over per chromosome

arm in each meiosis

-Alu transposable elements predominate in GC rich regions while

GC-poor regions are associated with dark G- bands in karyotypes Lander et al., Nature 2001, 409, 877

-SNP= Single Nucleotide Polymorphism

-Two human genomes compared will differ at 2.5 million places

corresponding to a frequency of 1 per 1300 nucleotide pairs

-Rate of nucleotide change/genome 5 nt/ 1000 are changed in

1 million years due to the acuracy of replication

-Human and chimpanzee chromosomes are separated by 5

million years of evolution- very similar: human and mouse

chromosomes separated by 100 million years of evolution

are much more different Lander et al., Nature 2001, 409, 877

-Number of genes: 6000 for yeast Saccharomyces cerevisiae; 18,000

for the nematode C. elegans; 13,000 for Drosophila melanogaster;

30,000 for humans

-A total of 3 billion years of evolution

Chimpanzee and human chromosomes are almost identical except

for human chromosome 2

99% of Alu repeats (types of transposon repeats) are in

the same place in the human and chimpanzee genomes

The 1% repeats that differ contain human-sepcific Alu, still active

which can induce genetic diseases.

Human Alu seqs (1 million) and mouse B1 (400,000) evolved

From the 7SL RNA which encodes the SRP RNA

Alu restriction site = ag/ct

-3 million transposable element remnants in the human genome

-Presently these elements are responsible for new mutations, like

for 2/1000 mutations

-Hypothesis: 170 million years ago : critical speciation events

leading to mammalian radiation for a common ancestor may

have involved a burst in transposition activity

Lander et. al., Nature 2001, 409, 877

Transposons in the Human Genome

Transposons:

-LINES (LINE1 is still active !) are most ancient, 6kb in

length, encode 2

orfs and have a polymerase II promoter, move to the

nucleus a complex

of proteins and the RNA; an endonuclease makes a ss nick

and the RT

uses the nicked RNA to prime RT from the 3’ end-

imperferct with

unfinished 5’ ends; new insertions are flanked by 7-20bp

target site

Duplications; LINES target AT rich gene-poor regions due

to TTTT/A

endonuclease preferred cleavage site

Long interspersed nuclear elements- can insert into the gene for

Factor VIII and produce hemophilia

-SINES are short 100-400 bp and use LINES to function

-Promoter regions are shared with tRNA sequences or with the

7SL RNA of the signal recognition particle- this subfamily

of SINES is the Alu repeat

-LTR transposons are flanked by LTRs and contain gag, pol

coding for RT and RNAseH; transposition occurs via the RT in

a cytoplasmic virus-like structure, primed by a tRNA as opposed to

chromosomal priming for SINES. They generated extracellular

retroviruses by acquiring an envelope protein

-DNA transposons resemble bacterial transposons having terminal

Inverted repeats and using cut and paste mechanisms- they are short

lived elements

-Rapid transcription of SINE elements into RNA can only occur

near genes in opened chromatin; SINE RNA can appear in massive

amounts, inhibit PKR, stimulating translation.

-Stress induced SINE transcription which leads to massive

increases in protein translation- mechanism of evolution?

-Y chromosome has lower levels of somatic genes transcribed

and shows lower than expected numbers of Alu elements; the

reverse is true for chromosome 19

-LINE1 and Alu constitute 60% of all interspersed repeat sequence

-LINE1 and Alu are vertically transmitted

-Genomes of the worm, fly, and mustard weed have many more

types of recent active transposons of which LINE and SINE

elements are 5-6%

-The rate of housecleaning through small genomic deletions is

75 fold higher in flies than in mammals

-New spontaneous mutations due to LINES are 30 times more

likely to occur in mice than humans

Lander et. al., Nature 2001, 409, 877

Gene Expression measures mRNA

Microarrays 3,000-6,000 genes at a time

Northern blot- one gene at a time

RNA-sequencing all genome

Promoter activation- gene regulation

Chromatin immunoprecipitation- PCR for specific DNA seqs

Chromatin immunoprecipitation- chip hybridization

Chromatin immunoprecipitation- massive parallel sequencing

cytoplasmJAK2JAK2

nucleus

STAT 5

STAT 1, 3, 5

STAT 5

nucleus

Cross-link DNA-proteins in vivo

(formaldehyde)

Chromatin extraction

Sonication

Incubation with Ab-STAT

Chromatin Immunoprecipitation (ChIP)

Epo treatment (100u/ml), 15 minutes

Cross-link reversion, PK and RNAse A

treatments, purification of DNA

with known targets : PCR

Protein APrecipitation of Ab-STAT/proteins

complexes (protein A sepharose)

Agarose gel

ChIP – seq allows identiciation of DNA sequences physically

bound by a transcription factor. It does not give information

about transcriptional activation.

Microarrays give information about mRNA expression levels

not sequence of mRNAs.

Golub et al., Science 1999, 286, 531.

ALL AML

Golub et al., Science 1999, 286, 531.

RNA-seq gives information about RNA levels and sequence

Methodology

Small RNA Sequencing

Identification of sncRNAs

RNA extraction

Illumina Libraries Preparation

DNA extraction

Illumina Libraries Preparation

OxBS sequencing

Identification of 5hmC

MOABS (MOdel based Analysis

of Bisulfite Sequencing data)

/novel pipeline

miRNA/smallRNA seq

Type of RNA-seq but differ from other RNA-seq in that input material is enriched for small RNAs.

Allows to examine tissue specific expression, disease associations and to discover previously uncharacterized small RNAs.

A combination of ChiP-Seq, 3C (Conformation Capture Assay),

and RNA-seq gives information about chromatin shape, promoter

and enhancer usage, the conseqeunces on gene expression and sequence

of transcripts.

Exome sequencing gives information on the sequence of expressed genes

in a certain cell type/sample, but not about possible deletions and

insertions, copy number variations. It also misses regulatory sequences.

Array CGH approaches (molecular karyotype) gives information about

copy number variations.

Whole genome sequencing gives information about coding and non-

coding sequenced, copy number variations, and will replace exome+

Genomics Informed Medicine - histology.ro · 2018. 5. 10. · ADN, catena sens sau codantă ADN,...

Documents

LA BASE MOLECULAR DE LA HERENCIA · oligonucleótidos de ARN. Posteriormente estos cebadores se alargan en la dirección 5’ -> 3’ por la ADN pol, sintetizando la hebra retardada

ADN C1/ADN D1

CATENA - West Virginia Universitypages.geo.wvu.edu/~kite/CLASS_OF_NATURAL_RIVERS_300.pdf · CATENA Catena 22 (1994) 1699199 A classification of natural rivers David L. Rosgen Wildland

Associate of Applied Science Degree in Nursing (ADN) ADN ... · ADN Student Handbook 1 Associate of Applied Science Degree in Nursing (ADN) ADN Nursing Program . Application Packet

Ácidos nucleicos, nucleótidos y nucleósidos ADN, ARN ... · desoxiribonucleico o ADN y el ácido ribonucleico o ARN Vías de señalización Los nucleótidos actúan como segundos

· PDF fileDSP Merrill L nch Limited NJ Indialnvest Pvt Ltd Mata Securities India Pvt. Ltd ... 1239 ARN- 1308 ARN- 1379 ARN- 1389 ARN- 1390 ARN- 1428 ARN- 1435 ARN- 1480

Catena - Final Project

moon.vn · B. rât phúc tap, chúa hai phân tù ADN. D. rât phúc tap, chúa cå ADN và ARN. Câu 6 [533017]: Liên kêt — NH- CO- giùa các don phân có trong phân tu nào

06_James Catena LiDAR

Catena - LG Electronics

100 Years of Elevating Argentine Wine · BODEGA CATENA ZAPATA HISTORY OF ACCLAIM Wine Spectator Distinguishied Service Award 2012 – Nicolás Catena “Nicolás Catena Zapata has

Synthèse des protéinés Traduction. Transcription contre traduction Transcription: Le processus par lequel l`ADN est convertit en l`ARN Traduction: Le

Adn Arn Procesos Sdbhs Xmo 2012

ANALYSE DES ACIDES NUCLEIQUES - Freeeric.vinas.free.fr/IMG/pdf/methodes_etude_acides... · 1.5.1. Séparation ADN / ARN : centrifugation isopycnique sur gradient de chlorure de césium

Catena- ´facies coluvio

42 nm22 nm37 nm Virus de l'hépatite Bparticules videsVirus de l'hépatite D Ag HBs Ag HBc Ag HD s L polymérase ADN ARN

Ácidos nucleicos, nucleótidos y nucleósidos ADN, ARN ... · •Metabolismo de Ácidos nucleicos. Energía Los nucleótidos son fuentes de transferencia de energía celular. Regulan

ADN Cromatina. Cromosomas - agro.unc.edu.aragro.unc.edu.ar/~genetica/Teorico 1.pdf · Procariotas –Bacterias (ADN) Virus (ADN –ARN) ARN ADN cabeza cola Fibras de la cola Virus

ADN C1/ADN D1 - Sennheiser · The ADN D1 delegate unit and the ADN C1 chairperson unit are part of the Sennheiser ADN conference system. Delivery includes 1 ADN D1 delegate unit or

ADN, ARN, Protéines Réplication, Transcription, …ngyx.eu/onewebmedia/ADN, ARN, Protéines_show.pdf · Transcription / Traduction Gène Y codant pour la Protéine y Transcription