36
1 ECT- Key dates t day to decided on a project * resenting a proposed project in small groups rt presentation (Max 5 minutes) Title- Background Main question Major tools you are planning to use to answer th submission

1 FINAL PROJECT- Key dates 31.12 –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max

Embed Size (px)

Citation preview

Page 1: 1 FINAL PROJECT- Key dates 31.12 –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max

1

FINAL PROJECT- Key dates

31.12 –last day to decided on a project *

11-10/1- Presenting a proposed project in small groupsA very short presentation (Max 5 minutes) Title- Background Main question Major tools you are planning to use to answer the questions

1.3 Final submission

Page 2: 1 FINAL PROJECT- Key dates 31.12 –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max

Gene Expression Analysis

Page 3: 1 FINAL PROJECT- Key dates 31.12 –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max

Studying Gene Expression 1987-2010

3

Spotted microarray

One channel microarray

RNA profiling- Next Generation Sequencing

Page 4: 1 FINAL PROJECT- Key dates 31.12 –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max

4

Applications

• Identify gene function– Similar expression can infer similar function

• Find tissue/developmental specific genes– Different expression in different cells/tissues

• Find genes affected by different conditions– Different expression under different conditions

• Diagnostics– Different genes expression can indicate a disease state

Page 5: 1 FINAL PROJECT- Key dates 31.12 –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max

5

Different types of microarray technologies1. Spotted Microarray

Two channel cDNA microarrays.

2. DNA Chips

One Channel microarrays

(Affymetrix, Agilent),

Page 6: 1 FINAL PROJECT- Key dates 31.12 –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max

6

http://www.bio.davidson.edu/Courses/genomics/chip/chip.html

Microarray Experiment

Page 7: 1 FINAL PROJECT- Key dates 31.12 –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max

7

Experimental Protocol Two channel cDNA arrays

1. Design an experiment

(probe design)

2. Extract RNA molecules from cell

3. Label molecules with fluorescent dye

4. Pour solution onto microarray

– Then wash off excess molecules

5. Shine laser light onto array

– Scan for presence of fluorescent dye

6. Analyze the microarray image

Page 8: 1 FINAL PROJECT- Key dates 31.12 –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max

8

Analyzing Microarray Images

Original Image

One geneor mRNA

One tissue or condition

Page 9: 1 FINAL PROJECT- Key dates 31.12 –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max

9Cy3 Cy5Cy5Cy3

Cy5log2 Cy3

The ratio of expression is indicated by the intensity of the colorRed= High mRNA abundance in the experiment sample Green= High mRNA abundance in the control sample

Transforming raw data to ratio of expression

Page 10: 1 FINAL PROJECT- Key dates 31.12 –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max

10Cy3 Cy5Cy5Cy3

Cy5log2 Cy3

The ratio of expression is indicated by the intensity of the colorRed= High mRNA abundance in the experiment sample Green= High mRNA abundance in the control sample

Transforming raw data to ratio of expression

Page 11: 1 FINAL PROJECT- Key dates 31.12 –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max

11

Expression Data Format

cold normal hotuch1 -2.0 0.0 0.924 gut2 0.398 0.402 -1.329 fip1 0.225 0.225 -2.151 msh1 0.676 0.685 -0.564 vma2 0.41 0.414 -1.285 meu26 0.353 0.286 -1.503 git8 0.47 0.47 -1.088 sec7b 0.39 0.395 -1.358 apn1 0.681 0.636 -0.555 wos2 0.902 0.904 -0.149

Conditions

Gen

es /

mR

NA

s

Page 12: 1 FINAL PROJECT- Key dates 31.12 –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max

12

One channel DNA chips

• Each sequence is represented by a probe set • 1 probe set = N probes (Affymetrix 16 probes of length

25 mer).• Unknown sequence or mixture (target)

colored with on\e fluorescent dye.• Target hybridizes to complimentary probes only• The fluorescence intensity is indicative of the

expression of the target sequence

Page 13: 1 FINAL PROJECT- Key dates 31.12 –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max

13

Affymetrix Chip

Page 14: 1 FINAL PROJECT- Key dates 31.12 –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max

14

• Spotted arrays – o Longer probes (~70), more stable reactionso Easy to make in the lab (by reverse transcription)o Highly specific

• DNA chipso More sensitive (higher density)o More coverageo Enable more flexible designs (e.g differentially

measuring splice variants)

Pros and cons

Page 15: 1 FINAL PROJECT- Key dates 31.12 –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max

15

Designing probes for microarray experiments

• Probe on DNA chip is shorter than target– Choice of which section to hybridize

• Select a region which is unstructured– RNA folding, DNA stem-and-loop

• Choose region which is target-specific– Avoid cross-hybridization with other DNA

• Avoid regions containing variation– Minimize presence of mutation sites

Page 16: 1 FINAL PROJECT- Key dates 31.12 –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max

16

Probe DesignTwo main factors to optimize

• Sensitivity– Strength of interaction with target sequence– Requires knowledge of target only

• Specificity– Weakness of interaction with other sequences– Requires knowledge of ‘background’

Page 17: 1 FINAL PROJECT- Key dates 31.12 –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max

17

Sources of Inaccuracy

• Some sequences bind better than others– A–T versus G–C

• Low complexity sequences - Cross-hybridization

• Effects of experimental conditions– temperature

Page 18: 1 FINAL PROJECT- Key dates 31.12 –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max

18

Splicing Specific Microarrays

Pre-mRNA mRNA

Total transcript level

+

Page 19: 1 FINAL PROJECT- Key dates 31.12 –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max

19

Microarray Analysis

• Unsupervised-Partion Methods

K-meansSOM (Self Organizing Maps)

-Hierarchical Clustering

• Supervised Methods-Analysis of variance-Discriminate analysis-Support Vector Machine (SVM)

Page 20: 1 FINAL PROJECT- Key dates 31.12 –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max

20

Clustering• Grouping genes together according to

their expression profiles.• Hierarchical clustering

Michael Eisen, 1998 :

Generate a tree based on similarity

(similar to a phylogenetic tree)– Each gene is a leaf on the tree

– Distances reflect similarity of expression

– Internal nodes represent functional groups

Page 21: 1 FINAL PROJECT- Key dates 31.12 –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max

Results of Clustering Gene Expression

Limitations:– Hierarchical

clustering in general is not robust

– Genes may belong to more than one cluster

Page 22: 1 FINAL PROJECT- Key dates 31.12 –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max

Clustering

22

Genes are clustered according to similar expression patterns

Self Organizing Maps

Page 23: 1 FINAL PROJECT- Key dates 31.12 –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max

23

What can we learn from clusters with similar gene expression ??

• Similar expression between genes– One gene controls the other in a pathway– Both genes are controlled by another– Both genes required at the same time in cell

cycle– Both genes have similar function

• Clusters can help identify regulatory motifs– Search for motifs in upstream promoter regions

of all the genes in a cluster

Page 24: 1 FINAL PROJECT- Key dates 31.12 –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max

Normalizedexpression datafrom microarrays

Experiment 1

Exp

erim

ent 2

Expe

rimen

t 3

Finding Regulatory Motifs Within Expression Clusters

Search promoter regions for shared sequence motifs.

Page 25: 1 FINAL PROJECT- Key dates 31.12 –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max

25

EXAMPLE

HnRNPA1 and SRp40have a similar gene expression pattern in different tissues

Page 26: 1 FINAL PROJECT- Key dates 31.12 –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max

Are they regulated by the same transcription factor ?

26

hnrnpA1 promoter

SRp40 promoters

Common motif

1. Extract their promoter regions

2. Find a common motif in both sequences (MEME)

3. Identify the transcription factor related to the motif

Page 27: 1 FINAL PROJECT- Key dates 31.12 –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max

27

How can we use microarray for diagnostics?

Page 28: 1 FINAL PROJECT- Key dates 31.12 –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max

28

How can microarrays be used as a basis for diagnostic?

patient 1

patient 2

patient 3

patient4

patient 5

Gen1 + - - + +Gen2 + + - + -Gen3 - + + + -Gen4 + + + - -Gen5 - - + - +

Page 29: 1 FINAL PROJECT- Key dates 31.12 –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max

Informative Genes

29

Differentially expressed in the two classes.

Goal Identifying (statistically significant) informative

genes

Page 30: 1 FINAL PROJECT- Key dates 31.12 –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max

30

How can microarrays be used as a basis for diagnostic?

patinet1

patient 2

patient4

patient 3

patient 5

Gen1 + - + - +Gen3 - + + + -Gen4 + + - + -Gen2 + + + - -Gen5 - - - + +

InformativeGenes

Page 31: 1 FINAL PROJECT- Key dates 31.12 –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max

31

Specific Examples

Cancer Research

Ramaswamy et al, 2003Nat Genet 33:49-54

Hundreds of genesthat differentiate betweencancer tissues in differentstages of the tumor were found.The arrow shows an exampleof a tumor cells which were not detected correctly byhistological or other clinical parameters.

Page 32: 1 FINAL PROJECT- Key dates 31.12 –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max

32

Supervised approchesfor predicting gene function based on microarray data

• SVM would begin with a set of genes that have a common function (red dots), In addition, a separate set of genes that are known not to be members of the functional class (blue dots) are specified.

Page 33: 1 FINAL PROJECT- Key dates 31.12 –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max

33

• Using this training set, an SVM would learn to discriminate between the members and non-members of a

given functional class based on expression data.

• Having learned the expression features of the class, the SVM could recognize new genes as members or as non-members of the class based on their expression data.

?

Page 34: 1 FINAL PROJECT- Key dates 31.12 –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max

34

Using SVMs to diagnose tumors based on expression dataEach dot represents a vector of the expression pattern taken from a microarray experiment . For example the expression pattern of all genes from a cancer patients.

Page 35: 1 FINAL PROJECT- Key dates 31.12 –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max

35

How do SVM’s work with expression data?In this example red dots can be primary tumors and blue arefrom metastasis stage.The SVM is trained on data which was classified based on histology.

?

After training the SVM we can use it to diagnose the unknown tumor.

Page 36: 1 FINAL PROJECT- Key dates 31.12 –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max

36

Gene Expression Databasesand Resources on the Web

• GEO Gene Expression Omnibus- http://www.ncbi.nlm.nih.gov/geo/

• List of gene expression web resources– http://industry.ebi.ac.uk/~alan/MicroArray/

• Another list with literature references– http://www.gene-chips.com/

• Cancer Gene Anatomy Project– http://cgap.nci.nih.gov/

• Stanford Microarray Database– http://genome-www.stanford.edu/microarray/