12
Patrick X. Zhao, Ph. D. The Zhao Bioinformatics Lab [email protected] Mapping Affymetrix Medicago GeneChip Probe sets to IMGAG 3.0 Genes

Patrick X. Zhao, Ph. D. The Zhao Bioinformatics Lab Mapping Affymetrix Medicago GeneChip Probe sets to IMGAG 3.0 Genes

Embed Size (px)

DESCRIPTION

Affymetrix GeneChip Probes 5’ UTR EXON-I EXON-IIEXON-III3’ UTR mRNA Probeset: 11 Probes Target Sequence 25-mer Perfect match - PM Mismatch - MM

Citation preview

Page 1: Patrick X. Zhao, Ph. D. The Zhao Bioinformatics Lab Mapping Affymetrix Medicago GeneChip Probe sets to IMGAG 3.0 Genes

Patrick X. Zhao, Ph. D.

The Zhao Bioinformatics Lab

[email protected]

Mapping Affymetrix Medicago GeneChip Probe sets to IMGAG 3.0

Genes

Page 2: Patrick X. Zhao, Ph. D. The Zhao Bioinformatics Lab Mapping Affymetrix Medicago GeneChip Probe sets to IMGAG 3.0 Genes

• About Affymetrix Medicago GeneChip

• Mapping Approach

• Bioinformatics & Data Resources for Medicago IMGAG Release V3

Agenda

Page 3: Patrick X. Zhao, Ph. D. The Zhao Bioinformatics Lab Mapping Affymetrix Medicago GeneChip Probe sets to IMGAG 3.0 Genes

Affymetrix GeneChip Probes

5’ UTR EXON-I EXON-II EXON-III 3’ UTR

mRNA

Probeset: 11 Probes

Target Sequence

25-mer

1 255 10 15 20

1 255 10 15 20

Perfect match - PM

Mismatch - MM

Page 4: Patrick X. Zhao, Ph. D. The Zhao Bioinformatics Lab Mapping Affymetrix Medicago GeneChip Probe sets to IMGAG 3.0 Genes

• id_at:Designates probe sets that uniquely recognize target transcripts

• id_a_at:Designates probe sets that recognize alternative transcripts from the

same gene.• id_s_at:

Designates probe sets with common probes among multiple transcripts from different genes.

• id_x_at: Designates probe sets where it was not possible to select either a

unique probe set or a probe set with identical probes among multiple transcripts. Rules for cross-hybridization were dropped in order to design the _x probe sets. These probe sets share some probes identically with two or more sequences and, therefore, these probe sets may cross-hybridize in an unpredictable manner.

GeneChip® Expression Analysis Data Analysis Fundamentals.

Probeset Types

Page 5: Patrick X. Zhao, Ph. D. The Zhao Bioinformatics Lab Mapping Affymetrix Medicago GeneChip Probe sets to IMGAG 3.0 Genes

About Medicago GeneChipType Num of

probe setsPercent in the Mtr. set

Notes

Unique probe sets: e.g. Mtr.10097.1.S1_at

44182 86.80 Unique to one gene

Alternative (_a_), e.g.: Mtr.10267.1.S1_a_at

116 2.28 Alternative probe sets to one gene

Shared (_s_), e.g. Mtr.10146.1.S1_s_at

4793 9.42 Common to multiple genes

Others (_x_), e.g.:Mtr.10093.1.S1_x_at

1809 3.55 Other probe sets with complicated mapping

Total 50900 100

Page 6: Patrick X. Zhao, Ph. D. The Zhao Bioinformatics Lab Mapping Affymetrix Medicago GeneChip Probe sets to IMGAG 3.0 Genes

Statistics on Original Medicago GeneChip Probe-sets vs. Gene Index V8 Mapping

Num of ESTs

Matching Probeset

Percent

6315 0 17.12

29038 1 78.74

1525 >=2 4.14

36878 100

Page 7: Patrick X. Zhao, Ph. D. The Zhao Bioinformatics Lab Mapping Affymetrix Medicago GeneChip Probe sets to IMGAG 3.0 Genes

• IMGAG V3 gene sequences were matched to corresponding Affymetrix probe sets using a position-weighted scoring index in which mismatches near the middle of a probe were most heavily penalized as follows:

A perfect match for a probe set yields a score of 45

• Matches were declared when at least 8 of 11 probe sets had scores of 43 or higher

Mapping Approach

1 255 10 15 20

[1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,2,2,2,2,2,1,1,1,1,1]

Originated from Affymetrix, Inc.

Page 8: Patrick X. Zhao, Ph. D. The Zhao Bioinformatics Lab Mapping Affymetrix Medicago GeneChip Probe sets to IMGAG 3.0 Genes

Num of EST

Matching probe-set

Percent

3304 0 8.96

29535 1 80.09

4039 >=2 10.95

36878 1000

10

20

30

40

50

60

70

80

90

0 probset 1 probeset 2 probesets

OriginOurs

Overlapping mapping between our Probesets vs. Unigene mapping and the Affy original Probesets vs. Unigene mapping. 37872 ∩ 32108=32106. Our method covered 32106/32108=99.9993% of the Affymetrix original mapping.

Statistics on Our Probesets vs. Gene Index V8 Mapping Results

Page 9: Patrick X. Zhao, Ph. D. The Zhao Bioinformatics Lab Mapping Affymetrix Medicago GeneChip Probe sets to IMGAG 3.0 Genes

Statistics on Our IMGAG V3 vs. Probesets Mapping Results

Num of cDNA Matching probe_set Percent

29755 0 55.70

16384 1 30.67

7284 >=2 13.63

53423 Total 100

Page 10: Patrick X. Zhao, Ph. D. The Zhao Bioinformatics Lab Mapping Affymetrix Medicago GeneChip Probe sets to IMGAG 3.0 Genes

Item Num of probesets

Matched To Percent

1 4860 None 9.552 19709 Unigene only 38.723 24348 19949 Unigene and

unique IMGAGv3

39.19 47.83

4399 Unigene and multiple IMGAGv3

8.64+

4 1983 1698 Unique IMGAGv3 only

3.34 3.90

285 Multiple IMGAGv3 only

0.56++

50900 Total 100

EST 38.72

(47.83)IMGAG

3.90

9.55

Probesets Map to IMGAG V3 and/or Gene Index V9

Page 11: Patrick X. Zhao, Ph. D. The Zhao Bioinformatics Lab Mapping Affymetrix Medicago GeneChip Probe sets to IMGAG 3.0 Genes

Medicago Data and Bioinformatics Resources

• http://bioinfo3.noble.org/medicago

Page 12: Patrick X. Zhao, Ph. D. The Zhao Bioinformatics Lab Mapping Affymetrix Medicago GeneChip Probe sets to IMGAG 3.0 Genes

AcknowledgementZhao LabXinbin DaiRakesh KaundalHaiquan LiJun LiZhaohong ZhuangJoshua Smith

Collaborators:Michael K. UdvardiRick A. DixonKiran K. MysoreRujin ChenChris Town (JCVI)Greg D. May (NCGR)… …