Upload
jack-wilkins
View
215
Download
0
Tags:
Embed Size (px)
Citation preview
Using Exons to Define Isoforms in PRO
Timothy DanfordNovartis Institutes for Biomedical Research
PRO / AlzForum Kickoff MeetingOct. 4, 2011
Genes vs. Proteins• Gene• Transcript• Exon• Locus• Allele• Variant
• SNP• Indel• Rearrangement
• Motif
• Protein• Isoform• Variant• Domain• Site• Complex• Motif• Fragment
Can we join the worlds of PRO and of Genes, at a finer-grained level than that of “full sequence?”
Isoforms in PRO Today
PRO v23 (10/2/2011)
[Term]id: PR:000010173name: microtubule-associated protein taudef: "A protein that is a translation product of the MAPT gene or a 1:1 ortholog thereof." [PRO:DNx]comment: Category=gene. Flag=automatic.synonym: "MAPT" EXACT PRO-short-label []synonym: "neurofibrillary tangle protein" EXACT []synonym: "paired helical filament-tau" EXACT []synonym: "PHF-tau" EXACT []synonym: "MAPTL" RELATED []synonym: "Mtapt" RELATED []synonym: "MTBT1" RELATED []synonym: "TAU" RELATED []is_a: PR:000000001 ! protein
Isoforms in PRO Today
PRO v23 (10/2/2011)
[Term]id: PR:000026993name: microtubule-associated protein tau isoform Fetal-taudef: "A microtubule-associated protein tau that is a translation product of some mRNA giving rise to a protein with the amino acid sequence represented by UniProtKB:P10636-2 or a 1:1 ortholog thereof." [PRO:DAN]comment: Category=sequence.synonym: "Fetal-tau" EXACT []is_a: PR:000010173 ! microtubule-associated protein tau
Isoforms in PRO Today
PRO v23 (10/2/2011)
Isoforms in PRO Today
PRO v23 (10/2/2011)
Digression: Visual Notation
Isoforms in PRO Today
PRO v23 (10/2/2011)
Isoforms in PRO Today
PRO v23 (10/2/2011)
Isoforms in PRO Today
PRO v23 (10/2/2011)
Tau Isoforms Share Functionally-relevant Exons
Fetal Tau
Adult Tau
Slide: Gwen Wong (AlzForum), Image: http://www.med.upenn.edu/cndr/TauSynuclein.shtml
“Conserved Protein Domains in Tau Suggest Functional Differences between Protein Isoforms”
What Questions Could We Askof PRO + Genomic Data?
• Which isoform corresponds to which transcript(s)? • Which isoforms share a common feature?
– common exons? – common domains? (pfam, interpro, etc.)
• Which “normal” protein isoforms overlap with SNPs or other genetic variants? – How do protein sites line up to sites on the gene?
• How do mouse and human proteins correspond?
“Which isoform corresponds with which transcript(s)?”
Transcript Variant: This variant (4) lacks six internal coding exons, as compared to variant 6. The reading frame is not affected, and the resulting isoform (4) has identical N- and C-termini but lacks five segments, as compared to isoform 6.
Define Exons as Parts-of-Proteins
Defined class of Isoforms based on has_part and lacks_part to particular exons
Integrate Existing Isoforms
How is “MAPT Exon 2” defined?
• Take the “exon” definition from SO:0000147– “A region of the transcript sequence within a gene
which is not removed from the primary RNA transcript by RNA splicing.”
• Exon number defined relative to the full-length or “canonical” transcript– “An exon that corresponds (aligns) to the second of 13
exons in the full-length MAPT transcript...”• Define the part of the protein derived from this
portion of the transcript…
What Questions Could We Askof PRO + Genomic Data?
• Which isoform corresponds to which transcript(s)? • Which isoforms share a common feature?
– common exons? – common domains? (pfam, interpro, etc.)
• Which “normal” protein isoforms overlap with SNPs or other genetic variants? – How do protein sites line up to sites on the gene?
• How do mouse and human proteins correspond?
What Questions Could We Askof PRO + Genomic Data?
• Which isoform corresponds to which transcript(s)? • Which isoforms share a common feature?
– common exons? – common domains? (pfam, interpro, etc.)
• Which “normal” protein isoforms overlap with SNPs or other genetic variants? – How do protein sites line up to sites on the gene?
• How do mouse and human proteins correspond?
What Questions Could We Askof PRO + Genomic Data?
• Which isoform corresponds to which transcript(s)? • Which isoforms share a common feature?
– common exons? – common domains? (pfam, interpro, etc.)
• Which “normal” protein isoforms overlap with SNPs or other genetic variants? – How do protein sites line up to sites on the gene?
• How do mouse and human proteins correspond?