68
Tackling Analytical challenges in Cancer proteogenomics using Galaxy framework December 11, 2018 Pratik Jagtap Galaxy-P Team University of Minnesota Slides for the Talk: z.umn.edu/ mumbaislides

Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

Tackling Analytical challenges in Cancer proteogenomics using

Galaxy frameworkDecember 11, 2018

Pratik JagtapGalaxy-P Team

University of Minnesota

Slides for the Talk: z.umn.edu/mumbaislides

Page 2: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

• Introduction to proteogenomics and multi-omic studies

• RNASeq Data Processing: Data Analysis using Galaxy platform

• Proteomics data analysis using Galaxy

• Identification of novel proteoforms and visualization

RNASeq data processing. Generation of protein sequence database.

Sequence database searching and peptide /

protein identification

Results visualization and interpretation

Raw RNA-seq data

Raw MS/MS proteomics data

WORKSHOP STRUCTURE

Page 3: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

MULTI-OMICS

Page 4: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

MULTI-OMICS TECHNOLOGIES

Ruggles et al. Mol Cell Proteomics 2017;16:959-981 © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.

• Next-Gen Sequencing

• RNASeq

• Mass Spectrometry

• Proteogenomics

• Proteo-transcriptomics

• Metaproteomics

• Meta-transcriptomics

• Metabolomics

Page 5: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

LOOKING BEYOND THE KNOWN PROTEOME

Mass spectrumReference Protein Database

from genomic annotation

Cancer / Disease related

Databases such as COSMIC,

IARC p53, OMIM…

Deep genome sequencing data

from ICGC, TCGA and CPTAC

RNASeq data

(Customized OR

Combined)

6-frame DNA

sequences.

3-frame cDNA

sequences.Identification of

peptides

corresponding

to novel proteoforms.

Page 6: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

https://doi.org/10.1007/978-1-4939-7717-8_7

Multiomics / trans-omics

Page 7: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

GALAXYGalaxy Instance for proteogenomics workshop: z.umn.edu/galaxypinmumbai

User will need to register and login in using password onto the site. Step by step instructions for the

workshop are provided in the document below (registration instructions start on page 5).

Documentation for Galaxy instance usage:z.umn.edu/mumbaidocs

Page 8: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

GALAXYGalaxy Instance for proteogenomics workshop: z.umn.edu/proteogenomicsgateway

User will need to register and login in using password onto the site. Step by step instructions for the

workshop are provided in the document below (registration instructions start on page 5).

Documentation for Galaxy instance usage:z.umn.edu/pginnov18

Page 9: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

REGISTER

Page 10: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

IMPORT HISTORY

Page 11: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

IMPORT HISTORY

Page 12: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

INPUT DATA

Page 13: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

DATASET FOR MULTI-OMICS ANALYSIS

Heydarian et al J Proteomics Bioinform. (2014) 17:7. pii: 1000302.

• Mouse cell culture.

• RNA-seq analysis

RNA-seq libraries were sequenced on a HiSeq 2000

(Illumina SY-401–1001) to a read depth of

~90,000,000 single end 97 bp reads per sample.

• iTRAQ-labeling and Mass SpectrometryReversed phase liquid chromatography using Easy-nLCsystem (Thermo Scientific) and analyzed on a LTQ-Orbitrap Elite mass spectrometer (Thermo Scientific).

Page 14: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

Select History 1

Import history

Start using this history

Select Workflow 1

Import workflow

Using the workflow

Run Workflow 1

INPUT

WORKFLOW

GALAXY

OUTPUT

Page 15: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

GALAXY INTERFACE

Left (Tool) Pane

Main Viewing Pane

History Pane

Page 16: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

WORKSHOP WORKFLOWS

Workflow #1

RNA-Seq to Variant

FASTA database

Workflow #2

Database Searching

Using MS/MS data

Workflow #3

Identifying Novel Variants

And Visualization

Page 17: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

Genomic coordinate information

OBJECTIVE OF WORKFLOW 1

Create custom variant database

Page 18: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

Workflow #1

RNA-Seq to Variant

FASTA database

Workflow #2

Database Searching

Using MS/MS data

Workflow #3

Identifying Novel Variants

And Visualization

FASTA SequencesGenome Mapping Information

WORKSHOP WORKFLOWS

Page 19: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

INPUT DATA

Page 20: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

• RNA-Seq FASTQ file : Reads in FASTQ format

• GTF file: Gene Transfer Format • Tabular file to describe genes and related features

• Known protein and contaminant protein sequence FASTA file

• Mass-spectrometry (MGF) file

INPUT DATA

Page 21: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

Select

‘MousePG_Input_History’

Import history

Start using this history

Select

‘MousePG_Workflow1

_RNAseq_Dbcreation’

Import workflow

Using the workflow

Run Workflow 1

INPUT

WORKFLOW

GALAXY

OUTPUT

Page 22: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

IMPORT WORKFLOW

Page 23: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

IMPORT WORKFLOW

Page 24: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

RUNNING A WORKFLOW

Page 25: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

SELECTING INPUT FILES TO RUN A WORKFLOW

Page 26: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

JOB STATUS (HISTORY PANE)

Job in queue Job running Job successful Job failed

Page 27: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

Workflow #1

RNA-Seq to Variant

FASTA database

Workflow #2

Database Searching

Using MS/MS data

Workflow #3

Identifying Novel Variants

And Visualization

WORKSHOP WORKFLOWS

Page 28: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

WORKFLOW #1: RNA-SEQ TO VARIANT PROTEIN

SAV / In-Del Variants

Assembly Workflow

Page 29: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

POTENTIAL NOVEL PEPTIDE IDENTIFICATIONS

5’3’

Exon 1 Exon 2 Exon 8Exon 3 Exon 4 Exon 5 Exon 6 Exon 7

Expressed 5’ UTR

Alternate start

Alternate frame

+2

+1

Novel Exon

Novel Spliceform

Exon extension

Expressed 3’ UTR

/Alternate stop

Intergenic

/Novel gene

+3

+3

*

*

Single amino acid

variant

UTR UTR UTR

CD

S

CDS CDS CDS

CD

S

CDS

CD

S

Sta

rt Sto

p

+2Known

Peptides +2

InDels A

Page 30: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

RNA-SEQ TO FASTA DATABASE CREATION

RNA-Seq

FASTQ

HISAT

Alignment tool

STRINGTIE

RNA-Seq to transcripts

GFF COMPARE

Translate transcripts

FREEBAYES

CustomPro DB

Sequence

FASTA

GTF

Variant Calling

● Variant annotation

● Genome mapping

Evaluates the assembly with

annotated transcripts

Mapping

Files

Genome

SAV / In-Del Variants

Assembly Workflow

Page 31: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

RNA-SEQ TO FASTA DATABASE CREATION

RNA-Seq

FASTQ

HISAT

Alignment tool

STRINGTIE

RNA-Seq to transcripts

GFF COMPARE

Translate transcripts

FREEBAYES

CustomPro DB

Sequence

FASTA

GTF

Variant Calling

● Variant annotation

● Genome mapping

Evaluates the assembly with

annotated transcripts

Mapping

Files

Genome

SAV / In-Del Variants

Page 32: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

ALIGNMENT

Mapping to gene/genome

Reference gene/genome

HISAT2: Outputs BAM file (Dataset #9)Kim D., Langmead B. and Salzberg S.L. HISAT: a fast spliced aligner with low memory requirements. Nature Methods (2015)

RNASeq reads

Page 33: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

VARIANT CALLING

Mapping to gene/genome

Reference gene/genome

FreeBayes : Outputs VCF file (Dataset #14)Garrison E., Marth G. Haplotype-based variant detection from short-read sequencing. (arXiv:1207.3907)

RNASeq reads

Page 34: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

VIEWING SNP VARIANT IN IGV

Page 35: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

RNA-SEQ TO FASTA DATABASE CREATION

RNA-Seq

FASTQ

HISAT

Alignment tool

STRINGTIE

RNA-Seq to transcripts

GFF COMPARE

Translate transcripts

FREEBAYES

CustomPro DB

Sequence

FASTA

GTF

Variant Calling

● Variant annotation

● Genome mapping

Evaluates the assembly with

annotated transcripts

Mapping

Files

Genome

SAV / In-Del Variants

Page 36: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

CustomProDB

Wang X., Zhang B. customProDB: an R package to generate customized protein databases from RNA-Seq data for proteomics search. Bioinformatics (2013)

Reference gene/genome

Original Protein Variant ProteinTranslate Translate

FASTA Sequence Variant FASTA Sequence

Page 37: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

RNA-SEQ TO FASTA DATABASE CREATION

RNA-Seq

FASTQ

HISAT

Alignment tool

STRINGTIE

RNA-Seq to transcripts

GFF COMPARE

Translate transcripts

FREEBAYES

CustomPro DB

Sequence

FASTA

GTF

Variant Calling

● Variant annotation

● Genome mapping

Evaluates the assembly with

annotated transcripts

Mapping

Files

Genome

Assembly Workflow

Page 38: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

ALIGNMENT

Mapping to gene/genome

Reference gene/genome

Page 39: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

TRANSCRIPT ASSEMBLY

Mapping to gene/genome

Reference gene/genome

Assembled Transcript

Splicing

3-Frames Translation FASTA Sequence

Page 40: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

RNA-SEQ TO FASTA DATABASE CREATION

RNA-Seq

FASTQ

HISAT

Alignment tool

STRINGTIE

RNA-Seq to transcripts

GFF COMPARE

Translates novel

transcripts

FREEBAYES

CustomPro DB

Sequence

FASTA

GTF

Variant Calling

● Variant annotation

● Genome mapping

Evaluates the assembly with

annotated transcripts

Mapping

Files

Genome

SAV / In-Del Variants

Assembly Workflow

Page 41: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

OUTPUTS

>generic|ENSMUSP00000107433|Erp29|ER protein 29

MAAAAGVSGAASLSPLLSVLLGLLLLFAPHGGSGLHTKGALPLDTVTFYKSRLLLGP

>generic|ENSMUSP00000120715|Rps2|ribosomal protein S2

MADDAGAAGGPGGPGGPGLGGRGGFRGGFGSGLRGRGRGRGRGRGRGRGARGGKAEDKEWIPVTKLGRLVKDMKIKSLEEIY

LFSLPIKESEIIDFFLGASLKDEVLKIMPVQKQTRAGQR

ENSMUSP00000107433 chr5 121452190 121452340 – 0 150

ENSMUSP00000107433 chr5 121449139 121449163 – 150 174

ENSMUSP00000120715 chr17 24720275 24720452 + 0 177

ENSMUSP00000120715 chr17 24720533 24720731 + 177 375

ENSMUSP00000120715 chr17 24720968 24721302 + 375 709

ENSMUSP00000120715 chr17 24721622 24721727 + 709 814

ENSMUSP00000120715 chr17 24721802 24721897 + 814 909

FASTA Sequence File

Genomic Mapping File

Page 42: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

Workflow #1

RNA-Seq to Variant

FASTA database

Workflow #2

Database Searching

Using MS/MS data

Workflow #3

Identifying Novel Variants

And Visualization

FASTA SequencesGenome Mapping Information

WORKSHOP WORKFLOWS

Page 43: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

SNAPSHOT OF WHAT HISTORY LOOKS LIKE AT THIS STAGE

Page 44: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

PROTEOMICS DATA ANALYSIS USING GALAXY

Protein FASTA: reference proteins + potential variants

Peaklist of MS/MS data

Multiple algorithms for matching MS/MS to peptides

Organization and scoring of peptide spectral matches (PSMs)

Generation of an sqLite database for downstream data visualization and filtering

Putative variant peptide sequences for further verification and analysis

Proteomics. 11:996-9Nat Biotechnol. 33:22-4

Page 45: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

Mass Spectrometry and Proteomics

Page 46: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)
Page 47: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

Vaudel et al. Nature Biotechnol. 2015, 33:22–24.Vaudel et al. J Proteome Res. 2018, doi: 10.1021/acs.jproteome.8b00175.

• Bundles a multiple freely-available algorithms for matching MS/MS to peptide sequences

• Infers proteins from peptide sequence matches

• Assigns confidence scores to peptide sequence matches and inferred proteins

• Provides outputs in standard formats (e.g. mzidentML) for further processing

Page 48: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)
Page 49: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

WORKSHOP WORKFLOWS

Workflow #1

RNA-Seq to Variant

FASTA database

Workflow #2

Database Searching

Using MS/MS data

Workflow #3

Identifying Novel Variants

And Visualization

Page 50: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

YOUR CURRENT HISTORY

Page 51: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

In order to access the input for this part of the workshop, Click on “Shared Data”→ “Histories”→“ MousePG_History2”. And click on Import History.

IF NOT…

Page 52: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

Select ‘MousePG_Workflow3_Novel_peptide_analysis’

Import workflow

Start using this workflow

Run Workflow

ACTIVE HISTORY

FROM EARLIER

WORKFLOW

WORKFLOW

Page 53: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

WORKFLOW FOR THIS SECTION

Workflow 2

Workflow 2

Workflow 1

Page 54: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

WORKFLOW FOR THIS SECTION

Workshop Documentation: z.umn.edu/galaxypinmumbai5.2 BlastP analysis 325.3 Novel proteoform analysis 335.4 Using Multi-omics Visualization Platform for visualizing novel proteoforms 35

Page 55: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

SELECT DISTINCT PSM.*FROM PSM JOIN BLAST ON PSM.SEQUENCE =BLAST.QSEQID

WHERE BLAST.PIDENT < 100 OR BLAST.GAPOPEN

>= 1 OR BLAST.LENGTH < BLAST.QLEN

ORDER BY PSM.SEQUENCE, PSM.ID

BLASTP ANALYSIS

Page 56: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

MULTI-OMICS VISUALIZATION PLATFORM FOR VISUALIZING NOVEL PROTEOFORMS

Page 57: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

MULTI-OMICS VISUALIZATION PLATFORM FOR VISUALIZING NOVEL PROTEOFORMS

SPECTRAL QUALITY VISUALIZATION (Lorikeet Viewer)

GENOMIC LOCALIZATION (Integrated Genomics Viewer)

Page 58: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

ESSREALVEPTSESPRPALAR

Page 59: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

GENOMIC LOCALIZATION (INTEGRATED GENOMICS VIEWER)

Page 60: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

NOVEL PROTEOFORM ANALYSIS

Page 61: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

UCSC GENOME BROWSER

Page 62: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

CDART BLAST SEARCH

Page 63: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

PROJECT OVERVIEW

Page 64: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

GO AND TRY IT OUT!Galaxy Instance for proteogenomics workshop: z.umn.edu/galaxypinmumbai

User will need to register and login in using password onto the site. Step by step instructions for the

workshop are provided in the document below (registration instructions start on page 5).

Documentation for Galaxy instance usage:z.umn.edu/mumbaidocs

Page 65: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

GO AND TRY IT OUT!

GALAXY INSTANCE ONE

· Galaxy Instance for proteogenomics workshop: z.umn.edu/galaxypinmumbai

User will need to register and login in using password onto the site. Step by step instructions for the workshop

are provided in the document below (registration instructions start on page 5).

· Documentation for Galaxy instance usage: z.umn.edu/mumbaidocs

GALAXY INSTANCE TWO (Back up if GALAXY INSTANCE ONE gets busy)

· Proteogenomics Gateway: z.umn.edu/proteogenomicsgateway

User will need to register and login in using password onto the site. Step by step instructions for the workshop

are provided in the document below (registration instructions start on page 5).

· Documentation for Galaxy instance usage: z.umn.edu/pginnov18

Page 66: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

• Instructors• Pratik Jagtap

• Support• Praveen Kumar• Prof. Timothy Griffin Galaxy-P team (University of Minnesota)• Subina Mehta• James Johnson and Thomas McGowan (University of Minnesota)• Matthew Chambers• Jetstream Cloud at Indiana University

• Funding

WORKSHOP INSTRUCTORS AND ACKNOWLEDGEMENTS

Page 67: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

Minnesota Supercomputing InstituteJames JohnsonThomas McGowanLee ParsonsMichael Milligan

Ira CookeMelbourne , Australia

University of MinnesotaTimothy GriffinPratik JagtapPraveen KumarCandace GuerreroSubina MehtaAdrian Hegeman (Co-I)Art EschenlauerShane HublerRay SajulgaCaleb EasterlyAndrew Rajczewski

Biologists / collaboratorsLaurie ParkerJoel RudneyManeesh BhargavaAmy SkubitzChris WendtBrian CrookerSteven FriedenbergKevin VikenKristin BoylanMarnie PetersonSomiah AfiuniBrian SandriAlexa PragmanWanda WeberAmy Treeful

Harald BarsnesMarc VaudelUniversity of Bergen, Norway

University of Freiburg,Freiburg, Germany

VIB, UGhent, Belgium

Judson HerveyNaval Research InstituteWashington, D.C.

Matt ChambersNashville, TN

Alessandro TancaPorto Conte Ricerche, Italy

CarolinKolmederUniversity of Helsinki, Finland

Thilo MuthBernhard RenardRobert Koch Institut

Thomas DoakJeremy Fisher Indiana University

Josh EliasStanford University

Brook NunnU of Washington

Lennart Martens (Co-I)Bart MesuereRobbert G Singh

Bjoern GrueningBérénice Batut

Lloyd Smith (Co-I)Michael ShortreedUW-Madison

Karen ReddyMo HeydarianJohns Hopkins UniversityFunding

Anamika KrishanpalPriyabrata PanigrahiPersistent Systems Limited

Stephan KangIntero Life Sciences

galaxyp.org

FundingACKNOWLEDGMENTS

Page 68: Tackling Analytical challenges in Cancer proteogenomics using …galaxyp.org/wp-content/uploads/2019/01/CancerProteogenomics_IIT… · Heydarian et al J Proteomics Bioinform. (2014)

QUESTIONS?

Follow us on twitter.com/usegalaxyp

Workshop Documentation: z.umn.edu/galaxypinmumbai

Slides for the Talk: z.umn.edu/mumbaislides

Visit: http://galaxyp.org

Feedback: https://z.umn.edu/fbindia