49
Proteomics Informatics – Protein characterization: post- translational modifications and protein- protein interactions (Week 10)

Proteomics Informatics –

  • Upload
    lilike

  • View
    54

  • Download
    1

Embed Size (px)

DESCRIPTION

Proteomics Informatics – Protein characterization: post-translational modifications and protein-protein interactions  (Week 10). Top down / bottom up. Top down. Bottom up. intensity. mass/charge. Charge distribution. Top down Bottom up. 2+. 27+. 31+. 3+. - PowerPoint PPT Presentation

Citation preview

Page 1: Proteomics Informatics –

Proteomics Informatics – Protein characterization: post-translational

modifications and protein-protein interactions (Week 10)

Page 2: Proteomics Informatics –

Top down / bottom up

Top down

Bottom up

mass/charge

inte

nsity

Page 3: Proteomics Informatics –

Top down Bottom up

Charge distribution

mass/chargein

tens

itymass/charge

inte

nsity

1+

2+

3+

4+

27+

31+

Page 4: Proteomics Informatics –

Top down Bottom upm = 1035 Da m = 1878 Da m = 2234 Da

Isotope distribution

mass/chargein

tens

itymass/charge

inte

nsity

Page 5: Proteomics Informatics –

Fragmentation

Top down Bottom up

Fragmentation

Page 6: Proteomics Informatics –

Correlations between modifications

Top down

Bottom up

Page 7: Proteomics Informatics –

Alternative Splicing

Top down

Bottom up

Exon 1 2 3

Page 8: Proteomics Informatics –

Top down

Kellie et al., Molecular BioSystems 2010

Proteinmass

spectraFragment

mass spectra

Page 9: Proteomics Informatics –

Protein Complexes

AB

AC

D

Digestion

Mass spectrometry

Page 10: Proteomics Informatics –

Sowa et al., Cell 2009

Protein Complexes – specific/non-specific binding

Page 11: Proteomics Informatics –

Protein Complexes – specific/non-specific binding

Choi et al., Nature Methods 2010

Page 12: Proteomics Informatics –

Tackett et al. JPR 2005

Protein Complexes – specific/non-specific binding

Page 13: Proteomics Informatics –

Analysis of Non-Covalent Protein Complexes

Taverner et al., Acc Chem Res 2008

Page 14: Proteomics Informatics –

Non-Covalent Protein Complexes

Schreiber et al., Nature 2011

Page 15: Proteomics Informatics –

More / better quality interactions

Affinity Capture Optimization Screen

+

Cell extraction

Lysate clearance/Batch Binding

Binding/Washing/Eluting

SDS-PAGE

Filtration

LaCava, Hakhverdyan, Domanski, Rout

Page 16: Proteomics Informatics –

Over 20 different extraction and washing conditions ~ 10 years or art.(41 pullouts are shown)

Molecular Architecture of the NPC

Actual model Alber F. et al. Nature (450) 683-694. 2007 Alber F. et al. Nature (450) 695-700. 2007

Page 17: Proteomics Informatics –

Cloning nanobodies for GFP pullouts

• Atypical heavy chain-only IgG antibody produced in camelid family – retain high affinity for antigen without light chain

• Aimed to clone individual single-domain VHH antibodies against GFP – only ~15 kDa, can be recombinantly expressed, used as bait for pullouts, etc.

• To identify full repertoire, will identify GFP binders through combination of high-throughput DNA sequencing and mass spectrometry

VHH clone for recombinant expression

Page 18: Proteomics Informatics –

Cloning llamabodies for GFP pullouts

Llama GFP immunization

Lymphocytetotal RNA

Crude serum

VHH amplicon

454 DNA sequencing

RT / Nested PCRIgG fractionation &

GFP affinity purification

VHH DNA sequence library

GFP-specific VHH fraction

LC-MS/MS

GFP-specific VHH clones

Bone marrow aspiration Serum bleed

500400300

1000 bp

0

100,000

200,000

300,000

400,000

500,000

No.

of R

eads

Read length (bp)

VH

VHH

Fridy, Li, Keegan, Chait, Rout

Page 19: Proteomics Informatics –

CDR3: 100.0% (14/14); combined CDR: 100.0% (33/33); DNA count: 10MAQVQLVESGGGLVQAGGSLRLSCVASGRTFSGYAMGWFRQTPGREREAVAAITWSAHSTYYSDSVKDRFTISIDNTRNTGYLQMNSLKPEDTAVYYCTVRHGTWFTTSRYWTDWGQGTQVTVS

CDR3: 100.0% (14/14); combined CDR: 72.7% (24/33; DNA count: 1MADVQLVESGGGLVQSGGSRTLSCAASGRVLATYHLGWFRQSPGREREAVAAITWSAHSTYYSDSVKGRFTISIDNARNTGYLQMNSLKPEDTAVYYCTVRHGTWFTVSRYWTDWGQGTQVTVS

CDR3: 100.0% (14/14); combined CDR: 72.7% (24/33); DNA count: 1MAQVQLVESGGALVQAGASLSVSCAASGGTISKYNMAWFRRAPGREREAVAAITWSAHSTYYSDSVKDRFTISIDNTRNTGYLQMNSLKPEDTAVYYCTVRHGTWFTTSRYWTDWGQGTQVTVS

CDR3: 100.0% (14/14); combined CDR: 42.4% (14/33); DNA count: 1MAQVQLEESGGGLVQAGDSLTLSCSASGRTFTNYAMAWSRQAPGKERELLAAIDAAGGATYYSDSVKGRFTISIDNTRNTGYLQMNSLKPEDTAVYYCTVRHGTWFTTSRYWTDWGQGTQVTVS

CDR3: 100.0% (14/14); combined CDR: 42.4% (14/33); DNA count: 1MAQVQLVESGGGRVQAGGSLTLSCVGSEGIFWNHVMGWFRQSPGKDREFVARISKIGGTTNYADSVKGRFTISIDNTRNTGYLQMNSLKPEDTAVYYCTVRHGTWFTTSRYWTDWGQGTQVTVS

CDR1 CDR2 CDR3

Underlined regions are covered by MS

Rank sequences according to:CDR3 coverage; Overall coverage;Combined CDR coverage; DNA counts;

Identifying full-length sequences from peptides

Page 20: Proteomics Informatics –

Sequence diversity of 26 verified anti-GFP nanobodies

• Of ~200 positive sequence hits, 44 high confidence clones were synthesized and tested for expression and GFP binding: 26 were confirmed GFP binders.

• Sequences have characteristic conserved VHH residues, but significant diversity in CDR regions.

FR1 CDR1 FR2 CDR2 CDR3FR3 FR4

Page 21: Proteomics Informatics –

HIV-1

gp120Lipid Bilayer gp41

MA

CA

NC

PRIN

RT

RNA

Particle

Genome

env

rev

vpu

tat

nef

3’ LTR5’ LTR

vif gagpol

vpr

CAMA NC p6

PR RT IN

gp41gp120

9,200 nucleotides

Page 22: Proteomics Informatics –

Genetic-Proteomic Approach

Tagged Viral Protein

Tag

Protein ComplexSDS-PAGE

*

Mass Spectrometry

Page 23: Proteomics Informatics –

I-Dirt for Specific Interaction

3xFLAG Tagged HIV-1 WT HIV-1

Infection

Light Heavy (13C labeled Lys, Arg)

1:1 Mix

Immunoisolation

MS

I-DIRT = Isotopic Differentiation of Interactions as Random or Targeted

Lys Arg(+6 daltons)(+6 daltons)

Modified from Tackett AJ et al., J Proteome Res. (2005) 4, 1752-6.

Page 24: Proteomics Informatics –

IDIRT and Reverse IDIRT

0.40

0.50

0.60

0.70

0.80

0.90

1.00

0.40 0.50 0.60 0.70 0.80 0.90 1.00

Spec

ifici

ty, R

erve

rse

Specificity, Forward

gp160 IDIRT: Forward-Reverse Ratio Comparison

Env-3xFLAG Vif-3xFLAG

Luo, Jacobs, Greco, Cristae, Muesing, Chait, Rout

Page 25: Proteomics Informatics –

Protein Exchange

Vif-3F

Heavy labeled Vif-3F lysate

IP in heavy labeled Vif-3F lysate

Vif-3F

Light labeled wt lysate

Incubation with light labeled wt lysate

Vif-3F

15min

Vif-3F

5min

Stable Interactor

Vif-3F

Interactor with fast exchange

60min

Page 26: Proteomics Informatics –

Env Time Course SILAC

• Differentially labeled infection harvested at early or late stage of infection

• Distinguish proteins that interact with Env at early or late stage during infection

Early during infection Late during infection

LightHeavy (13C labeled Lys, Arg)

1:1 Mix

Immunoisolation

MS

Early interactor Late interactor

Page 27: Proteomics Informatics –

M/Z

PeptidesFragments

Fragmentation

ProteolyticPeptides

Enzymatic Digestion

ProteinComplex

Chemical Cross-Linking

MS

MS/MS

Isolation

Cross-LinkedProtein Complex

Interaction Partners by Chemical Cross-Linking

Page 28: Proteomics Informatics –

M/Z

PeptidesFragments

Fragmentation

ProteolyticPeptides

Enzymatic Digestion

ProteinComplex

Chemical Cross-Linking

MS

MS/MS

Isolation

Cross-LinkedProtein Complex

Interaction Sites by Chemical Cross-Linking

Page 29: Proteomics Informatics –

Cross-linking

protein

n peptides with reactive groups

(n-1)n/2 potential ways to cross-link peptides pairwise+ many additional uninformative forms

Protein A + IgG heavy chain 990 possible peptide pairs

Yeast NPC ˜106 possible peptide pairs

Page 30: Proteomics Informatics –

Protein Crosslinking by Formaldehyde

~1% w/v Fal20 – 60 min

~0.3% w/v Fal5 – 20 min1/100 the volume

LaCava

Page 31: Proteomics Informatics –

Protein Crosslinking by Formaldehyde

RED: triplicate experiments, FAl treated grindateBLACK: duplicated experiments, FAl treated cells (then ground)

SCORE: Log Ion Current / Log protein abundance Akgöl, LaCava, Rout

Page 32: Proteomics Informatics –

Cross-linkingMass spectrometers have a limited dynamic range and it therefore important to limit the number of possible reactions not to dilute the cross-linked peptides.

For identification of a cross-linked peptide pair, both peptides have to be sufficiently long and required to give informative fragmentation.

High mass accuracy MS/MS is recommended because the spectrum will be a mixture of fragment ions from two peptides.

Because the cross-linked peptides are often large, CAD is not ideal, but instead ETD is recommended.

Page 33: Proteomics Informatics –

0

0.2

0.4

0.6

0.8

1

1.2

0 5 10 15 20 25Number of fragment ions

Pro

babi

lity

of L

ocal

izat

ion

Phosphopeptide identification

mprecursor = 2000 DaDmprecursor = 1 DaDmfragment = 0.5 DaPhosphorylation

Localization of modifications

Page 34: Proteomics Informatics –

0

0.2

0.4

0.6

0.8

1

1.2

0 5 10 15 20 25

Prob

abili

ty o

f Loc

aliz

atio

n

Number of fragment ions

ID3

Localization (dmin=3)

mprecursor = 2000 DaDmprecursor = 1 DaDmfragment = 0.5 DaPhosphorylation

dmin>=3 for 47% of human tryptic peptides

Localization of modifications

Page 35: Proteomics Informatics –

0

0.2

0.4

0.6

0.8

1

1.2

0 5 10 15 20 25

Prob

abili

ty o

f Loc

aliz

atio

n

Number of fragment ions

ID32

Localization (dmin=2)

mprecursor = 2000 DaDmprecursor = 1 DaDmfragment = 0.5 DaPhosphorylation

dmin=2 for 33% of human tryptic peptides

Localization of modifications

Page 36: Proteomics Informatics –

0

0.2

0.4

0.6

0.8

1

1.2

0 5 10 15 20 25

Prob

abili

ty o

f Loc

aliz

atio

n

Number of fragment ions

ID321

Localization (dmin=1)

mprecursor = 2000 DaDmprecursor = 1 DaDmfragment = 0.5 DaPhosphorylation

dmin=1 for 20% of human tryptic peptides

Localization of modifications

Page 37: Proteomics Informatics –

0

0.2

0.4

0.6

0.8

1

1.2

0 5 10 15 20 25

Prob

abili

ty o

f Loc

aliz

atio

n

Number of fragment ions

ID3211*

Localization(d=1*)

mprecursor = 2000 DaDmprecursor = 1 DaDmfragment = 0.5 DaPhosphorylation

Localization of modifications

Page 38: Proteomics Informatics –

Peptide with two possible modification sites

Localization of modifications

Page 39: Proteomics Informatics –

Peptide with two possible modification sites

MS/MS spectrum

m/z

Inte

nsity

Localization of modifications

Page 40: Proteomics Informatics –

Peptide with two possible modification sites

MS/MS spectrum

m/z

Inte

nsity

Matching

Localization of modifications

Page 41: Proteomics Informatics –

Peptide with two possible modification sites

MS/MS spectrum

m/z

Inte

nsity

Matching

Which assignment doesthe data support?

1, 1 or 2, or 1 and 2?

Localization of modifications

Page 42: Proteomics Informatics –

AAYYQK

Visualization of evidence for localization

AAYYQK

Page 43: Proteomics Informatics –

Visualization of evidence for localization

AAYYQK

AAYYQK

Page 44: Proteomics Informatics –

Visualization of evidence for localization

3

2

1

3

2

1

Page 45: Proteomics Informatics –

Estimation of global false localization rate using decoy sites

By counting how many times the phosphorylation is localized to amino acids that can not be phosphorylated we can estimate the false localization rate as a function of amino acid frequency.

0

0.005

0.01

0.015

0.02

0 0.05 0.1 0.15

0

0.005

0.01

0.015

0.02

0 0.05 0.1 0.15

Amino acid frequency

Fals

e lo

caliz

atio

n fr

eque

ncy

Y

Page 46: Proteomics Informatics –

S21

Sm1

How much can we trust a single localization assignment?

If we can generate the distribution of scores for assignment 1 when 2 is the correct assignment, it is possible to estimate the probability of obtaining a certain score by chance for a given peptide sequence and MS/MS spectrum assignment.

SS mm21

0

2

1

21

2

0

2

1

21

2

2

1

1

dSSFdSSFp

S m

)(

)(

1.

2.

Page 47: Proteomics Informatics –

Is it a mixture or not?If we can generate the distribution of scores for assignment 2 when 1 is the correct assignment, it is possible to estimate the probability of obtaining a certain score by chance for a given peptide sequence and MS/MS spectrum assignment.

S12

Sm2

SS mm21

0

12

12

1

0

12

12

11

2)(

)(2

dSSF

dSSFp

Sm

1.

2.

Page 48: Proteomics Informatics –

ppppthth

and1

2

2

1 1 and 2 pppp

ththand

1

2

2

1 1 pppp

ththand

1

2

2

1

ppppthth

and1

2

2

1 1 or 2Ø )( ppSS mm

1

2

2

121

Peptide with two possible modification sites

MS/MS spectrum

m/zIn

tens

ity

Matching

Which assignment doesthe data support?

1, 1 or 2, or 1 and 2?

Localization of modifications

Page 49: Proteomics Informatics –

Proteomics Informatics – Protein characterization: post-translational

modifications and protein-protein interactions (Week 10)