1
Byologic ® Features Supernovo Automatic End-to-End De Novo Sequencing (including I/L) of Antibodies with EThcD Fragmentation Wilfred H. Tang 1 Yong J. Kil 1 K. Ilker Sen 1 Marshall Bern 1 Shruti Nayak 2 Beatrix Ueberheide 2 David Fenyő 2 Gregg Silverman 2 1 Protein Metrics, San Carlos, CA 2 New York University Langone Medical Center, New York, NY Contact: [email protected] Abstract EThcD applies both electron-transfer and beam-type collisional dissociation to give especially complete peptide fragmentation spectra containing both b / y and c / z ions. EThcD has long been proposed as highly advantageous for automated de novo sequencing due to its more complete fragmentation and recognizable relationships between peaks, but lack of software has prevented this potential from being fully realized. Another advantage (Zhokhov et al., 2017) is that EThcD generates w ions that can distinguish the isobaric amino acids leucine and isoleucine. Here we use multiple digests, EThcD fragmentation, and Supernovo TM software to do automatic end-to-end de novo sequencing of antibodies. Experimental Methods Byologic ® Features Byologic ® Features Leucine / Isoleucine Determination (Arnold et al., 2005) Human Serum IgM Glycosylation, J. Biol. Chem., 280, 29080-29087 (2005). (Zhokhov et al., 2017) An EThcD-Based Method for Discrimination of Leucine and Isoleucine Residues in Tryptic Peptides, J. Am. Soc. Mass Spectrom., (2017 epub). (Riley et al., 2017) Age-associated B cells (ABC) inhibit B lymphopoiesis and alter antibody repertoires in old age, Cell Immunol. (2017 epub) (Elkon and Silverman, 2012) Naturally occurring autoantibodies to apoptotic cells, Adv. Exp. Med. Biol. 750,14-26 (2012). We gratefully acknowledge NIH grant GM103362 “Protein Sequencing Tools for Biological Therapeutics” and shared instrumentation grant NIH/ORIP S10OD010582 for purchase of an Orbitrap Fusion Lumos. Anti-phosphatidylcholine IgM References and Acknowledgments EThcD generates ions that distinguish leucine from isoleucine z ion with I on its N-terminus loses ethyl (-29) characteristic w ion z ion with L on its N-terminus loses isopropyl (-43) characteristic w ion Supernovo accumulates z-29 / z-43 evidence from multiple, overlapping peptides to increase confidence in the I / L inference. This is generally the best method for discriminating I vs L, but Supernovo also makes use of: (1) chymotrypsin selectively favors cleavage C-terminal to L over I, and (2) compared to germline, IL changes are unlikely in the constant region, somewhat more likely in the framework region, and quite likely in the CDR’s. To test Supernovo’s performance, we analyzed NISTmAb, a mAb standard with known sequence. Supernovo’s answer is completely correct (including I / L) Total of 64 I / L inferences: 55 high confidence 4 medium confidence 5 low confidence Upper sequence: Germline sequence Lower sequence: Supernovo answer Magenta: Supernovo-determined sequence divergent from germline sequence CDR’s highlighted in yellow Confidence of I / L inference: Green = high Yellow = medium Red = low Workflow Byologic ® Features Discussion and Conclusions Supernovo is a program that can de novo sequence a purified protein using MS/MS spectra from multiple digests. For human, mouse, or rat mAbs, no other input is needed. For other proteins, the user should supply a starting sequence with at least 80% identity to the unknown sequence. We ran Supernovo using default parameters, except: Cysteine fixed modification = carboxymethyl / +58.004579 Fragmentation type = EThcD Variable glycan modifications appropriate for each mAb (see “IgG / IgM Glycosylation” section below for further discussion) Supernovo output offers a variety of views for validating and reporting evidence for the antibody sequence. We analyzed 2 antibodies: - NISTmAb (reference material 8671), a well-characterized humanized IgG1 - 15 mg of an anti-phosphatidylcholine IgM antibody isolated 20 years ago from a patient with Waldenström’s macroglobulinemia Aliquots of each antibody were digested using trypsin, chymotrypsin, pepsin and LysC and loaded onto a trap column (Acclaim® PepMap 100 pre-column, 75 μm × 2 cm, C18, 3 μm, 100 Å) and separated on an analytical column (EASY-Spray column, 50 m × 75 μm ID, PepMap RSLC C18, 2 μm, 100 Å) maintained at 45C. Peptides were separated with a 60-minute gradient at a flow rate of 200 nl/min. The Orbitrap Fusion Lumos acquired high resolution MS 1 (120K) and MS 2 (15K) scans. MS 2 scans used EThcD fragmentation with a 40 ms ETD reaction time and 32% supplemental activation (HCD). This IgM was isolated 20 years ago from a patient with Waldenström’s macroglobulinemia and found to bind to phosphatidylcholine (PtC), a lipid component of human cell membranes. We were interested in further investigating this antibody, and for this we need to determine the primary sequence. Even though we had only 15 mg of IgM available, we were able to obtain well-supported end-to-end sequence with clearly identifiable germline origins (VH3-23 and Vk320) and a high level of somatic hypermutation. We speculate that this antibody may have arisen in a clone of the recently discovered age-associated B cells (ABC) subset, which bear the phenotype of clonal and antigenic selection in a germinal center, and become dysregulated later in life (Riley et al., 2017) Supernovo enables routine sequencing of mAbs by providing: “Hands free” operation Complete end-to-end sequence; including I / L determination based on EThcD and chymotrypsin digestion specificity Metrics and visualization to allow for rapid validation of the sequence by the scientist Analysis of the NISTmAb standard demonstrates Supernovo’s capabilities. Analysis of the anti-PtC IgM demonstrates the feasibility of Supernovo for applications with limited starting material. Determination of the primary sequence of the anti-PtC IgM may enable the development of antibody- and immunotherapy-based approaches to lower the burden of damaged cells during inflammatory and infectious diseases (Elkon and Silverman, 2012). Protein View Peptide Details XIC MS2 Zoomed MS1 IgG / IgM Glycosylation N394 and N401 Glycosylation makes de novo sequencing more challenging, but this can be handled by informing Supernovo of the glycosylation species. The most abundant N-glycan species (which differ for IgG vs. IgM) as well as GlcNAc should be specified as variable modifications. IgG HexNAc(4)Hex(3)Fuc(1) = G0F HexNAc(4)Hex(4)Fuc(1) = G1F HexNAc(4)Hex(5)Fuc(1) = G2F HexNAc(1) IgM (see Arnold et al., 2005) HexNAc(4)Hex(5)Fuc(1) HexNAc(4)Hex(5)Fuc(1)NeuAc(1) HexNAc(4)Hex(5)Fuc(1)NeuAc(2) HexNAc(5)Hex(5)Fuc(1) HexNAc(5)Hex(5)Fuc(1)NeuAc(1) HexNAc(5)Hex(5)Fuc(1)NeuAc(2) HexNAc(2)Hex(5) HexNAc(2)Hex(6) HexNAc(2)Hex(7) HexNAc(2)Hex(8) HexNAc(2)Hex(9) HexHAc(1) Bi-antennary glycan with bisecting GlcNAc on IgM NISTmAb = Digestion = Fragmentation

Automatic End-to-End De Novo Sequencing (including I/L) of … · 2018. 11. 16. · ByologicSupernovo ® Features Automatic End-to-End De Novo Sequencing (including I/L) of Antibodies

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Automatic End-to-End De Novo Sequencing (including I/L) of … · 2018. 11. 16. · ByologicSupernovo ® Features Automatic End-to-End De Novo Sequencing (including I/L) of Antibodies

Byologic® FeaturesSupernovo

Automatic End-to-End De Novo Sequencing (including I/L) of Antibodies with EThcD FragmentationWilfred H. Tang1 Yong J. Kil1 K. Ilker Sen1 Marshall Bern1 Shruti Nayak2 Beatrix Ueberheide2 David Fenyő2 Gregg Silverman2

1Protein Metrics, San Carlos, CA 2New York University Langone Medical Center, New York, NY Contact: [email protected]

Abstract

EThcD applies both electron-transfer and beam-type collisional

dissociation to give especially complete peptide fragmentation spectra

containing both b / y and c / z ions. EThcD has long been proposed as

highly advantageous for automated de novo sequencing due to its more

complete fragmentation and recognizable relationships between peaks,

but lack of software has prevented this potential from being fully realized.

Another advantage (Zhokhov et al., 2017) is that EThcD generates w ions

that can distinguish the isobaric amino acids leucine and isoleucine.

Here we use multiple digests, EThcD fragmentation, and SupernovoTM

software to do automatic end-to-end de novo sequencing of antibodies.

Experimental Methods

Byologic® FeaturesByologic® FeaturesLeucine / Isoleucine Determination

(Arnold et al., 2005) Human Serum IgM Glycosylation, J. Biol. Chem., 280, 29080-29087 (2005).

(Zhokhov et al., 2017) An EThcD-Based Method for Discrimination of Leucine and Isoleucine Residues

in Tryptic Peptides, J. Am. Soc. Mass Spectrom., (2017 epub).

(Riley et al., 2017) Age-associated B cells (ABC) inhibit B lymphopoiesis and alter antibody repertoires

in old age, Cell Immunol. (2017 epub)

(Elkon and Silverman, 2012) Naturally occurring autoantibodies to apoptotic cells, Adv. Exp. Med. Biol.

750,14-26 (2012).

We gratefully acknowledge NIH grant GM103362 “Protein Sequencing Tools for Biological

Therapeutics” and shared instrumentation grant NIH/ORIP S10OD010582 for purchase of an

Orbitrap Fusion Lumos.

Anti-phosphatidylcholine IgM

References and Acknowledgments

EThcD generates ions that distinguish leucine from isoleucine

• z ion with I on its N-terminus loses ethyl (-29) characteristic w ion

• z ion with L on its N-terminus loses isopropyl (-43) characteristic w ion

Supernovo accumulates z-29 / z-43 evidence from multiple, overlapping peptides

to increase confidence in the I / L inference. This is generally the best method for

discriminating I vs L, but Supernovo also makes use of: (1) chymotrypsin

selectively favors cleavage C-terminal to L over I, and (2) compared to germline,

I⇔L changes are unlikely in the constant region, somewhat more likely in the

framework region, and quite likely in the CDR’s.

To test Supernovo’s performance, we analyzed NISTmAb, a mAb standard with

known sequence.

• Supernovo’s answer is completely correct (including I / L)

• Total of 64 I / L inferences:

• 55 high confidence

• 4 medium confidence

• 5 low confidence

Upper sequence: Germline sequence

Lower sequence: Supernovo answer

Magenta: Supernovo-determined

sequence divergent from germline

sequence

CDR’s highlighted in yellow

Confidence of I / L inference:

• Green = high

• Yellow = medium

• Red = low

Workflow

Byologic® FeaturesDiscussion and Conclusions

Supernovo is a program that can de novo sequence a purified protein using

MS/MS spectra from multiple digests. For human, mouse, or rat mAbs, no other

input is needed. For other proteins, the user should supply a starting sequence

with at least 80% identity to the unknown sequence.

We ran Supernovo using default parameters, except:

• Cysteine fixed modification = carboxymethyl / +58.004579

• Fragmentation type = EThcD

• Variable glycan modifications appropriate for each mAb (see “IgG / IgM

Glycosylation” section below for further discussion)

Supernovo output offers a variety of views for validating and reporting evidence

for the antibody sequence.

• We analyzed 2 antibodies:

-NISTmAb (reference material 8671), a well-characterized humanized IgG1

- 15 mg of an anti-phosphatidylcholine IgM antibody isolated 20 years ago from a

patient with Waldenström’s macroglobulinemia

• Aliquots of each antibody were digested using trypsin, chymotrypsin, pepsin and

LysC and loaded onto a trap column (Acclaim® PepMap 100 pre-column, 75 μm

× 2 cm, C18, 3 μm, 100 Å) and separated on an analytical column (EASY-Spray

column, 50 m × 75 μm ID, PepMap RSLC C18, 2 μm, 100 Å) maintained at 45C.

Peptides were separated with a 60-minute gradient at a flow rate of 200 nl/min.

• The Orbitrap Fusion Lumos acquired high resolution MS1 (120K) and MS2 (15K)

scans. MS2 scans used EThcD fragmentation with a 40 ms ETD reaction time and

32% supplemental activation (HCD).

This IgM was isolated 20 years ago from a patient with Waldenström’s

macroglobulinemia and found to bind to phosphatidylcholine (PtC), a lipid

component of human cell membranes. We were interested in further investigating

this antibody, and for this we need to determine the primary sequence. Even

though we had only 15 mg of IgM available, we were able to obtain well-supported

end-to-end sequence with clearly identifiable germline origins (VH3-23 and

Vk320) and a high level of somatic hypermutation. We speculate that this antibody

may have arisen in a clone of the recently discovered age-associated B cells

(ABC) subset, which bear the phenotype of clonal and antigenic selection in a

germinal center, and become dysregulated later in life (Riley et al., 2017)

• Supernovo enables routine sequencing of mAbs by providing:

− “Hands free” operation

− Complete end-to-end sequence; including I / L determination based on

EThcD and chymotrypsin digestion specificity

− Metrics and visualization to allow for rapid validation of the sequence by

the scientist

• Analysis of the NISTmAb standard demonstrates Supernovo’s capabilities.

• Analysis of the anti-PtC IgM demonstrates the feasibility of Supernovo for

applications with limited starting material.

• Determination of the primary sequence of the anti-PtC IgM may enable the

development of antibody- and immunotherapy-based approaches to lower

the burden of damaged cells during inflammatory and infectious diseases

(Elkon and Silverman, 2012).

Protein

View

Peptide Details

XIC

MS2

Zoomed

MS1

IgG / IgM Glycosylation

N394 and N401

Glycosylation makes de novo sequencing more

challenging, but this can be handled by informing

Supernovo of the glycosylation species. The most

abundant N-glycan species (which differ for IgG

vs. IgM) as well as GlcNAc should be specified as

variable modifications.

IgG

HexNAc(4)Hex(3)Fuc(1) = G0F

HexNAc(4)Hex(4)Fuc(1) = G1F

HexNAc(4)Hex(5)Fuc(1) = G2F

HexNAc(1)

IgM (see Arnold et al., 2005)

HexNAc(4)Hex(5)Fuc(1)

HexNAc(4)Hex(5)Fuc(1)NeuAc(1)

HexNAc(4)Hex(5)Fuc(1)NeuAc(2)

HexNAc(5)Hex(5)Fuc(1)

HexNAc(5)Hex(5)Fuc(1)NeuAc(1)

HexNAc(5)Hex(5)Fuc(1)NeuAc(2)

HexNAc(2)Hex(5)

HexNAc(2)Hex(6)

HexNAc(2)Hex(7)

HexNAc(2)Hex(8)

HexNAc(2)Hex(9)

HexHAc(1)

Bi-antennary glycan with

bisecting GlcNAc on IgM

NISTmAb

= Digestion = Fragmentation