Advances and Challenges in Liquid Chromatography-Mass

Advances and Challenges in LiquidChromatography-Mass Spectrometry-basedProteomics Profiling for Clinical Applications*Wei-Jun Qian, Jon M. Jacobs, Tao Liu, David G. Camp II, and Richard D. Smith‡

Recent advances in proteomics technologies provide tre-mendous opportunities for biomarker-related clinical ap-plications; however, the distinctive characteristics of hu-man biofluids such as the high dynamic range in proteinabundances and extreme complexity of the proteomespresent tremendous challenges. In this review we sum-marize recent advances in LC-MS-based proteomics pro-filing and its applications in clinical proteomics as wellas discuss the major challenges associated with imple-menting these technologies for more effective candi-date biomarker discovery. Developments in immunoaf-finity depletion and various fractionation approaches incombination with substantial improvements in LC-MSplatforms have enabled the plasma proteome to be pro-filed with considerably greater dynamic range of cover-age, allowing many proteins at low ng/ml levels to beconfidently identified. Despite these significant advancesand efforts, major challenges associated with the dy-namic range of measurements and extent of proteomecoverage, confidence of peptide/protein identifications,quantitation accuracy, analysis throughput, and the ro-bustness of present instrumentation must be addressedbefore a proteomics profiling platform suitable for effi-cient clinical applications can be routinely implemented.Molecular & Cellular Proteomics 5:1727–1744, 2006.

Advances in MS technologies, high resolution liquid phaseseparations, and informatics/bioinformatics for large scaledata analysis have made MS-based proteomics an indispen-sable research tool with the potential to broadly impact biol-ogy and laboratory medicine (1). In particular, proteomicstechnologies have been increasingly applied to the study ofdisease-related clinical samples (e.g. human blood serum/plasma, proximal fluids, and disease tissues) for the purposesof identifying novel disease-specific protein biomarkers, gain-ing better understandings of disease processes, and discov-ering novel protein targets for therapeutic interventions anddrug developments (2).

Proteomics-based candidate biomarker discovery efforts

have recently gained significant attention due to the power ofthese technologies for analyzing complex protein mixturesand their potential for identifying novel markers indicative ofdisease. It is widely believed that many complex human dis-eases, including cancers, might be more effectively cured ifspecific disease biomarkers were available to enable detec-tion and treatment at very early stages of disease (3). Despitenoteworthy efforts, only a handful of cancer biomarkers havebeen approved by the United States Food and Drug Adminis-tration (FDA)1 for clinical use, with the majority of these beingprotein biomarkers (4). Although existing markers play a signif-icant role in screening, monitoring, and staging, effective bi-omarkers are not currently available for most cancers and aregenerally nonexistent for early detection (3). Therefore, there is aclear need for applying advanced technologies such as thesebased on proteomics in the quest for novel candidate clinicalbiomarkers.

Although widely speculated that advances in genomics andproteomics would alter the landscape of clinical biomarkerdiscovery and validation, the declining trend of new FDA-approved biomarkers reported over the last decade (5) high-lights the magnitude of the challenges associated with humanclinical samples and validation of candidate biomarkers. Con-tributing to these challenges are the substantial complexity ofthe human proteome and the heterogeneity of the humanpopulation, both of which make the search for biomarkersfrom either biofluids or disease tissues a daunting task. As aresult of the heterogeneous nature of humans and the com-plexity of diseases, e.g. cancers, a panel of biomarkers ratherthan a single marker may be required to achieve the highsensitivity and specificity required for clinical applications (3).Proteomics technologies offer significant potential for discov-ering such marker panels.

Many different technologies have been applied for biomar-ker discovery and other clinical applications, including two-dimensional (2D) gel-electrophoresis (6), LC-MS, and protein-and antibody-based microarrays (7–9). LC-MS- or tandem MS

From the Biological Sciences Division and Environmental MolecularSciences Laboratory, Pacific Northwest National Laboratory,Richland, Washington 99352

Received, May 2, 2006, and in revised form, July 25, 2006Published, MCP Papers in Press, August 3, 2006, DOI 10.1074/

mcp.M600162-MCP200

1 The abbreviations used are: FDA, Food and Drug Administration;SCX, strong cation exchange chromatography; NET, normalized elu-tion time; AMT, accurate mass and time; IMS, ion mobility spectrom-etry; 2D, two-dimensional; RPLC, reversed phase LC; MARS, multipleaffinity removal system; HUPO, Human Proteome Organization; LPS,lipopolysaccharide; MRM, multiple reaction monitoring.

Review

Molecular & Cellular Proteomics 5.10 1727This paper is available on line at http://www.mcponline.org

(MS/MS)-based proteomics technologies offer highly sensi-tive analytical capabilities and a relatively large dynamic rangeof detection and have increasingly become the method ofchoice for in depth profiling of complex protein mixtures (1). Inaddition, the relatively high throughput of LC-MS technolo-gies is amenable to clinical applications that involve humanbiofluids and disease tissues. The application of LC-MS/MSfor human biofluid protein profiling was initiated by the firstglobal shotgun proteomics study of human plasma/serumpublished in 2002 by Adkins et al. (10). An explosion of LC-MS-based applications in human plasma/serum and variousbiofluids soon followed due to the tremendous interest inidentifying disease-related proteins (11, 12). Various deple-tion/fractionation/enrichment techniques have been devel-oped along the way and coupled to LC-MS to increase cov-erage of the biofluid proteomes (13).

Human blood serum/plasma remains the most commonlyused clinical sample to date for proteomics applications be-cause it may include specific biomarkers for virtually all hu-man diseases due to its either direct or indirect interactionwith the entire cell complement of the body, i.e. tissue-spe-cific proteins may be released into the blood stream upon celldamage or cell death. Additionally serum/plasma can bereadily obtained by clinical sampling. However, the magnitudeof the previously mentioned challenges associated with hu-man clinical samples coupled with the anticipation that po-tential biomarkers of interest could be present at extremelylow concentrations in plasma has raised doubts as to whetherdisease biomarkers can be accurately detected or identifiedfrom plasma using a proteomics approach. As a result, anal-ysis of various other biofluids/tissues has gained increasingattention. Due to their proximity to the source of disease orperturbation in the body, tissues (14) and various biofluids suchas cerebrospinal fluid (15), bronchoalveolar lavage fluid (16),synovial fluid (17), nipple aspirate fluid (18), saliva (19), and urine(20) are believed to provide a more focused pool of potentialbiomarkers of interest. In addition, tumor interstitial fluids havealso been reported as a novel source for proteomics biomarker

and therapeutic target discovery (21), offering a promising al-ternative to direct tissue analysis. In the following review, wehighlight LC-MS-based proteomics profiling for clinical applica-tions by summarizing recent advances as well as the majorchallenges facing this technology for more effective candidatebiomarker discovery.

CHALLENGES AND REQUIREMENTS FOR DESIGNING A ROBUSTLC-MS DISCOVERY PLATFORM

The distinctive nature of human biofluid proteomes, in par-ticular the serum/plasma proteome, presents significant chal-lenges for current analytical technologies aimed at quantita-tive protein profiling and biomarker discovery. First, theserum/plasma protein content is dominated by several veryabundant proteins (i.e. the 22 most abundant proteins represent�99% of the total protein mass in plasma) yet at the same timepresents an extraordinary dynamic range (�10 orders of mag-nitude) in protein concentrations that begins with serum albuminat �45 mg/ml and extends to cytokines (and potentially manydisease-related proteins) at around 1–10 pg/ml or lower (5).Second, the serum/plasma proteome presents tremendous bi-ological complexity as a result of tissue “leakage” proteins fromthe entire body, complex post-translational protein modifica-tions such as glycosylation, and the existence of various forms(i.e. splice variants, proteolytic products, and the tremendousvariability in the immunoglobulin class) for each expressedgene. Finally the substantial genetic and non-genetic biologicalvariability of human clinical samples contributes significantly tothe overall analytical challenge.

Despite significant recent advances, major challenges re-main to prevent routine implementation of an LC-MS proteinprofiling platform suitable for efficient biomarker discovery(Table I). To effectively address these challenges, a proteinprofiling platform suitable for biomarker discovery and clinicalapplications must provide at the very minimum 1) overall highdynamic range of measurements and extensive coverage ofthe proteome for effective detection of low abundance pro-teins, 2) highly confident and specific protein identifications,

TABLE IChallenges and limitations of current LC-MS-based proteomics technologies applied to biomarker discovery

Challenge Current techniques for addressing the challenge Limitations

Dynamic range ofmeasurements

Immunoaffinity depletion and multidimensionalfractionation coupled with high resolution LC-MSor MS/MS instrumentation

Low throughput, requires relatively largesample sizes

Sensitivity Small inner diameter LC column (50 �m or less)coupled with nanoflow electrospray ionization andadvanced MS instrumentation (i.e. FTICR, LTQ-FT)

Issues in robustness and expense

Reproducibility andquantitation

Platform automation (including sample processing),label-free direct quantitation, and isotope labeling-based quantitation

Variations from multistep sampleprocessing, ionization suppression andinstrument variations, labeling efficiencies

Throughput Automated fast LC and gas phase ion mobilityseparations

Limited dynamic range or coverage

False positive identifications Improved database searching algorithms andstatistical models

Lack of consensus

LC-MS-based Clinical Proteomics

1728 Molecular & Cellular Proteomics 5.10

3) accurate quantitation of relative protein abundances acrossmany clinical samples, and 4) high throughput capable ofanalyzing large numbers of clinical samples to provide suffi-cient statistical power needed to address biological variability.In addition, the platform, including both sample processingand LC-MS instrumentation, must be robust and include ef-ficient informatics software capabilities for data mining andstatistical analyses. Currently there is a broad consensus thatno existing platform meets all of these requirements for effec-tive biomarker discovery.

Fig. 1 shows a component-based diagram of an LC-MSprotein profiling platform. Note that such a platform is notbased on a single instrument but rather on a compilation ofcurrent technologies to achieve high dynamic range quanti-tative proteome profiling for clinical samples. A key perform-ance factor of any such platform is the overall dynamic rangeof detection and extent of proteome coverage, which in turndictates its ability to detect low abundance proteins. Manydisease-specific proteins in plasma/serum are anticipated tobe present at very low levels (ng/ml or even lower), e.g. withinthe same range as current FDA-approved markers such asprostate-specific antigen (0.01–100 ng/ml) and Troponin-T(0.02–100 ng/ml). This is particularly obvious for cancer mark-ers of early detection where tumor size is very small (millime-ter size), and cancer-specific proteins in plasma may presentat pg/ml or lower levels. This overall dynamic range presentsa tremendous challenge for any MS-based technology. Theachievable dynamic range or proteome coverage for a plat-form depends on the peak capacity (the number of chromato-graphic peaks that can be fit into the length of separation) ofthe on-line LC separations prior to MS measurements, thedynamic range of the MS instrumentation, and the efficiencyof sample enrichment or fractionation steps at both proteinand peptide levels prior to LC-MS analyses. Analysis through-put inevitably determines the size of any clinical study sampleset and largely depends on factors such as automation ofeach platform component, LC-MS analysis duty cycle, andthe extent of prefractionation prior to LC-MS analysis. Al-though the application of more extensive fractionation canlead to a higher dynamic range of detection, the overall

throughput can be severely reduced. Other key performancefactors are the confidence of protein identifications and thequantitative accuracy, which determine the ability of the plat-form to confidently identify a potential biomarker based on theabundance differences between healthy and diseased condi-tions. Both the reproducibility of sample processing/fraction-ation prior to LC-MS and the LC-MS instrumentation willcontribute to the accuracy of quantitation.

ADVANCES IN LC-MS TECHNOLOGIES

A high resolution LC (or LC/LC) separation coupled on linewith MS is the central component of many proteomics plat-forms. Over the past decade, there have been significantadvances in LC separations as well as in MS instrumentationand ESI. To date, the “bottom-up” proteomics strategy thatcombines high efficiency separations with MS to characterizehighly complex peptide mixtures still accounts for the majorityof proteomics measurements. This strategy relies on the iden-tification of peptides sufficiently unique for protein identifica-tion. Protein mixtures from cellular lysates or biofluids aretypically digested by trypsin (or other proteases) into polypep-tides, which are then separated by capillary LC and analyzedby MS on line via an ESI interface. Peptide sequences areidentified by using automated database searching algorithmssuch as SEQUEST (22), MASCOT (23), or X!Tandem (24) tocorrelate experimental MS/MS spectra to theoretical massspectra based on sequences in a given protein database for aspecific organism. With the recent development of high speed2D linear ion trap instruments, i.e. LTQ, the protein profilingcoverage has been greatly enhanced compared with tradi-tional three-dimensional ion trap systems (25). When coupledwith SCX fractionation either on line or off line (26, 27), LC-MS/MS technologies now routinely allow for identification ofthousands of proteins from complex mammalian tissues andcells. Although routinely used for peptide/protein identifications,data-dependent LC-MS/MS still has an inherent “undersam-pling” limitation whereby only a portion of the species observedin the survey MS scan is selected for fragmentation (28).

To overcome the undersampling issue, our laboratory de-veloped an accurate mass and time (AMT) tag approach that

FIG. 1. A component diagram of an LC-MS protein profiling platform. FFE, free flow electrophoresis; 1D, one-dimensional; iTRAQ,isobaric tags for relative and absolute quantitation.


Molecular & Cellular Proteomics 5.10 1729

utilizes highly accurate mass measurements from a high res-olution mass spectrometer (e.g. FTICR or TOF mass spec-trometer) in conjunction with accurate elution time measure-ments from high resolution capillary LC separations toachieve high throughput proteome profiling without routineMS/MS measurements (29, 30). The concept of this AMT tagapproach is based on the principle that the accurate massand time measurements will allow reliable peptide identifica-tions by correlating the mass and time of detected peaks to apre-established peptide AMT tag reference library for a par-ticular biological system (e.g. plasma). With this approach,LC-MS/MS proteome analyses coupled with extensive frac-tionation only need to be performed once to create an effec-tive reference database of peptide markers defined by accu-rate masses and elution times, i.e. AMT tags. The AMT tagdatabase then serves as a comprehensive “look-up table” forsubsequent higher throughput LC-MS analyses, allowingmany peptides in each spectrum to be identified withoutMS/MS. Fig. 2 exemplifies an LC chromatogram and 2D dis-play of �2,800 peptides identified using the AMT tag strategyresulting from a single LC-FTICR analysis of a Pro-teomeLabTM IgY-12 depleted human plasma sample.

The fact that application of the AMT tag approach obviatesthe need for routine MS/MS is particularly attractive in highthroughput repeated analyses of similar samples (e.g. serum/plasma) in clinical proteomics studies. We have recently dem-onstrated the application of the AMT tag approach coupledwith 18O labeling for quantitative profiling of the humanplasma proteome in response to lipopolysaccharide adminis-tration (31). The availability of commercial high performancemass spectrometers (e.g. ThermoElectron Finnigan LTQ-FTand LTQ-Orbitrap) will likely lead to an even broader range ofapplications based on this LC-MS-only approach for higherthroughput peptide identifications.

As mentioned previously, the achievable dynamic rangefor the LC-MS platform depends significantly on the peakcapacity of the on-line gradient reversed phase separations,the dynamic range of the MS system, and the efficiency andstability of the ESI interface. A single MS spectrum canprovide a dynamic range of up to 103 for a high resolutioninstrument (e.g. FTICR), and one would expect to achievea dynamic range of at least 105 by coupling this instrumentto an on-line high resolution LC separation that provides apeak capacity of �1,000. However, the observed dynamicrange of measurements can be significantly reduced forcomplex biological samples such as human plasma due tothe charge competition of co-eluting high abundancespecies, leading to ion suppression of the relatively lowabundance species. Ion suppression is a particular issuewhen analyzing human biofluid samples as these samplesare dominated by a handful of highly abundant proteins.Significant ion suppression will occur when peptides origi-nating from low abundance proteins of interest co-elute withpeptides originating from high abundance proteins, leadingto the inability to detect the co-eluting low abundancepeptides.

Table II provides a summary of the relative proteome cov-erage and estimated dynamic ranges achieved by couplinghigh resolution reversed phase capillary LC separations witheither MS/MS using an LTQ instrument or MS using a 9.4-tesla FTICR instrument. The enhanced coverage and dynamicranges obtained by the removal of high abundance proteinsand SCX fractionation are illustrated. All results shown inTable II are based on triplicate experiments that involved apooled plasma sample from healthy subjects. The number ofpeptide identifications are reported with �95% confidencebased on either a reversed database evaluation for MS/MSdata (32) or a shifted database evaluation for the LC-FTICR

FIG. 2. A typical LC-FTICR analysis of an IgY-12 depleted human plasma sample. A, the base peak chromatogram. B, a 2D display of�2,800 identified species at the mass and NET space. The analysis was performed using a Bruker 9.4-tesla FTICR instrument coupled withan LC system equipped with a 150-�m-inner diameter and 65-cm-long capillary column operated at 5,000 p.s.i.



data2 with all proteins identified using a minimum of twodifferent peptides. As shown, the single LC-MS/MS analysisonly identifies �100 proteins with high confidence and pro-vides a dynamic range of �103. With the removal of either thetop six (MARS) or top 12 (IgY-12) abundant proteins, theoverall dynamic range is enhanced to �105. LC-FTICR showsgreater coverage for both peptide and protein identificationscompared with LC-MS/MS, and the dynamic range is esti-mated to be similar to that observed for LC-MS/MS. (It shouldbe noted that presently unassigned peptides probably includemany more proteins.) When IgY-12 depletion and SCX frac-tionation are combined with LC-MS/MS, a dynamic range of106–107 can be achieved, allowing identification of nearly 500proteins in plasma with high confidence including many at thelow ng/ml level, and 2D LC-FTICR analyses would be ex-pected to increase this by approximately another order ofmagnitude. Note, however, that this dynamic range still falls 3orders of magnitude short for detecting pg/ml protein con-centrations. In addition, it should be noted that not all theproteins within the estimated dynamic range will be detecteddue to the differences in digestion efficiency and ion suppres-

sion effects for different proteins/peptides within the complexsample.

One key area of recent advances in LC-MS technologies isthe improvement associated with capillary LC instrumentationthat provides enhanced peak capacities and dynamic rangeof detection needed to analyze clinical samples. These im-provements have been achieved primarily through the use ofvery high pressure (10–20 kp.s.i.), very small porous particles(3 �m or less), smaller inner diameter columns (50-�m innerdiameter or less), nanoelectrospray interfaces, and relativelylong columns and long gradients for separations (33–35). Forexample, high efficiency separations with peak capacities of�1,000 have been achieved by using 15–75-�m-inner diam-eter and 85-cm-long capillary columns packed with 3-�mC18-bonded silica particles operated at 10 kp.s.i. By usingsmaller inner diameter columns (e.g. 15 �m) (34), the sensi-tivity of the system continues to increase inversely as themobile phase flow rates drop to as low as 20 nl/min, demon-strating the advantages of ESI-MS analyses at very low liquidflow rates (36, 37). More recently, the use of 20 kp.s.i. capillaryLC columns packed with 1.4–3-�m porous C18-bonded silicaparticles has been demonstrated to provide chromatographicpeak capacities of 1,000–1,500 for complex peptide and me-tabolite mixtures (35). Although these very high pressure sys-tems present technical challenges for robust automated op-

2 V. A. Petyuk, W. J. Qian, M. H. Chin, H. Wang, E. A. Livesay,M. E. Monroe, J. N. Adkins, N. Jaitly, D. J. Anderson, D. G. Camp,D. J. Smith, and R. D. Smith, manuscript submitted.

TABLE IIThe proteome coverage and estimated dynamic range offered by current LC-MS technologies

A pooled reference plasma sample from healthy individuals was used for this evaluation. A prepacked 4.6 � 50-mm (loading capacity, 15�l of plasma) MARS affinity column (Agilent, Palo Alto, CA) and a 7 � 52-mm (loading capacity, 25 �l of plasma) ProteomeLab IgY-12 affinitycolumn (Beckman Coulter, Fullerton, CA) were used for the depletion of high abundance proteins. For each method, the samples wereprocessed in triplicate and individually analyzed using a 150-�m-inner diameter and 65-cm-long column coupled with either a Finnigan LTQsystem (MS/MS) or a Bruker 9.4-tesla FTICR instrument. 10 and 5 �g of peptide samples were loaded for each LC-MS/MS and LC-FTICRanalyses, respectively. 300 �g of peptides were used for each SCX fractionation. The LC and SCX operations were the same as describedpreviously (31). Peptides were filtered with a confidence level �95% based on reversed database evaluation (32), and proteins were identifiedwith at least two different peptides. ALS, acid-labile subunit; vWF, von Willebrand factor; SAA, serum amyloid A; CRP, C-reactive protein;HGFA, hepatocyte growth factor activator; MSF, megakaryocyte-stimulating factor; EGFR, epidermal growth factor receptor; APOC2,apolipoprotein C-II; B2M, �2-microglobulin; NAP1L1, nucleosome assembly protein 1-like1; MMP2, matrix metallopeptidase 2; 1D, one-dimensional. We note that more relaxed indentification criteria would considerably expand the numbers of peptides and proteins identified byall approaches.

MethodsReplicate

Overlap Identified low abundance proteinsEstimated

dynamic rangeof coverage1 2 3

Non-depleted plasma and 1D LC-MS/MSPeptides 1,398 1,213 1,466 972 ALS, 25 �g/ml; Factor XII, 30 �g/ml;

APOC2, 35 �g/mlProteins 99 97 102 96 �103

MARS depletion and 1D LC-MS/MSPeptides 1,723 1,732 1,692 1,250 B2M, 1.1 �g/ml; vWF, 1.3 �g/ml; SAA,

10 �g/mlProteins 119 118 115 111 �104

IgY-12 depletion and 1D LC-MS/MSPeptides 1,869 1,912 1,999 1,309 Myoglobin, 90 ng/ml; CRP, 500 ng/ml;

HGFA, 500 ng/ml; CD14, 1.4 �g/mlProteins 130 141 130 122 �105

IgY-12 depletion and 1D LC-FTICRPeptides 2,800 2,840 2,630 2,070 Myoglobin, 90 ng/ml; CRP, 500 ng/ml;

HGFA, 500 ng/ml; CD14, 1.4 �g/mlProteins 174 172 167 162 �105

IgY-12 depletion and SCX-LC-MS/MSPeptides 5,196 6,148 5,687 3,391 MSF, 1 ng/ml; Leptin, 5 ng/ml; NAP1L1,

7 ng/ml; MMP2, 9 ng/ml; Cathepsin D,9 ng/ml; EGFR, 11 ng/ml

Proteins 498 474 476 369 �106–107



erations, the recently commercialized Waters nanoACQUITYUPLC System that takes advantage of 1.7-�m sized particlesand operates at �10 kp.s.i. demonstrates the feasibility ofsuch high performance systems for routine applications. Withfurther improvements in robustness, these “ultraperformance”systems may become a powerful component for separatingcomplex mixtures such as human biofluids while concurrentlyproviding the high dynamic range needed for candidatebiomarker discovery applications.

MULTIDIMENSIONAL FRACTIONATION STRATEGIES COUPLED WITHLC-MS FOR IMPROVED PROTEOME COVERAGE

Given the tremendous dynamic range of protein abun-dances and the extraordinary complexity of human biofluidproteomes, many different fractionation techniques havebeen developed and applied in a multidimensional fashion toenhance dynamic range of detection and improve proteomecoverage (13). Multicomponent immunoaffinity removal ofhighly abundant proteins in human plasma/serum (38, 39) hasincreasingly become the method of choice for prefractionat-ing human plasma samples due to the high specificity, effi-cacy, and ease of coupling to other fractionation techniques.As shown in Table II, coupling the immunoaffinity depletionstep to LC-MS provides an additional 1–2 orders of magni-tude increase in dynamic range, allowing for detection ofmore low abundance proteins by effectively increasing thesample loading; similar improvements were reported in otherstudies (40, 41). Good reproducibility was demonstrated byperforming immunoaffinity depletion with an automated LCsystem; however, some of the nontarget low abundance pro-teins have also been observed to bind to the columns but in areproducible fashion (42). A possible approach to counter thiseffect is to analyze both the flow-through and bound fractionsin more of a “partitioning” method instead of a pure “deple-tion” approach (39) with the accompanying trade-off of anincreased number of required analyses. A further enhance-ment to the platform dynamic range will stem from the con-tinuous improvement of antibody-based microbead technol-ogies that will allow for removal of more highly to moderatelyabundant proteins.

Several different techniques for protein-level fractionationhave been applied to human plasma/serum proteome profil-ing, including common gel-based techniques (43, 44), PF2Dautomated chromatofocusing/reversed phase LC (RPLC) (45)and other liquid chromatography-based separations (46),free-flow electrophoresis (41, 47), and IEF (46, 48–51). IEF isa common fractionation technique that has been applied toplasma profiling at both peptide and protein levels. Variousforms of liquid phase IEF techniques have been developed,including off-gel electrophoresis (48), Rotofor (49) or Mini-Rotofor (46), microscale solution IEF (ZOOM) (50), and a pre-parative multichannel electrolyte system (51). A common fea-ture of these systems is the multiple tandem electrodechambers used to partition complex protein samples. IPG IEF

followed by in-gel digestion has also been used for plasmaprotein fractionation prior to LC-MS/MS (52). A number ofrecent large scale proteome profiling studies have combineddifferent protein- and peptide-level fractionation techniques(e.g. PF2D (45), SCX/RPLC (54), free flow electrophoresis-IEF/RPLC (47), ZOOM/SDS-PAGE (50), and Rotofor/RPLC/SDS-PAGE (49) protein fractionation) with peptide-level LC-MS/MSanalyses to achieve more comprehensive coverage of theplasma proteome.

An alternative to plasma protein fractionation is to specifi-cally enrich functional “subproteomes” such as the glycopro-teome or the cysteinyl subproteome by using chemical tag-ging or capture agents; this significantly reduces overallsample complexity and enhances detection of low abundanceproteins. For example, we have recently demonstrated a sim-ple procedure for effectively enriching cysteinyl peptides fromcomplex proteomes (including human biofluids (55)) that pro-vides significantly improved proteome coverage when usedas a peptide-level fractionation technique (27). Additionallyhydrazine chemistry can be applied to specifically enrich N-linked glycopeptides (56, 57), and multilectin affinity chroma-tography can be used to isolate and characterize glycopro-teins from human plasma and serum samples (58). Ourlaboratory has recently developed a strategy that combinesimmunoaffinity depletion and subsequent chemical fraction-ation based on cysteinyl peptide and N-glycoprotein captureswith 2D LC-MS/MS for in depth plasma profiling (Fig. 3) (59).Application of this “divide-and-conquer” strategy to traumapatient plasma samples resulted in confident identification of�1,500 different proteins (with a minimum of two peptides perprotein; �99.5% confidence level based on reversed data-base evaluation) and illustrated an overall dynamic range ofdetection of �107 (low ng/ml concentrations for six identifiedlow abundance proteins were verified by ELISA).

ANALYSIS THROUGHPUT

Although integration of extensive multidimensional fraction-ation/separations with MS greatly increases the overall pro-teomics analysis dynamic range and the extent of proteomecoverage, this general approach suffers from the limitation ofvery low throughput. To date, most reports involving exten-sive fractionation have been limited to small scale studies ofone or two pooled clinical samples rather than larger scalequantitative studies. The development of more effective de-pletion/fractionation strategies and improved LC-MS plat-forms will most likely reduce the total number of fractionsnecessary for the detection of low abundance and clinicallyrelevant proteins and thus provide higher throughput.

Several recent technology developments hold potential forgreatly enhancing the overall analysis throughput of clinicalsamples. The first is the development of very fast LC separa-tions for proteomics analyses. Current automated LC-MSproteomics platforms typically involve LC separations withgradients of 100 min or longer, which limits throughput to �10



sample analyses per day per MS instrument. Several reportshave explored the use of smaller particle-packed columnsor monolithic columns for fast LC separations (10 min orless) as well as multiplex column systems to significantlyimprove the throughput (60, 61). However, it is unclearwhether sufficient separation power can be achieved withthese fast liquid phase separations because the increase inthe solvent gradient speed can degrade the separation peakcapacity (60), which in turn reduces the overall dynamicrange of detection. Other strategies for achieving robustfast separations include liquid phase chromatographic andelectrophoretic separations on a microfluidic chip platform(62–64). Such chip-based separation devices also have theadvantage of providing better robustness, reliability, andease of operation.

Very fast (millisecond scale) gas phase separations basedon ion mobility spectrometry (IMS; a separation method that issomewhat analogous to electrophoresis in the gas phase) areanother powerful alternative to liquid phase separations forsignificant improvement in throughput. At its simplest, an IMSstage consists of a drift tube filled with a non-reactive gas

(commonly helium or N2) and a uniform electric field estab-lished along the axis of separation. Mixtures of peptides,proteins, or small molecules are separated by their gas phasecross-sections (size) in addition to charge, and knowledge oftheir mobility provides another separation dimension to aid inidentification.

The power of IMS has been advanced by several recenttechnical developments. IMS coupled with a TOF MS platformand combinatorial libraries (65) has been recently demon-strated for analysis of proteolytic digests (66). Because anIMS separation typically requires 1–100 ms and has a resolv-ing power of 50–200, a single species IMS peak exits the drifttube over a �0.1–1-ms period. Generation of a typical TOFMS spectrum requires �30–100 �s, which allows multiplemass spectra to be obtained during the “elution” of an IMSpeak. More recently, LC has been coupled to IMS-TOF MS viaan ESI interface, providing 2D separations prior to MS anal-ysis (67). Despite enormous potential for high throughputanalyses of complex samples, the application of IMS-TOF MShas been limited by low sensitivity due to ion losses at theIMS-MS interface; however, the recent implementation of

FIG. 3. Schematic representation of a chemical fractionation strategy applied to the plasma proteome characterization. Highabundance proteins were first removed using immunoaffinity subtraction. The resulting less abundant proteins were split and subjected to solidphase cysteinyl peptide and N-glycoprotein captures independently. Non-cysteinyl peptides and non-glycopeptides generated at the sametime were also collected. All four different peptide populations were then fractionated by SCX, and each fraction was analyzed by capillaryLC-MS/MS. PNGase F, peptide-N-glycosidase F (59).



electrodynamic ion funnels at both the ESI-IMS and IMS-TOFMS interfaces has significantly improved the sensitivity of theoverall LC-ESI-IMS-TOF MS platform (Fig. 4) (68) such that thesensitivity is now comparable to that of a commercial ESI-MSinstrument. Although still in the development stage, the very fastseparation speed and potential high dynamic range of meas-urements offered by the 2D liquid phase-gas phase separationsmake LC-ESI-IMS-TOF MS an attractive and practical platformfor high throughput clinical applications.

CONFIDENCE OF PEPTIDE/PROTEIN IDENTIFICATIONS

One of the challenges associated with MS/MS-based pro-teome profiling is how to assess the confidence levels ofpeptide and protein identifications that result from automateddatabase searching. It is recognized that a significant portionof the protein identifications in previously published proteom-ics datasets of human plasma are likely comprised of falsepositive identifications (32, 69–71). For example, four differentplasma proteomics datasets that originated from differentmethodologies were combined into a list that included 1,175non-redundant proteins; however, only 46 of these non-re-dundant proteins (�4%) were observed across all four studies(70). This surprisingly low overlap suggests the potential for avery large number of false protein identifications. In a plasmaprofiling study using nanoscale LC-MS/MS, Shen et al. (69)reported a nearly 2-fold difference in the number of identifiedproteins (ranging from 800 to 1,600) depending on which setof previously published criteria were used to filter the data.This criteria-dependent difference illustrates the need formore detailed statistical evaluations to ensure high confi-dence protein identifications.

To address the issue of false peptide identifications, werecently performed a probability-based evaluation of peptideidentifications derived from LC-MS/MS and SEQUEST anal-

ysis in which selected human proteomes, including humanplasma, were searched against a sequence-reversed humanprotein database (32) similar to a previous report applying thereversed database strategy to the yeast proteome (72). Thereversed protein database was created by reversing the orderof amino acid sequences for each protein (the carboxyl ter-minus becomes the amino terminus and vice versa) in theoriginal human protein database. This approach assumes thatthe numbers of false positives that arise from “random” hitsshould be the same for both the normal database and thereversed database because the reversed database is identicalin number of protein entries, protein size, and distribution ofamino acids to the normal database. Fig. 5 shows a histogramof Xcorr distribution for unique peptides (charge state 2�;

FIG. 5. Relative frequency of different peptides identified fromthe normal human protein database (solid line) and the reversedhuman protein database (dashed line) at different Xcorr values.Data shown are for the 2� charge state fully tryptic peptides identifiedfrom human plasma and filtered with �Cn � 0.1. Reproduced withpermission from Ref. 32, copyright 2005 Am. Chem. Soc.

FIG. 4. Schematic diagram of a prototype ESI-IMS-Q-TOF instrumentation platform that uses electrodynamic ion funnel interfacesat both ends of the IMS drift tube and, as a result, provides very high sensitivity from high speed analyses. Reproduced with permissionfrom Ref. 68, copyright 2005 Am. Chem. Soc.



fully tryptic) from a human plasma sample identified bysearching the normal (solid line) and reversed (dashed line)databases. The Xcorr distribution allows an estimated confi-dence level for any given Xcorr bin as well as the overall falsepositive rate for a given Xcorr cutoff to be calculated bydividing the area beneath the dashed line (reversed databasehits) by the area beneath the solid line (normal database hits)for a given Xcorr range. This study also revealed the high falsepositive rates for plasma/serum peptide/protein identifica-tions in several previously published studies (10, 69, 70, 73,74). For example, �30% false positives were observed whenthe often cited Washburn et al. (75) filtering criteria wereapplied to human plasma. Thus, filtering criteria that providedoverall �95% confidence at the unique peptide level for bothhuman cell lines and human plasma were proposed. Whenidentical filtering criteria were used, the observed false posi-tive rates of peptide identifications for human plasma weresignificantly higher than those for the human cell lines, sug-gesting that the false positive rates are significantly depend-ent upon sample characteristics, particularly the number ofproteins found within the detectable dynamic range for differ-ent samples. Additionally Xie and Griffin (76) reported theincreased potential for false positive identifications for the 2Dlinear ion trap (LTQ) when compared with a traditional three-dimensional ion trap (LCQ) instrument, and more stringentfiltering criteria are required for LTQ compared with LCQ tominimize false positive identifications. These results suggestthat peptide/protein identification confidence levels not onlydepend on sample characteristics but also on components ofthe LC-MS platform.

Table III illustrates differences in filtering criteria stringencyby comparing peptide/protein identification results from thesame plasma MS/MS dataset (obtained from a recent profilingstudy using trauma patient plasma samples (59)) that wasfiltered using three different sets of criteria (77, 78). As shown,

the reversed database filtering criteria generated the smallestnumber of peptide and protein identifications, consistent withthe significantly lower percentage of false positive identifica-tions (�4%), whereas the Human Proteome Organization(HUPO) plasma proteome project-recommended criteria (77)and the criteria recently reported by Hood et al. (78). gener-ated nearly �25 and �66% false positives at the peptidelevel, respectively. The comparison shows that the number ofpeptide/protein identifications from an individual protein pro-filing study could be easily inflated if a statistical evaluation offalse positives was not performed.

A similar observation was recently reported for proteinsidentified from data acquired on different instruments from18 laboratories as part of the large scale HUPO plasmaproteome collaborative study (77). Application of a rigorousstatistical approach that used multiple hypothesis-testingtechniques and took into account the length of codingregions in genes reduced the initial list of 9,504 proteins (ofwhich 3,020 were identified with two or more peptides) to889 proteins (containing both multipeptide and single pep-tide protein identifications) identified with a confidence levelof at least 95% (71). Interestingly this length-dependentstatistical approach was applied to reanalyze one of ourpreviously published datasets (69) and resulted in 1,073proteins using the HUPO criteria and 433 proteins using the�95% confidence length-dependent statistics (71). Similarly a�2-fold difference in protein identifications between the re-versed database filtering results and the HUPO criteria (TableIII) was observed, suggesting similar performance betweenthe length-dependent statistical approach and reversed da-tabase filtering with �95% confidence.

PeptideProphet provides another independent statisticalmodel for evaluating potential false positive peptide identifi-cations. The model utilizes the expectation maximum algo-rithm to derive a mixture of correct and incorrect peptide

TABLE IIIComparison of peptide and protein identifications from a plasma proteome profiling dataset analyzed using different criteria (59)

Filtering criteria Difference in stringencyPeptidesidentified

Proteinsidentifieda

Multipeptideproteins

Averagepeptides

per protein

Estimatedfalse positive

rateb

%

Reversed database (32) �95% confidence at the unique peptide levelbased on statistical evaluation. Only fullyand partially tryptic peptides areconsidered.

22,267 3,654 1,494 (40.9%) 6.1 �4

HUPO Plasma ProteomeProject (77)

Inclusion of partially tryptic peptides withrelatively low cutoffs.

30,524 7,928 2,850 (35.9%) 3.9 �25

Hood et al. (78) Inclusion of partially tryptic and otherenzymatically cleaved peptides as well aspeptides without protease constraints withrelatively low cutoffs.

66,839 18,958 11,653 (61.5%) 3.5 �66

a Non-redundant protein identifications generated by Protein Prophet (80).b False positive rate for each filtering criteria was calculated at unique peptide level based on reversed database evaluation (32). The reversed

protein database was created by reversing the order of amino acid sequences for each protein (the carboxyl terminus becomes the aminoterminus and vice versa) in the original protein database.



assignments from the data (79). This approach has beendirectly compared with the reversed database approach foranalyzing the same dataset derived from human plasma (59).Following filtering with reversed database criteria, 6,279unique peptides were identified from this dataset with �95%confidence, whereas 6,341 unique peptides were identified byPeptideProphet using a minimum computed probability of0.95. Approximately 95% of peptides were common betweenthe two datasets, suggesting comparable results from thesetwo statistical approaches. The use of ProteinProphet, an-other statistical model that computes the probability of thepresence of proteins, addresses the issue of whether pep-tides are present in more than one entry in the protein data-base (protein redundancy problem) (80). The list of identifiedpeptides from both the PeptideProphet and the reversed da-tabase filtering approaches can serve as input for Protein-Prophet to generate a list of non-redundant protein identifi-cations. Several other statistical methods have been recentlydescribed for evaluating peptide assignments from MS/MSspectra (81–83). Ideally universal acceptance of a statisticalmodel that optimizes both sensitivity and specificity for con-fident peptide identifications from MS/MS spectra will allowcross-comparison of protein profiling results from differentlaboratories, which currently remains as an unresolvedchallenge.

Similar challenges exist for evaluating false positive identi-fications from MS-only approaches that utilize accurate mass

measurements for peptide/protein identifications. The utilityof accurate mass measurements initially was demonstrated inthe “peptide mass fingerprinting” approach for protein iden-tification in which a set of peptide fragments unique to eachprotein are created by digestion, and the mass of these pep-tide fragments is used as a “fingerprint” to identify the originalprotein (84–86). Thus far, this approach has been limited tosimple protein mixtures or single proteins. The more recentlyreported AMT tag approach utilizes accurate LC retentiontime measurements in addition to accurate mass measure-ments to identify peptides and has been successfully appliedto global proteome profiling, including the human plasmaproteome (31, 87). With the AMT tag approach, peptides areidentified by matching LC-MS observed mass and normalizedelution time (NET) features to AMT tags in the pre-establishedreference database (look-up table of peptides) with a givenmass error and NET error tolerances (typically 1–5 ppm formass and 1–3% for NET). The potential false positive identi-fications resulting from random matching of features to thereference database are indicated on histograms of mass error(the difference between observed mass and calculated massfor the matched peptide in the database) exemplified in Fig.6A for a human plasma dataset analyzed by LC-FTICR. Notethat the use of the NET constraint significantly reduces thelevel of random matches as indicated by the background levelfor each histogram. Similar to the reversed database ap-proach for MS/MS, we have recently applied a shifted data-

FIG. 6. A, mass error histograms of features detected from a single LC-FTICR dataset of a human plasma sample that matched to a humanplasma AMT tag database using different levels of NET constraints. The LC separation time is normalized to a 0–1 scale in NET. B, mass errorhistograms for features from the same dataset matching to a normal AMT tag database (gray circles) and to a shifted AMT tag database (blacksquares). Note, the black squares represent random matches to the 11 Da shifted AMT tag database.



base approach for evaluating the false positive rate in theAMT tag process.2 As shown in Fig. 6B, an �3% false positiverate for this human plasma dataset was estimated as the ratioof the area beneath the curve that represents matches to theshifted database (black squares) and the area beneath thecurve that represents matches to the normal database withina �2 ppm window (gray circles). In addition to being used fordirect identification in the MS-only approach, the accuratemass information also has been utilized for improving theconfidence of peptide identifications by MS/MS through ap-plication of the new generation of LTQ-FT and LTQ-Orbitrapmass spectrometers (88, 89).

QUANTITATION STRATEGIES

The ability to quantitatively measure relative protein abun-dance differences between different clinical samples is essen-tial for identifying candidate protein biomarkers; however, thevast majority of proteomics work related to biomarker discov-ery published to date has been qualitative, highlighting theneed for more robust quantitative approaches for such appli-cations. Our initial application for comparative proteome anal-ysis of human plasma following lipopolysaccharide (LPS) ad-ministration involved a semiquantitative strategy based on thetotal number of peptide identifications per protein (peptidehits or spectrum count) (74). In this study, standard SCX-LC-MS/MS analysis was performed at the 0-h time point (control)and a 9-h time point following LPS administration, and pep-tide hits were used to obtain a relative quantitative measurebetween the control and 9-h time point. Several known in-flammatory response and acute phase proteins were ob-

served to be up-regulated upon LPS administration. Severalother studies have shown that this peptide hits approach can beused as a semiquantitative approach for initial screening whenapplied with proper controls and with adequate thresholds(90–93).

More recently, we have demonstrated 16O/18O labelingcombined with the AMT tag strategy as an effective globalquantitative approach for quantifying relative protein abun-dance differences in human plasma (31). By incubating trypticpeptides in 18O water (55, 94) in the presence of trypsin, the18O atoms are incorporated into the carboxyl terminus oftryptically cleaved peptides via a postdigestion trypsin-cata-lyzed oxygen exchange reaction. The 16O/18O-labeled pep-tide pairs provide a 4-Da mass difference (Fig. 7A), whichallows a high resolution mass spectrometer such as FTICR orTOF to effectively resolve the 16O- and 18O-labeled peptidepairs and accurately measure the relative abundances. Theadvantage is that all types of samples (e.g. tissues, cells, andbiological fluids) can be effectively labeled using this simpleand specific enzyme-catalyzed reaction. Fig. 7A shows apartial 2D display of detected peptide pairs in mass versustime dimensions. The 18O/16O-labeled peptides are readilyvisualized as co-eluting pairs (4 Da apart), and the abundanceratio can be precisely calculated for each 18O/16O pair. In thisinitial comparative analysis demonstration of two humanplasma samples obtained from a healthy individual prior to(control) and following LPS administration, relative abundancedifferences between the two plasma samples were quantifiedfor a total of 429 plasma proteins. Fig. 7B shows the normal-ized -fold changes in 429 quantified proteins and demon-

FIG. 7. A, a partial 2D display of the detected 18O/16O-labeled peptide pairs from an LC-FTICR analysis. The elution time is shown as anormalized scale between 0 and 1. Observed peaks (represented by spots) correspond to various eluting peptides. The heavy and lightisotope-labeled pairs are easily visualized with a 4-Da mass difference. B, normalized -fold changes for the 429 quantified proteins followingLPS administration. The abundance ratio for each protein shown was normalized to zero (R � 1) (53). For ratios smaller than 1, normalizedinverted ratios were calculated as 1 � (1/R). The error bar for each protein indicates the S.D. for the abundance ratios from multiple peptides.Proteins without error bars were identified with single peptides.



strates the significant changes in abundance for a set ofproteins following LPS administration. The combined 16O/18Olabeling-AMT tag strategy can also be easily coupled withsubsequent peptide-level fractionation approaches such ascysteinyl peptide enrichment (55) and SCX fractionation.

Other stable isotope labeling methods based on relativepeptide/protein abundance measurements include metaboliclabeling (95–97) and chemical labeling of specific functionalgroups using reagents such as ICAT (98) and iTRAQ (isobarictags for relative and absolute quantitation) (99, 100) have beenroutinely used for quantitative proteomics analysis. In clinicalproteomics applications, these stable isotope labeling tech-niques are well suited for detecting accurate changes in pair-wise comparisons provided the samples can be effectivelylabeled; however, it is often challenging to compare across alarge number of clinical samples. One alternative to the use ofthese labeling techniques is the use of a labeled referencesample (often a pooled composite) that is spiked into eachnormally processed individual clinical sample that allows rel-ative quantitation between each clinical sample and the ref-erence sample and cross-comparison among the entire set ofclinical samples. The 18O labeling strategy is well suited forgenerating such a labeled reference sample as all other clin-ical samples can be processed with natural 16O on the car-boxyl termini without labeling; 16O/18O peptide pairs areformed after spiking the samples with the 18O-labeledreference.

Alternatively “label-free” direct quantitation approacheshold interest because of greater flexibility for comparativeanalyses and simpler sample processing procedures com-pared with labeling approaches. The isotope labeling andlabel-free approaches are complementary, and each ap-proach has different sources of variations. Several initial stud-ies suggest that the use of normalized LC-MS peak intensitiesfor detected peptides can be used to compare relative abun-dances between similar complex samples (101–103). It hasbeen demonstrated that abundance ratios of separate modelproteins may be predicted to within �20% in complex pro-teome digests by using measured peptide ion intensities ob-tained in LC-MS analyses (101). Among the main challengesfor label-free quantitation are the multiple issues that affectthe usefulness of peptide peak intensities for relative quanti-tation, such as differences in electrospray ionization efficien-cies among different peptides and different samples (37),differences in the amount of sample injected in each analysis,and sample preparation reproducibility. These issues are of-ten peptide-dependent, leading to observed disparity amongrelative abundances of different peptides originating from thesame protein. The significant bias and ion suppression effectscaused by charge competition (ionization bias) during ESI(104) are often considered a major limitation for accuratelabel-free quantitation. Recent studies have demonstratedsubstantial advantages for ESI-MS analyses at nanoflow re-gimes (�100 nl/min) afforded by narrower inner diameter

capillary columns for separations (36, 37). It is well demon-strated that smaller inner diameter columns with lower flowrates provide significantly higher sensitivity than larger innerdiameter columns with higher flow rates (34) because of thesignificant improvements in both ionization and MS samplingefficiencies. Reversed phase packed nanoscale LC and mon-olithic nanoscale LC separations have been developed andcoupled to ESI for improved ionization and quantitation (34,105). As ionization efficiencies are increased for nanoelectro-spray, detection biases are decreased because undesiredmatrix effects and/or ion suppression effects are either re-duced or eliminated (104–106), providing the basis for im-proved quantitation. With further improvements to the ro-bustness of these nano-LC-ESI-MS systems, label-freequantitation may be widely applied in clinical applications.

Another challenge for quantitative clinical proteomics appli-cations is the variability introduced during multiple steps ofsample processing. With continued development of cleanupproducts for more consistent performance and automatedsample processing, such reproducibility issues may be mini-mized, leading to further improvements in quantitation whenapplying either the stable isotope labeling or label-freeapproaches.

IMPLICATIONS OF HUMAN HETEROGENEITY IN CLINICALPROTEOMICS STUDIES

The ability to identify disease-specific differences by usinga proteomics approach relies on multiple factors integral tothe overall analysis pipeline. For example, when performingpeptide-level measurements, achieving high peptide identifi-cation quality is a prerequisite for assuring confidence in allother downstream parameters (i.e. confidence in both proteinidentification and quantitation), whereas the ability to quantifydifferences between any two samples largely depends on thereproducibility of the overall platform. Due to inherent varia-tions that stem from sample preparation and instrument anal-ysis, technical replicates are often performed to evaluate andminimize technical variability arising from the overall analysispipeline. Technical variability will be minimized as technolo-gies continue to mature, and platforms will likely becomemore robust and reproducible; however, biological variabilitywithin the same comparative groups remains as a challengefor identifying real differences between different conditions.Although ideally one would like to either control or minimizesuch biological variability by utilizing more controlled modelsystems such as cell cultures, an in vitro model system, oreven inbred mouse strains, this is not always possible. Mostclinical studies are based on “real world” human clinical sam-ples where inherent human individual heterogeneity makesdiscovery efforts more difficult. The human heterogeneitychallenge in proteomics studies stems from the high proba-bility that two equally “healthy” individuals will have overallsignificantly different individual protein abundance levelswhen sampled at any given time. This heterogeneity can be



due to individual genetic variability (i.e. gender, race, etc.)and/or to contributing environmental factors such as diet,overall health, detrimental environmental exposures, etc. Thecomplexity of human diseases presents another degree ofchallenge. For example, in human cancer, each tumor typetypically consists of a number of subtypes that differ withregard to their spectrum of genetic alterations (107). There-fore, a potential candidate biomarker of disease may be ele-vated only in a certain percentage of the pool of diseasepatients.

The implications of human heterogeneity in the context ofLC-MS-based proteomics experiments centers mostly on themeasured quantitative values for peptide/protein identifica-tions. Fig. 8 shows an initial evaluation of the technical vari-ation and biological variations of human and mouse plasmasamples based on the Pearson correlation of the identifiedpeptide intensities between any two individual samples. Thetechnical replicate results (Fig. 8A; nine individually processedsamples from one pooled reference plasma) show overallgood correlation (0.94 � 0.02), which suggests relatively goodreproducibility of the overall analytical platform. The increasedvariation among human subjects (Fig. 8B) appears obvious onthe basis of significantly reduced average correlation coeffi-cients (0.85 � 0.06) compared with the technical replicateresults; whereas mouse plasma samples (Fig. 8C) show onlyslightly reduced correlation (0.92 � 0.05), which suggestsrelatively small biological variation in these inbred mousemodels. Such large variations observed among different

healthy control subjects present a challenge for identifyingdisease-specific differences. To address these challengesand increase the confidence of discovery results, it is essen-tial for the discovery platform to be able to analyze a relativelylarge number of clinical samples in a high throughput mannerto obtain sufficient statistical power.

Other proteomics studies have also described the effects ofhuman heterogeneity in specific model systems. Hu et al. (15)performed a limited study that compared both intra- andinterindividual variability of human cerebrospinal fluid samplesobtained from six individuals. Specific proteins were observedto fluctuate over time with the same individual, but overallthere was a higher concordance of interindividual resultsthan across individuals. Interestingly results from measuringintraindividual protein levels suggested that certain proteinstended to fluctuate more than others, calling into questionthe effectiveness of using these proteins as potential dis-ease markers. Other studies include a report by Zhan andDesiderio (108) that showed the heterogeneity in 2D gelelectrophoresis human pituitary proteome analysis and aninteresting review by Mann et al. (109) that overviewed theeffects of genotypic and phenotypic variations in evalua-tions of the hemostatic proteome. They reported that “nor-mal” pro- and anticoagulant concentrations were observedto vary significantly and influence downstream responses,demonstrating how heterogeneity in individual phenotypesshould influence diagnosis and therapy for hemorrhagic andthrombotic diseases.

FIG. 8. Pearson correlation plot comparing peptide intensities of LC-FTICR analyses of plasma samples. A, nine technical replicatesfor a pooled reference human plasma sample from multiple healthy subjects. B, nine human plasma samples from individual healthy subjectswith ages range from 18 to 26. C, nine mouse plasma samples isolated from individual C57BL6 mice. Each sample including the technicalreplicate was separately processed by ProteomeLab IgY-12 (for human) or IgY-R7 (for mouse) depletion, and the flow-through portions weredigested with trypsin prior to LC-MS analyses.



Designing experiments to minimize biological variability isimperative for clinical studies. One example is to analyze aserial sample set, i.e. plasma or biopsy tissue samples, fromthe same individual over a time course or disease progres-sion; this in theory will alleviate a majority of heterogeneityeffects, but such samples are traditionally more difficult toobtain in addition to the fact that most patients do not have a“control” blood or tissue sample in storage for comparisonagainst a possible disease diagnosis. For most studies thatuse cross-sectional approaches, it is desirable to match thepatients and controls in terms of age, sex, race, weight, andeven diet if possible. A recent study reported the potentialutility of pooling for reducing the effects of biological variationin microarray studies while retaining the accuracy of identify-ing differentially expressed genes when biological replicatesare retained in the study design and providing the additionalbenefit of a great reduction in the total number of samples tobe analyzed (110). Such a strategy might be explored andextended to clinical proteomics studies.

A further implication in heterogeneity is the presence ofprotein isoforms, splice variants, specific amino acid muta-tions, proteolytic products, and other post-translational mod-ifications that are likely present in individual samples but aremost often not explicitly included as sequences in the search-able protein database. This exclusion makes it challenging fortraditional LC-MS/MS-based bottom-up approaches to iden-tify such modified proteins and is possibly one of the mainreasons that a large percentage of MS/MS spectra in clinicalanalyses remain unidentified. The identification of amino acid-specific post-translational modifications (e.g. phosphoryla-tion, glycosylation, glycation, nitration, oxidation, and deami-nation) challenges MS/MS-based approaches due to the vastvariety of possible modifications and the potential high falsepositive rates that originate from database searching. Be-cause it is recognized that many protein biomarkers may bespecific protein isoforms or modified proteins, further techni-cal developments for more effective identification and quan-titation of protein isoforms and modifications would be greatlydesirable.

As an alternative to identifying protein isoforms and mod-ifications, intact protein-level separations can be used toseparate different protein isoforms on the basis of theirdifferent masses or other properties. The ability to use 2Dgel electrophoresis for resolving different isoforms andmonitoring their abundance changes has been well docu-mented (111). The recently developed multidimensional in-tact protein analysis system (IPAS) separates intact proteinson the basis of charge, hydrophobicity, and molecularmass; quantitation is achieved by protein tagging with flu-orophores (43). The potential for revealing different proteinisoforms and specific protein cleavage products in humanplasma/serum also has been demonstrated (49). The advan-tages offered by intact protein analysis complements thebottom-up proteomics approaches, and better integration

of these two approaches may lead to more effective biomar-ker discovery.

TARGETED PROTEOMICS APPROACHES

The majority of proteomics applications in the search forcandidate biomarkers to date have been focused on globalproteome characterization focused on identifying multipleprotein differences (candidate biomarkers) that correlatewith specific human diseases; however, as discussed pre-viously, there are many challenges associated with applyingsuch a strategy to the discovery of low abundance candi-date marker proteins. An alternative strategy for biomarkerdiscovery that complements global profiling is the targetedproteomics approach that involves quantitative MS tomeasure a hypothesis-generated list of candidates (112).The targeted proteomics strategy often provides greatersensitivity and allows for detection of low abundance can-didate proteins. Anderson and Hunter (113) recently dem-onstrated the use of peptide multiple reaction monitoring(MRM) for quantitative assaying of major plasma proteins.Such MRM assays provide great specificity for peptide/protein identifications and relatively good precision forquantitation. Additionally MRM can provide a rapid andspecific platform for biomarker validation, particularly whencoupled with specific enrichment techniques such as therecently published SISCAPA (Stable Isotope Standards andCapture by Anti-Peptide Antibodies) method for enrichingtarget peptides using anti-peptide antibodies (114). Activity-based protein profiling is another strategy that uses chem-ical probes for tagging, enriching, and isolating a specificsubset of physiologically important proteins on the basis ofenzymatic activity (115, 116). Coupling such strategies withLC-MS holds potential for eliminating many issues relatedto the dynamic range of protein abundance.

A continuing issue for current LC-MS-based profiling ap-proaches is that many of the detected species or featuresfrom LC-MS and LC-MS/MS analyses remain unidentified.Based on our experience, �80% of MS/MS spectra onaverage are not confidently identified via database search-ing, and more than 50% of LC-FTICR-detected featuresremain unidentified by the AMT tag approach. Present in-formatics tools and statistical algorithms have been able toutilize intensity information of these unidentified features toidentify “interesting” features as potential biomarkers forspecific diseases; effectively targeting these interesting fea-tures using data-directed or targeted MS/MS approaches isof current interest. One of the informatics challenges asso-ciated with identifying these features concerns differentpost-translational modifications. Current commercial massspectrometers such as the LTQ offer a targeted MS/MScapability based on the selection of a list of m/z values.Developing an advanced targeted MS/MS approach (117)that incorporates “smart selection” of the targets and dif-ferent, but complementary fragmentation techniques will be



an integral component for an effective LC-MS profiling plat-form suitable for clinical applications.

CONCLUSIONS AND PERSPECTIVES

The amount of effort placed into the development andapplication of effective proteomics profiling of serum/plasmaand other clinical samples has increased tremendously overthe last several years. With the emergence of more effectiveLC-MS technologies and the variety of fractionation ap-proaches, the number of proteins detectable in human plasmaby global profiling has been greatly expanded (e.g. 889 pro-teins with �95% confidence reported in the recent HUPOstudy and 1,494 proteins with �99% confidence, includingconfident identification of many low ng/ml level plasma pro-teins, in our recent study (59)). Although this level of detectionstill falls short of the 10 orders of magnitude in dynamic rangethat encompasses plasma protein abundances, it still offerssignificant potential for the discovery of novel candidate bi-omarkers from clinical plasma/serum samples.

Currently there is no single platform that represents the“best” technology for such discovery applications, and inte-gration of multiple technologies is often required for detectionand quantitation of low abundance proteins. The need forimproved reproducibility, throughput, dynamic range, andquantitation will continue to drive technology developmentand improvement efforts. Importantly several new technolog-ical developments such as fast LC separations, gas phaseIMS separations, and high efficiency nano-ESI interfacespresently appear promising for future discovery platforms andapplications. With improvements in quantitation accuracy,throughput, and robustness, the LC-MS protein profiling plat-form may eventually become a powerful tool for clinical diag-nostic testing that provides simultaneous measurements of alarge number of clinically relevant analytes.

An important component of any integrated profiling plat-form not previously discussed is the informatics and statisticalanalysis. The development of more effective software pack-ages will be essential for processing the large number ofLC-MS datasets, which may include peak (or feature) detec-tion, run-to-run feature alignment, intensity normalization, fea-ture matching to the database, and statistical analysis togenerate a list of high confidence potential candidates.

Finally due to the complexity of large scale clinical proteom-ics studies, collaborative efforts from multiple laboratorieswith different platforms may be required for benchmarkingand better cross-validation of the discovery results and elim-inating potential biases introduced into any given platform.This implies that a common set of standards is needed so thatplatform performance in different laboratories may be readilycompared and large scale proteomics datasets can be effec-tively exchanged and shared.

Acknowledgments—The contributions of Marina Gritsenko, Hongli-ang Jiang, Matt Monroe, Ron Moore, Tom Metz, Angela Norbeck,Sam Purvine, and Yufeng Shen to the work reviewed here are grate-fully acknowledged.

* Portions of the reviewed research were supported by the UnitedStates Department of Energy (DOE) Office of Biological and Environ-mental Research; the National Institutes of Health through the Na-tional Center for Research Resources Grant RR018522, NIGMS LargeScale Collaborative Research Grant U54 GM-62119-02, NIDDK GrantR21 DK070146, and NIDA Grant 1P30DA01562501; the Entertain-ment Industry Foundation (EIF) and the EIF Women’s Cancer Re-search Fund; and the Laboratory Directed Research Developmentprogram at Pacific Northwest National Laboratory. Our laboratoriesare located in the Environmental Molecular Sciences Laboratory, anational scientific user facility sponsored by the DOE and located atPacific Northwest National Laboratory, which is operated by BattelleMemorial Institute for the DOE under Contract DE-AC05-76RL0 1830.The costs of publication of this article were defrayed in part by thepayment of page charges. This article must therefore be herebymarked “advertisement” in accordance with 18 U.S.C. Section 1734solely to indicate this fact.

‡ To whom correspondence should be addressed: EnvironmentalMolecular Sciences Laboratory, Pacific Northwest National Labora-tory, P. O. Box 999, MSIN: K8-98, Richland, WA 99352. E-mail:[email protected].

REFERENCES

1. Aebersold, R., and Mann, M. (2003) Mass spectrometry-based proteom-ics. Nature 422, 198–207

2. Hanash, S. (2003) Disease proteomics. Nature 422, 226–2323. Etzioni, R., Urban, N., Ramsey, S., McIntosh, M., Schwartz, S., Reid, B.,

Radich, J., Anderson, G., and Hartwell, L. (2003) The case for earlydetection. Nat. Rev. Cancer 3, 243–252

4. Ludwig, J. A., and Weinstein, J. N. (2005) Biomarkers in cancer staging,prognosis and treatment selection. Nat. Rev. Cancer 5, 845–856

5. Anderson, N. L., and Anderson, N. G. (2002) The human plasma proteome:history, character, and diagnostic prospects. Mol. Cell. Proteomics 1,845–867

6. Zhou, G., Li, H., DeCamp, D., Chen, S., Shu, H., Gong, Y., Flaig, M.,Gillespie, J. W., Hu, N., Taylor, P. R., Emmert-Buck, M. R., Liotta, L. A.,Petricoin, E. F., III, and Zhao, Y. (2002) 2D differential in-gel electro-phoresis for the identification of esophageal scans cell cancer-specificprotein markers. Mol. Cell. Proteomics 1, 117–124

7. Zangar, R. C., Varnum, S. M., and Bollinger, N. (2005) Studying cellularprocesses and detecting disease with protein microarrays. Drug Metab.Rev. 37, 473–487

8. Janzi, M., Odling, J., Pan-Hammarstrom, Q., Sundberg, M., Lundeberg, J.,Uhlen, M., Hammarstrom, L., and Nilsson, P. (2005) Serum microarraysfor large scale screening of protein levels. Mol. Cell. Proteomics 4,1942–1947

9. Uhlen, M., Bjorling, E., Agaton, C., Szigyarto, C. A., Amini, B., Andersen,E., Andersson, A. C., Angelidou, P., Asplund, A., Asplund, C., Berglund,L., Bergstrom, K., Brumer, H., Cerjan, D., Ekstrom, M., Elobeid, A.,Eriksson, C., Fagerberg, L., Falk, R., Fall, J., Forsberg, M., Bjorklund,M. G., Gumbel, K., Halimi, A., Hallin, I., Hamsten, C., Hansson, M.,Hedhammar, M., Hercules, G., Kampf, C., Larsson, K., Lindskog, M.,Lodewyckx, W., Lund, J., Lundeberg, J., Magnusson, K., Malm, E.,Nilsson, P., Odling, J., Oksvold, P., Olsson, I., Oster, E., Ottosson, J.,Paavilainen, L., Persson, A., Rimini, R., Rockberg, J., Runeson, M.,Sivertsson, A., Skollermo, A., Steen, J., Stenvall, M., Sterky, F., Strom-berg, S., Sundberg, M., Tegel, H., Tourle, S., Wahlund, E., Walden, A.,Wan, J., Wernerus, H., Westberg, J., Wester, K., Wrethagen, U., Xu,L. L., Hober, S., and Ponten, F. (2005) A human protein atlas for normaland cancer tissues based on antibody proteomics. Mol. Cell. Proteom-ics 4, 1920–1932

10. Adkins, J. N., Varnum, S. M., Auberry, K. J., Moore, R. J., Angell, N. H.,Smith, R. D., Springer, D. L., and Pounds, J. G. (2002) Toward a humanblood serum proteome: analysis by multidimensional separation cou-pled with mass spectrometry. Mol. Cell. Proteomics 1, 947–955

11. Jacobs, J. M., Adkins, J. N., Qian, W. J., Liu, T., Shen, Y., Camp, D. G., II,and Smith, R. D. (2005) Utilizing human blood plasma for proteomicbiomarker discovery. J. Proteome Res. 4, 1073–1085

12. Veenstra, T. D., Conrads, T. P., Hood, B. L., Avellino, A. M., Ellenbogen,



R. G., and Morrison, R. S. (2005) Biomarkers: mining the biofluid pro-teome. Mol. Cell. Proteomics 4, 409–418

13. Lee, H. J., Lee, E. Y., Kwon, M. S., and Paik, Y. K. (2006) Biomarkerdiscovery from the plasma proteome using multidimensional fraction-ation proteomics. Curr. Opin. Chem. Biol. 10, 42–49

14. Wright, M. E., Han, D. K., and Aebersold, R. (2005) Mass spectrometry-based expression profiling of clinical prostate cancer. Mol. Cell. Pro-teomics 4, 545–554

15. Hu, Y., Malone, J. P., Fagan, A. M., Townsend, R. R., and Holtzman, D. M.(2005) Comparative proteomic analysis of intra- and interindividual var-iation in human cerebrospinal fluid. Mol. Cell. Proteomics 4, 2000–2009

16. Wattiez, R., and Falmagne, P. (2005) Proteomics of bronchoalveolar la-vage fluid. J. Chromatogr. B Anal. Technol. Biomed. Life Sci. 815,169–178

17. Liao, H., Wu, J., Kuhn, E., Chin, W., Chang, B., Jones, M. D., O’Neil, S.,Clauser, K. R., Karl, J., Hasler, F., Roubenoff, R., Zolg, W., and Guild,B. C. (2004) Use of mass spectrometry to identify protein biomarkers ofdisease severity in the synovial fluid and serum of patients with rheu-matoid arthritis. Arthritis Rheum. 0, 3792–3803

18. Varnum, S. M., Covington, C. C., Woodbury, R. L., Petritis, K., Kangas,L. J., Abdullah, M. S., Pounds, J. G., Smith, R. D., and Zangar, R. C.(2003) Proteomic characterization of nipple aspirate fluid: identificationof potential biomarkers of breast cancer. Breast Cancer Res. Treat. 80,87–97

19. Xie, H., Rhodus, N. L., Griffin, R. J., Carlis, J. V., and Griffin, T. J. (2005) Acatalogue of human saliva proteins identified by free flow electrophore-sis-based peptide separation and tandem mass spectrometry. Mol.Cell. Proteomics 4, 1826–1830

20. Theodorescu, D., Wittke, S., Ross, M. M., Walden, M., Conaway, M., Just,I., Mischak, H., and Frierson, H. F. (2006) Discovery and validation ofnew protein biomarkers for urothelial cancer: a prospective analysis.Lancet Oncol. 7, 230–240

21. Celis, J. E., Gromov, P., Cabezon, T., Moreira, J. M., Ambartsumian, N.,Sandelin, K., Rank, F., and Gromova, I. (2004) Proteomic characteriza-tion of the interstitial fluid perfusing the breast tumor microenvironment:a novel resource for biomarker and therapeutic target discovery. Mol.Cell. Proteomics 3, 327–344

22. Yates, J. R., III, Eng, J. K., and McCormack, A. L. (1995) Mining genomes:correlating tandem mass spectra of modified and unmodified peptidesto sequences in nucleotide databases. Anal. Chem. 67, 3202–3210

23. Perkins, D., Pappin, D., Creasy, D., and London, U. (1999) Probability-based protein identification by searching sequence databases usingmass spectrometry data. Electrophoresis 20, 3551–3567

24. Craig, R., and Beavis, R. C. (2004) TANDEM: matching proteins withtandem mass spectra. Bioinformatics 20, 1466–1467

25. Mayya, V., Rezaul, K., Cong, Y. S., and Han, D. (2005) Systematic com-parison of a two-dimensional ion trap and a three-dimensional ion trapmass spectrometer in proteomics. Mol. Cell. Proteomics 4, 214–223

26. Wolters, D. A., Washburn, M. P., and Yates, J. R. (2001) An automatedmultidimensional protein identification technology for shotgun proteom-ics. Anal. Chem. 73, 5683–5690

27. Wang, H., Qian, W. J., Chin, M. H., Petyuk, V. A., Barry, R. C., Liu, T.,Gritsenko, M. A., Mottaz, H. M., Moore, R. J., Camp, D. G., II, Khan,A. H., Smith, D. J., and Smith, R. D. (2006) Characterization of themouse brain proteome using global proteomic analysis complementedwith cysteinyl-peptide enrichment. J. Proteome Res. 5, 361–369

28. Tabb, D. L., MacCoss, M. J., Wu, C. C., Anderson, S. D., and Yates, J. R.(2003) Similarity among tandem mass spectra from proteomic experi-ments: detection, significance, and utility. Anal. Chem. 75, 2470–2477

29. Smith, R. D., Anderson, G. A., Lipton, M. S., Pasa-Tolic, L., Shen, Y.,Conrads, T. P., Veenstra, T. D., and Udseth, H. R. (2002) An accuratemass tag strategy for quantitative and high throughput proteome meas-urements. Proteomics 2, 513–523

30. Qian, W. J., Camp, D. G., and Smith, R. D. (2004) High throughputproteomics using Fourier transform ion cyclotron resonance (FTICR)mass spectrometry. Expert Rev. Proteomics 1, 89–97

31. Qian, W. J., Monroe, M. E., Liu, T., Jacobs, J. M., Anderson, G. A., Shen,Y., Moore, R. J., Anderson, D. J., Zhang, R., Calvano, S. E., Lowry, S. F.,Xiao, W., Moldawer, L. L., Davis, R. W., Tompkins, R. G., Camp, D. G.,and Smith, R. D. (2005) Quantitative proteome analysis of humanplasma following in vivo lipopolysaccharide administration using 16O/

18O labeling and the accurate mass and time tag approach. Mol. Cell.Proteomics 4, 700–709

32. Qian, W. J., Liu, T., Monroe, M. E., Strittmatter, E. F., Jacobs, J. M.,Kangas, L. J., Petritis, K., Camp, D. G., and Smith, R. D. (2005) Prob-ability-based evaluation of peptide and protein identifications from tan-dem mass spectrometry and SEQUEST analysis: the human proteome.J. Proteome Res. 4, 53–62

33. Tolley, L., Jorgenson, J. W., and Moseley, M. A. (2001) Very high pressuregradient LC/MS/MS. Anal. Chem. 73, 2985–2991

34. Shen, Y., Zhao, R., Berger, S. J., Anderson, G. A., Rodriguez, N., andSmith, R. D. (2002) High-efficiency nanoscale liquid chromatographycoupled on-line with mass spectrometry using nanoelectrospray ioni-zation for proteomics. Anal. Chem. 74, 4235–4249

35. Shen, Y., Zhang, R., Moore, R. J., Kim, J., Metz, T. O., Hixson, K. K., Zhao,R., Livesay, E. A., Udseth, H. R., and Smith, R. D. (2005) Automated 20kpsi RPLC-MS and MS/MS with chromatographic peak capacities of1000–1500 and capabilities in proteomics and metabolomics. Anal.Chem. 77, 3090–3100

36. Wilm, M. S., and Mann, M. (1994) Electrospray and Taylor-Cone theory,Dole’s beam of macromolecules at last? Int. J. Mass Spectrom. IonProcess. 136, 167–180

37. Smith, R. D., Shen, Y., and Tang, K. (2004) Ultrasensitive and quantitativeanalyses from combined separations-mass spectrometry for the char-acterization of proteomes. Acc. Chem. Res. 37, 269–278

38. Zolotarjova, N., Martosella, J., Nicol, G., Bailey, J., Boyes, B. E., andBarrett, W. C. (2005) Differences among techniques for high-abundantprotein depletion. Proteomics 5, 3304–3313

39. Huang, L., Harvie, G., Feitelson, J. S., Gramatikoff, K., Herold, D. A., Allen,D. L., Amunngama, R., Hagler, R. A., Pisano, M. R., Zhang, W. W., andFang, X. (2005) Immunoaffinity separation of plasma proteins by IgYmicrobeads: meeting the needs of proteomic sample preparation andanalysis. Proteomics 5, 3314–3328

40. Echan, L. A., Tang, H. Y., Ali-Khan, N., Lee, K., and Speicher, D. W. (2005)Depletion of multiple high-abundance proteins improves protein profil-ing capacities of human serum and plasma. Proteomics 5, 3292–3303

41. Cho, S. Y., Lee, E. Y., Lee, J. S., Kim, H. Y., Park, J. M., Kwon, M. S., Park,Y. K., Lee, H. J., Kang, M. J., Kim, J. Y., Yoo, J. S., Park, S. J., Cho,J. W., Kim, H. S., and Paik, Y. K. (2005) Efficient prefractionation oflow-abundance proteins in human plasma and construction of a two-dimensional map. Proteomics 5, 3386–3396

42. Liu, T., Qian, W. J., Mottaz, H. M., Gritsenko, M. A., Norbeck, A. D., Moore,R. J., Purvine, S. O., Camp, D. G., II, and Smith, R. D. (July 19, 2006)Evaluation of multiprotein immunoaffinity subtraction for plasma pro-teomics and candidate biomarker discovery using mass spectrometry.Mol. Cell. Proteomics 10.1074/mcp.T600039-MCP200

43. Wang, H., Clouthier, S. G., Galchev, V., Misek, D. E., Duffner, U., Min,C. K., Zhao, R., Tra, J., Omenn, G. S., Ferrara, J. L., and Hanash, S. M.(2005) Intact-protein-based high-resolution three-dimensional quantita-tive analysis system for proteome profiling of biological fluids. Mol. Cell.Proteomics 4, 618–625

44. Wang, H., and Hanash, S. (2005) Intact-protein based sample preparationstrategies for proteome analysis in combination with mass spectrome-try. Mass Spectrom. Rev. 24, 413–426

45. Sheng, S., Chen, D., and Van Eyk, J. E. (2006) Multidimensional liquidchromatography separation of intact proteins by chromatographic fo-cusing and reversed phase of the human serum proteome: optimizationand protein database. Mol. Cell. Proteomics 5, 26–34

46. Barnea, E., Sorkin, R., Ziv, T., Beer, I., and Admon, A. (2005) Evaluation ofprefractionation methods as a preparatory step for multidimensionalbased chromatography of serum proteins. Proteomics 5, 3367–3375

47. Moritz, R. L., Clippingdale, A. B., Kapp, E. A., Eddes, J. S., Ji, H., Gilbert,S., Connolly, L. M., and Simpson, R. J. (2005) Application of 2-Dfree-flow electrophoresis/RP-HPLC for proteomic analysis of humanplasma depleted of multi high-abundance proteins. Proteomics 5,3402–3413

48. Heller, M., Michel, P. E., Morier, P., Crettaz, D., Wenz, C., Tissot, J. D.,Reymond, F., and Rossier, J. S. (2005) Two-stage Off-Gel isoelectricfocusing: protein followed by peptide fractionation and application toproteome analysis of human plasma. Electrophoresis 26, 1174–1188

49. Misek, D. E., Kuick, R., Wang, H., Galchev, V., Deng, B., Zhao, R., Tra, J.,Pisano, M. R., Amunugama, R., Allen, D., Walker, A. K., Strahler, J. R.,



Andrews, P., Omenn, G. S., and Hanash, S. M. (2005) A wide range ofprotein isoforms in serum and plasma uncovered by a quantitative intactprotein analysis system. Proteomics 5, 3343–3352

50. Tang, H. Y., Ali-Khan, N., Echan, L. A., Levenkova, N., Rux, J. J., andSpeicher, D. W. (2005) A novel four-dimensional strategy combiningprotein and peptide separation methods enables detection of low-abundance proteins in human plasma and serum proteomes. Proteom-ics 5, 3329–3342

51. Herbert, B., and Righetti, P. G. (2000) A turning point in proteome analysis:sample prefractionation via multicompartment electrolyzers with iso-electric membranes. Electrophoresis 21, 3639–3648

52. Tu, C. J., Dai, J., Li, S. J., Sheng, Q. H., Deng, W. J., Xia, Q. C., and Zeng,R. (2005) High-sensitivity analysis of human plasma proteome by im-mobilized isoelectric focusing fractionation coupled to mass spectrom-etry identification. J. Proteome Res. 4, 1265–1273

53. Andersen, J. S., Lam, Y. W., Leung, A. K., Ong, S. E., Lyon, C. E., Lamond,A. I., and Mann, M. (2005) Nucleolar proteome dynamics. Nature 433,77–83

54. Jin, W. H., Dai, J., Li, S. J., Xia, Q. C., Zou, H. F., and Zeng, R. (2005)Human plasma proteome analysis by multidimensional chromatographyprefractionation and linear ion trap mass spectrometry identification. J.Proteome Res. 4, 613–619

55. Liu, T., Qian, W. J., Strittmatter, E. F., Camp, D. G., Anderson, G. A., Thrall,B. D., and Smith, R. D. (2004) High throughput comparative proteomeanalysis using a quantitative cysteinyl-peptide enrichment technology.Anal. Chem. 76, 5345–5353

56. Zhang, H., Li, X.-j., Martin, D. B., and Aerbersold, R. (2003) Identificationand quantification of N-linked glycoproteins using hydrazide chemistry,stable isotope labeling and mass spectrometry. Nat. Biotechnol. 21,660–665

57. Liu, T., Qian, W. J., Gritsenko, M. A., Camp, D. G., II, Monroe, M. E.,Moore, R. J., and Smith, R. D. (2005) Human plasma N-glycoproteomeanalysis by immunoaffinity subtraction, hydrazide chemistry, and massspectrometry. J. Proteome Res. 4, 2070–2080

58. Yang, Z. P., Hancock, W. S., Chew, T. R., and Bonilla, L. (2005) A study ofglycoproteins in human serum and plasma reference standards (HUPO)using multilectin affinity chromatography coupled with RPLC-MS/MS.Proteomics 5, 3353–3366

59. Liu, T., Qian, W. J., Gritsenko, M. A., Xiao, W., Moldawer, L. L., Kaushal,A., Monroe, M. E., Varnum, S. M., Moore, R. J., Purvine, S. O., Maier,R. V., Davis, R. W., Tompkins, R. G., Camp, D. G., II, and Smith, R. D.(June 8, 2006) High dynamic range characterization of the traumapatient plasma proteome. Mol. Cell. Proteomics 10.1074/mcp.M600068-MCP200

60. Shen, Y., Smith, R. D., Unger, K. K., Kumar, D., and Lubda, D. (2005)Ultrahigh-throughput proteomics using fast RPLC separations with ESI-MS/MS. Anal. Chem. 77, 6692–6701

61. Chen, H. S., Rejtar, T., Andreev, V., Moskovets, E., and Karger, B. L.(2005) High-speed, high-resolution monolithic capillary LC-MALDI MSusing an off-line continuous deposition interface for proteomic analysis.Anal. Chem. 77, 2323–2331

62. Xie, J., Miao, Y., Shih, J., Tai, Y. C., and Lee, T. D. (2005) Microfluidicplatform for liquid chromatography-tandem mass spectrometry analy-ses of complex peptide mixtures. Anal. Chem. 77, 6947–6953

63. He, B., and Regnier, F. (1998) Microfabricated liquid chromatographycolumns based on collocated monolith support structures. J. Pharm.Biomed. Anal. 17, 925–932

64. Li, J., LeRiche, T., Tremblay, T. L., Wang, C., Bonneil, E., Harrison, D. J.,and Thibault, P. (2002) Application of microfluidic devices to proteomicsresearch: identification of trace-level protein digests and affinity captureof target peptides. Mol. Cell. Proteomics 1, 157–168

65. Srebalus, C. A., Li, J., Marshall, W. S., and Clemmer, D. E. (2000) Deter-mining synthetic failures in combinatorial libraries by hybrid gas-phaseseparation methods. J. Am. Soc. Mass Spectrom. 11, 352–355

66. Henderson, S. C., Valentine, S. J., Counterman, A. E., and Clemmer, D. E.(1999) ESI/ion trap/ion mobility/time-of-flight mass spectrometry forrapid and sensitive analysis of biomolecular mixtures. Anal. Chem. 71,291–301

67. Valentine, S. J., Kulchania, M., Srebalus Barnes, C. A., and Clemmer, D. E.(2001) Multidimensional separations of complex peptide mixtures: acombined high-performance liquid chromatography/ion mobility/time-

of-flight mass spectrometry approach. Int. J. Mass Spectrom. 212,97–109

68. Tang, K., Shvartsburg, A. A., Lee, H. N., Prior, D. C., Buschbach, M. A., Li,F., Tolmachev, A. V., Anderson, G. A., and Smith, R. D. (2005) High-sensitivity ion mobility spectrometry/mass spectrometry using electro-dynamic ion funnel interfaces. Anal. Chem. 77, 3330–3339

69. Shen, Y., Jacobs, J. M., Camp, D. G., Fang, R., Moore, R. J., Smith, R. D.,Xiao, W., Davis, R. W., and Tompkins, R. G. (2004) High efficiencySCXLC/RPLC/MS/MS for high dynamic range characterization of thehuman plasma proteome. Anal. Chem. 76, 1134–1144

70. Anderson, N. L., Polanski, M., Pieper, R., Gatlin, T., Tirumalai, R. S.,Conrads, T. P., Veenstra, T. D., Adkins, J. N., Pounds, J. G., Fagan, R.,and Lobley, A. (2004) The human plasma proteome: a nonredundant listdeveloped by combination of four separate sources. Mol. Cell. Pro-teomics 3, 311–316

71. States, D. J., Omenn, G. S., Blackwell, T. W., Fermin, D., Eng, J., Speicher,D. W., and Hanash, S. M. (2006) Challenges in deriving high-confidenceprotein identifications from data gathered by a HUPO plasma proteomecollaborative study. Nat. Biotechnol. 24, 333–338

72. Peng, J., Elias, J. E., Thoreen, C. C., Licklider, L. J., and Gygi, S. P. (2003)Evaluation of multidimensional chromatography coupled with tandemmass spectrometry (LC/LC-MS/MS) for large-scale protein analysis: theyeast proteome. J. Proteome Res. 2, 43–50

73. Tirumalai, R. S., Chan, K. C., Prieto, D. A., Issaq, H. J., Conrads, T. P., andVeenstra, T. D. (2003) Characterization of the low molecular weighthuman serum proteome. Mol. Cell. Proteomics 2, 1096–1103

74. Qian, W. J., Jacobs, J. M., Camp II, D. G., Monroe, M. E., Moore, R. J.,Gritsenko, M. A., Calvano, S. E., Lowry, S. F., Xiao, W., Moldawer, L. L.,Davis, R. W., Tompkins, R. G., and Smith, R. D. (2005) Comparativeproteome analyses of human plasma following in vivo lipopolysaccha-ride administration using multidimensional separations coupled withtandem mass spectrometry. Proteomics 5, 572–584

75. Washburn, M. P., Wolters, D., and Yates, J. R. (2001) Large-scale analysisof the yeast proteome by multidimensional protein identification tech-nology. Nat. Biotechnol. 19, 242–247

76. Xie, H., and Griffin, T. J. (2006) Trade-off between high sensitivity andincreased potential for false positive peptide sequence matches using atwo-dimensional linear ion trap for tandem mass spectrometry-basedproteomics. J. Proteome Res. 5, 1003–1009

77. Omenn, G. S., States, D. J., Adamski, M., Blackwell, T. W., Menon, R.,Hermjakob, H., Apweiler, R., Haab, B. B., Simpson, R. J., Eddes, J. S.,Kapp, E. A., Moritz, R. L., Chan, D. W., Rai, A. J., Admon, A., Aebersold,R., Eng, J., Hancock, W. S., Hefta, S. A., Meyer, H., Paik, Y. K., Yoo,J. S., Ping, P., Pounds, J., Adkins, J., Qian, X., Wang, R., Wasinger, V.,Wu, C. Y., Zhao, X., Zeng, R., Archakov, A., Tsugita, A., Beer, I., Pandey,A., Pisano, M., Andrews, P., Tammen, H., Speicher, D. W., and Hanash,S. M. (2005) Overview of the HUPO Plasma Proteome Project: resultsfrom the pilot phase with 35 collaborating laboratories and multipleanalytical groups, generating a core dataset of 3020 proteins and apublicly-available database. Proteomics 5, 3226–3245

78. Hood, B. L., Zhou, M., Chan, K. C., Lucas, D. A., Kim, G. J., Issaq, H. J.,Veenstra, T. D., and Conrads, T. P. (2005) Investigation of the mouseserum proteome. J. Proteome Res. 4, 1561–1568

79. Keller, A., Nesvizhskii, A. I., Kolker, E., and Aebersold, R. (2002) Empiricalstatistical model to estimate the accuracy of peptide identificationsmade by MS/MS and database search. Anal. Chem. 74, 5383–5392

80. Nesvizhskii, A. I., Keller, A., Kolker, E., and Aebersold, R. (2003) A statis-tical model for identifying proteins by tandem mass spectrometry. Anal.Chem. 75, 4646–4658

81. MacCoss, M. J., Wu, C. C., and Yates, J. R. (2002) Probability-basedvalidation of protein identifications using a modified SEQUEST algo-rithm. Anal. Chem. 74, 5593–5599

82. Anderson, D. C., Li, W., Payan, D. G., and Noble, W. S. (2003) A newalgorithm for the evaluation of shotgun peptide sequencing in proteom-ics: support vector machine classification of peptide MS/MS spectraand SEQUEST scores. J. Proteome Res. 2, 137–146

83. Fenyo, D., and Beavis, R. C. (2003) A method for assessing the statisticalsignificance of mass spectrometry-based protein identifications usinggeneral scoring schemes. Anal. Chem. 75, 768–774

84. Henzel, W. J., Billeci, T. M., Stults, J. T., Wong, S. C., Grimley, C., andWatanabe, C. (1993) Identifying proteins from two-dimensional gels by



molecular mass searching of peptide fragments in protein sequencedatabases. Proc. Natl. Acad. Sci. U. S. A. 90, 5011–5015

85. Pappin, D. J., Hojrup, P., and Bleasby, A. J. (1993) Rapid identification ofproteins by peptide-mass fingerprinting. Curr. Biol. 3, 327–332

86. Yates, J. R., Speicher, S., Griffin, P. R., and Hunkapiller, T. (1993) Peptidemass maps: a highly informative approach to protein identification.Analytical Biochemistry 214, 397–408

87. Zimmer, J. S., Monroe, M. E., Qian, W. J., and Smith, R. D. (2006)Advances in proteomics data analysis and display using an accuratemass and time tag approach. Mass Spectrom. Rev. 25, 450–482

88. Olsen, J. V., and Mann, M. (2004) Improved peptide identification inproteomics by two consecutive stages of mass spectrometric fragmen-tation. Proc. Natl. Acad. Sci. U. S. A. 101, 13417–13422

89. Dieguez-Acuna, F. J., Gerber, S. A., Kodama, S., Elias, J. E., Beausoleil,S. A., Faustman, D., and Gygi, S. P. (2005) Characterization of mousespleen cells by subtractive proteomics. Mol. Cell. Proteomics 4,1459–1470

90. Gao, J., Opiteck, G. J., Friedrichs, M. S., Dongre, A. R., and Hefta, S. A.(2003) Changes in the protein expression of yeast as a function ofcarbon source. J. Proteome Res. 2, 643–649

91. Liu, H., Sadygov, R. G., and Yates, J. R. (2004) A model for randomsampling and estimation of relative protein abundance in shotgun pro-teomics. Anal. Chem. 76, 4193–4201

92. Jacobs, J. M., Diamond, D. L., Chan, E. Y., Gritsenko, M. A., Qian, W. J.,Stastna, M., Camp, D. G., Rice, C. M., Carithers, R. L., Katze, M. G., andSmith, R. D. (2005) Proteome analysis of Huh-7.5 cells containingfull-length hepatitis C virus replicon and application to HCV infectedliver biopsy samples. J. Virol. 79, 7558–7569

93. Zybailov, B., Coleman, M. K., Florens, L., and Washburn, M. P. (2005)Correlation of relative abundance ratios derived from peptide ion chro-matograms and spectrum counting for quantitative proteomic analysisusing stable isotope labeling. Anal. Chem. 77, 6218–6224

94. Heller, M., Mattou, H., Menzel, C., and Yao, X. (2003) Trypsin catalyzed16O-to-18O exchange for comparative proteomics: tandem mass spec-trometry comparison using MALDI-TOF, ESI-QTOF, and ESI-ion trapmass spectrometers. J. Am. Soc. Mass Spectrom. 14, 704–718

95. Pasa-Tolic, L., Jensen, P. K., Anderson, G. A., Lipton, M. S., Peden, K. K.,Martinovic, S., Tolic, N., Bruce, J. E., and Smith, R. D. (1999) Highthroughput proteome-wide precision measurements of protein expres-sion using mass spectrometry. J. Am. Chem. Soc. 121, 7949–7950

96. Oda, Y., Huang, K., Cross, F. R., Cowburn, D., and Chait, B. T. (1999)Accurate quantitation of protein expression and site-specific phospho-rylation. Proc. Natl. Acad. Sci. U. S. A. 96, 6591–6596

97. Ong, S. E., Blagoev, B., Kratchmarova, I., Kristensen, D. B., Steen, H.,Pandey, A., and Mann, M. (2002) Stable isotope labeling by amino acidsin cell culture, SILAC, as a simple and accurate approach to expressionproteomics. Mol. Cell. Proteomics 1, 376–386

98. Gygi, S. P., Rist, B., Gerber, S. A., Turecek, F., Gelb, M. H., and Aebersold,R. (1999) Quantitative analysis of complex protein mixtures using iso-tope-coded affinity tags. Nat. Biotechnol. 17, 994–999

99. Zhang, Y., Wolf-Yadlin, A., Ross, P. L., Pappin, D. J., Rush, J., Lauffen-burger, D. A., and White, F. M. (2005) Time-resolved mass spectrometryof tyrosine phosphorylation sites in the epidermal growth factor recep-tor signaling network reveals dynamic modules. Mol. Cell. Proteomics 4,1240–1250

100. DeSouza, L., Diehl, G., Rodrigues, M. J., Guo, J., Romaschin, A. D.,Colgan, T. J., and Siu, K. W. (2005) Search for cancer markers fromendometrial tissues using differentially labeled tags iTRAQ and cICAT

with multidimensional liquid chromatography and tandem mass spec-trometry. J. Proteome Res. 4, 377–386

101. Wang, W., Zhou, H., Lin, H., Roy, S., Shaler, T. A., Hill, L. R., Norton, S.,Kumar, P., Anderle, M., and Beker, C. H. (2003) Quantification of pro-teins and metabolites by mass spectrometry without isotope labeling orspiked standards. Anal. Chem. 75, 4818–4826

102. Chelius, D., and Bondarenko, P. V. (2002) Quantitative profiling of proteinsin complex mixtures using liquid chromatography and mass spectrom-etry. J. Proteome Res. 1, 317–323

103. Fang, R., Elias, D. A., Monroe, M. E., Shen, Y., McIntosh, M., Wang, P.,Goddard, C. D., Callister, S. J., Moore, R. J., Gorby, Y. A., Adkins, J. N.,Fredrickson, J. K., Lipton, M. S., and Smith, R. D. (2006) Differentiallabel-free quantitative proteomic analysis of Shewanella oneidensis cul-tured under aerobic and suboxic conditions by accurate mass and timetag approach. Mol. Cell. Proteomics 5, 714–725

104. Tang, K., Page, J. S., and Smith, R. D. (2004) Charge competition and thelinear dynamic range of detection in electrospray ionization mass spec-trometry. J. Am. Soc. Mass Spectrom. 15, 1416–1423

105. Luo, Q., Shen, Y., Hixson, K. K., Zhao, R., Yang, F., Moore, R. J., Mottaz,H. M., and Smith, R. D. (2005) Preparation of 20-�m-i.d. silica-basedmonolithic columns and their performance for proteomics analyses.Anal. Chem. 77, 5028–5035

106. Juraschek, R., Dulcks, T., and Karas, M. (1999) Nanoelectrospray—morethan just a minimized-flow electrospray ionization source. J. Am. Soc.Mass Spectrom. 10, 300–308

107. Alaiya, A., Al-Mohanna, M., and Linder, S. (2005) Clinical cancer proteom-ics: promises and pitfalls. J. Proteome Res. 4, 1213–1222

108. Zhan, X., and Desiderio, D. M. (2003) Heterogeneity analysis of the humanpituitary proteome. Clin. Chem. 49, 1740–1751

109. Mann, K. G., Brummel-Ziedins, K., Undas, A., and Butenas, S. (2004) Doesthe genotype predict the phenotype? Evaluations of the hemostaticproteome. J. Thromb. Haemostasis 2, 1727–1734

110. Kendziorski, C., Irizarry, R. A., Chen, K. S., Haag, J. D., and Gould, M. N.(2005) On the utility of pooling biological samples in microarray exper-iments. Proc. Natl. Acad. Sci. U. S. A. 102, 4252–4257

111. Sickmann, A., Marcus, K., Schafer, H., Butt-Dorje, E., Lehr, S., Herkner,A., Suer, S., Bahr, I., and Meyer, H. E. (2001) Identification of post-translationally modified proteins in proteome studies. Electrophoresis22, 1669–1676

112. Anderson, L. (2005) Candidate-based proteomics in the search for bi-omarkers of cardiovascular disease. J. Physiol. 563, 23–60

113. Anderson, L., and Hunter, C. L. (2006) Quantitative mass spectrometricmultiple reaction monitoring assays for major plasma proteins. Mol.Cell. Proteomics 5, 573–588

114. Anderson, N. L., Anderson, N. G., Haines, L. R., Hardie, D. B., Olafson,R. W., and Pearson, T. W. (2004) Mass spectrometric quantitation ofpeptides and proteins using Stable Isotope Standards and Capture byAnti-Peptide Antibodies (SISCAPA). J. Proteome Res. 3, 235–244

115. Berger, A. B., Vitorino, P. M., and Bogyo, M. (2004) Activity-based proteinprofiling: applications to biomarker discovery, in vivo imaging and drugdiscovery. Am. J. Pharmacogenomics 4, 371–381

116. Speers, A. E., and Cravatt, B. F. (2004) Chemical strategies for activity-based proteomics. Chembiochem 5, 41–47

117. Masselon, C., Pasa-Tolic, L., Tolic, N., Anderson, G. A., Bogdanov, B.,Vilkov, A. N., Shen, Y., Zhao, R., Qian, W. J., Lipton, M. S., Camp, D. G.,II, and Smith, R. D. (2005) Targeted comparative proteomics by liquidchromatography-tandem Fourier ion cyclotron resonance mass spec-trometry. Anal. Chem. 77, 400–406



Documents

Advances and Challenges in Liquid Chromatography-Mass