2
Guest Editor’s Introduction Methods in structural proteomics Structural proteomics or structural genomics describes the determination of protein structures on a genome-wide scale. The field has developed over the last 10 years in response to the explo- sion in the amount of protein sequence data available from high throughput genome sequencing. The need to accommodate a large number of potential targets for structure determination has led to technology developments in all aspects of the gene to structure process. Hence, structural proteomics has been an important driver of innovation in structural biology and this is reflected in the arti- cles in this special issue of Methods. In a series of reviews and experimental reports, the workflow from target selection via pro- tein production, crystallization, data collection through to struc- ture determination by X-ray crystallography is considered. Beginning with bioinformatics, Overton and Barton [1] describe in silico methods for optimising protein target selection and discuss the merits of algorithms based on the large data sets obtained by structural proteomics projects for predicting the behaviour of pro- teins in crystallization trials. Although protein expression remains largely empirical, with an element of trial and error involved, using all available bioinformatic data in choosing the starting sequences can significantly contribute to overall success. Having defined the protein sequences for study the next step is to evaluate the expres- sion and solubility of the proteins in one or more heterologous hosts. Typically, an expression screening step at small-scale (up to one millilitre culture volume) is used to identify proteins suit- able for production at a larger scale (one litre or more culture vol- ume). A common theme of many of the experimental protocols that are described in this issue is the use of multi-well plates for parallel processing of samples. Multi-well plates in SBS (Society of Bio-molecular Screening) format are compatible with standard liquid handling robotics enabling the automation of some proce- dures, for example expression testing, as described by Vincentelli and et al. [2]. However all plate based procedures can be carried out manually using multi-channel pipettes. Escherichia coli remains the dominant expression system in structural biology with nearly 90% of the structures in the Protein Data Bank (PDB) derived from proteins produced in this bacterial host [3]. Highly efficient pipelines have been assembled for pro- ducing proteins expressed in E. coli and a consensus approach has emerged from structural proteomics projects world-wide [4]. Typically, T7 RNA polymerase-based vectors constructed by liga- tion independent cloning are used in combination with BL21 E. coli strains for protein expression, see for example, [5]. An ami- no-terminal hexahistidine tag is routinely added to the sequences to be expressed to enable purification using metal chelate affinity matrices. The approach is exemplified by the experiences of two protein production pipelines based on E. coli which are presented in this issue [2,6]. For some difficult to express proteins, library methods have proved highly successful in identifying structured domains in pro- teins that were otherwise not revealed by bioinformatics analyses [7]. However, many mammalian and eukaryotic viral proteins re- quire post-translation modification for proper folding and/or are part of large multimeric complexes. Therefore, expression in higher eukaryotic cell lines from both invertebrate and vertebrate sources is required to produce these proteins. Using transient expression in mammalian cells has emerged as the system of choice for the pro- duction of secreted glycoproteins [8]. Although higher eukaryotic expression systems are generally more time-consuming and expensive to use than bacteria, there have been improvements in technology that have streamlined the processes involved. Protocols designed to improve the throughput of recombinant protein expression in insect and mammalian cells are described by Hitch- man et al. [9] and Raymond et al. [10], respectively. Cell-free sys- tems offer an alternative to cell-based expression and a system based on the protozoan, Leishmania tarentolae, has been developed by Alexandrov and co-workers [11] that combines the ease-of-use and low cost of E. coli cell-free methods in a eukaryotic cell-free system. One of the most significant changes to working practice, intro- duced by structural proteomics has been the use of nanolitre vol- ume crystallization experiments carried out in 96-well plates. The miniaturization of crystallization experiments not only re- duces the demand on the amount of purified protein required for screening but enables extensive exploration of chemical ‘‘space’’ to identify optimal conditions for crystallization. This presents a challenge in terms of handling large numbers of plate experiments and some of these issues are discussed in the article by Newman [12]. In parallel with developments in protein crystallization, there has been significant innovation at synchrotrons particularly in the automation of sample handling and data processing. Developments at synchrotrons have played a key part in accelerating structure determination by X-ray crystallography. As an example of the cur- rent state-of-the art, the systems at Diamond Light Source, the Uni- ted Kingdom’s third generation synchrotron are described by Winter and McAuley [13]. At the end of the structural proteomics workflow is protein structure determination. Automated pipelines are now routinely used in structural biology and one of the leading software suites available is described by Adams et al. [14]. I am grateful to all the contributors to this special issue of Methods for sharing their experience and knowledge in the invited reviews and research articles. I also thank the journal Editor 1046-2023/$ - see front matter Ó 2011 Published by Elsevier Inc. doi:10.1016/j.ymeth.2011.09.024 Methods 55 (2011) 1–2 Contents lists available at SciVerse ScienceDirect Methods journal homepage: www.elsevier.com/locate/ymeth

Methods in structural proteomics

Embed Size (px)

Citation preview

Page 1: Methods in structural proteomics

Methods 55 (2011) 1–2

Contents lists available at SciVerse ScienceDirect

Methods

journal homepage: www.elsevier .com/locate /ymeth

Guest Editor’s Introduction

Methods in structural proteomics

Structural proteomics or structural genomics describes thedetermination of protein structures on a genome-wide scale. Thefield has developed over the last 10 years in response to the explo-sion in the amount of protein sequence data available from highthroughput genome sequencing. The need to accommodate a largenumber of potential targets for structure determination has led totechnology developments in all aspects of the gene to structureprocess. Hence, structural proteomics has been an important driverof innovation in structural biology and this is reflected in the arti-cles in this special issue of Methods. In a series of reviews andexperimental reports, the workflow from target selection via pro-tein production, crystallization, data collection through to struc-ture determination by X-ray crystallography is considered.

Beginning with bioinformatics, Overton and Barton [1] describein silico methods for optimising protein target selection and discussthe merits of algorithms based on the large data sets obtained bystructural proteomics projects for predicting the behaviour of pro-teins in crystallization trials. Although protein expression remainslargely empirical, with an element of trial and error involved, usingall available bioinformatic data in choosing the starting sequencescan significantly contribute to overall success. Having defined theprotein sequences for study the next step is to evaluate the expres-sion and solubility of the proteins in one or more heterologoushosts. Typically, an expression screening step at small-scale (upto one millilitre culture volume) is used to identify proteins suit-able for production at a larger scale (one litre or more culture vol-ume). A common theme of many of the experimental protocolsthat are described in this issue is the use of multi-well plates forparallel processing of samples. Multi-well plates in SBS (Societyof Bio-molecular Screening) format are compatible with standardliquid handling robotics enabling the automation of some proce-dures, for example expression testing, as described by Vincentelliand et al. [2]. However all plate based procedures can be carriedout manually using multi-channel pipettes.

Escherichia coli remains the dominant expression system instructural biology with nearly 90% of the structures in the ProteinData Bank (PDB) derived from proteins produced in this bacterialhost [3]. Highly efficient pipelines have been assembled for pro-ducing proteins expressed in E. coli and a consensus approachhas emerged from structural proteomics projects world-wide [4].Typically, T7 RNA polymerase-based vectors constructed by liga-tion independent cloning are used in combination with BL21E. coli strains for protein expression, see for example, [5]. An ami-no-terminal hexahistidine tag is routinely added to the sequencesto be expressed to enable purification using metal chelate affinitymatrices. The approach is exemplified by the experiences of two

1046-2023/$ - see front matter � 2011 Published by Elsevier Inc.doi:10.1016/j.ymeth.2011.09.024

protein production pipelines based on E. coli which are presentedin this issue [2,6].

For some difficult to express proteins, library methods haveproved highly successful in identifying structured domains in pro-teins that were otherwise not revealed by bioinformatics analyses[7]. However, many mammalian and eukaryotic viral proteins re-quire post-translation modification for proper folding and/or arepart of large multimeric complexes. Therefore, expression in highereukaryotic cell lines from both invertebrate and vertebrate sourcesis required to produce these proteins. Using transient expression inmammalian cells has emerged as the system of choice for the pro-duction of secreted glycoproteins [8]. Although higher eukaryoticexpression systems are generally more time-consuming andexpensive to use than bacteria, there have been improvements intechnology that have streamlined the processes involved. Protocolsdesigned to improve the throughput of recombinant proteinexpression in insect and mammalian cells are described by Hitch-man et al. [9] and Raymond et al. [10], respectively. Cell-free sys-tems offer an alternative to cell-based expression and a systembased on the protozoan, Leishmania tarentolae, has been developedby Alexandrov and co-workers [11] that combines the ease-of-useand low cost of E. coli cell-free methods in a eukaryotic cell-freesystem.

One of the most significant changes to working practice, intro-duced by structural proteomics has been the use of nanolitre vol-ume crystallization experiments carried out in 96-well plates.The miniaturization of crystallization experiments not only re-duces the demand on the amount of purified protein required forscreening but enables extensive exploration of chemical ‘‘space’’to identify optimal conditions for crystallization. This presents achallenge in terms of handling large numbers of plate experimentsand some of these issues are discussed in the article by Newman[12]. In parallel with developments in protein crystallization, therehas been significant innovation at synchrotrons particularly in theautomation of sample handling and data processing. Developmentsat synchrotrons have played a key part in accelerating structuredetermination by X-ray crystallography. As an example of the cur-rent state-of-the art, the systems at Diamond Light Source, the Uni-ted Kingdom’s third generation synchrotron are described byWinter and McAuley [13]. At the end of the structural proteomicsworkflow is protein structure determination. Automated pipelinesare now routinely used in structural biology and one of the leadingsoftware suites available is described by Adams et al. [14].

I am grateful to all the contributors to this special issue ofMethods for sharing their experience and knowledge in theinvited reviews and research articles. I also thank the journal Editor

Page 2: Methods in structural proteomics

2 Guest Editor’s Introduction / Methods 55 (2011) 1–2

Dr. Kenneth Adolph for the opportunity of editing this issue ofMethods, and the journal manager Mr. Jon Stein for his expertassistance.

References

[1] I.M. Overton, G.J. Barton, Methods 55 (2011) 3–11.[2] R. Vincentelli, A. Cimino, A. Geerlof, A. Kubo, Y. Satou, C. Cambillau, Methods 55

(2011) 65–72.[3] J.E. Nettleship, R. Assenberg, J.M. Diprose, N. Rahman-Huq, R.J. Owens, J. Struct.

Biol. 172 (2010) 55–65.[4] Nat. Methods 5 (2008) 135–146.[5] L. Bird, Methods 55 (2011) 29–37.[6] Y. Kim, G. Babnigg, R. Jedrzejczak, W.H. Eschenfeldt, H. Li, N. Maltseva, C.

Hatzos-Skintges, M. Gu, M. Makowska-Grzyska, R. Wu, H. An, G. Chhor, A.Joachimiak, Methods 55 (2011) 12–28.

[7] H. Yumerefendi, D.C. Desravines, D.J. Hart, Methods 55 (2011) 38–43.[8] V.T. Chang, M. Crispin, A.R. Aricescu, D.J. Harvey, J.E. Nettleship, J.A. Fennelly, C.

Yu, K.S. Boles, E.J. Evans, D.I. Stuart, R.A. Dwek, E.Y. Jones, R.J. Owens, S.J. Davis,Structure 15 (2007) 267–273.

[9] R.B. Hitchman, E. Locanto, R.D. Possee, L.A. King, Methods 55 (2011) 52–57.[10] C. Raymond, R. Tom, S. Perret, P. Moussouami, D. L’Abbé, G. St-Laurent, Y.

Durocher, Methods 55 (2011) 44–51.

[11] O. Kovtun, S. Mureev, W.R. Jung, M. Kubala, W. Johnston, K. Alexandrov,Methods 55 (2011) 58–64.

[12] J. Newman, Methods 55 (2011) 73–80.[13] G. Winter, K.E. McAuley, Methods 55 (2011) 81–93.[14] P.D. Adams, P.V. Afonine, G. Bunkóczi, V.B. Chen, N. Echols, J.J. Headd, L.-W.

Hung, S. Jain, G.J. kapral, R.W. Grosse Kunstleve, A.J. McCoy, N.W. Moriarty,R.D. Oeffner, R.J. Read, D.C. Richardson, J.S. Richardson, T.C. Terwilliger, P.H.Zwart, Methods 55 (2011) 94–106.

Ray OwensThe Oxford Protein Production Facility,

Research Complex at Harwell,Rutherford Appleton Laboratory and Division of Structural Biology,

Wellcome Trust Centre for Human Genetics,University of Oxford, UK