8
Ca*+-dependent carbohydrate-recognition domains in animal proteins Kurt Drickamer Columbia University, New York, USA Many Ca 2+-dependent (C-type) animal lectins contain a common sequence motif, with 14 invariant and 18 highly conserved residues distributed over discrete carbohydrate-recognition domains of 115-130 amino acids. Domains that display the C-type carbohydrate-recognition domain motif are found in an increasing number of proteins, although only some are known to bind carbohydrate. Progress is being made towards making deductions about the structure and activity of these domains from their sequences. Current Opinion in Structural Biology 1993, 3:393-400 Introduction @+-dependent carbohydrate-recognition domains (C- type CRDs) were initially identified in the mammalian asialoglycoprotein receptor, its chicken homolog, and serum mannose-binding proteins [ 11. Sequence align- ments have led to the identification of more than 50 additional proteins that contain domains related to these CDs. Comparison of these sequences reveals the pres- ence of a common sequencemotif consisting of 14 invari- ant and 18 highly conserved residues (Fig. 1). The CRDs are found associated with various domains that have been identified in other extracellular and cell-surface proteins 121.C-type animal lectins can be classified into groups which are based either on the overall architecture of the protein and the position of the CRD relative to other do- mains (Fig. 2) or the degree to which the sequencesof their CRDs are related (Fig. 3). In general, the groups defined by these two approaches are the same. in understanding the evolutionary, structural, and func- tional significance of the sequence similarities observed - in this family of protein domains. The biological roles of diverse C-type lectins have been discussed in de- tail elsewere [3]. Structurally distinct groups of lectins, such as the soluble, P-galactoside-binding(S-type) lectins and the mannose 6-phosphate receptors (P-type lectins), have been considered in other recent reviews [4,5]. Variations on established themes New examples of C-type CRDs in proteins that are similar to known lectins have been reported during the pastyear. In addition to the human E-, L- and P-selectins,murine homologs of all three proteins have now been cloned [ 6,7], as have rabbit E-selectin [ 81 and rat L-selectin [ 91. But, in any one species,at most onlv three of these cell- Recent additions to the collection of C-type CRDs will adhesion molecules have been described. It remains to be discussed in the first two sections of this review. be seen whether the selectin-like protein that mediates These are followed by a summary of progress made endothelial-cell capillary formation is a bovine form of E- -s-- n ----------;-~--e---C------e--e-O--E--~-- s---s------s ~--------0-n--G--n------n--u--- I ZP-----------EOCe-n--------G-uND--C------n-C- Fig. 1. Sequence motif in C-type CRDs. Invariant residues are indicated in ‘the one letter amino acid code. Residues that are conserved in character are designated: @, aromatic; 8, aliphatic; R, either aromatic or aliphatic; and 0, oxygen-containing. A version of this motif has been entered in the PROSITE database as the C-type lectin domain signature KTD. Abbreviations CRD-carbohydrate-recognition domain; ECFLepidermal growth factor. @ Current Biology Ltd ISSN 0950440X 393

Ca2+-dependent carbohydrate-recognition domains in animal proteins

  • Upload
    kurt

  • View
    213

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Ca2+-dependent carbohydrate-recognition domains in animal proteins

Ca*+-dependent carbohydrate-recognition domains in animal proteins

Kurt Drickamer

Columbia University, New York, USA

Many Ca 2+-dependent (C-type) animal lectins contain a common sequence motif, with 14 invariant and 18 highly conserved residues distributed over discrete carbohydrate-recognition domains of 115-130 amino acids. Domains that display the C-type carbohydrate-recognition domain motif are found in an increasing number of proteins, although only some are known to bind carbohydrate. Progress is being made towards making deductions about the structure and activity of these domains from

their sequences.

Current Opinion in Structural Biology 1993, 3:393-400

Introduction

@+-dependent carbohydrate-recognition domains (C- type CRDs) were initially identified in the mammalian asialoglycoprotein receptor, its chicken homolog, and serum mannose-binding proteins [ 11. Sequence align- ments have led to the identification of more than 50 additional proteins that contain domains related to these CDs. Comparison of these sequences reveals the pres- ence of a common sequence motif consisting of 14 invari- ant and 18 highly conserved residues (Fig. 1). The CRDs are found associated with various domains that have been identified in other extracellular and cell-surface proteins 121. C-type animal lectins can be classified into groups which are based either on the overall architecture of the protein and the position of the CRD relative to other do- mains (Fig. 2) or the degree to which the sequences of their CRDs are related (Fig. 3). In general, the groups defined by these two approaches are the same.

in understanding the evolutionary, structural, and func- tional significance of the sequence similarities observed - in this family of protein domains. The biological roles of diverse C-type lectins have been discussed in de- tail elsewere [3]. Structurally distinct groups of lectins, such as the soluble, P-galactoside-binding (S-type) lectins and the mannose 6-phosphate receptors (P-type lectins), have been considered in other recent reviews [4,5].

Variations on established themes

New examples of C-type CRDs in proteins that are similar to known lectins have been reported during the past year. In addition to the human E-, L- and P-selectins, murine homologs of all three proteins have now been cloned [ 6,7], as have rabbit E-selectin [ 81 and rat L-selectin [ 91. But, in any one species, at most onlv three of these cell-

Recent additions to the collection of C-type CRDs will adhesion molecules have been described. It remains to be discussed in the first two sections of this review. be seen whether the selectin-like protein that mediates These are followed by a summary of progress made endothelial-cell capillary formation is a bovine form of E-

-s-- n ----------;-~--e---C------e--e-O--E--~--

s---s------s ~--------0-n--G--n------n--u---

I

ZP-----------EOCe-n--------G-uND--C------n-C-

Fig. 1. Sequence motif in C-type CRDs. Invariant residues are indicated in ‘the one letter amino acid code. Residues that are conserved in character are designated: @, aromatic; 8, aliphatic; R, either aromatic or aliphatic; and 0, oxygen-containing. A version of this motif has been entered in the PROSITE database as the C-type lectin domain signature KTD.

Abbreviations CRD-carbohydrate-recognition domain; ECFLepidermal growth factor.

@ Current Biology Ltd ISSN 0950440X 393

Page 2: Ca2+-dependent carbohydrate-recognition domains in animal proteins

394 Sequences aid topology

GROUP II - GROUP IV GROUP VI GROUP Ill GROUP I (GROUP VI

NNN c

NNN

GROUP VII

or P-selectin, or represents a new member of the selectin group (M Nguyen, NA Strubel, J Bischoff, abstract, J Cell Biochem 1992, Sl6Dl75).

The term collectin is now being used to describe the group III C-type lectins, denoting the presence of both collagenous and &in-like segments in a single polypep- tide. Several additional collectin sequences have recently been reported, including murine homologs of the two forms of rat mannose-binding proteins [lo], and murine and human forms of a minor pulmonary surfactant pro- tein, SP-D [il-131. These proteins continue to clus- ter into two subgroups distinguishable by the size of their collagenous segments. The presence of longer col- lagenous domains is correlated with the formation of large, cruciform aggregates from trirneric building blocks, whereas the shorter collagen tails generally lead to the association of trimers into bouquet-like structures [ 141. Proteins in both subgroups are believed to mediate an innate immune response to pathogens.

One of the most diverse groups of C-type animal lectins is the category of type II transmembrane receptors, which mediate glycoprotein endocytosis and degradation. The group II lectins consist of carboxy-terminal CRDs linked to amino-terminal, Internal signal/membrane anchor se- quences. As noted above, proteins of this type, includ- ing the hepatic asialoglycoprotein receptor, were among the first C-type lectins to be described. A close homo- log of this protein isolated from mouse tumoricidal macrophages has recently been cloned [ 151. A more divergent member of group II is a mannose-specific re- ceptor isolated from human placenta [16=*].

Fig. 2. Summary of the structures of several groups of C-type animal lectins. Representative structures for four groups of membrane-associated lectins are shown. Groups II and V are repre- sented by the chicken hepatic lectin, the homolog of the mammalian asialo- glycoprotein receptor. Group IV consists of the selectin cell-adhesion molecules, such as L-selectin. The macrophage mannose receptor is the only lectin in group VI. Three groups of water-sol- uble lectins are also depicted: group Ill lectins (collectins), such as mannose- binding protein, are found in extra- cellular fluids; group I CRD-containing proteins are proteoglycans of the ex- tracellular matrix; and group VII con- sists of CRDs without flanking domains. Other domains shown: ECF, ECF-like do- mains; CR, complement-regulatory re- peats; FN-II, fibronectin type II repeats; COL, collagen-like sequences; GAG, gly- cosaminoglycan attachment sites; and HA, link protein homology domains. C, carboxyl terminus; N, amino terminus.

The carboxy-terminal domain structures of extracellular matrix proteoglycan core proteins (group I> are proving to be unexpectedly diverse. Variant splicing forms of the mRNiU for aggrecan, the aggregating proteoglycan of cartilage, have been described, in which the epidermal growth factor (EGF)-like domains or the complement- binding repeats, usually found on either side of the C-type CRD, are deleted [ 171. A novel brain proteogly can, neurocan, has also been isolated and cloned [ 1 WI. Like versican, a core protein from fibroblasts, neurocan contains two repeats of the EGF-like domain.

New structural categories

Several recently determined sequences have resulted in the definition of two new groups of C-type animal lectins. The existence of proteins that appear to be free CRDs, not appended to any other polypeptide segment, has been known for some time. But until recently, most of these proteins had been isolated from invertebrate sources and characterized at the protein level. Because of their evolutionary distance from the vertebrate proteins, it is diicult to relate their sequences to the other C-type lectins. In addition, until cDNA sequences for these pro- teins have been established, it is impossible to be sure that they are not fragments derived from larger precur- sors by proteolytic processing.

Several vertebrate cDNAS encoding free-standing CRDs have recently been described. The sequences of sev-

Page 3: Ca2+-dependent carbohydrate-recognition domains in animal proteins

Ca*+dependent carbohydrate-recognition domains Drickamer 395

Gal LEC

TYPE II RECEPTORS ”

FREE CROs VII

IHuI MANNOSE RECEPTOR

SELECTINS I”

r 1601 CONGLUT~N~N COLLECTINS 11, .

1 1 IHub TETRANECTIN

IHUI EOSONOPHIL GRANULE PROTEIN

IHW NKGZA . _

IHul NKGZC

lHul NKGZO

IRal NK-PI

IMol NK-PI-2 NK

WI NK-Pl-34 TYPE II

IMol NK-PI.40

LYf&yzfNyE V

IMOI LY49B

IMOI LY49C

IMol LY49A . (Tj

IMol ANTIGEN 9 _

--- --- GENE DUPLICATION SPECIES DIVERGENCE

Fig. 3. Dendrogram summarizing se- quence similarity between various C- type CRDs. Similarities were determined on the basis of comparison of amino acid sequences of the CR0 portion of each protein, using a cluster analysis program 1571. The computer-generated tree has been modified for clarity by the segregation of the CRDs of the man- nose receptor from the other proteins. At the bottom are shown the approx- imate regions of the dendrogram cor- responding to either the divergence of protein sequences following duplication of genes to serve new functions or the divergence of proteins with homologous functions in-different species. The divi- sion between these two regions is not precise, because different proteins di- verge at distinct rates. The fact that the C-type animal lectins fall into the same categories in this figure and the preced- ing one indicates that the shuffling of exons to form a precursor with a, given protein architecture occurred only once during evolution. Thus, all proteins with similar domain organization are de- scended from a common ancestor. Ref- erences for sequences may be found in I411 and in the text. Sequences of lectins from lower vertebrates and in- vertebrates are not included. 60, cow; Ch, chicken; Do, dog; HEP, hepatocyte; Hu, human; LEC, lectin; MO, mouse; Ma, macrophage; Ra, rat; Rb, rabbit.

Page 4: Ca2+-dependent carbohydrate-recognition domains in animal proteins

3% Sequences aid topology

et-al of these CRDs place them in a cluster (group VII in Fig. 3). Because these proteins have been isolated from several.diEerent species, it is not easy to be cer- tain which are homologs of each other. The sequence comparispns suggest the probable existence of at least two distinct mammalian group VII proteins, one rep- resented by the pancreatic stone protein [ 19,201, and the other represented by the pancreatic thread protein [2l], pancreatitus-associated protein 1221, and a newly described hepatoma-associated protein (which is Iso expressed in pancreas) [23**]. Tetranectin, a human serum protein which binds to the fourth kringle domain of plasminogen, is not included in this group because its sequence is quite divergent from the other CRDs in this group, and because it is not known whether this polypep- tide is produced by degradation of a larger precursor [24]. The major basic protein of human eosinophil granules, which has been independently isolated as an immunoregulatory factor [ 251, likewise does not cluster with any of the known groups. Proteins from lower vertebrates related to some of the mammalian and avian proteins shown in Figs 2 and 3 have been described. Two snake venom proteins, a galactose-binding lectin [26] and a coagulation factor IX/X-binding protein [27] are isolated C-type CRDs that cluster with the group VII CRDs. Two fonns of phospho- lipase A2 inhibitor [28] are also free CRDs and are very closely related in sequence to each other, but do not clus- ter with any of the groups in Fig. 3. Finally, two antifreeze proteins from fish serum [ 29*,30*] are distantly related to the group I and II lectins in mammals and birds. An increasing number of cDNAs encoding type II trans- membrane proteins, similar in overall structure to the group II C-type animal lectins, have been isolated from natural killer lymphocyte libraries [31-351. Similar pro- teins have also been identified on the surface of T and B cells [36-381. The putative CRDs in these group V proteins are highly divergent from the other sequences compared in Fig. 3. Some are also highly divergent from each other. The possible functions of these proteins are discussed below.

Cenomic organization

The intron-exon organization of several genes encoding C-type CRDs has been described. In each case, the CRD- encoding region js separated from the remainder of the gene by an intron. In addition, the coding sequences for CRDs ark interrupted by two introns in group I and II proteins [39=-l, and three introns in group VII proteins [19]. The positions of these introns are identical within each group, but are shifted between any two groups [40]. The CRDs in other groups are encoded entirely on uninterrupted exons. Thus, the genomic organization reflects the same categorization as the sequence compar- isons and overall protein architecture. The gene for the macrophage mannose receptor is the most complex yet examined, consisting of 30 exons [41]. The pattern of introns within its CRDs does not correlate with the inter-

ruptions seen in the other groups of CRDS, reflecting the early divergence of this receptor.

Structure-evolution correlations

The determination of the three-dimensional structure of the CRD from rat serum mannose-binding protein pro- vides a basis for understanding the role of the residues that make up the C-type CRD motif [42,43**]. Residues that define the motif establish the overall fold of the CRD (Fig. 4) by creating: two conserved disulfide bonds; bind- ing sites for two bound Cal+; several turns; and an ex- tended hydrophobic core. Hence, it can be concluded that the motif residues have been conserved in order to maintain the basic fold of the domain. Therefore, all of the domains that share this set of conserved residues will probably be folded in similar ways. Several ancillary pieces of evidence are consistent with the idea that the folds of proteins possessing the C- type CRD motif are similar. Wherever the topology of disullide-bond formation has been established, it is the same as in the mannose-binding CRD. In addition, circular-dichroism spectroscopy suggests that the CRDs that have been examined contain levels of o! and p struc- ture similar to that observed in the crystal structure of the mannose-binding CRD (K Drickamer, unpublished data). Finally, all of the domains that have been examined dis- play Ca’+ -dependent activity. Yet the number of Ca’+ - binding sites is not always known. For instance, many of the ligands that form the Ca2+-binding site 1 in the mannose-binding CRD are absent from the selectins. In spite of this difference, there is evidence for two divalent cation-binding sites in P-selectin [44]. Studies identify- ing several Ca2+ -binding peptide mimics of subdomains of P-selectin are difficult to interpret, because these do not correspond to either of the Ca2+-binding sites ob- served in the mannose-binding CRD [45]. An important aspect of the CRD structure is that the amino-terminal and carboxy-terminal ends are located close to each other. This arrangement contrasts with that in some other extracellular protein modules, such as immunoglobulin- like domains, which have a ‘pass through’ topology. This type of topological consideration clearly affects the way in which domains can be shuffled and juxtaposed.

Sugar-binding activity and specificity

Crystallography It is one thing to use a sequence motif, such as the one identified with the C-type CRDs, to predict that domains are folded in similar ways, and another to predict the ac- tivity of such a domain. An important guide in this re- spect’is the recently described structure ‘of the mannose- binding protein complexed with a mannose-containing oligosaccharide [43**]. The structure reveals that the equatorial 3- and 4-hydroxyl groups of mannose are co- ordinated to one of the bound Ca*+ ions (Ca2+ 2 in Fig.

Page 5: Ca2+-dependent carbohydrate-recognition domains in animal proteins

Ca*+-dependent carbohydrate-recognition domains Drickamer 397

4), and that four of the protein side chains that are also coordinated to this Ca*+ (two asparagine and two glu- tamic acid residues) are hydrogen bonded to these same sugar hydroxyl groups (Fig. 4). The remaining ligands for Ca*+ 2 are contributed by the side chain of an aspartic acid residue.

The presence of these five amino acid side chains be- comes an important criterion for predicting that a CRD- like domain. will have the capacity to bind sugars in a manner analogous to the mannose-binding CRD. In- deed, these five residues are completely conserved in CRDs that bind mannose, glucose, and other sugars with equatorial 3- and 4hydroxyl groups. It is also striking that in CRDs known to bind galactose preferentially, two of the liganding residues are different. One asparagine po- sition in the mannose-binding CRD is always occupied by aspartic acid in galactose-binding CRDs, and one of the glutamic acids is replaced by glutamine. Modification of the mannose-binding CRD by incorporation of these two changes results in a change to preferential binding of galactose [46]. The sugar-binding site is not related to that predicted from sequence comparisons with other sugar-binding proteins [47].

A second important source of information about sugar binding has been tile mutagensis of E-selectin [49**]. These studies point to the importance of a second re- gion of the CRD, which includes two lysine residues that are needed for binding to the sialyl Lewis x epitope, a nat- ural ligand for this protein [50]. The residues at Ca*+ 2 in the selectins are the same as those in the mannose- binding CRD. As fucose binds to the mannose-binding protein, probably through the 2- and 3-hydroxyl groups, it is possible that the fucose portion of the sialyl Lewis x tetrasaccharide binds to selectins in a similar way. The _ critical sialic acid terminal residue might then bind at the second site identified in the mutagenesis studies. Some ligands may bind to this second site in a Ca’+ - independent manner [51]. It will be interesting to see whether the sequence in this region can be correlated

Mutagenesis As expected from the crystallographic results, extensive mutagenesis of the CRD from rat serum mannose-bind- ing protein reveals that residues over much of the sur- with saccharide-binding specificit$ in other CRDs.

face of the domain can be changed without altering sugar-binding activity, whereas changes in the conserved Ca* + ligands usually lead to loss of binding activity [48]. More surprising is the finding that changes in many of the conserved residues that make up the hydrophobic core of the domain result in loss of sugar-binding activ- ity as an indirect result of decreased affinity for Ca*+. This decreased affinity must result from subtle changes in the arrangement of the loops that form the Caz+- binding sites (Fig. 4). The phenotype of these mutants might mimic the changes brought about at low pH, such as in endosomes, when C-type CRDs release their ligands.

(a) l (b)

ASP 2lK

Fig. 4. Structure of C-type CRD from rat serum mannose-binding protein complexed with mannose-containing oligosaccharide. (a) Ribbon diagram showing secondary structure elements of the mannose-binding CRD complexed with a glycopeptide from ovalbumin containing six mannose residues. Spheres l-3 represent Caz+ Ions observed in the crystal structure. Only the first two are believed to be present in the soluble CRD. (b) Detailed view of region surrounding Ca2+ 2. The Ca2+ Ion and black spheres represent carbon, nitrogen and oxygen, respectively. Ca2+-

is shown as a light grey sphere. White, dark grey, coordination bonds are denoted by long thick dashes,

whereas short dashes represent hydrogen bonds. Numbers on the mannose carbon atoms represent ring positions. The cl-glycosidic bond to the next sugar of the oligosaccharide (at carbon 1) has been cut off for clarity. Published with permission [4X**].

Page 6: Ca2+-dependent carbohydrate-recognition domains in animal proteins

398 Sequences’ and topology

Predictions 5.

On the basis of’these findings, it is possible to make more informed interpretations of sequences containing the C- type CRD motif, as CRDs containing the five Ca’+ 2 lig- ands iri either the mannose- or galactose-binding configu- ration would be better candidates for carbohydrate-bind- ing activity than CRDs that do not. An interesting test case is the lymphocyte low-affinity receptor for the Fc portion of IgE (CD23). The &E-binding domain of this group II protein has been shown to correspond to the segment most similar to bona fide 0s [52-l. The mouse form of this protein contains all of the residues expected of a mannose-binding CRD, whereas one of the asparagine residues is replaced by threonine in the human version. Evidence for the involvement of carbohydrates in the CD23-IgE interaction is conflicting. A peptide from IgE is able to inhibit the interaction, indicating that protein forms at least part of the binding determinant [53]. Al- though sugars do not compete effectively [54], experi- ments employing tunicamycin to prevent glycosylation suggest that sugars are required for the interaction of CD23 with a novel ligand, CD21 [ 550,56*]. One pos- sibility is that, like the selectins, ~~23 may have a dual interaction, in this case with both protein and carbohy- drate.

One of the most interesting cases for considering pos- sible ligand-binding activity is the group V family of natural killer cell-surface proteins. As noted above, this family is quite diverse in sequence. Many of the members are lacking one or more of the Ca’+ 2 ligands, making it unclear whether or not these proteins bind Cal+, much less saccharide ligands.

Acknowledgements

I thank Maureen Tzzyior for comments on the manuscript. K Drick- amer is a recipient of a l%cult) Sakiry Amrd from the American Cancer Society.

References and recommended reading

Papers of particular interest, published within the annual period of review, have been hiRhlinhred 35: . . .

1.

2.

3.

4.

of special interest of outstanding interest

D~ICKAMER K: Two Distinct Classes of Carbohydrate-Recog- nition Domains in Animal Lectins. J Biol U~er77 1988,

263:9557-9560.

WEIS WI, QUEXNBERRY MS, TAYLOR ME, BIZOLISKA K, HENDRICKSON WA, DRIC~IER K: Molecular Mechanisms of Complex Carbohydrate Recognition at the Cell Surface. Cold Spring Harb Symp Quanr Bioi 1993, 52: in press.

DRICMER K, TAYLOR ME: Biology of Animal Lectins. Anr7u

Rev Cell Biol 1993, 9: in press.

WANG JL, LUNG JG, ANDERSON RL: Lectins in the Cell Nucleus. Glymbiolog)~ 1991, 1:243-252.

6.

7.

8.

9.

10.

11.

12.

13.

1-l.

IS.

16. . .

KORNFEUI S: Structure and Function of the Mannose 6. Phosphate/Insulin-Like Gromh Factor II Receptors. A77t7rt

Ret* Biocbern 1992, 61:307-330.

WEUK A, ISENUNN S, V~n~~Hfz D: Cloning of the Mouse Endothelial Selectins: Expression of Both E- and P-Selectin Is Inducible by Tumor Necrosis Factor a. J Biol u1wnf 1992, 267:15176-15183.

S&iolia WE, WILSON RW, B~VLWIZ’NE CM, BI:.A~IXT AL Molec- ular Cloning and Analysis of in vivo Expression of Murine P-Selectin. Blood 1992. 80:795-800.

L\RIGAN JD, TS~G TC, RUXII~ERG~~R Jkl, Bul~vs DK: Chancter- ization of cDNA and Cenomic Sequences Encoding Rabbit ELAM-1: Conservation of Structure and Functional Intenc- dons with Leukocytes. DNA Cell Biol 1972, 11:149-162.

WATMABE T, SONG Y, HIKAYAU Y, T,UIA%\NI T. KLU~A K, Mn’~%ucr\ M: Sequence and Expression of n Rat cDNA for LECAM-1. Bicchin? Biop/ys ACM 1992, 1131:321-324.

SA~TXY K, Z\HEDI K, LI~IJAS J-M, WHITI:.HEA~ AS, Eze~ow?r~ RAB: Molecular Characterization of the Mouse Mannose- Binding Proteins: the Mannose-Binding Protein A but not C Is ?n Acute Phase Reactant. J fn7r77rr~rol 1991, 147:692-697.

RLIS~‘ K. GROSSO L, ZHANG V, CHANG D, PIX~SON A, LONGUORE W, CAI G-Z. CROLICH E: Human Surfactant Protein D: SP-D Contains a C-Type Lectin Carbohydrate Recognition Do- main. &cl) /3ioc/~c~~t Biop/+ 1991, 290:116-126.

SHI~IIZU H. FISHEH JIH. PAPS?’ P. BENSON B, late K. SEASON RJ, \rOELKeK DR: Primzuy Structure of Rat Pulmonary Surfactant Protein D: cDNA and Deduced Amino Acid Sequence. ./ Hiol Gr,cm 1992, 26731853-1857.

LLI J. WIUJS AC, &lo KBM: Purification. Characterization and cDNA Cloning of Human Lung Surfactant Protein D. Biocheru J 1992, 284z795-802.

LLI J, TI-I&L S, WllXXh&~N H, TIUI’I. R, RUG KBM: Binding of the Pentamer/Hexamer Forms of a Mannan-Binding Protein to Zymosan Activates the Proenzyme ClrzClsz Complex of the Classical Pathway of Complement, without Involvement of Clq. J /f~,~r,,lo/ 1990, 144:2287-2194.

SATO M. KAU’AKA\II K. OSAWA T, TOYOSHIAIA S: Molecular Cloning and Expression of cDNA Encoding a Galactose/N- Acetylgalactosamine-Specific Lectin on Mouse Tumoricidal Macrophages. J B~C&JCW? 1992, 111:331-336.

Cl1RTlS BM, SCHARvOW%E S, WATSON AJ: Sequence and Ex- pression of a Membrane-Associated C-Type Lectin that Exhibits CD4-lndependent Binding of Human Immunodefi- ciency Virus Envelope Glycoprotein gpl20. I-‘rw N&l Acd Sci USA 1992, 89:83%-8360.

A novel mannose-binding receptor, cloned from placenta on rhe ba- sis of its ability to confer HIV-sensirivky on cells, is closely related in sequence to other group II receptors but has a unique carbohydmte- binding specilicky.

17. D0EGE KJ, SASAKI M, KIAIURA T, YAUDA Y: Complete Cod- ing Sequence and Deduced Primary Structure of the Human Cartilage Large Aggregating Proteoglycan. Aggre- can: Human-Specific Repeats, and Additional Alternatively Spliced Forms. J Biol C%em 1991, 266:894-9O2.

18. It\ucti u, KARTHIKEYAN 1 MAUREL P, MARGOIJS RU, MAHCOIJS . . RK: Cloning and Primary Structure of Neurocan, a De-

velopmentally Regulated, Aggregating Chondroitin Sulfate PrOteOglycan of Brain. J Biol U~ern 1992, 267:1953619547.

Neuroc;ln, die latest matrix pr&eoglycan 10 be isolated, in this case from brain, is believed to play a role in the guidance of developing neuronal axons. Its snucture, which includes a.C-t)pe CRD, is similar fo that of proteoglycans from cartilage and connective tissue.

19. WATANABE T, YONEKURA H, TIZRAZONO K, YAMAMO~O H, O~IOTO H: Complete Nucleotide Sequence of Human reg Gene and its Expression in Normal and Tumoral Tissues: the reg Protein, Pancreatic Stone Protein. and Pancreatic

Page 7: Ca2+-dependent carbohydrate-recognition domains in animal proteins

Ca2+-dependent carbohydrate-recognition domains Drickamer 399

20.

21.

22.

23. . .

Thread Protein Are One and the Same Product of the Gene. J Biol Cbern 1990, 2657432-7439.

ROL~QUIER S, VERDIER J-M, IO~A~‘NA J, DAGORN J-C, GIORGI D: Rat Pancreatic Stone Protein Messenger RNA: Abundant Ex- pression in Mature Exocrine Cells. Regulation by Food Content, and Sequence Identity with the Endocrine reg Transcript. J Biol aem 1991, 266:786791.

DE IA MONTE SM, OZTURK M, WANDS JR: Enhanced Expres- sion of an Exocrine Pancreas Protein in Alzheimer’s Dis- ease and the Developing Human Brain. J Ch hwes/ 1990, 86:1004-1013.

IOVWNA J, ORELIE B, KEIM V, DAGOK~’ J-C: Messenger RNA Sequence and Expression of Rat Pancreatitis-Associated Protein, a Lectin-Related Protein Overexpressed dur- ing Acute Experimental Pancreatitis. J Rio/ Cbem 1991, 266:24664-24669.

L%%ERRE C, CHIUSTA L, SI~ION M-T, VERNIEH P, BRECHOT C: A Novel Gene (HIP) Activated in Human Priiary Liver Cancer. tinter Res 1992, 52:5089-5095.

In addition to describing a new CRDcontaining protein found in hep atomas, this paper clarifies the relationship between other group VU proteins, suggesting that the pancreatic stone protein and the pancre- atic thread protein are distinct.

24.

25.

26.

27.

28.

29. 0.

FUI~LENDORFF J, CLE~IMENSEN 1. MAGNUSON S: Primary Struc- ture of Tetranectin. a Plasminogen Kringle 4 Binding Plasma Protein: Homology with Asialoglycoprotein Recep- tor and Cartilage Proteoglycan Core Protein. Biocbemiswy 1987, 26~6757-6764.

YOSI-II~!A~SU K. OHI’A Y, SI-IIKATA Y, SETO T, HASEGAWA Y, TANAKA I, KAW~\~IURA T, KITOI-I K, To~h%tmlr\ S, OSAwA T: Pu- rification and cDNA Cloning of a Novel Factor Produced by a Human T-Cell Hybridomr Sequence Homology with Animal Lectins. IBOI ltnm~rnol 1992, 29537-546.

HIIL!~AYASHI J. KUSUNOKI T, KA!SAI K: Complete Primary Struc- ture of a Galactose-Specific Lectin from the Venom of the Rattlesnake Crotulus utro% Homologies with CaZ+- Dependent-Type Lectins. J Biol Ckw? 1991, 266:232&2326.

ATODA H, HWGA M, MOIUTA T: The Primary Structure of Co- agulation Factor lX/Factor X-Binding Protein Isolated from the Venom of Trfmeresurus j’avoviridis Homology with Asialoglycoprotein Receptors, Proteoglycan Core Protein, Tetranectin. and Lymphocte Fcs Receptor for Immunoglob- ulin E. J Biol Chem 1991, 266:14903-14911.

INOUE S, KOGAI(I H, IKEDA K, SA~IEJI~W Y, O~IORI-SATOH T: Amino Acid Sequences of the Two Subunits of a Phospho- Upase A2 lnhibitor from the Blood Plasma of Trimeresurus flavoviridis Sequence Homologies with Pulmonary Sur- factant Apoprotein and Animal Lectins. J Biol tiem 1991. 266:1001-1007.

NG NFL HEW CL Structure of an Antifreeze Polypeptide from the Sea Raven: Disuffide Bonds and Siiilariry to Lectin-Binding Proteins. J Biol Ulem 1992, 267:16069-16075.

see 130’1.

30. EWART KV, RUBINSK~ B, FLETCHER GL: Structure and Func- . tional SiiUarity between Antifreeze Proteins and Calcium-

Dependent Lectins. Biochetn BiopLys Res Conmuir? 1992, 185:335-340.

These two papers [29*,30-l note an unexpected similarity between cer. tain antifreeze proteins and Ctype lectins. The relationship wa5 not no- ticed when the first antifreeze protein sequence was published.

31. GIORDA R, RUDERT WA, VAV~SSORI C, CH~UIBERS WH. HISERODT JC, TRUCCO M: NKR-PI, a Signal Transduction Molecule on Natural Killer Cells. Science 1990, 249:1298-1300.

32. HOUCHINS JP, YALE T, MCSHERRY C, BACH FH: DNA Sequence Analysis of NlCGZ, a Family of Related cDNA Clones Encod- ing Type Il Intergrai Membrane Proteins on Human Natural Killer Cells. J Exp Med 1991, 173:1017-1020.

33. WONG S, FREEDMAN JD, KEUEHER C. MAGER D, TAKI F: Ly-49 Multigene Family: New Members of a Superfamily of Type I1 Membrane Proteins with Lectin-Lie Domains. J Immunol * 1991, 147:1417-1423.

34. GIORDA R, TRUCCO M: Mouse NKR-PI: a Family of Genes Selectively Coexpressed in Adherent Lymphokine-Activated Killer Ceils. J Itnmrrnol 1991, 147:1701-1708.

35. Yoxo%t~ WM, ~‘AN JC, HUNTER JJ, S~IITH HRC, STARK M, SEA&IAN WE: cDNA Cloning of Mouse NlCB-PI and *Genetic Linkage with Ly-49: Identification of a Natural lCiUer.CeU Gene Complex on Mouse Chromosome 6. J Immutiol 1991, 147:3229-3236.

36. Cm P-Y, TAKEI F: Molecular Cloning and Characteriza- tion of a Novel Murine T CeU Surface Antigen, YE1/48. J lrnn~ur~ol 1989, 142:1727-1736.

37. NAKAYAMA E, VON HOEGEN I, PIUUVES JR: Sequence of the Lyb-2 B-cell Differentiation Antigen Defines a Gene Superfamily of Receptors with Inverted Membrane Orientation. froc Null Accul Sci USA 1989* 86:1352-1356.

38. YOKOY~W Wkl, JACOSS LB, KANAGAWA 0, SHEVACI-I EM, COHEN Dl: A Murine T Lymphocyte Antigen Belongs to a Super- gene Family of Type 11 Integral Membrane Proteins. J Im I?ZIII?O/ 1989, 143:1379-1386.

39. DIUC&\IER K: Evolution of Ca2+ -Dependent Animal Lectins. . . Prog Nucleic Acid Res Mel Biol 1993, 45~207-232. A detailed discussion of the evolutionary issues touched on in the present review. Although the data set is slightly less complete than that- shown in Fig. 3, the conclusions still hold true.

40. Btzousx~ K. CRICHLOW’ GV, ROSE JM, TAYLOR ME, DRICI~A~IER K: Evolutionary Conservation of Intron Position in a Subfamily of Genes Encoding Carbohydrate-Recognition Domains. J Rio/ Bern 1991, 266:11604-11609.

41. Km SJ, RUIZ N, BEZOUSKA K, DRICI~~IER K: Organization and Characterization of the Gene Encoding the Human Macrophage Mannose Receptor. Genomics 1992, 14:721-727.

42. WEIS WI, KAHN R, FOLIRAIE R, DIUC&\~IER K. HENDRICKSON WA: Structure of the Calcium-Dependent Lectin Domain from a Rat Mannose-Binding Protein Determined by MAD Phasing. Science 1991, 254:168%1615.

43. . .

WEIS WI, DIUCKAMEH K, HENDRICKSON WA: Structure of a C-Type Mannose-Binding Protein Complexed with an OUgosaccharide. Ncirure 1992, 360:127-134.

This paper, together with [42], provides the structural basis for inter- preting the sequence conservation common to the C-type CRDs.

44.

45.

46.

47.

48.

49. . .

ERBE DV, WOLITZKV BA, PRESTA LG, NORTON CR, RA&IOS RJ, BURNS DK, RU~IBERGER JM. RAO BNN, Foxw C. BRANDLEY BK, LAshT LA: Identification of an E-Selectin Region Critical for Carbohydrate Recognition and CeU Adhesion. J Cell Biol 1992, 119215-227.

Monoclonal antibodies and site-directed mutagenesis are employed to define in E-selectin residues involved in ligand binding. The results pro-

GENG J-G, MOOU KI, JOHNSON AE, MCEVEH RP: Neu- trophil Recognition Requires a Ca2+-Induced Conforma- tion Change in the Lectin Domain of GMP-140. J Biol Cbem 1991, 266:22313-22318.

GENG J-G, HU\VNER GA, McEvERRP: Lectin Domain Peptides from Select& Interact with Both CeU Surface Ca2+ Ions. J Bioi C&em 1992, 267:19846-19853.

Ligands and

DIUCI~UIER K: Engineering Galactose-Binding Activity into a C-type Mannose-Binding Protein. Nature 1992, 360:183186.

HOLT GD: Identifying Glycoconjugate-Binding Domains: Building on the Past. G/yco&io/ogy 1991, 1:329-336.

QUESENI~ERRY MS, DRICKI\ICIER K: Role of Conserved and Non- conserved Residues in the Ca2+-Dependent Carbohydrate- Recognition Domain of a Rat Mannose-Binding Protein: Analysis by Random Cassette Mutagenesis. J Biol Cbem 1992, 267:10831-10841.

Page 8: Ca2+-dependent carbohydrate-recognition domains in animal proteins

400 Sequences and topology

tide strong evidence for a binding site different from that observed in the mannose-binding CRD crystal.

SO..

51.

52. .

L4sKy ‘IA: Selectins: Interpreters of Cell-Specific Carbo- hydrate Information during Inflammation. Science 1992, 2.58:~%9.

&A D, GANT T, ODA Y, BRANDLEI’ BK: Evidence for Two Classes of Carbohydrate Binding Sites on Selectins. G&co biology 1992, 2:39%00.

This careful study of the low-affinity IgE Fc receptor (CD231 pro\Sdes

Btmtx B, ~xmo G, RAGGIN S, RWGG D, HOFSTEI-WR

evidence for the disullide-bonding pattern ancl shows that the IgE-bind. ing site lies in the CRDJike region.

H: Immunoglobulin E-Binding Site ln FCE R&eptor (FcfZIVCD23) Identified by Homolog-Scanning Mutagen- esis. J Biol C&m 1992. 267:185-191.

53. VERCEW D, HEUI B, MARSH P, PADLAN E, GEHA RS. Goum H: The B-Cell Binding Site on Human lmmunoglobulin E. Nature 1989. 338643-651.

54. RICHARDS M1; KA’IZ DH: The Binding of IgE to Murine Fc Rll Is Calcium-Dependent but Not Inhibited by Carbohydrate. J Immrmol 1990, 144:263%2G46.

55.

Although the exact nature of the CD23.binding determinant on IgE re-

P~CHON S, GRAVER P, YEAGER M, JANSBN K, BERNRAD AR, AWRY .

mains poorly understood, esldence provided in these papers [55*,56*]

J-P, BONNEFOY J-Y: Demonstration of a Second Ligand for the Low AlTinity Receptor for Immunoglubulin E (CD23)

suggests that carbohydrates may be involved in the interaction of CD23

Using Recombinant CD23 Reconstituted into Fluorescent Microsomes. J Exp Med 1992. 176:389-397.

‘See [WI

with at least some ligands.

56. AUBRY J-P, POCHON S, GRABER P, JANSEN KU, BONNEFOY J-Y: . CD21 Is a Ligand for CD23 and Regulates IgE Production.

Nature 1992, 358:505-507.

57. HIGGINS DG, SHARK’ PM: CLUSTAL: a Package for Performing Multiple Sequence Alignment on a Microcomputer. Gette 1988, 73:237-244.

K Drickamer, Department of Biochemistry and Molecular Biophysics, Columbia University, 630 West 168th Street, New York, New York 10032, USA.