Academia.eduAcademia.edu
740 ¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim DOI: 10.1002/cbic.200300753 ChemBioChem 2004, 5, 740 ± 764 Chemical Biology of the Sugar Code Hans-Joachim Gabius,*[a] Hans-Christian Siebert,[a] Sabine Andrÿ,[a] JesÇs Jimÿnez-Barbero,[b] and Harold R¸diger[c] In respectful and thankful memory of Professor Friedrich Cramer, who died three months before his 80th birthday A high-density coding system is essential to allow cells to communicate efficiently and swiftly through complex surface interactions. All the structural requirements for forming a wide array of signals with a system of minimal size are met by oligomers of carbohydrates. These molecules surpass amino acids and nucleotides by far in information-storing capacity and serve as ligands in biorecognition processes for the transfer of information. The results of work aiming to reveal the intricate ways in which oligosaccharide determinants of cellular glycoconjugates interact with tissue lectins and thereby trigger multifarious cellular responses (e.g. in adhesion or growth regulation) are teaching amazing lessons about the range of finely tuned activities involved. The ability of enzymes to generate an enormous diversity of biochemical signals is matched by receptor proteins (lectins), which are equally elaborate. The multiformity of lectins ensures accurate signal decoding and transmission. The exquisite refinement of both sides of the protein±carbohydrate recognition system turns the structural complexity of glycans–a demanding but essentially mastered problem for analytical chemistry–into a biochemical virtue. The emerging medical importance of protein±carbohydrate recognition, for example in combating infection and the spread of tumors or in targeting drugs, also explains why this interaction system is no longer below industrial radarscopes. Our review sketches the concept of the sugar code, with a solid description of the historical background. We also place emphasis on a distinctive feature of the code, that is, the potential of a carbohydrate ligand to adopt various defined shapes, each with its own particular ligand properties (differential conformer selection). Proper consideration of the structure and shape of the ligand enables us to envision the chemical design of potent binding partners for a target (in lectin-mediated drug delivery) or ways to block lectins of medical importance (in infection, tumor spread, or inflammation). 1. Introduction Biological information storage and transfer are commonly described to be based solely on nucleic acids and proteins. In contrast to nucleotides and amino acids, the most abundant type of biomolecule in nature, the carbohydrate molecule, has been almost completely sidelined in this respect. Sugar molecules have been nearly exclusively assigned as building blocks of protective cell wall constituents (for example cellulose and chitin) or as biochemical fuel in energy metabolism. This paradigm, which is reflected in textbooks, has been questioned occasionally over the years. An exemplary quotation from 1972 points out that glycans do matter more than originally assumed: ™The polysaccharides of mammalian connective tissue, and glycoproteins, begin to make biochemical sense for the first time ever. So many exciting developments have occurred that this period seems to have moved us out of a dark age to see polysaccharides in quite a new light. They have become interesting molecules to contemplate in relation to the life of a cell. The ugly ducklings have begun to look a little more like swans. In this sense, polysaccharides begin to appear attractive molecules, shapely molecules.∫[1] With hindsight, the answer to the question of why the exceptional talents of carbohydrates have remained nearly unnoticed for so long appears to be rather simple; in essence, this neglect occurred because ™glycoconjugates are much more complex, variegated, and difficult to study than proteins or nucleic acids.∫[2] Viewed from the perspective of bioinformatics, however, this structural property in fact makes oligomers of saccharides ™ideal for generating comChemBioChem 2004, 5, 740 ± 764 DOI: 10.1002/cbic.200300753 pact units with explicit informational properties.∫[3] This argument and other reasons listed below explain why it is justified to portray individual monosaccharides as letters of an alphabet. These letters form biochemical code words. The coining of terms such as sugar code or glycomics helps condense the concept into keywords. However, it goes without saying that the reader can expect us to carve out a distinctive image of this fundamental functionality of glycans.[4] 2. The Hardware of the Sugar Code Carbohydrates have several exceptional features at their disposal. These features make a strong case for a prominent role of d-glucose and its relatives in information handling. Fore[a] Prof. Dr. H.-J. Gabius, Priv.-Doz. Dr. H.-C. Siebert, Dr. S. Andrÿ Institut f¸r Physiologische Chemie, Tier‰rztliche Fakult‰t Ludwig-Maximilians-Universit‰t Veterin‰rstra˚e 13, 80539 M¸nchen (Germany) Fax: (+ 49) 89-2180-2508 E-mail: gabius@tiph.vetmed.uni-muenchen.de gabius@lectins.de [b] Prof. Dr. J. Jimÿnez-Barbero Centro de Investigaciones BiolÛgicas CSIC, Ramiro de Maeztu 9, 28040 Madrid (Spain) [c] Prof. Dr. H. R¸diger Institut f¸r Pharmazie und Lebensmittelchemie Julius-Maximilians-Universit‰t Am Hubland, 97074 W¸rzburg (Germany) ¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim 741 H.-J. Gabius et al. Hans-Joachim Gabius was born in Bad Bevensen (Germany). He studied biochemistry in Hannover (as a fellow of the Studienstiftung des deutschen Volkes) and obtained an MSc. in 1980, then a PhD in 1982 for chemical and biochemical studies on the proofreading mechanisms of aminoacyl-tRNA synthetases, under the direction of F. Cramer, Max Planck Institute for Experimental Medicine, Gˆttingen. He spent most of 1981 investigating tRNA splicing in the laboratory of J. Abelson (Department of Chemistry) at UC San Diego. After starting work in tumor lectinology in 1983 at the Max Planck Institute in Gˆttingen, he went on to a post-doctoral research post in the group of S. H. Barondes at UC San Diego (1984± 1985) and appointments as assistant professor for biochemistry at the Max Planck Institute for Experimental Medicine in Gˆttingen (1987), as associate professor for pharmaceutical chemistry at the University of Marburg (1991), and as head of the Institute for Physiological Chemistry, Faculty of Veterinary Medicine, University of Munich (1993). His research awards include the Otto-Hahn-Medal (1983), the Award of the Dr. Carl Duisberg Foundation (1988), and the Award of the Paul Martini Foundation (1990). His research interests comprise chemical, biophysical, and biochemical analysis of protein±carbohydrate interactions relevant to the biological and medical fields, such as the development of glycoscientific strategies for tumor diagnosis and therapy and the elucidation of the functions of mammalian lectins. He was prominently placed in the ranking of researchers by number of hot papers produced by the Institute of Scientific Information in 1998 (http://www.the-scientist.library.upenn.edu/yr1999/ june/hotresearch_p1_990 621.html; see also, www.lectins.de). JesÇs Jimÿnez-Barbero was born in 1960 in Madrid (Spain) and is Professor at the Center for Biological Research of the CSIC. He obtained his PhD in Organic Chemistry in 1987 for synthesis work and conformational studies on saccharides at the Institute of Organic Chemistry of the CSIC in Madrid under the supervision of M. Bernabÿ and M. MartÌn-Lomas. Following post-doctoral training in molecular mechanics and NMR methodology from 1986 to 1988 at CERMAV-CNRS in Grenoble, the University of Z¸rich, and the National Institute for Medical Research at Mill Hill, he received tenure in the CSIC in 1988. He was Visiting Scientist at the Department of Chemistry of Carnegie Mellon University at Pittsburgh between 1990 and 1992 and then started work on the application of NMR methodology to the study of interactions between carbohydrates and proteins and conformational and structural studies of oligo- and polysaccharides in Madrid. He was promoted to Senior Research Scientist in 1996 and to Full Professor in 2002, when he moved from the Institute of Organic Chemistry to the Center for Biological Research of the CSIC. He is mostly interested in obtaining a 3D view of the molecular recognition processes in which carbohydrates are involved, in particular by the application of NMR spectroscopy and modeling methods. He is a member of the editorial boards of several international journals and has published almost 200 scientific papers, reviews, and book chapters on the above-mentioned topics. He has also given more than sixty lectures at international conferences and research institutions. Hans-Christian Siebert was born in 1960 in Kiel (Germany) and studied physics in Kiel (1980±1983) and in Heidelberg (1983± 1987). He obtained a diploma (1987) for work on dynamic NMR spectroscopy and a PhD (1990) for conformational studies of gangliosides by NMR spectroscopy and computational methods in J. Dabrowski's group at the Department of Organic Chemistry, Max Planck Institute for Medical Research, Heidelberg. Following postdoctoral research in J. F. G. Vliegenthart's and R. Kaptein's groups at the Bijvoet Center for Biomolecular Research, Utrecht University, he joined H.-J. Gabius' research team at the Institute of Pharmaceutical Chemistry in Marburg in 1992 and moved with him to Munich, where he became Dr. med. vet. habil. and received the Venia legendi in Biochemistry in 1999. His research interests include NMR spectroscopy and molecular modeling structural studies of carbohydrate±protein interactions with biomedical relevance. Harold R¸diger was born in Stolp (Germany). He studied chemistry and biochemistry at the University of W¸rzburg, where he earned his PhD in 1963 for studies on the kinetics of peptide synthesis in yeast. He moved to the Biochemistry Institute of the University of Uppsala (Sweden), then headed by Nobel Prize winner A. Tiselius, to do postdoctoral research in J. Porath's group, where he studied modern biochemical analytical and separation techniques and worked out a purification protocol for a plant lectin. Upon returning to Germany in 1966, he joined L. Jaenicke's group at the Biochemistry Institute of the University of Cologne, where he studied the bacterial cobalamin-dependent methionine biosynthesis. In 1971, he became lecturer in biochemistry and in 1974 he moved to W¸rzburg University to take up a position as a professor of biochemistry. His research interests center on plant lectins and their interaction with various ligands. 742 ¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim www.chembiochem.org ChemBioChem 2004, 5, 740 ± 764 Chemical Biology of the Sugar Code Sabine Andrÿ was born in 1966 in Bad Hersfeld (Germany) and studied biology in Gˆttingen (1986±1991). In 1991 she joined H.-J. Gabius' group at Philipps University in Marburg. She obtained a diploma (1992) and a PhD (1996) in biochemical and cell/molecular biological analysis of protein±carbohydrate interactions. She gained experience in epidermal growth factor receptor signaling research in A. Villalobo's group at the Higher Council for Scientific Research of Spain (CSIC) in Madrid in 1994. Her research gives special emphasis to the study of glycoclusters and glycomimetics for lectin-targeted drug design, of lectin functions with medical relevance by using cell biological models produced with up-to-date technology for rationally manipulating galectin/glycoconjugate expression, and developing new diagnostic tools for histopathology. most is the unsurpassed capacity of these molecules to form isomers. In contrast to nucleotides or amino acids, saccharides contain several approximately chemically equivalent sites for chain elongation and, notably, even for branching (Scheme 1). As illustrated for b-linked diglucosides in Scheme 2, chemically distinct compounds are generated when the attachment point for the unit at the reducing end is moved stepwise from the 2’ to the 3’, 4’, or 6’-hydroxy group: Sophorose (Glcpb1-2Glc) is a constituent of plant glycosides such as the sweetener stevioside from the Composita Stevia rebaudiana, which is popular in Japan, or the glycosides found in root extracts of Uzara (South African Xysmalobium and Pachycarpus species of the Asclepiadaceae family). Laminaribiose (Glcpb1-3Glc) is a product of the partial hydrolysis of laminarin, an algal polysaccharide from Laminaria (seaweed; Chrysophyceae/Phaeophyceae). Laminarin also occurs in immunomodulatory fungal b1-3/1-6linked polysaccharides such as schizophyllan (from Schizophyllum commune) or lentinan (from Lentinus edodes). CelloChemBioChem 2004, 5, 740 ± 764 biose (Glcpb1-4Glc) is the basic structural unit of the most common carbon compound in nature, cellulose. Gentiobiose (Glcpb1-6Glc) occurs as the bitter-tasting ingredient of extracts taken from the roots of Gentiana lutea. This diglucoside also forms the carbohydrate part of various plant glycosides, among them amygdalin, the glycoside found in bitter almonds. This compound was instrumental in experiments delineating the famous ™lock-and-key∫ principle. In 1894, E. Fischer investigated the stereospecificity of the enzyme emulsin and reported that it shows ™eine kr‰ftige Wirkung auf Amygdalin∫ (a strong effect on amygdalin; p. 2990, ref. [5]). He continues (p. 2992): ™Invertin und Emulsin haben bekanntlich manche æhnlichkeit mit den Proteinstoffen und besitzen wie jene unzweifelhaft ein asymmetrisch gebautes Molek¸l. Ihre beschr‰nkte Wirkung auf die Glucoside liesse sich also auch durch die Annahme erkl‰ren, dass nur bei ‰hnlichem geometrischem Bau diejenige Ann‰herung der Molek¸le stattfinden kann, welche zur Auslˆsung des chemischen Vorganges erfor- Scheme 1. Illustration of the exquisite chemical versatility of a monosaccharide as a module for chain initiation and elongation. Whereas nucleotides (left; deoxyadenosine monophosphate) or amino acids (center; serine) form linear oligo- and polymers by 5’,3’-phosphodiester-dependent or peptide-bond-dependent elongation (positions marked by arrows), monosaccharide (right; a/b-d-glucose) addition to a growing oligomer can proceed through the four hydroxy groups at C2, C3, C4, and C6 and the two anomeric hydroxy positions (see Schemes 2 and 3 for the structures of the resulting diglucosides). Scheme 2. Illustration of the structural series of b-diglucosides derived by shifting the position of the b1-linked hydroxy group of the reducing-end glucose moiety from the 2’ to the 3’, 4’, or 6’-site (shown by arrows in Scheme 1). The biological relevance of this variability is underscored by the examples of the natural occurrence of each diglucoside given in the text. www.chembiochem.org ¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim 743 H.-J. Gabius et al. derlich ist. Um ein Bild zu gebrauchen, will ich sagen, dass Enzym und Glucosid wie Schloss und Schl¸ssel zu einander passen m¸ssen, um eine chemische Wirkung auf einander aus¸ben zu kˆnnen.∫ (It is known that invertin and emulsin have several features in common with proteinaceous compounds and undoubtedly also harbor an asymmetrically built molecule. Their limited effect on the glucosides might thus also be explained by the assumption that only molecules with a similar geometrical design can approach one another as required for a chemical process to occur. To use a metaphor, I wish to say that enzyme and glucoside must fit like lock and key to be able to exert a chemical effect on each other).[5] Moving on from this case study of a disaccharide, the structural features described above raise the expectation that carbohydrates will be second to no other biomolecule class in the diversity of the isomers they form, and this expectation is completely vindicated by diversity calculations. The theoretical limit of isomer diversity, that is, the total number of hexamers that can be formed with 20 different building blocks, differs tremendously between types of monomer: 6.4 î 107 hexapeptides are possible versus as many as 1.44 î 1015 hexasaccharides.[6] These calculations include the noted variability of the attachment point for glycosides. Moreover, the level of diversity introduced by the occurrence of the two anomeric variants at each glycosidic linkage is taken into account. The two structures in Scheme 3 illustrate that the seemingly rather minor difference in only one structural parameter between the diglucosides cellobiose and maltose effectively translates into the widely disparate properties of the polymers cellulose and starch/glycogen. Thus, in order to characterize a glycosidic linkage precisely, not one (i.e. the sequence) but three independent parameters are necessary. These parameters (for the first and second dimensions of structural diversity; for the third dimension, see Section 6) are: a) the sequence of the individual monomers, b) the individual linkage points, and c) the anomeric configuration. Amazingly, the potential for structural diversity at the level of the sequence does not end at this point. In structural terms, a further level of diversity is accessible through the introduction of substituents. Glycosaminoglycan chains of proteoglycans found in the extracellular matrix, such as heparan sulfates, provide a telling example of how even a branchless backbone with repeating disaccharide units whose main function has been thought of as passive structural scaf- Scheme 3. Illustration of the structural impact of anomer variation on the two otherwise structurally identical diglucosides cellobiose (building block of cellulose; see also Schemes 1 and 2) and maltose (building block of starch and glycogen). folding is turned into a chain of biologically distinct microdomains through substitution.[7] The presentation of substituents facilitates versatile multicontact recognition relevant for the coordination of cell±matrix interactions. Site-specific introduction of sulfate substituents to hydroxy/amino groups and the epimerization of d-glucuronic acid to l-iduronic acid in the basic repeating unit (GlcN-HexA)n are the key to this heparanomic complexity.[8] From the repeating core unit of the initial enzymatic polymer formation, a total of 48 different disaccharides can theoretically be formed by the ensuing modifications. A particular and rare modification pattern results in the anticoagulant pentasaccharide determinant of heparin (Scheme 4), an example of a carbohydrate compound currently used in clinical applications and an object of chemical refinement toward an optimal design.[9] A synthetic pentasaccharide comprising the same features is now commercially available (Table 1). As alluded to above, the three-dimensional shape of the molecule comes into play too. l-Iduronic acid can undergo conformer interconversion (1C4 chair, 2So skew boat).[10] By adopting different Scheme 4. Illustration of the structure of the heparin-derived anticoagulant pentasaccharide that binds antithrombin III with high specificity. The introduction of the 3’-O-sulfate group (circled) into the central substituted (N- and O-sulfated) d-glucosamine residue by a 3-O-sulfotransferase is essential for pharmacologic activity. This rare natural carbohydrate is used as a model for the development of drugs for preventing and treating venous and arterial thromboembolism. 744 ¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim www.chembiochem.org ChemBioChem 2004, 5, 740 ± 764 Chemical Biology of the Sugar Code tree. The enterotoxin of Escherichia coli, plant lectins such as ricin, the agglutinin from Erythrina coralloCompound Target Disease dendron, and animal galectins exploit such a contact between the ligand and an aromatic side chain, acarbose a-glucosidases diabetes mellitus preferably that of tryptophan. This duty to make (amylases) heparin/heparinoids antithrombin III thrombosis thermodynamically favorable contact to a ligand in heparin pentasaccharide antithrombin III thrombosis the binding site (see Section 6 and illustrations (Fondaparinux) (factor Xa) therein for a view of a receptor±ligand complex derivatives or mimetics of neuraminidase viral infection with such a contact) is one likely reason why trypto2-deoxy-2,3-dehydro-Nacetylneuraminic acid phan is indispensable in the panel of proteinogenic N-butyldeoxynojirimycin a-glucosidases viral infection amino acids. In the course of establishing contact (N-glycan processing) between the sugar ligand and the aromatic ring, derivatives or mimetics of adhesins and toxins bacterial infection water molecules are dispelled from the rather hymilk oligosaccharides (lectins) GlcN-(2-O-hexadecyl) GPI-mannosyltransferase I protozoan infection drophobic patches of the carbohydrate, a process phosphatidylinositol (e.g. African sleeping that makes a sizable contribution to the thermodysickness) namic driving force of the binding process.[13b] The derivatives or mimetics of selectins inflammatory reaction a/x selectivity of the molecular rendezvous, achieved sialylated/sulfated Le epitopes through a combination of the above-mentioned d-Man phosphomannose congenital disorder factors, explains the exquisite way in which individisomerase deficiency of glycosylation Ib ual code letters, for example d-glucose and its l-Fuc GDP-fucose transport congenital disorder 4-epimer d-galactose, are distinguished. Thus, a of glycosylation IIc (LAD II) change of only one hydroxy group from the equaN-butyldeoxygalactonojiriglycosphingolipid glycosphingolipid torial to the axial orientation keeps perturbation of mycin and properly glycosynthesis and enzymatic storage disorders the favored ™tridymite∫ water structure minimal and sylated b-gluco(galacto) degradation is sufficient to establish distinct letters.[14] cerebrosidase In summary, the structural variability introduced by changes in linkage points, anomeric position, and placement of substituents endows carbohydrates with the features necessary for a high-density coding system. In fact, not only glycosaminoglycans but all conformations at these flexible hingelike sites, the topological N- and O-glycans and glycolipids are representatives of the display of the neighboring substituents can be easily modulatchemical diversity realized by the enzymatic machinery of ed, and this substituent pattern has an impact on the contact glycan production.[15] Good reasons for development of a new sites for interaction with receptors, for example, antithrombin paradigm that views oligosaccharides as ™multipurpose tools∫ III and fibroblast growth factors (see Section 6 for further inforare as follows: a) the strategic placement of glycan chains in mation). Needless to say, aberrations in proteoglycan synthesis the glycocalyx so that they reach out into the extracellular and modification have been linked to developmental dysreguspace like sensors or tentacles, b) the existence of more than lation in model organisms, for example, sqv (squashed vulva) 1000 known N-glycan structures (this list is continuously growgenes in the nematode Caenorhabditis elegans or sfl (sulfateless)/pipe genes in the fruitfly Drosophila melanogaster.[11] It is ing), c) the ways of marking proteins with a distinct sugar signal likened to a postal code (e.g. Man-6-phosphate and of note that sulfation has also found frequent use as a tool to GalNAc-4-sulfate for routing of lysosomal enzymes and pituitaform ™Umlaut-like∫ letters in the sugar alphabet in the N- and ry glycoprotein hormones, respectively) that have already been O-glycans of glycoproteins and in glycolipids. For example, detected, and d) the overall complexity of the families of glyGalNAc-4-sulfate (but not GalNAc) is important for routing cosyltransferases and glycan-modifying enzymes such as the pituitary glycoprotein hormones, as are GalNAc/Gal/GlcNAc-6sulfotransferases mentioned above.[4a, 15a] According to recent sulfates for lymphocyte homing.[12] So far, 31 carbohydrate sulaccounts, the number of glycosyltransferase-related sequences fotransferases have been described along with their individual identified has grown to more than 7200, distributed over 65 ligand spectrum, which underlines the sophisticated ramificadistinct sequence-derived families; about 1 % of the open readtions of this type of modification.[12] ing frames of each metazoan genome is calculated to be deComing back to the basic chemical features that are favoravoted to building up glycans.[15d] It now looks like a foregone ble for a role in information transfer, the amphiphilic character of carbohydrates is a boon for intermolecular interactions. This conclusion that these determinants equip cells with attractive property affords multiple donor/acceptor sites for directional sensor points/areas for intermolecular contact. If matched on hydrogen bonds.[13] Moreover, a set of suitably positioned pothe level of receptor proteins, the well-elaborated processes of code word generation would make sense as a way to establish larized C H bonds can be engaged in C H/p-electron and a versatile communication mode involved in biosignaling, cellstacking interactions in certain cases (e.g. to bring about intispecific targeting, and host-defence pathways.[16] Families of mate d-Gal±Trp contact).[13] This principle has been seen to proteins capable of ™reading∫ the sugar-encoded messages– work in organisms from various branches of the evolutionary Table 1. Examples of sugar compounds used as pharmaceuticals. ChemBioChem 2004, 5, 740 ± 764 www.chembiochem.org ¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim 745 H.-J. Gabius et al. besides the enzymes that tailor glycan determinants– are the missing link needed to turn structure into biological response. This assumption about information transfer pathways has been shown to be correct by the discovery of lectins. Table 2. Brief historical account of lectinology.[a] 1860 1888 1891 1898 3. Lectins: Tools to Read Sugar-Encoded Messages The first documented observations of lectin activity were made on clumping red blood cells (Table 2). In 1860, the venom of the rattlesnake proved active in this respect: ™one drop of venom was put on a slide and a drop of blood from a pigeon's wounded wing allowed to fall upon it. They were instantly mixed. Within three minutes the mass had coagulated firmly, and within ten it was of arterial redness.∫[17] The concern that the term ™coagulation∫ used by Mitchell might reflect the action of procoagulants but not cell agglutination was satisfactorily addressed by deliberately repeating the experiments with washed erythrocytes.[18] The paper by Flexner and Noguchi in which this work is reported was in fact introduced by Mitchell who commented on it as follows: ™I have long desired that the actions of venoms upon blood should be further examined. I finally indicated in a series of propositions the direction I wished the inquiry to take. Starting from these the following very satisfactory study has been made by Professor Flexner and Dr. Noguchi. My own share in it, although so limited, I mention with satisfaction.∫[18] Only a few years later, in 1906, an agglutinating activity of activated-complement-coated erythrocytes was detected in bovine serum. To allow the reader to follow the major historical events in this field, we have listed this finding in Table 2. As with snake venom, the active protein of bovine serum was later biochemically characterized as a C-type lectin (see also Section 4), in this case from the subgroup of collectins named conglutinin, which binds to the Man8/Man9 N-glycan of human iC3b at Asn917 in the a chain of complement glycoprotein C3.[19] The assay used to look at haemagglutination was also instrumental to the discovery of the cell-bridging capacity of proteins in plant extracts, initially that of toxic castor bean extract.[20] Stillmark remarked in his M.D. thesis, published in 1888: ™Das Ricin bewirkt in defibriniertem serumhaltigem Blute eine Zusammenballung der rothen Blutkˆrperchen unter Bildung einer fibrin‰hnlichen Substanz.∫ (Ricin causes a conglomeration (or agglutination) of the red blood corpuscles in defibrinated serum-containing blood that yields a fibrin-like substance).[20] The discovery that plant extracts are rich sources of agglutinins made possible the first purification of such a protein (named concanavalin A) by crystallization (™If jack bean extracts are covered with toluene and simply allowed to stand exposed to the air for several weeks, this protein is precipitated as beautifully formed crystals having a diameter of about 0.1 mm∫)[21] and the 746 1902 1902 1906 1907 1913 1919 1936 1941 1947±1948 1952 1954 1960 1965 1972 1972±1977 1974 1978 1979 1983 1984 1985 1987 1989 1992±1993 1995 1996±1998 2001±2002 Observation of blood ™coagulation∫ by rattlesnake venom (S. W. Mitchell) Detection of erythrocyte agglutination by protein fractions from castor beans and other plant seeds (H. Stillmark) Toxic plant agglutinins applied as model antigens (P. Ehrlich) Introduction of the term ™haemagglutinin∫ or phytohaemagglutinin for plant proteins that agglutinate red blood cells (M. Elfstrand) Detection of bacterial agglutinins (R. Kraus) Demonstration that blood ™coagulation∫ by snake venom (later shown to depend on a C-type lectin) observed in 1860 was not caused by blood clotting but by cell agglutination (S. Flexner, H. Noguchi) Detection of an agglutinin in bovine serum (later characterized as the C-type lectin conglutinin) that acts on activated complementcoated erythrocytes (J. Bordet, F. P. Gay) Detection of nontoxic agglutinins in plants (K. Landsteiner, H. Raubitschek) Use of intact cells for the purification of lectins (R. Kobert) Crystallization of a lectin, concanavalin A (J. B. Sumner) Precipitation of starch, glycogen, and mucins by concanavalin A and its interaction with the stromata of erythrocytes define the carbohydrate as a ligand (J. B. Sumner, S. F. Howell) Detection of viral agglutinins (G. K. Hirst) Detection of lectins specific for human blood groups (W. C. Boyd, K. O. Renkonen) Carbohydrate nature of blood group determinants proven by lectin-mediated agglutination and its sugar-dependent inhibition (W. M. Watkins, W. T. J. Morgan) Introduction of the term ™lectin∫ for plant agglutinins, primarily for those that are blood-group specific (W. C. Boyd) Detection of the mitogenic potency of lectins toward lymphocytes (P. C. Nowell) Application of affinity chromatography for the isolation of lectins (I. J. Goldstein, B. B. L. Agrawal) Determination of the amino acid sequence and three-dimensional structure of a lectin, concanavalin A (G. M. Edelman, K. O. Hardman, C. F. Ainsworth et al.) Discovery of impaired synthesis of a marker for glycoprotein (lysosomal enzymes) routing as the cause of a human disease (mucolipidosis II) and identification of the marker as Man-6-phosphate, the ligand for P-type lectins (E. F. Neufeld et al.; W. S. Sly et al.) Isolation of a mammalian Gal/GalNAc-specific lectin from the liver (G. Ashwell) First conference focusing on lectins and glycoconjugates, termed Interlec (T. C. B˘g-Hansen) Detection of endogenous ligands for plant lectins (H. R¸diger) Detection of the insecticidal action of a plant lectin (L. L. Murdock) Isolation of lectins from tumors (H.-J. Gabius; R. Lotan, A. Raz) Discovery of immobilized glycoproteins as pan-affinity adsorbents for lectins (H. R¸diger) Introduction of neoglycoconjugates for localization of tissue lectins for tumor diagnosis (H.-J. Gabius et al.) Detection of the fungicidal action of a plant lectin (W. J. Peumans) Identification of impaired synthesis of lectin (selectin) ligands by defective fucosylation as the cause for leukocyte adhesion deficiency type II (A. Etzioni et al.) Structural analysis of a lectin±ligand complex in solution by NMR spectroscopy (J. Jimÿnez-Barbero et al.) Detection of differential conformer selection by plant and animal lectins (H.-J. Gabius et al.; L. Poppe et al.) Advances in lectinology and glycosciences honored by dedication of special issues of Biochim. Biophys. Acta, Biochimie, Biol. Chem., Cells Tissues Organs, Chem. Rev., Curr. Opin. Struct. Biol., J. Agric. Food Chem. (Liener symposium), and Science to these topics [a] Extended and modified from ref. [16d]. ¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim www.chembiochem.org ChemBioChem 2004, 5, 740 ± 764 Chemical Biology of the Sugar Code demonstration of its interaction with carbohydrate groups (Table 2). As summarized by Sumner and Howell in 1936, ™concanavalin A unites with some constituent of the stromata and, since concanavalin A unites with starch, glycogen, mucins, etc., it is possible that this may be a carbohydrate group in a protein.∫[22] Since the activities of several plant and animal haemagglutinins towards reaction with erythrocytes of different AB0 blood group status resemble those of serum antibodies initially observed by Creite (1869) and Landois (1875) and referred to as isoagglutinins by Landsteiner in 1900,[23] the resulting classification of the haemagglutinins as antibody-like substances sounds logical. The following quotation explains Boyd's reason for introducing the term lectin in 1954, a term that continues to be commonly used today: ™It would appear to be a matter of semantics as to whether a substance not produced in response to an antigen should be called an antibody even though it is a protein and combines specifically with a certain antigen only. It might be better to have a different word for the substances and the present writer would like to propose the word lectin from Latin lectus, the past principle of legere meaning to pick, choose or select.∫[24] By building on the pioneering observations made by Sumner and Howell, on the detection of haptenic inhibition of antibody±antigen reactions by Landsteiner and van der Scheer, and on the description of blood-group-specific lectins by Renkonen and Boyd (cited above), [22, 23d±f] a milestone of lectin application was established shortly before 1954 (see Table 2). This breakthrough was the inhibition of haemagglutination, mediated by eel (Anguilla anguilla) serum and seed extracts of the Leguminosa Lotus tetragonolobus, by l-fucose. These key experiments led to the determination of ™the biochemical basis of blood group AB0 and Lewis antigenic specificity∫[25e] (for further listings of the course of lectin research history, see Table 2). To reach the present version of the term lectin, its definition had to be subjected to several refinements. The experimental focus on agglutination, which requires at least bivalency for the bridging of two cell surfaces, was dropped completely in the course of this process. The three criteria that must currently be met by a (glyco)protein for it to qualify as a member of the lectin family are given below.[26] a) Carbohydrate-binding activity Assays monitoring binding to carrier-immobilized carbohydrate ligands of (neo)glycoconjugates are now commonly used to detect and quantify lectin activity, irrespective of the presence of bridging functionality.[4a, 27] The presence of a carbohydrate recognition domain (CRD) linked with other bioactive modules in a mosaic-like protein (see also Sections 4 and 5) makes it possible to assign bi- and multifunctional proteins to different protein families. b) Distinction from immunoglobulins In the original definition of a lectin given by Boyd in 1954,[24] the groups of immunoglobulins (Ig), such as IgG or IgM, are ChemBioChem 2004, 5, 740 ± 764 www.chembiochem.org deliberately excluded. It should be noted that the animal lectins of the I-type class with a distal V-set module and C2-set domains belong to the Ig superfamily and that various lectins, such as galectins, as well as C- and I-type lectins, are produced from lymphocytes along with antibodies.[28] c) Distinction from enzymes tailoring free saccharides/ glycan chains of glycoconjugates, and from sensor or carrier proteins for free mono- or oligosaccharides Any glycosyltransferase, glycosidase, or enzyme that modifies its cognate carbohydrate (e.g. the sulfotransferases or epimerases), as well as transport/chemotaxis receptors for free mono-, di-, or oligosaccharides are excluded from the lectin family. With this explanation of the generic name for (glyco)proteins that read sugar-encoded messages in mind, it is instructive to examine the diversity of these proteins in plants and animals. If lectins were rare inventions of nature, then communication with sugar code words would surely be restricted to only a few messages that can be decoded. 4. Plant Lectins: Occurrence, Functions, and Applications The richest sources of plant lectins are the seeds or, more generally, the storage organs of plants. For most plants studied so far, lectins have been prepared from the seeds, but roots, tubers, bulbs, bark, or leaves have also served as starting materials for the isolation of lectins.[29] As emphasized above in the context of the inter- and intrafamily diversity of glycosyltransferases (see the last paragraph of Section 2), the wide distribution of lectins is a strong argument for their physiological relevance. Table 3 lists families of higher plants, as defined by the rules of botanical taxonomy, with the numbers of lectin-bearing species in each family. Algal, fern, and fungal lectins are not included. Since an activity assay solely with haemagglutination without proper controls can yield false-positive results, we limited the compilation to those cases for which further unambiguous evidence for lectin presence is available. The overwhelming majority of lectins characterized up to now has been found in the Angiospermae section. Among these, about three-quarters of the lectin-bearing species belong to the Dicotyledoneae and almost 90 % of these to the Archichlamydeae subclass of the dicot class. Leguminosae played an important role in the early history of lectinology, as outlined above (see also Table 2), and still hold a prominent position in the field. However, to avoid misinterpretation of our systematic compilation, we must add that the search for lectins has not really been carried out strategically by following the rules of botanic systematics and searching species by species. It is thus likely that the literature-based numbers given in Table 3 will promptly increase when researchers begin doing so. Studies have so far often focused on economically relevant plants. Besides the advantage of easy access to the starting material, reports on compounds from plants of nutritional value are sure to find a wide readership. Consequently, the occurrence of lectins in plants outside the remit of modern agriculture is proba- ¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim 747 H.-J. Gabius et al. Table 3. Systematic coverage of the occurrence of plant lectins. Section Class Subclass Order Family Number of known lectin-bearing species Angiospermae Dicotyledoneae Archichlamydeae Salicales Fagales Urticales Salicaceae Fagaceae Cecropiaceae Urticaceae Moraceae Loranthaceae Viscaceae Amaranthaceae Caryophyllaceae Chenopodiaceae Phytolaccaceae Cactaceae Lauraceae Ranunculaceae Theaceae Cruciferae Papaveraceae Crassulaceae Leguminosae Saxifragaceae Euphorbiaceae Sapindaceae Celastraceae Rhamnaceae Vitaceae Eleagnaceae Passifloraceae Cucurbitaceae Myrtaceae Araliaceae Umbelliferae Ebenaceae Solanaceae Lamiaceae Convolvulaceae Pedaliaceae Verbenaceae Caprifoliaceae Compositae Alismataceae Alliaceae Amaryllidaceae Dioscoreaceae Iridaceae Liliaceae Gramineae Araceae Cyperaceae Musaceae Orchidaceae Araucariaceae Pinaceae 2 2 1 1 18 1 3 7 1 1 1 2 2 1 1 4 1 1 140 1 13 3 2 1 1 1 2 22 1 1 3 1 5 9 5 1 1 5 2 1 6 11 1 6 20 9 17 1 2 6 1 3 Santalales Centrospermae Cactales Magnoliales Ranunculales Guttiferales Papaverales Rosales Geraniales Sapindales Celastrales Rhamnales Thymelaeales Violales Cucurbitales Myrtiflorae Umbelliflorae Metachlamydeae Monocotyledoneae Gymnospermae Coniferopsida Ebenales Tubiflorae Dipsacales Campanulales Helobiae Liliiflorae Graminales Spathiflorae Cyperales Scitamineae Microspermae Coniferae bly underestimated. Another factor that may have played a role in the count is that such plants often have tiny seeds. The most popular method for tracing lectin presence is still to test plant extracts for their ability to agglutinate cells, usually human or other mammalian erythrocytes. This classical method, already used more than a century ago by Mitchell[17] and Stillmark,[20] excels because of its simplicity. However, the technique suffers from several noteworthy disadvantages. Plant extracts can contain active material such as tannins that 748 leads to the above-mentioned false-positive results. The presence of lipids can also lead to misinterpreted results, and erythrocytes tend to agglutinate spontaneously in the presence of only moderate concentrations of bivalent metal ions. Erythrocytes are sensitive to surface-active substances such as saponins, so lectins may easily be overlooked in their presence. Moreover, an agglutination assay only detects lectins that are at least bivalent and can therefore link cells, a factor noted above in criterion (a) of the lectin definition (see Section 3). ¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim www.chembiochem.org ChemBioChem 2004, 5, 740 ± 764 Chemical Biology of the Sugar Code Therefore, screening methods have been developed that capitalize on the carbohydrate-binding capacity of lectins. Chemically tailored neoglycoconjugates present carrier-immobilized carbohydrate ligands for interaction, and complex formation is then picked up analytically by using any suitable label.[27, 30] To avoid radioactive labeling, the intrinsic activity of enzymes that are naturally glycosylated, such as horseradish peroxidase, or that can serve as acceptors for glycans through chemical conjugation, such as E. coli b-galactosidase, is used to detect any lectin-like activity presented on a matrix.[31] Recently developed methods employ the microarray technology. This approach can be combined with combinatorial synthesis, which illustrates the emerging importance of the interface between lectin research and carbohydrate chemistry.[32] Needless to say, arrays will prove instrumental in the definition of ligands with optimal affinity and selectivity, a factor of relevance for research aiming to extend the contents of Table 1 in the future. However, at present these methods are too sophisticated for general use. Expensive equipment and considerable expertise are required to master the chemical syntheses and analytical evaluation techniques. In consequence, even recent studies dealing with newly discovered lectins rely on cell agglutination as an analytical tool. The shortcomings of this approach are generally addressed by using controls to prove inhibition of the agglutination by haptenic sugar, as in the elucidation of the determinants of the AB0 histo-blood group epitopes fifty years ago.[25] Once lectin activity had been detected, the next step in the characterization pathway, regardless of the source of the material, is isolation of the lectin(s). This step can certainly be performed by standard protocols for protein purification, which include ion exchange, size exclusion, and hydrophobic chromatography. Investments of time and effort are reduced by taking advantage of the highly efficient method of affinity chromatography on immobilized carbohydrates or glycoconjugates. A very simple means of applying this technique is to use naturally occurring polysaccharides such as dextrans. These compounds are high-affinity adsorbents for glucose-binding lectins such as concanavalin A and pea or lentil agglutinins when the polymer chains are cross-linked. Surprisingly, the enormous potential of this method was not initially realized. As was recently pointed out in a commentary on the path of the lectins ™from obscurity into the limelight∫ by Sharon,[33] the manuscript pioneering this approach did not find favorable review at first: ™Irwin J. Goldstein from the University of Michigan at Ann Arbor, a leading lectin researcher to this very day, tells that when he sent a note, in 1963, to Biochemical and Biophysical Research Communications describing the purification of concanavalin A by affinity chromatography, it was rejected forthright because 'this represents a modest advance in an obscure area.' The note was eventually published in Biochemical Journal[34a] and affinity chromatography soon became the method of choice for lectin isolation∫ (see Table 2). Among the procedures used to conjugate a saccharide to the matrix, we found divinyl sulfone activation particularly easy in handling and efficient in terms of final lectin yields.[34b±d] To broaden the scope of onestep lectin purification, it is convenient to covalently couple ChemBioChem 2004, 5, 740 ± 764 www.chembiochem.org not only saccharides but also naturally occurring glycoproteins to the resin. For this purpose, hog gastric mucin or hen ovomucoid, both easily available in large amounts, was successfully employed.[35] The prevailing method used to elute the lectin exploits the haptenic sugar as a competitive inhibitor. Problems arise when binding is directed to extended glycans, as is the case for Phaseolus bean lectins (or phytohaemagglutinins (PHAs), a formerly used generic name for plant lectins; see below). The presence of these lectins is the biochemical cause of the nausea that results from eating insufficiently cooked beans (see also Table 4). In such instances of binding to the extended glycans, lectin elution from the resin can be performed by lowering the pH value of the buffer. If the lectin is too sensitive to withstand an acidic medium, desorption with a borate-containing buffer offers a simple and affordable alternative. The elution profiles that result from the use of these two protocols are illustrated in Figure 1. Successive elution with haptenic sugar and borate was helpful for purification of distinct lectins from the same source that differ in carbohydrate specificity. Figure 1 A shows that isolectin family I (specific for Gal/GalNAc) found in Griffonia simplicifolia seed extracts can be easily separated from the type II lectin (specific for (GlcNAc)n). Figure 1 B illustrates that this procedure even allows closely related isolectins such as the Phaseolus bean lectins to be resolved. When the concentration of the eluant borate was increased stepwise, it was possible to obtain the five isolectins L4, L3E, L2E2, LE3, and E4 in separate fractions.[36] The isolectin L4 (listed by commercial suppliers as PHA-L4 or phytohaemagglutinin L4) is a popular laboratory tool used as a mitogen for lymphocytes and the chromatographic method described gives remarkably easy access to pure material without contamination by the isoagglutinin E4 or the other three forms, as explained in detail in the figure legend. The members of the diverse group of plant lectins that are studied and used most frequently are listed in Table 4. The leading position is held by concanavalin A, the ™classical∫ Man/ Glc-binding lectin from Jack beans (see above and Table 2 for the central role of this lectin in the history of lectinology). The obtainable yield of concanavalin A from seed material is about 2 g per 100 g and it is chemically stable, key factors for its initial isolation by crystallization (see above). Once purified, the lectin can undergo numerous chemical modifications. All these properties are very favorable for chemical, biochemical, and biomedical applications (see Table 5 for a summary of research areas in which plant lectins are used as tools). These facts explain why this lectin has attained its status as a reliable and popular workhorse, especially for carbohydrate chemists looking for a lectin to use in an attempt to prove the ligand properties of a sugar compound attached to a new synthetic scaffold. The other lectins listed in Table 4 are capable of following the role model concanavalin A, although they are less prominently used in research. These compounds form a panel of probes for isolation and structural characterization of glycoconjugates (glycoproteins, glycolipids, or polysaccharides), as well as use in various assays in cell biology, histochemistry, and the medical sciences (Table 5).[26g, 34d, 37] The size of the panel of lectins with related specificities (for a selection of frequently ¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim 749 H.-J. Gabius et al. Table 4. Examples of plant lectins to illustrate the inter- and intrafamily diversity of these proteins.[a] Plant species and name/ abbreviation of lectin Family Mono- or disaccharide specificity Comments Canavalia ensiformis (concanavalin A, ConA) Ricinus communis (ricin) Leguminosae Man/Glc Euphorbiaceae Gal Triticum vulgare (WGA) Phaseolus vulgaris (PHA) Gramineae Leguminosae (GlcNAc)1-3, Neu5Ac no simple carbohydrate known Glycine max (SBA) Pisum sativum (PSA) Viscum album (VAA, viscumin) Leguminosae Leguminosae Viscaceae GalNAc/Gal Man/Glc Gal Arachis hypogaea (PNA) Leguminosae Lens culinaris (LCA) Dolichos biflorus (DBA) Leguminosae Leguminosae Griffonia simplicifolia (GSA-I) Griffonia simplicifolia (GSA-II) Artocarpus integrifolia (jacalin) Solanum tuberosum (STA) Galanthus nivalis (GNA) Leguminosae Gal, Galb3GalNAca (TF-antigen) Man/Glc GalNAca3GalNAc, GalNAc Gal/GalNAc cheapest and most popular lectin; first lectin isolated by crystallization and demonstrated to interact with carbohydrate (see text and Table 2 for details) ribosome-inactivating protein, type II (RIP II), used for generating immunotoxins; biohazard potential function in plant defence mechanisms isolectin L4 is a strong mitogen for T-lymphocytes, isolectin E4 is a strong erythrocyte agglutinin (see Figure 1 B for chromatographic isolectin separation); distinguish between bisected and nonbisected N-glycans; cause of severe gastrointestinal irritation when ingested in insufficiently cooked beans cell sorting, bone marrow purging binding of N-glycans enhanced by core fucosylation RIP II used for generating immunotoxins, constituent of proprietary mistletoe extracts (immunomodulatory and growth stimulatory for tumor cells in vitro and in vivo at low doses; see text for details) very popular in histochemistry; separates immature from mature thymocytes Leguminosae (GlcNAc)n isolectin GSA-I-A4 agglutinates blood group A erythrocytes, isolectin GSA-I-B4 blood group B erythrocytes insecticidal activity, potential defence role Moraceae Gal (Man, TF-antigen) used for isolation of IgA1 and mucins, mitogenic for CD4 + T-cells Solanaceae (GlcNAc)n potential function in plant defence mechanisms Amaryllidaceae Man Ulex europaeus (isolectin UEA)-I Erythrina corallodendron (ECA) Vicia faba (VFA) Sambucus nigra (SNA) Leguminosae l-Fuc Leguminosae Abrus precatorius Lotus tetragonolobus (LTA) Lycopersicon esculentum Leguminosae Leguminosae Galb4GlcNAc, Gal, GalNAc Man/Glc Neu5Aca6Gal/ GalNAc, (Gal/GalNAc) Gal l-Fuc does not bind Glc as the Leguminosae lectins do, application for insect and nematode defence in transgenic crop plants tested, antiretroviral activity in vitro, selective agglutination of rabbit but not human erythrocytes agglutinates blood group 0(H) erythrocytes; selective marker for endothelial cells of primates mitogen for human lymphocytes Solanaceae (GlcNAc)n Phaseolus lunatus limensis Datura stramonium (DSA) Maackia amurensis (MAA) Phytolacca americana (PWM) Bauhinia purpurea (BPA) Leguminosae GalNAca3[Fuca2]Gal, GalNAc (GlcNAc)n Urtica dioica (UDA) Hevea brasiliensis (hevein) Maclura pomifera (MPA) Urticaceae Euphorbiaceae Leguminosae Caprifoliaceae Solanaceae Leguminosae Phytolaccaceae Leguminosae Moraceae Neu5Aca3Gal/ GalNAc GlcNAc GalNAcb3GalNAc, GalNAc (GlcNAc)n (GlcNAc)n T-antigen > Tnantigen binding of N-glycans enhanced by core fucosylation; lymphocyte mitogen cell sorting, agglutinates blood group A erythrocytes binding of N-glycans enhanced by core fucosylation probe for sialylated glycoconjugates, e.g. in thymocyte differentiation RIP II used for generating immunotoxins agglutinates red cells of blood group 0(H), instrumental to the definition of a-lfucose as a crucial 0(H) epitope (see Table 2) potential function in plant defence mechanisms; marker of endothelium of small vessels in rats agglutinates blood group A erythrocytes potential function in plant defence mechanisms probe for sialylated glycoconjugates known as pokeweed mitogen; detected in 1969 in the course of investigating a fatality associated with ingestion of pokeweed berries enrichment of B lymphocytes, isolation of T cells producing Il-2 antifungal activity antifungal activity; allergen in rubber products of poor quality mitogen for lymphocytes [a] The order of the list reflects the share of attention given to each lectin in the literature. used plant lectins, see Table 4) ensures that the optimal tool for a defined purpose can always be found. For example, LCA can be used when the cells to be desorbed from a lectin-con- 750 taining solid matrix must be handled under gentle conditions, in contrast to concanavalin A, with which harsher conditions are required since binding is comparatively tight.[38] A frequent- ¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim www.chembiochem.org ChemBioChem 2004, 5, 740 ± 764 Chemical Biology of the Sugar Code Table 5. Versatility of plant lectins as research tools.[a] Biochemistry detection of defined carbohydrate epitopes of glycoconjugates in blots or on thin-layer chromatography plates purification of lectin-reactive glycoconjugates by affinity chromatography glycan characterization by serial lectin affinity chromatography (lectin affinity capture) glycome analysis (glycomics) quantification of lectin-reactive glycoconjugates in enzyme-linked lectinbinding assays (ELLA) quantification of activities of glycosyltransferases/glycosidases by lectinbased detection of products of enzymatic reaction model reagents for the assessment of the ligand functionality of carbohydrate-presenting scaffolds (e.g. glycodendrimers) Cell biology characterization of intracellular assembly, routing, and cell surface presentation of glycoconjugates in normal and genetically engineered cells (glycomic profiling, spatially defined) selection of cell variants (mutants, transfectants) with altered lectinbinding properties as models for dissecting glycosylation machinery and glycan functionality (glycomic profiling, functionally defined) fractionation of cell populations modulation of the proliferation and activation status of cells and dissection of the involved signal pathways model substratum for study of cell aggregation, adhesion, and migration Medicine Figure 1. Illustration of the chromatographic purification and separation of plant lectins from the same species and source (see Table 4 for further information on these lectins) by using the glycan chains of immobilized glycoproteins as affinity ligands. A) Successive elution with 25 mm d-galactose and 50 mm borate from a column bearing desialylated hog gastric mucin as the affinity ligand and loaded with plant extract as previously described[36] resulted in purification of Gal/GalNAc-specific Griffonia simplicifolia agglutinin I (GSA-I; subunit Mr = 30/32 kDa) and GSA-II ((GlcNAc)n-specific; subunit Mr = 28 kDa). B) Stepwise increases in the borate concentration in the elution buffer resulted in desorption of the five Phaseolus vulgaris isoagglutinins from immobilized ovomucoid. Elution started with PHA-L4 (subunit Mr = 31 kDa) at 15 mm borate and finally reached PHA-E4 at 250 mm borate. Elution was monitored by measuring the absorption at 280 nm (A280) and the agglutination activity, as described previously.[35] The latter assays revealed that potency increases from E1L3 to E2L2 to E3L1, and finally E4 (subunit Mr = 34 kDa), the strongest erythrocyte agglutinin. Lymphocyte stimulation increased from E4 (20-fold at 37 mg mL 1) to L4 (24-fold at 8 mg mL 1). ly encountered application concerns the mitogenic activity of lectins (Table 5). The fact that plant lectins can affect lymphocyte activity and proliferation has led to suggestions that the laboratory tools could be introduced as immunomodulatory therapeutic agents in clinical applications. The example of the galactoside-specific mistletoe lectin (VAA, formerly ML-1), a constituent of proprietary extracts used in Austria, Germany, and Switzerland, shows that immune functions such as secretion of proinflammatory cytokines or priming of granulocytes/ ChemBioChem 2004, 5, 740 ± 764 www.chembiochem.org detection of disease-related alterations of glycan synthesis by lectin cytoand histochemistry histo-blood group typing and definition of secretor status quantification of aberrations of cell surface glycan presentation, e.g. in malignancy cell marker for diagnostic purposes including marking infectious agents (viruses, bacteria, fungi, parasites) cell marker for functional assays to pinpoint defects in cell activities such as mediator release [a] Extended and modified from ref. [26g]. activity of NK cells can indeed be stimulated at nontoxic doses of lectin (VAA concentration needed to elicit in vivo effects: 1± 2 ng kg 1 body weight, given subcutaneously).[39] However, this immunomodulatory capacity is unlikely to have a clinical perspective because lectin-dependent increase in the proliferation (and also metastatic capacity) of tumor cells has likewise been described for cell lines, histocultures of human tumors, and animal models in vivo (primum non nocere).[26g, 40] Enhanced availability of proinflammatory cytokines might account for this effect. In more general terms, it is becoming evident that these immune factors can also trigger growth responses in malignant cells.[41] Our understanding of how immune/inflammatory cells influence tumor growth and neovascularization is thus undergoing a paradigmatic shift. This development is reflected in the statement that these cells ™conspire with cancer cells in promoting∫ (rather than inhibiting) these processes,[41e] which has implications for the way we look at immunostimulation in cancer patients. As a consequence, immunomodulation by a lectin can exert a nonbeneficial influence on tumor parameters. Case studies, including an account of a study on melanoma patients in which treatment with a proprietary mistle- ¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim 751 H.-J. Gabius et al. toe extract appeared to decrease Table 6. Functions of plant lectins.[a] the lengths of overall survival Activity Example of lectin and disease-free intervals of patients with lymph node metastaexternal protection from fungal attack Hevea brasiliensis (rubber tree), Urtica dioica ses, underline concerns that activities (stinging nettle), Solanum tuberosum (potato) protection from herbivorous animals Phaseolus vulgaris (French bean), Ricinus comherbal treatment modalities in munis (castor bean), Galanthus nivalis (snowalternative/complementary meddrop), Triticum vulgare (wheat) icine may not be free of serious involvement in establishing symbiosis between Pisum sativum (common pea), Lotononis bainesii risk potential.[42] A recent review plants and bacteria (miles lotononis), Arachis hypogaea (peanut), Triticum vulgare (wheat), Oryza sativa (rice) on the controversial issue of the Internal storage proteins valid for all lectins clinical use of Viscum album exactivities ordered deposition of storage proteins and enPisum sativum (common pea), Lens culinaris tracts in cancer treatment conzymes in protein bodies and mediation of con(lentil), Glycine max (soybean), Oryza sativa (rice) cluded that ™mistletoe therapy tact between storage proteins and protein body membranes has the potential to harm cancer modulation of enzymatic activities such as phos- Secale cereale (rye), Solanum tuberosum (potato), [42d] patients.∫ These data also phatase activity Pleurotus ostreatus (oyster mushroom), Glycine caution against intuitive expectmax (soybean), Dolichos biflorus (horse gram) ations that in vitro modulation participation in growth regulation Medicago sativa (alfalfa), Cicer arietinum (chick pea) of one or more immune paramadjustment to altered environmental conditions Triticum aestivum (winter wheat) eters (plant lectins are very [a] For further information on carbohydrate specificities, see Table 4. For a recent review, see ref. [26g]. active elicitors of such a response) will automatically be clinically beneficial. While knowledge on the distribution of lectins in plants has and bacterial sialidases, often contain a second domain besides taken enormous strides as a result of documentation of their their catalytic section. This domain has exclusive carbohydratewidespread occurrence, it is difficult to produce a succinct binding activity that allows it to guide and firmly position the compendium of their functions in situ. In principle, each lectin hydrolytic center.[45] This close cooperation of the two sites might have distinct functions at the site of expression and toward polysaccharide degradation (see criterion (c), Section 3) through interplay with binding partners in the cell and the exexplains the reluctance of researchers to count these enzymes tracellular environment, an idea also valid for animal lectins. with a carbohydrate-binding module as lectins. Equivalent proOne particular protein can thus take care of several tasks. Powteins that bring a catalytic and a carbohydrate-binding domain erful techniques used to regulate lectin presence on the level together are found in both plants (e.g. b-galactosidases and of gene expression in vitro and in vivo that were a boon for endo-b-1-4-glucanase in strawberry) and animals (see the elucidation of lectin functions in animals are starting to be below).[26g] A recent example of clinical interest implicates mu[43] exploited in plants, so progress in refining and extending tations affecting a putative glycogen-binding domain (CBD-4) current knowledge of the functions of plant lectins will not be of laforin in disease onset, which is supposedly a result of mislong in coming. Table 6 summarizes current concepts on this positioning of the phosphatase activity. This domain is the topic, together with examples of lectins with the activities conproduct of the EPM2A (epilepsy of progressive myoclonus type cerned. Free oligosaccharides also convey biochemical messag2) gene, which is defective in Lafora disease.[46] The detection es and, although their binding partners do not fit the lectin of a chitinase-related receptor-like kinase (CHRK1) in tobacco definition given herein in every respect (see criterion (c) in Secand of receptor-like protein kinases with extracellular lectinlike tion 3), our survey would not be complete without paying tribdomains in thale cress (Arabidopsis thaliana) and lombardy ute to this aspect of oligosaccharide behavior. Indeed, an poplar (Populus nigra var. italica) suggests the existence of an emerging topic in the area of protein±carbohydrate interaction outside/inside signaling route for the transfer of sugar-encodis the way in which oligosaccharide elicitors interact with their ed messages into the plant cell.[16c, 47] Although it is tempting [44] These elicitors are products of often ill-defined receptors. to draw analogies between plant and animal lectin functions, the degradation of plant/fungal cell walls or lipochitooligosacthis approach should not be taken too far. The enzymatic apcharides (Nod factors involved in the chemical cross-talk beparatus of glycan synthesis is not identical in plants and anitween nitrogen-fixing soil bacteria and their leguminous host mals, so the patterns of potential natural ligands for evolutionplant). Of note is the observation that the rhizobial nodulation ary adaptation diverge. For example, the structures of the core protein NodC, a glycosyltransferase responsible for GlcNAc inregions of complex-type N-glycans in plants differ from the corporation within the synthetic pathway of the Nod factors, structures in animals in that the plant glycans harbor two does not appear to be a unique invention of the evolutionary unique additions to the substitution pattern of the core region process because similar sequences have been found in Xeno(the a1-3-linked fucose attached to the proximal GlcNAc resipus, zebrafish, and mouse proteins[44e] (the alternative route to due and the b1-2-linked xylose in the core mannose residue). Mammalian cells, in contrast, have relatively abundant supplies chitooligosaccharides employs endochitinases). Members of of b1-4-galactosyltransferases, a1-2/6-fucosyltransferases, and this glycosylhydrolase family (no. 18), like many other enzymes sialyltransferases.[48] involved in bacterial/fungal carbohydrate polymer degradation 752 ¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim www.chembiochem.org ChemBioChem 2004, 5, 740 ± 764 Chemical Biology of the Sugar Code With regard to the lectins, the take-home messages of this section are clear: a) plant lectins are widely found, and b) these lectins are endowed with various functional activities through their carbohydrate-binding activity. The number of ways in which plant lectins are successfully applied as tools (summarized in Table 5) intimates that far-reaching opportunities would be missed if the system of complementary molecular interaction exploited with these laboratory tools were not naturally operative in animals. In the search for biochemical hardware for programmed ™lock-and-key∫ interactions, lectins and glycans have thus been judged to be ™reasonable candidates.∫[49] To gauge the extent to which our knowledge of animal lectins has advanced over the last few decades (see also Table 2), it is informative to recall the scepticism with which this concept was confronted three decades ago. At that time, the view was held (as for antibodies and antigens) that lectins and oligosaccharides ™are unlikely to provide a general mechanism of recognition and communication of the type postulated by Weiss[50] because one member of each pair is probably not a common cell component. The known lectins generally originate from plants or invertebratesº∫[49] Isolation of the C-type hepatocyte asialoglycoprotein receptor in 1974 and the galectin (electrolectin) from the electric eel (Electrophorus electricus) in 1975, along with later work in 1980 leading to the biochemical verification of the presence of lectins in snake venom (originally discovered by Mitchell in 1860),[17, 51] as well as the ensuing work has markedly changed this view.[28c,e] The next section is a brief survey of the current status of knowledge of lectin occurrence and functions in animals. 5. Animal Lectins: Occurrence, Functions, and Applications The complexity of glycosylation reactions in animals, especially mammals, and of the resulting glycans which form the cellular glycome is matched by that of proteins with a carbohydrate recognition domain that meet the criteria for classification as a lectin given at the end of Section 3.[4, 28c,e] The great strides taken in sequence and three-dimensional analysis of lectins have enabled researchers to pinpoint modules that accommodate glycan epitopes with great precision.[28c,e, 52] A minimum of five lectin families has been solidly defined, the C-, I-, and Ptype lectins, pentraxins, and galectins.[28c,e] New additions to this list will very likely include: a) the two molecular chaperones calnexin and calreticulin, which have a folding pattern resembling that of leguminous lectins, b) a mannose-binding lectin from the pufferfish Fugu rubripes with sequence similarity to the agglutinins of monocotyledonous plants with the same binding specificity, c) tachylectin 5A/ficolin, with their fibrinogen-like binding sites, d) the ™chitinase-like∫ Ym1 lectin with its TIM barrel, e) fucose-binding eel lectins, which have a b-barrel with jelly-roll topology, and f) glycosaminoglycan-binding receptors/adhesion molecules.[19f, 28c] This subclassification is evocative of that of glycosyltransferases and each lectin family encompasses more than one member. Table 7 gives an idea of the degree of intrafamily diversity. The table shows the current status of the family of mammalian galectins (Ca2 + -independent ChemBioChem 2004, 5, 740 ± 764 www.chembiochem.org animal lectins with specificity for b-galactosides and derivatives thereof, a jelly-roll-like folding pattern, and a set of invariant amino acids in the site of contact with the ligand that includes a central Trp residue; see Section 2 for the role of the indolyl side chain). Scouring genome databases for respective hits is thus a worthwhile activity, and homology-based database mining is becoming a valuable tool for the detection of new family members.[53] A further striking example of intrafamily diversity is the proteins containing the C-type domain (115±130 amino acids with four invariant Cys residues and a characteristic consensus sequence). This domain is often found in mosaiclike proteins with functions involved in cell adhesion (e.g. the selectins in lymphocyte recirculation) or organization of the extracellular matrix (e.g. the hyalectans/lecticans), and in proteins involved in glycan endocytosis. The gene encoding the C-type domain is placed seventh in frequency amongst the 19 099 predicted genes of the nematode Caenorhabditis elegans and thus surpasses even the epidermal-growth-factor-like and Igsuperfamily domains in ranking.[54] With 165 or 183 open reading frames (according to separate calculations), this motif, typical for a member of the family of animal lectins, is well-represented in the genome of the model organism.[52c, 55] To date, over a hundred human proteins with C-type lectinlike domains have been described, which establishes this group of domains as a lectin family. These lectins are divided into six subgroups based on their individual modular and quaternary structures.[52c] These numbers reflect a complex evolutionary genealogy and intimate fine-tuning of ligand specificity for distinct functions. This type of lectin and also members of several other families take advantage of the elaborate enzymatic process line that specifically tailors the branch ends of glycan chains by preferentially targeting the spatially accessible tips of the sugar antennae. That lectins, through binding to their distinct glycan determinants, are indeed able ™to provide a general mechanism of recognition and communication∫[49] (the widespread presence of lectins in animals has already convincingly dispelled the concerns quoted above) is proven by the accrued knowledge presented in Table 8. It is immediately clear from the entries in this table that these insights into lectin function offer enormous potential for applied research in chemical biology. Endocytic receptors of the C-type lectin family, with their fixed geometry of binding sites, are ideal as targets for synthetically tailored drug carriers. These receptors render uptake into cells such as hepatocytes feasible.[56] Antiviral drugs can thus be delivered to hepatocytes, for example by using triantennary N-glycans with GalNAc in the terminal position as a post code. Conversely, lectin-dependent clearance of glycosylated pharmaproteins is therapeutically disadvantageous as it reduces the bioavailability of the drug. It is reasonable in this case to modify the glycan structure to reduce or even avoid lectin binding. Integration of chemoenzymatic N-glycan synthesis and bioassays toward this aim has spawned progress in this field.[57] To be specific, biantennary complex-type N-glycans with a2-3(6)-sialylation and/or bisecting GlcNAc or core fucosylation in the bioactive part of the neoglycoproteins have been studied. a2-6-Sialylation of a biantennary complex-type N- ¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim 753 H.-J. Gabius et al. genesis. The duration of serum presence and the activity of the Name Occurrence Structural features engineered glycoproteins in vivo have been increased in this galectin-1 (galaptin, many cell types homodimer; one CRD per subunit way.[58] The combined use of L-14) (14±15 kDa): proto type galectin-2 gastrointestinal tract; clone from human homodimer; one CRD per subunit (43 % these chemoenzymatic strathepatoma sequence identity to galectin-1; 14 kDa): egies, which render N- and Oproto type glycans of choice available,[57, 59] galectin-3 (CBP35, Mac- many cell types monomer with one CRD (oligomer forand molecular biological engi2 antigen, IgE-binding mation in solution and on surfaces); protein, L-29, L-34) Pro-, Tyr-, and Gly-rich repeats in N-terneering is bound to bring about minal section (27±36 kDa): chimera type the rational design of comgalectin-4 colon, small intestine, stomach, oral epimonomer with two partially homolopounds with prolonged bioavailthelium, esophagus; lung, testis, breast, gous but distinct CRDs connected by a ability or refined capacity for liver, and placenta by RT-PCR link peptide (36 kDa); proteolysis generates truncated proto-type-like products: specific delivery. tandem-repeat type Fixed topological presentation galectin-5 reticulocytes, erythrocytes (rat) monomer with one CRD (17 kDa): proto of binding sites, as discussed type above, is also a prerequisite for galectin-6 small intestine, colon tandem-repeat arrangement of two CRDs (33 kDa) blocking access to bacterial/viral galectin-7 keratinocytes, stratified epithelia, carcino- homodimer; one CRD per subunit lectins, a new concept for interma cells (15 kDa): proto type fering with the adhesion step of homologous to galectins-4 and -6 galectin-8 several tissues; frequently present in infections and the binding of (tandem-repeat arrangement of two tumor cell lines (link peptide extension CRDs with unique link peptide; 34 kDa) possible) AB5-toxins.[30g,k] The fivefold symgalectin-9 small intestine, liver, lung, kidney, homologous to galectins-4, -6, and -8 metry of the presentation of the thymus (rat/mouse; small intestinal iso(tandem-repeat arrangement of two binding sites in these toxins proform with 31/32 amino acid extension of CRDs with unique link peptide; 36 kDa) vides the potential for extremely link peptide); lymphatic tissue and B cells, T cells and macrophages, pancreas, tight binding by a suitably decolon carcinoma cells (human) signed pentavalent ligand. This Charcot±Leyden crystal major autocrystallizing constituent of one CRD-like structure with specificity configuration is evocative of a protein (galectin-10) eosinophils and basophils for d-Man (16.5 kDa) starfish and such compounds galectin-11 (ovgal-11) sheep gastrointestinal tract, induced one CRD, resembles proto-type galectins (14 kDa) upon nematode infection are 107-fold more potent in ingalectin-12 several tissues (upregulation in cells homologous to galectins-4, -6, -8, and -9 hibition assays than their monosynchronized at the G1 phase or G1/S (tandem-repeat arrangement of two mers.[30k] Since soluble lectins boundary of the cell cycle), adipocytes CRDs with unique link peptide; 35.3 kDa) also display binding sites in dishomodimer; one CRD per subunit galectin-13 identical to placental protein 13 (pp13); (16.1 kDa); close similarity to galectin-7 also expressed in the spleen, kidney, tinct arrangements, the theraand the Charcot±Leyden crystal protein bladder, and in tumor cells peutic concept may be extended galectin-14 ovine eosinophils, secreted into bronone CRD resembling proto-type galecbeyond infections toward atchoalveolar lavage fluid tins (18.2 kDa) tenuating lymphocyte accumula[a] Taken from ref. [27c], extended, and modified. Please note that the presence of the galectins in humans has tion or metastatic spread. Glyconot been confirmed in all cases (e.g. rat galectin-5). dendrimers have indeed been shown to impair binding of galectins both in solid-phase assays with selectivity for the glycoprotein ligand and type of glycan is a means of conferring the signal to its carrier for a galectin, and in cell-binding studies.[60] The recently delineated rather long period of circulation.[57] Addition of a bisecting involvement of galectins in tissue invasion during glioblastoma GlcNAc residue to the biantennary N-glycan considerably inprogression or within the metastatic cascade (e.g. in colon, creases uptake of the neoglycoprotein into the liver and breast, or prostate carcinoma)[16b, 61] is a potential area of interspleen, which is relevant for clinical imaging. Neither core fucoest for testing these ideas in applications. In addition to the efsylation nor use of the glycan free of substitution can achieve the same effect.[57] As the cited reports describe in further fects of the spatial presentation of ligands on synthetic scaffolds such as wedge-like glycodendrimers,[60c,d] the fine specificdetail, glycan modification by substitution can also bring ity differences between these homologous endogenous lectins about notable changes in the affinity of the molecule for soluare being delineated to enhance probe selectivity, another ble lectins, an effect emerging from the presence of distinct substitutions with biological/clinical relevance.[57] Another challenge for chemical biology.[62] A theory is forming that the structure of the ligand and the spatial mode of its presentation route towards optimization of glycosylation for clinical use involves glycoengineering. In this approach, new N-glycosylation modulate binding avidity in markedly different ways for indisequences (the sequon Asn-X-Ser/Thr, where X is any amino vidual lectins of a family. The detection of these differences acid except Pro) are introduced into protein therapeutics such lends credit to the assumption that intrafamily diversification is as recombinant human erythropoietin by site-directed mutaaccompanied by quantitative alterations of the ligand profile, Table 7. Members of the galectin family of mammalian lectins.[a] 754 ¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim www.chembiochem.org ChemBioChem 2004, 5, 740 ± 764 Chemical Biology of the Sugar Code problem have paved the way for endogenous proteins to become Activity Example of lectin tools too. As research tools (see Table 5), the endogenous lectins ligand-selective molecular chaperones in endoplasmic calnexin, calreticulin offer the added advantage of reticulum intracellular routing of glycoproteins and vesicles ERGIC-53 and VIP-36 (probably also ERGL and VIPL), being able to act as potential P-type lectins, comitin therapeutic agents that exploit intracellular transport and extracellular assembly nonintegrin 67-kDa elastin/laminin-binding protein natural substances and signal inducer of membrane superimposition and zippering langerin (CD207) pathways, for example, to limit (formation of Birbeck granules) cell-type-specific endocytosis hepatic and macrophage asialoglycoprotein receptors, tumor proliferation or T-celldendritic cell and macrophage C-type lectins (mandependent immune disornose receptor family members of the tandem-repeat ders.[28f, 41c, 64] Since lectins natutype and single CRD lectins such as langerin/CD207), rally select binding partners for cysteine-rich domain of the dimeric form of the mannose receptor for GalNAc-4-SO4-bearing glycoprotein an in situ function, it is a sure hormones in hepatic endothelial cells, P-type lectins bet that assays involving the recognition of foreign glycans(b1,3-glucans, LPS) CR3 (CD11b/CD18), dectin-1, Limulus coagulation facparticipation of endogenous lectors C and G, earthworm CCF tins will increase in number. In recognition of foreign or aberrant glycosignatures on collectins, L-ficolin, C-type macrophage and dendritic cells (including endocytosis or initiation of opsonizacell receptors, a/q-defensins, pentraxins (CRP, limulin), terms of functional consideration or complement activation) tachylectins tions, the development of assays targeting of enzymatic activity in multimodular proacrosin, laforin, Limulus coagulation factor C with endogenous lectins (instead teins of plant surrogates) can be conintra- and intermolecular modulation of enzyme activ- porcine pancreatic a-amylase, galectin-1/a2-6-sialylities in vitro transferase sidered a quantum leap. Since bridging of molecules homodimeric and tandem-repeat-type galectins, cytothe fine sugar specificities of kines (e.g. IL-2:IL-2R and CD3 of T-cell receptors), cereplant and mammalian lectins bellar soluble lectin often differ, results obtained galectins, selectins, and other C-type lectins such as induction or suppression of effector release (H2O2, cytokines, etc.) CD23, BDCA-2, and dectin-1 with plant lectins suffer from the cell growth control and induction of apoptosis/anoigalectins, C-type lectins, amphoterin-like protein, hyainevitable drawback that they kis luronic-acid-binding proteins, cerebellar soluble lectin cannot be reliably extrapolated cell migration and routing selectins and other C-type lectins, I-type lectins, galecto in situ functionality. tins, hyaluronic-acid-binding proteins (RHAMM, CD44, hyalectans/lecticans) We have compiled the docucell±cell interactions selectins and other C-type lectins (e.g. DC-SIGN), gamented functions of animal leclectins, I-type lectins (e.g. siglecs, N-CAM, P0, or L1) tins for review in Table 8. Evicell±matrix interactions galectins, heparin- and hyaluronic-acid-binding lectins dently, carbohydrates serve as such as hyalectans/lecticans, calreticulin matrix network assembly proteoglycan core proteins (C-type CRD and G1 versatile ligands. It is thus logical domain of hyalectans/lecticans), galectins (e.g. galecto ask a fundamental question tin-3/hensin), nonintegrin 67-kDa elastin/laminin-bindon the nature of oligosaccharing protein ides: ™How can flexible mole[a] Taken from ref. [4c], extended, and modified. cules act as signals?∫[65] This concern was put into words in a recent review in which the as mentioned at the start of this section. Systematic chemical author states that, ™on several occasions I have heard structural mapping with ligand derivatives and screening of arrays/librabiologist colleagues state that the glycan units in a glycoprories to discover potent ligand mimetics are likely eventually to tein, for instance, cannot be important because they are too allow molecules to be devised that fit hand-in-glove into a parflexible to be seen in an X-ray crystal structure or by NMR. In ticular galectin (or any other lectin of clinical interest).[13d, 16d, 32] other words, if they do not have a structure, how can they have a function? That this conclusion is gratuitous∫[2] can be As in the case of plant lectins, these reagents will be instrumental to the detection of lectin activities and to their cytoseen by turning to the next section. and histochemical localization, which is relevant to histopathology.[27] This approach (i.e. tracking down carbohydrate-binding 6. The Third Dimension of the Sugar Code proteins by using synthetic probes) has been termed ™reverse lectin histochemistry∫ to distinguish it from the routine lectin It is in principle correct to point critically at the inherent flexiapplications listed in Table 5.[63] bility of oligosaccharides. Rapid intramolecular movements can explain the frustrating futility of attempts to obtain crystals The deployment of mammalian lectins as laboratory tools from viscous solutions produced by synthetic carbohydrate has lagged behind application of agglutinins from plants. The chemistry. In glycoproteins, the glycan antennae can even reasons for this lack of application are definitely the limited behave as nearly separate entities, a noteworthy factor that availability of, and access to the reagents. The easy-to-follow allows the proteomic complexity to be increased through disprotocols provided by recombinant technology to solve this Table 8. Functions of animal lectins.[a] ChemBioChem 2004, 5, 740 ± 764 www.chembiochem.org ¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim 755 H.-J. Gabius et al. tinct template-independent posttranslational modifications, without altering the genomic coding capacity.[15a] Close inspection of this flexibility by molecular modeling (molecular mechanics and dynamics simulations) and NMR spectroscopy[65, 66] has revealed that ™certain glycans have highly favored conformations.∫[2] Figures 2 and 3 focus on ligands for the galactoside-specific lectins (galectins and mistletoe lectin) introduced above and illustrate that the conformational space of such a free disaccharide is energetically structured like a topographical map is arranged with respect to altitude. The molecules populate low-energy (valley) positions in the molecular dynamics simulations, and this result is experimentally verified by the detection of time-averaged interresidual resonance transfer between water-insensitive C H protons (Figures 2 and 3).[67] Such disaccharides thus have access to more than one position in the F, Y, E plot characterizing the distinct sets of energetically favored conformations (Figure 2, Figure 3). Since ™the carbohydrate moves in solution through a bunch of shapes each of which may be selected by a receptor,∫ Hardy has likened such a carbohydrate ligand to a ™bunch of keys,∫[68] with explicit ref- Figure 2. Illustration of conformational aspects of the disaccharide Galb1-3GalNAca/b. This epitope (the a-anomer is the Thomsen±Friedenreich tumor antigen) is a ligand for galectins (for further information on this family of animal lectins, see Tables 7 and 8), as shown by the occurrence of two interresidual trNOE contact signals in the 2D trNOESY spectrum of a mixture (molar ratio 10:1) of the disaccharide with chicken liver galectin (CG-16), recorded at 500 MHz and 298 K with a mixing time of 100 ms (top). Introduction of this information as two pairs of contour lines into the conformational energy map (F, Y E plot) derived from molecular mechanics calculations (e = 4) limits the conformational space of the bound ligand. It is also limited in this way when the experimental information is introduced into the molecular dynamics profile (300 K, 1000 ps) derived from calculations that explicitly include water molecules and start from the F, Y coordinates at 0/1808 outside the central low-energy valley. These calculations reveal a high population density within this central valley (middle), as described previously.[67b] Three individual low-energy conformations from the central area, marked 1, 2, and 3 in the energy map, were drawn by using these sets of F, Y angle combinations to visualize the structural impact of F, Y angle changes (bottom). 756 ¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim www.chembiochem.org ChemBioChem 2004, 5, 740 ± 764 Chemical Biology of the Sugar Code Figure 3. Illustration of the structural aspects of differential conformer selection of a digalactoside by a plant and an animal lectin. Relevant parts of 2D ROESY/trNOESY spectra (recorded at 500 MHz, 298 K with a mixing time of 100 ms) of the free disaccharide Gal’b1-2Gal (A) and of this ligand at a 10:1 molar ratio with (B) the galactoside-specific mistletoe lectin (Viscum album L. agglutinin, VAA; see Table 4 for further information) and (C) the chicken liver galectin (CG-16), respectively. The spectra show three interresidual cross-peaks for the free ligand and two such signals for the lectin± ligand complexes, as described previously.[66a, 67a] The interresidual H1’/H2 cross-peak is shared by the three spectra, whereas only one of the interresidual H1’/H1 and H1’/H3 cross-peaks is present in each of the trNOESY spectra of the ligand with the plant and animal lectins. Molecular mechanics (e = 4) and molecular dynamics calculations (e = 80, CVFF, 300 K, 1000 ps), combined with the NMRspectroscopy-based contour line pairs (see refs. [66a, 67a] for details), revealed that only one of the two conformers present in solution (labeled as 1 and 2 in the F, Y, E plot) was bound by each of these two lectins (D). The plant agglutinin and the animal lectin select different conformers of the digalactoside. The structures of the conformers are shown in (E). erence to the ™lock-and-key∫ paradigm introduced by Fischer in 1894 (see Section 2).[5] Taken literally, each individual conformer (™key∫) is endowed with the potential to interact with a certain complementary receptor site (™lock∫). In other words, a lectin might perform conformer selection, which provides a starting point for hypothesis-driven work. Several agglutinins that share sequence specificity for a disaccharide might subject the ligand population to differential conformer selection. In this sense, recognition is primarily a shape problem (see the passage quoted above), a statement with substantial implications for the design of therapeutic glycomimetics. As illustrated in Figures 2 and 3, experimental data on interresidual proton distances for the ligand in complex with the lectin in solution are obtained by transferred nuclear Overhauser effect (trNOE) spectroscopy, where the signal intensity serves as a molecular ruler.[66, 69] Whereas the definition of the bound conformation is not unambiguous for the example given in Figure 2 and requires further experimental input or a docking analysis (see below for further discussion and also Figure 4), the information presented in Figure 3 clearly demonstrates the principle of differential conformer selection by lectins.[66a, 67] In this instance, a single disaccharide (Galb12Gal) forms two rapidly interconverting shapes. Each specifically interacts with only one of the two different lectins, that is, either with a galectin or a plant lectin. The same ligand can form a bioactive and a bioinert conformation when viewed from the perspective of the galectin tested. As Roseman commented, ™it is this interplay between proteins and different conformers that likely allows a single carbohydrate structure (º) to be used in many different ways.∫[2] In terms of methodology, it is the interplay of carbohydrate chemistry, molecular modeling, NMR ChemBioChem 2004, 5, 740 ± 764 www.chembiochem.org ¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim 757 H.-J. Gabius et al. Figure 4. Illustration of the substantial gain of information about the actual conformation of a carbohydrate ligand provided by access to NOE data for contacts involving water-exchangeable protons, and an experimental example to verify the validity of this concept (discussed in detail previously).[76a] The blurring in (A) demonstrates that even the presence of two interresidual contacts (here H1’ to H3 and H4 of Gal’a1-3Galb1-R) does not allow accurate definition of the conformation of the disaccharide (see Figure 2 (middle) and Figure 3 D for the size of the area shared by two pairs of contour lines in the E plot). Although the results of molecular mechanics calculations intimate that the bound-state conformations are at low-energy sites in the F, Y, E plots, further experimental evidence to support this assumption is essential. This verification is symbolized by the substitution of the blurred image by a clear structure (B) after inclusion in calculations of a signal indicating a third contact. In fact, detection of the new water-sensitive contact by analysis of protein±ligand complexes in an aprotic solvent improves the precision of the conformational description by allowing a third pair of contour lines to be added to the E plot. This pair of lines delimits the area of overlap of the two pairs of lines drawn based on water-insensitive contacts (C). Remarkably, this area representing the ligand's bound-state conformation, which is accommodated by a natural immunoglobulin G fraction from human serum, lies within/close to the central low-energy valley.[76a] spectroscopy, and biochemical preparation of the receptors that allows the validity of this concept to be convincingly proven. The power of this integrated approach is again made evident in Figure 5, which shows the bound-state conformers of a glycomimetic. The tested C-glycoside offers the pharmacodynamic advantage of resistance to hydrolytic cleavage. However, an increased degree of flexibility relative to that of the O-glycoside results from the introduction of a methylene bridge in place of the oxygen atom (for further information, see the legend of Figure 5).[70] By using exclusive interresidual contacts as fingerprint-like characteristics for a certain bound-state topology, differential conformer selection was established and the conformers selected by galectin-1 (syn-F, Y), the B chain of ricin (anti-Y), and an enzymatically inactive mutant of the bacterial b-galactosidase (anti-F) were tracked down.[71] One may wonder whether this result applies only to small ligand structures or also to naturally occurring extended saccharide chains. 758 A recent example is provided by a combined NMR spectroscopy and molecular modeling study that defined the boundstate topology of a cell-surface-exposed oligosaccharide chain, the pentasaccharide of ganglioside GM1. The obtained data add further strong support to the concept that a certain lowenergy conformer is favored for binding. The carbohydrate chain of the ganglioside is the target for both cross-linking by galectin-1 to induce inhibition of the growth of human SK-NMC neuroblastoma cells and for the AB5 toxin of Vibrio cholerae.[30k, 64a] This ability of one molecule to act as a ligand for two structurally unrelated receptors prompts questions about the topological aspects of these two recognition processes. As shown in Figure 6, in which the two bound-state conformations are compared, there is indeed a difference at the branch point of the carbohydrate chain.[72] The dihedral angles of the Neu5Aca2-3Gal linkage in the bound ligand are either F, Y = 708/158 in the case of galectin-1 (in solution) or about 1728/ 268 for cholera toxin (in crystals). The conformations selected for binding represent two of the three lowest-energy conformations of the free ligand. Binding causes no distortion of the topology of the selected ™key∫. This result makes it tempting to suggest that ligand derivatives with the same carbohydrate sequence but conformational restriction at the linkage of the internal branch point could no longer interact with both receptor proteins. After all, it would be clinically desirable to block the action of the AB5 toxin with an inhibitor while lowering the affinity of the inhibitor to the endogenous lectin to avoid undesired side reactions. This challenge at the interface of synthetic carbohydrate chemistry and chemical biology can be tackled rationally given precise topological information. Beyond selectivity, the binding of a deliberately preformed conformer might also help reduce the entropic penalty in the thermodynamic balance sheet of the overall association and accommodation process.[13c,e, 65] When we analyzed the binding of the pentasaccharide to galectin-1 by modeling, we were able to obtain information on the major contact sites and the resulting interaction energy terms, data that provide more input for the design of glycomimetics.[72] An intriguing example of the intimate relationship between carbohydrate flexibility and molecular recognition is given by iduronic acid in heparin/heparan sulfates, as outlined in Section 2 (for the position of l-iduronic acid in the anticoagulant heparin pentasaccharide that binds to antithrombin III, see Scheme 4). When latched into the recognition site of the plasma protein antithrombin III, 2-O-sulfated l-iduronic acid is driven toward its skewed 2S0 conformation. In contrast, the local-kink-forming 1C4 conformation is preferred by fibroblast growth factors because it maximizes contact between the target determinant in the glycosaminoglycan and these homologous proteins.[10, 73] This amazing role as a versatile hinge that allows the crucial regions of the glycosaminoglycan to adopt the most favorable spatial topology makes it clear that the development of the epimerase reaction that produces l-iduronic acid was not a fortuitous event but a wise investment. The given examples teach this lesson: the more we learn about the intricacies underlying the virtues of carbohydrates as ligands, the more refined the ideas on the drawing-board for devising ¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim www.chembiochem.org ChemBioChem 2004, 5, 740 ± 764 Chemical Biology of the Sugar Code precision of the information on the bound-state topology of the ligand. As shown in Figures 2, 3, and 5, only C H protons have been exploited as reporter groups so far. Recruiting hydroxy protons to contribute to the fingerprint of 2D trNOESY/ROESY cross-peaks would improve the quality of our view of bound-state ligand topology, as graphically depicted in Figure 4 A and B. To preclude loss of the information from water-exchangeable hydroxy protons, one option was to build on pioneering work with glucosides. Sharp signals were detected for these study objects when they were dissolved in an aprotic solvent (dimethyl sulfoxide).[75] The concern that the activity of carbohydrate-binding proteins might be harmed substantially by the solvent change was addressed by performing systematic binding assays. These assays revealed that the activity of proteins with a well-structured folding pattern, for example, the jelly roll of galectins, the double b trefoil of the mistletoe lectin, and the Ig fold of Figure 5. Illustration of the structural aspects of differential conformer selection of C-lactoside by an animal lectin (galectin-1), a plant lectin (the B chain of ricin; see Table 4 for further information), and a catalytically inactive form of immunoglobulin G fractions, is E. coli b-galactosidase (the asterisk denotes the E537Q mutant). The glycomimetic, which cannot be hydrolyzed, acnot harmed by such solvent [71b] cesses 23 % of the conformational space in the F, Y, E plot, while 12 % is accessed by the O-lactose. The increased change.[76] These data square flexibility of C-lactoside compared to O-lactoside is accompanied by a shift of population density from the syn confor[71b] well with encouraging experimation (F, Y: 558, 208) to the anti-Y conformation (F, Y: 408, 1808) to give a 32/54 % ratio of the confomers. The three conformations of C-lactoside at relative energy minima (syn, anti-Y, and anti-F) are characterized by the ences with enzymes in organic occurrence of distinct interresidual resonance transfer processes, each of which is possible for only one topological solvents.[77] The results of such constellation and thus establishes an exclusive contact. Each arrow in the figure originates from the respective posiexperiments also intimate that tion in the F, Y, E plot and points to the relevant part of a spectrum, shown together with a molecular model in the folding pattern, at least which the pair of protons establishing the exclusive contact is indicated: GalH1/GlcH4 (syn), GalH1/GlcH3 (anti-Y), and GalH2/GlcH4 (anti-F).[69b, 71d] Detection of cross-peaks arising from any of these exclusive contacts in the 2D trNOaround the binding site, is not ESY spectra of the three types of lactoside-binding proteins allows the bound-state conformation of the lactoside to markedly changed by the solbe defined. The animal lectin, the plant agglutinin, and the enzymatically inactive bacterial b-galactosidase select difvent. Indeed, the accuracy of ferent conformers of the ligand. this assumption has been ascertained experimentally. Formation of dimers of the homodimeric galectin-1, instead of any indication of unfolding, was ligands with optimal fit and specificity will become. Consideraobserved by small angle neutron scattering.[78] The experimention of the shape of the molecule and its control will play a major role in this process. Rational synthesis and manipulation tal approach of turning to aprotic solvents for trNOE spectrosof the structural details of the molecule, such as the sulfation copy thus affords the possibility of detecting signals from pattern, as well as screening of oligosaccharide libraries proligand protons other than those originating from resonance vide routes to augment the affinity of ligands for certain tartransfer between C H protons. Figure 4 illustrates results from gets and to obtain substances with special biological propera proof-of-principle example. The results shown prompted conties, such as dissociating anticoagulant and antiangiogenic acsideration of how the range of applicability of this approach tivities.[9, 74] The guidelines for the synthesis clearly rely on the could be extended. ChemBioChem 2004, 5, 740 ± 764 www.chembiochem.org ¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim 759 H.-J. Gabius et al. 7. Conclusions and Perspectives Figure 6. Illustration of the structural aspects of differential conformer selection of the complex carbohydrate chain in ganglioside GM1 by human galectin-1 (left) and cholera toxin (right). The structures are based on analysis of the solution structure of the galectin-bound pentasaccharide[72] and data from the Brookhaven Protein Bank (code no. 2CHB/3CHB) in the case of the cholera toxin. The illustrated difference in the F, Y angle combination for the glycosidic linkage at the branch point connecting the central galactose unit (Gal’) with the a2-3-linked sialic acid residue reflects the differential selection of two conformers from the three relative energy minima of the free state (F, Y: 708, 158 for galectin-1; F, Y: 1728/ 268 for cholera toxin).[72] A recent study demonstrated that addition of measured amounts of water to an aprotic solvent does not prevent measurement of sharp signals from the hydroxy protons of the ligand, even at temperatures well above 0 8C.[79] We thus suggest that the use of binary solvent/water mixtures has the potential to enter the panel of strategies for collecting very detailed information on bound-state topology. The aim of these techniques is to enable the complete structure of the complex, including all details of the receptor, to be revealed by analysis in solution. From this information, the way in which ligand binding affects the conformation of the receptor, including its sites for protein±protein interactions (measured for galectin-1 in solution by small angle neutron scattering),[78] could be discerned. This is a demanding task to accomplish, both for the biochemist, who has to supply (isotope-labeled) material in sufficient quantity and with sufficient solubility for analysis, and for the NMR expert, who is responsible for turning spectra into a structure. This problem has already been solved for a synthetic Thomsen±Friedenreich antigen-binding 15-mer peptide, hevein-domain-containing plant lectins or lectin domains such as the 43 amino acid hevein and GlcNAc oligomers (see Figure 7), the 11-kDa cyanovirin-N from the cyanobacterium (blue-green alga) Nostoc ellipsosporum and Mana1-2Mana, as well as the 198 amino acid adhesin domain of P-pili from uropathogenic E. coli (PapGII) and galabiose (Gala1±4Galb).[80] In answer to the question that has guided this section, namely how flexible compounds can act as ligands, it has become clear that the conformational space of carbohydrates is structured into several areas. These areas are distinguishable by their relative energy levels. Only a limited set of conformations (™bunch of keys∫)[68] is attributed to low-energy valleys, and the accommodation of such conformers is evidently not associated with an insurmountable entropic barrier. Although the molecular details of the overall thermodynamics of the generally enthalpically driven binding reaction are yet to be understood,[13c,e] the merging of synthetic excellence with in silico and in vitro techniques guarantees progress toward resolving this issue eventually. 760 The multifarious intermolecular recognition and regulation processes that underlie the efficient and smooth functioning of cell sociology have hitherto been assigned exclusively to nucleic acids and proteins in the central dogma of molecular biology. Despite a fashionable tendency to write off anything beyond genomics, the problem of how the limited panel of primary gene products is increased to serve all purposes properly and even to allow rapid and reversible regulation has engendered a surge in interest in mechanisms of posttranslational modification. Glycan chains have all the properties required for high-density information storage and are therefore qualified to make a mark in this respect. Their finely tuned synthesis even allows for dynamic modulations in response to external signals, and the ensuing interplay with endogenous lectins furnishes cells with an efficient communication system. This transition in the way we look at glycans, which means that the focus is no longer merely on the role of these molecules as biochemical fuel or protective cell wall constituents, has not passed unnoticed. As a consequence, cellular glycoconjugates and lectins are receiving increasing attention and respect. The entry at the bottom of Table 2 concerning the years 2001/2 attests this development. Stepwise refinements in instrumental capacity for structural analysis of carbohydrate oligo- and polymers have made it possible to consider deciphering the sequence of a glycan no longer deterrent.[15a, 81] The same holds true for conformational analysis. The realization of the enormous talents of glycans occurred in a gradual process rather than by a quantum leap.[82] Fittingly, progress in lectinology also followed this pattern, as the historic survey in Table 2 recounts and the steady increase of publications dealing with lectins reflects.[16d] The instrumental role of leguminous and eel lectins in the definition of the structure of AB0 histo-blood group epitopes about 50 years ago (see Section 3) sets a precedent for, and shows the enormous potential of merging these lines of research in the glycosciences branch of chemical biology. The design of optimal ligands to block disease-causing lectin activities (e.g. in bacterial infection or tumor invasion) or of lectin-mimetic peptides to elicit clinically beneficial lectin activities (e.g. removal of activated T-cells in autoimmune diseases or destruction of tumor cells by mimicking the capacity of galectin-1 to induce apoptosis/anoikis) are aims for this research. As summarized by Sharon recently, ™breaking the glycocode and identifying the receptors are of prime importance not only for theoretical reasons, but also to facilitate the development of novel treatments for the many diseases in which carbohydrate recognition plays a key role.∫[83] Acknowledgements The excellent manuscript processing by R. Ohl and the constructive and exceptionally helpful comments of both reviewers are greatly appreciated, as is the support from the Verein zur Fˆrderung des biologisch-technischen Fortschritts in der Medizin e.V. A sincere apology is directed to colleagues whose original work ¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim www.chembiochem.org ChemBioChem 2004, 5, 740 ± 764 Chemical Biology of the Sugar Code Figure 7. Relevant sections of the 2D NOESY spectra (recorded at 360 MHz, 300 K and with a mixing time of 200 ms) of the 43 amino acid plant lectin hevein (for further information on this lectin, see Table 4) in the absence (A) and in the presence (B) of N,N’-diacetylchitobiose. Characteristic alterations in the Ser19-dependent signals caused by the presence of a ligand are indicated by arrows. Involvement of the aromatic amino acids Trp21, Trp23, and Tyr30 in ligand binding is delineated by laser photo CIDNP difference spectra (aromatic section) of 1 mm hevein in the absence (C) and in the presence (D) of 1 mm N,N’-diacetylchitobiose at pD 4.[84] The spatial proximity of Ser19 and the three aromatic amino acid side chains to the ligand is depicted by the superposition of twenty snapshots (E) of the lectin± ligand complex taken in the course of a molecular dynamics simulation with explicit inclusion of water molecules, as presented in detail previously.[76a] ChemBioChem 2004, 5, 740 ± 764 www.chembiochem.org ¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim 761 H.-J. Gabius et al. could not be completely included, discussed, and cited because of space limitations and the scope of this review. With regret regarding this aspect of the paper, we set out to produce a primer on the concept of the sugar code as we see it, illustrated by selected proof-of-principle examples to convey a flavor of the field. Keywords: adhesion ¥ bioinformatics ¥ drug design ¥ glycosylation ¥ lectins ¥ molecular modeling ¥ NMR spectroscopy [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] D. A. Rees, Biochem. J. 1972, 126, 257 ± 273. S. Roseman, J. Biol. Chem. 2001, 276, 41 527 ± 41 542. P. J. Winterburn, C. F. Phelps, Nature 1972, 236, 147 ± 151. a) H.-J. Gabius, Naturwissenschaften 2000, 87, 108 ± 121; b) J. Hirabayashi, K.-i. Kasai, Trends Glycosci. Glycotechnol. 2000, 12, 1 ± 5; c) H.-J. Gabius, S. Andrÿ, H. Kaltner, H.-C. Siebert, Biochim. Biophys. Acta 2002, 1572, 165 ± 177. E. Fischer, Ber. Dt. Chem. Ges. 1894, 27, 2985 ± 2993. R. A. Laine in Glycosciences: Status and Perspectives (Eds.: H.-J. Gabius, S. Gabius), Chapman and Hall, London, 1997, pp. 1 ± 14. a) W. D. Comper, J. Theor. Biol. 1990, 145, 497 ± 509; b) H. Kresse in Glycosciences: Status and Perspectives (Eds.: H.-J. Gabius, S. Gabius), Chapman and Hall, London, 1997, pp. 201 ± 222; c) J. Turnbull, A. Powell, S. Guimond, Trends Cell Biol. 2001, 11, 75 ± 82. a) S. Yamada, K. Sugahara, Trends Glycosci. Glycotechnol. 1998, 10, 95 ± 123; b) R. Sasisekharan, G. Venkataraman, Curr. Opin. Chem. Biol. 2000, 4, 626 ± 631. a) S. Alban in Carbohydrates in Drug Design (Eds.: Z. J. Witczak, K. A. Nieforth), M. Dekker, New York, 1997, pp. 209 ± 276; b) R. J. Linhardt, T. Toida in Carbohydrates in Drug Design (Eds.: Z. J. Witczak, K. A. Nieforth), M. Dekker, New York, 1997, pp. 277 ± 341; c) B. Casu, U. Lindahl, Adv. Carbohydr. Chem. Biochem. 2001, 57, 159 ± 206; d) I. Capila, R. J. Linhardt, Angew. Chem. 2002, 114, 428 ± 451; Angew. Chem. Int. Ed. 2002, 41, 390 ± 412; e) A. S. Gallus, D. W. Coghlan, Curr. Opin. Hematol. 2002, 9, 422 ± 429; f) M. Sundaram, Y. Qi, Z. Shriver, D. Liu, G. Zhao, G. Venkataraman, R. Langer, R. Sasisekharan, Proc. Natl. Acad. Sci. USA 2003, 100, 651 ± 656. B. Casu, M. Petitou, M. Provasoli, P. Sinay, Trends Biochem. Sci. 1988, 13, 221 ± 225. a) N. Perrimon, M. Bernfield, Nature 2000, 404, 725 ± 728; b) S. B. Selleck, Trends Genet. 2000, 16, 206 ± 212; c) M. Princivalle, A. De Agostini, Int. J. Dev. Biol. 2002, 46, 267 ± 278; d) J. Turnbull, K. Drummond, Z. Huang, M. Ford-Perriss, M. Murphy, S. Guimond, Biochem. Soc. Transact. 2003, 31, 343 ± 348. a) L. V. Hooper, S. M. Manzella, J. U. Baenziger in Glycosciences: Status and Perspectives (Eds.: H.-J. Gabius, S. Gabius), Chapman and Hall, Weinheim±London, 1997, pp. 261 ± 276; b) M. Fukuda, N. Hiraoka, T. O. Akama, M. N. Fukuda, J. Biol. Chem. 2001, 276, 47 747 ± 47 750; c) J. R. Grunwell, C. R. Bertozzi, Biochemistry 2002, 41, 13 117 ± 13 126; d) K. Honke, N. Taniguchi, Med. Res. Rev. 2002, 22, 637 ± 654; e) J. U. Baenziger, Biochem. Soc. Transact. 2003, 31, 326 ± 330; f) I. Brockhausen, Biochem. Soc. Transact. 2003, 31, 318 ± 325. a) F. A. Quiocho, Pure Appl. Chem. 1989, 61, 1293 ± 1306; b) R. U. Lemieux, Acc. Chem. Res. 1996, 29, 373 ± 380; c) H.-J. Gabius, Pharmaceut. Res. 1998, 15, 23 ± 30; d) D. SolÌs, J. Jimÿnez-Barbero, H. Kaltner, A. Romero, H.-C. Siebert, C.-W. von der Lieth, H.-J. Gabius, Cells Tissues Organs 2001, 168, 5 ± 23; e) T. K. Dam, C. F. Brewer, Chem. Rev. 2002, 102, 387 ± 429. a) H. Uedeira, H. Uedeira, J. Sol. Chem. 1985, 14, 27 ± 34; b) J. Hirabayashi, Quart. Rev. Biol. 1996, 71, 365 ± 380; c) A. M. Striegel, J. Am. Chem. Soc. 2003, 125, 4146 ± 4148. a) G. Reuter, H.-J. Gabius, Cell. Mol. Life Sci. 1999, 55, 368 ± 422; b) T. Hennet, Cell. Mol. Life Sci. 2002, 59, 1081 ± 1095; c) R. G. Spiro, Glycobiology 2002, 12, 43R ± 56R; d) P. M. Coutinho, E. Deleury, G. J. Davies, B. Henrissat, J. Mol. Biol. 2003, 328, 307 ± 317; e) D. J. Becker, J. B. Lowe, Glycobiology 2003, 13, 41R ± 53R; f) K. G. Ten Hagen, T. A. Fritz, L. A. Tabak, Glycobiology 2003, 13, 1R ± 16R. 762 [16] a) S. H. Barondes, Annu. Rev. Biochem. 1981, 50, 207 ± 231; b) H. Kaltner, B. Stierstorfer, Acta Anat. 1998, 161, 162 ± 179; c) A. Villalobo, H.-J. Gabius, Acta Anat. 1998, 161, 110 ± 129; d) H. R¸diger, H.-C. Siebert, D. SolÌs, J. Jimÿnez-Barbero, A. Romero, C.-W. von der Lieth, T. DÌaz-MauriÊo, H.-J. Gabius, Curr. Med. Chem. 2000, 7, 389 ± 416; e) N. M. Dahms, M. K. Hancock, Biochim. Biophys. Acta 2002, 1572, 317 ± 340; f) S.-i. Kawabata, R. Tsuda, Biochim. Biophys. Acta 2002, 1572, 414 ± 421; g) J. Lu, C. Teh, U. Kishore, K. B. M. Reid, Biochim. Biophys. Acta 2002, 1572, 387 ± 400; h) P. H. Weigel, J. H. N. Yik, Biochim. Biophys. Acta 2002, 1572, 341 ± 363. [17] S. W. Mitchell, Smithsonian Contrib. Knowledge 1860, XII, 89 ± 90. [18] S. Flexner, H. Noguchi, J. Exp. Med. 1902, 6, 277 ± 301. [19] a) J. Bordet, F. P. Gay, Ann. Inst. Pasteur 1906, 20, 467 ± 498; b) J. Bordet, O. Streng, Zbl. Bakteriol. Parasitenkd. Infektionskrankh. Hyg. Abt. I Orig. 1909, 49, 260 ± 276; c) S. Hirani, J. D. Lambris, H. J. M¸ller-Eberhard, J. Immunol. 1985, 134, 1105 ± 1109; d) H.-J. Gabius, Int. J. Biochem. 1994, 26, 469 ± 477; e) G. R. Vasta, M. Quesenberry, H. Ahmed, N. O'Leary, Dev. Comp. Immunol. 1999, 23, 401 ± 420; f) D. C. Kilpatrick, Biochim. Biophys. Acta 2002, 1572, 401 ± 413. [20] H. Stillmark, ‹ber Ricin, ein giftiges Ferment aus den Samen von Ricinus comm. L. und einigen anderen Euphorbiaceen Inaugural Dissertation, Schnakenburg's Buchdruckerei, Dorpat, 1888. [21] J. B. Sumner, J. Biol. Chem. 1919, 37, 137 ± 142. [22] J. B. Sumner, S. F. Howell, J. Bacteriol. 1936, 32, 227 ± 237. [23] a) A. Creite, Z. Rat. Med. 1869, 36, 90 ± 108; b) K. Landsteiner, Zbl. Bakteriol. Parasitenkd. Infektionskrankh. Hyg. Abtlg. I Orig. 1900, 27, 357 ± 362; c) K. Landsteiner, Wiener Klin. Wschr. 1901, 42, 1020 ± 1024; d) K. Landsteiner, J. van der Scheer, J. Exp. Med. 1931, 54, 295 ± 305; e) K. O. Renkonen, Ann. Med. Exp. Biol. Fenn. 1948, 26, 66 ± 72; f) W. C. Boyd, R. M. Reguera, J. Immunol. 1944, 62, 333 ± 339; g) W. C. Boyd, Vox Sang. 1963, 8, 1 ± 32; h) N. C. Hughes-Jones, B. Gardner, Br. J. Haematol. 2002, 119, 889 ± 893. [24] W. C. Boyd in The Proteins (Eds.: H. Neurath, K. Bailey), Academic Press, New York, 1954, Vol. 2, Part 2, pp. 756 ± 844. [25] a) W. M. Watkins, W. T. J. Morgan, Nature 1952, 169, 825 ± 826; b) W. T. J. Morgan, W. M. Watkins, Br. J. Exp. Pathol. 1953, 34, 94 ± 103; c) W. J. Judd, CRC Crit. Rev. Clin. Lab. Sci. 1980, 12, 172 ± 214; d) G. W. G. Bird, Transfusion Med. Rev. 1989, 3, 55 ± 62; e) W. M. Watkins, Trends Glycosci. Glycotechnol. 1999, 11, 391 ± 411; f) H. P. Schwarz, F. Dorner, Br. J. Haematol. 2003, 121, 556 ± 565. [26] a) I. J. Goldstein, R. C. Hughes, M. Monsigny, T. Osawa, N. Sharon, Nature 1980, 285, 66; b) J. Kocourek, V. HorœejsœÌ, Nature 1981, 290, 188; c) M. B. F. Dixon, Nature 1981, 292, 192; d) J. Kocourek, V. HorœejsœÌ in Lectins. Biology, Biochemistry, Clinical Biochemistry (Eds.: T. C. B˘g-Hansen, G. A. Spengler), W. de Gruyter, Berlin, 1983, Vol. 3, pp. 3 ± 6; e) S. H. Barondes, Trends Biochem. Sci. 1988, 13, 480 ± 482; f) H.-J. Gabius, Biochim. Biophys. Acta 1991, 1071, 1 ± 18; g) H. R¸diger, H.-J. Gabius, Glycoconjugate J. 2001, 18, 589 ± 613. [27] a) H.-J. Gabius, S. Andrÿ, A. Danguy, K. Kayser, S. Gabius, Methods Enzymol. 1994, 242, 37 ± 46; b) H.-J. Gabius, C. Unverzagt, K. Kayser, Biotech. Histochem. 1998, 73, 263 ± 277; c) H.-J. Gabius, Anat. Histol. Embryol. 2001, 30, 3 ± 31. [28] a) H.-J. Gabius, Cancer Invest. 1987, 5, 39 ± 46; b) L. D. Powell, A. Varki, J. Biol. Chem. 1995, 270, 14 243 ± 14 246; c) H.-J. Gabius, Eur. J. Biochem. 1997, 243, 543 ± 576; d) T. Angata, E. C. M. Brinkman-Van der Linden, Biochim. Biophys. Acta 2002, 1572, 294 ± 316; e) D. C. Kilpatrick, Biochim. Biophys. Acta 2002, 1572, 187 ± 197; f) G. A. Rabinovich, N. Rubinstein, M. A. Toscano, Biochim. Biophys. Acta 2002, 1572, 274 ± 284. [29] H. R¸diger, Acta Anat. 1998, 161, 130 ± 152. [30] a) C. P. Stowell, Y. C. Lee, Adv. Carbohydr. Chem. Biochem. 1980, 37, 225 ± 281; b) J. D. Aplin, J. C. Wriston, Jr., CRC Crit. Rev. Biochem. 1981, 10, 259 ± 306; c) H.-J. Gabius, Angew. Chem. 1988, 100, 1321 ± 1330; Angew. Chem. Int. Ed. Engl. 1988, 27, 1267 ± 1276; d) Neoglycoconjugates. Preparation and Applications (Eds.: Y. C. Lee, R. T. Lee), Academic Press, San Diego, 1994; e) N. V. Bovin, H.-J. Gabius, Chem. Soc. Rev. 1995, 24, 413 ± 421; f) R. Roy, Trends Glycosci. Glycotechnol. 1996, 8, 79 ± 99; g) M. Mammen, S.-K. Choi, G. M. Whitesides, Angew. Chem. 1998, 110, 2908 ± 2953; Angew. Chem. Int. Ed. 1998, 37, 2754 ± 2794; h) L. L. Kiessling, L. E. Strong, J. E. Gestwicki, Annu. Rep. Med. Chem. 2000, 35, 321 ± 330; i) N. Yamazaki, S. Kojima, N. V. Bovin, S. Andrÿ, S. Gabius, H.-J. Gabius, Adv. Drug Deliv. Rev. 2000, 43, 225 ± 244; j) B. T. Houseman, M. Mrksich, Top. ¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim www.chembiochem.org ChemBioChem 2004, 5, 740 ± 764 Chemical Biology of the Sugar Code [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] Curr. Chem. 2002, 218, 1 ± 44; k) C. L. Schengrund, Biochem. Pharmacol. 2003, 65, 699 ± 707. a) W. Straus, J. Histochem. Cytochem. 1983, 31, 78 ± 84; b) H.-J. Gabius, R. Engelhardt, K. P. Hellmann, T. Hellmann, A. Ochsenfahrt, Anal. Biochem. 1987, 165, 349 ± 355; c) S. Gabius, K. P. Hellmann, T. Hellmann, U. Brinck, H.-J. Gabius, Anal. Biochem. 1989, 182, 447 ± 451. a) A. Barkley, P. Arya, Chem. Eur. J. 2001, 7, 555 ± 563; b) S.-I. Nishimura, Curr. Opin. Chem. Biol. 2001, 5, 325 ± 335; c) K. R. Love, P. H. Seeberger, Angew. Chem. 2002, 114, 3733 ± 3736; Angew. Chem. Int. Ed. 2002, 41, 3583 ± 3586; d) L. A. Marcaurelle, P. H. Seeberger, Curr. Opin. Chem. Biol. 2002, 6, 289 ± 296; e) C. O. Mellet, J. M. G. Fernµndez, ChemBioChem 2002, 3, 819 ± 822; f) O. Ramstrˆm, T. Bunyapaiboonsri, S. Lohmann, J.-M. Lehn, Biochim. Biophys. Acta 2002, 1572, 178 ± 186. N. Sharon, Protein Sci. 1998, 7, 2042 ± 2048. a) B. B. L. Agrawal, I. J. Goldstein, Biochem. J. 1965, 26, 23c; b) J. H. Pazur, Adv. Carbohydr. Chem. Biochem. 1981, 39, 405 ± 447; c) H.-J. Gabius, Anal. Biochem. 1990, 189, 91 ± 94; d) H.-J. Gabius in Protein Liquid Chromatography (Ed.: M. Kastner), Elsevier, Amsterdam, 2000, pp. 619 ± 638. a) T. Freier, G. Fleischmann, H. R¸diger, Biol. Chem. Hoppe-Seyler 1985, 366, 1023 ± 1028; b) H. R¸diger in Lectins and Glycobiology (Eds.: H.-J. Gabius, S. Gabius), Springer, Heidelberg, 1993, pp. 31 ± 46. G. Fleischmann, I. Mauder, W. Illert, H. R¸diger, Biol. Chem. Hoppe-Seyler 1985, 366, 1029 ± 1032. a) H. Lis, N. Sharon, Annu. Rev. Biochem. 1986, 55, 35 ± 67; b) I. Damjanov, Lab. Invest. 1987, 57, 5 ± 20; c) T. Osawa, T. Tsuji, Annu. Rev. Biochem. 1987, 56, 21 ± 42; d) A. Danguy, F. Akif, B. Pajak, H.-J. Gabius, Histol. Histopathol. 1994, 9, 155 ± 171; e) J. F. Kennedy, P. M. G. Palva, M. T. S. Corella, M. S. M. Cavalcanti, L. C. B. B. Coelho, Carbohydr. Polym. 1995, 26, 219 ± 230; f) R. D. Cummings in Glycosciences: Status and Perspectives (Eds.: H.-J. Gabius, S. Gabius), Chapman and Hall, London, 1997, pp. 191 ± 199; g) W. J. Peumans, E. J. M. van Damme, Crit. Rev. Biochem. Mol. Biol. 1998, 33, 209 ± 259. a) V. Kinzel, D. K¸bler, J. Richards, M. Stˆhr in Concanavalin A as a Tool (Eds.: H. Bittiger, H. P. Schnebli), Wiley, London, 1976, pp. 467 ± 478; b) V. Kinzel, D. K¸bler, J. Richards, M. Stˆhr, Science 1976, 192, 487 ± 489. a) T. Hajto, K. Hostanska, H.-J. Gabius, Cancer Res. 1989, 49, 4803 ± 4808; b) T. Hajto, K. Hostanska, K. Frei, C. Rordorf, H.-J. Gabius, Cancer Res. 1990, 50, 3322 ± 3326. a) E. Kunze, H. Schulz, M. Adamek, H.-J. Gabius, J. Cancer Res. Clin. Oncol. 2000, 126, 125 ± 138; b) H.-J. Gabius, F. Darro, M. Remmelink, S. Andrÿ, J. Kopitz, A. Danguy, S. Gabius, I. Salmon, R. Kiss, Cancer Invest. 2001, 19, 114 ± 126; c) A. V. Timoshenko, Y. Lan, H.-J. Gabius, P. K. Lala, Eur. J. Cancer 2001, 37, 1910 ± 1920; d) S. Gabius, H.-J. Gabius, Dtsch. med. Wschr. 2002, 127, 457 ± 459. a) S. Chouaib, C. Asselin-Paturel, F. Mami-Chouaib, A. Caignard, J. Y. Blay, Immunol. Today 1997, 18, 493 ± 497; b) J. A. Sogn, Immunity 1998, 9, 757 ± 763; c) H.-J. Gabius, Biochimie 2001, 83, 659 ± 666; d) L. M. Coussens, Z. Werb, Nature 2002, 420, 860 ± 867; e) J. L. Yu, J. W. Rak, Breast Cancer Res. 2003, 5, 83 ± 88. a) H. Heimpel, Dtsch. med. Wschr. 1995, 16, 205; b) W. Hagenah, I. Dˆrges, E. Gafumbegete, T. Wagner, Dtsch. med. Wschr. 1998, 123, 1001 ± 1004; c) A. M. M. Eggermont, U. R. Kleeberg, D. J. Ruiter, S. Suciu in ASCO Educational Book (Ed.: M. C. Perry), American Society of Clinical Oncology, Alexandria, VA, USA, 2001, pp. 88 ± 93; d) E. Ernst, K. Schmidt, M. K. Steuer-Vogt, Int. J. Cancer 2003, 107, 262 ± 267. a) L. M. Brill, C. J. Evans, A. M. Hirsch, Plant J. 2001, 25, 453 ± 461; b) R. Esteban, B. Dopico, F. J. MuÊoz, S. Romo, E. Labrador, Physiol. Plant. 2002, 114, 619 ± 626; c) W.-d. Yong, Y.-y. Xu, W.-z. Xu, X. Wang, N. Li, J.-s. Wu, T.-b. Liang, K. Chong, Z.-h. Xu, K.-h. Tan, Z.-q. Zhu, Planta 2003, 217, 261 ± 270. a) J.-C. Promÿ, Curr. Opin. Struct. Biol. 1996, 6, 671 ± 678; b) A. M. Hirsch, Curr. Opin. Plant Biol. 1999, 2, 320 ± 326; c) P. Potin, K. Bouarab, F. K¸pper, B. Kloareg, Curr. Opin. Microbiol. 1999, 2, 276 ± 283; d) T. Yamaguchi, Y. Ito, N. Shibuya, Trends Glycosci. Glycotechnol. 2000, 12, 113 ± 120; e) P. P. G. van der Holst, H. R. M. Schlaman, H. P. Spaink, Curr. Opin. Struct. Biol. 2001, 11, 608 ± 616. a) P. Tomme, R. A. J. Warren, N. R. Gilkes, Adv. Microbiol. Physiol. 1995, 37, 1 ± 81; b) C. Khosia, P. B. Harbury, Nature, 2001, 409, 247 ± 252; c) B. W. McLean, A. B. Boraston, D. Brouwer, N. Sanaie, C. A. Fyfe, R. A. J. Warren, D. G. Kilburn, C. A. Haynes, J. Biol. Chem. 2002, 277, 50 245 ± ChemBioChem 2004, 5, 740 ± 764 www.chembiochem.org [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] 50 254; d) S. Thobhani, B. Ember, A. Siriwardena, G.-J. Boons, J. Am. Chem. Soc. 2003, 125, 7154 ± 7155. a) J. Wang, J. A. Stuckey, M. J. Wishart, J. E. Dixon, J. Biol. Chem. 2002, 277, 2377 ± 2380; b) S. Ganesh, N. Tsurutani, T. Suzuki, Y. Hishii, T. Ishihara, A. V. Delgado-Escueta, K. Yamakawa, Biochem. Biophys. Res. Commun. 2004, 313, 1101 ± 1109. a) Y. S. Kim, J. H. Lee, G. M. Yoon, H. S. Cho, S.-W. Park, M. C. Suh, D. Choi, H. J. Ha, J. R. Liu, H.-S. Pai, Plant Physiol. 2000, 123, 905 ± 915; b) A. Barre, C. Hervÿ, B. Lescure, P. Rougÿ, Crit. Rev. Plant Sci. 2002, 21, 379 ± 399; c) M. Nishiguchi, K. Yoshida, T. Sumizono, K. Tazaki, Mol. Genet. Genomics 2002, 267, 506 ± 514. a) P. A. Gleeson, Curr. Top. Microbiol. Immunol. 1988, 139, 1 ± 34; b) H. Ueda, H. Ogawa, Trends Glycosci. Glycotechnol. 1999, 11, 413 ± 428; c) S. Chen, A. M. Spence, H. Schachter, Trends Glycosci. Glycotechnol. 2001, 13, 447 ± 462; d) I. B. H. Wilson, Curr. Opin. Struct. Biol. 2002, 12, 569 ± 577. S. Roth, Quart. Rev. Biol. 1973, 48, 541 ± 563. P. Weiss, Yale J. Biol. Med. 1947, 19, 235 ± 278. a) R. L. Hudgin, W. E. Pricer, Jr., G. Ashwell, R. J. Stockert, A. G. Morell, J. Biol. Chem. 1974, 249, 5536 ± 5543; b) V. I. Teichberg, I. Silman, D. D. Beitsch, G. Resheff, Proc. Natl. Acad. Sci. USA 1975, 72, 1383 ± 1387; c) T. K. Gartner, K. Stocker, D. C. Williams, FEBS Lett. 1980, 117, 13 ± 16. a) H. Lis, N. Sharon, Chem. Rev. 1998, 98, 637 ± 674; b) W. J. Peumans, A. Barre, Q. Hao, P. Rougÿ, E. J. M. van Damme, Trends Glycosci. Glycotechnol. 2000, 12, 83 ± 101; c) R. B. Dodd, K. Drickamer, Glycobiology 2001, 11, 71R ± 79R; d) R. Loris, Biochim. Biophys. Acta 2002, 1572, 198 ± 208. D. N. W. Cooper, Biochim. Biophys. Acta 2002, 1572, 209 ± 231. The C. elegans Sequencing Consortium, Science 1998, 282, 2012 ± 2018. R. O. Hynes, Q. Zhao, J. Cell Biol. 2000, 150, F89 ± F95. a) J. C. Rogers, S. Kornfeld, Biochem. Biophys. Res. Commun. 1971, 45, 622 ± 629; b) Y. C. Lee, FASEB J. 1992, 6, 3193 ± 3200; c) L. Fiume, C. Busi, G. Di Stefano, A. Mattioli, Adv. Drug Deliv. Rev. 1994, 14, 51 ± 65; d) D. K. F. Meijer, G. Molema, Sem. Liver Dis. 1995, 15, 202 ± 256; e) H.-J. Gabius, Cancer Investig. 1997, 15, 454 ± 464; f) K. G. Rice in Glycosciences: Status and Perspectives (Eds.: H.-J. Gabius, S. Gabius), Chapman and Hall, London±Weinheim, 1997, pp. 471 ± 483; g) B. G. Davis, M. A. Robinson, Curr. Opin. Drug Discov. Develop. 2002, 5, 279 ± 288. a) S. Andrÿ, C. Unverzagt, S. Kojima, X. Dong, C. Fink, K. Kayser, H.-J. Gabius, Bioconjugate Chem. 1997, 8, 845 ± 855; b) C. Unverzagt, S. Andrÿ, J. Seifert, S. Kojima, C. Fink, G. Srikrishna, H. Freeze, K. Kayser, H.J. Gabius, J. Med. Chem. 2002, 45, 478 ± 491; c) S. Andrÿ, C. Unverzagt, S. Kojima, M. Frank, J. Seifert, C. Fink, K. Kayser, C.-W. von der Lieth, H.-J. Gabius, Eur. J. Biochem. 2004, 271, 118 ± 134. S. Elliott, T. Lorenzini, S. Asher, K. Aoki, D. Brankow, L. Buck, L. Busse, D. Chang, J. Fuller, J. Grant, N. Hernday, M. Hokum, S. Hu, A. Knudten, N. Levin, R. Komorowski, F. Martin, R. Navarro, T. Osslund, G. Rogers, N. Rogers, G. Trail, J. Egrie, Nat. Biotechnol. 2003, 21, 414 ± 421. a) O. Seitz, ChemBioChem 2000, 1, 214 ± 246; b) M. Mizuno, Trends Glycosci. Glycotechnol. 2001, 13, 11 ± 30; c) M. J. Grogan, M. R. Pratt, L. A. Marcaurelle, C. R. Bertozzi, Annu. Rev. Biochem. 2002, 71, 593 ± 634. a) S. Andrÿ, P. J. Cejas Ortega, M. Alamino Perez, R. Roy, H.-J. Gabius, Glycobiology 1999, 9, 1253 ± 1261; b) S. Andrÿ, B. Frisch, H. Kaltner, D. L. Desouza, F. Schuber, H.-J. Gabius, Pharmaceut. Res. 2000, 17, 985 ± 990; c) S. Andrÿ, R. J. Pieters, I. Vrasidas, H. Kaltner, I. Kuwabara, F.-T. Liu, R. M. J. Liskamp, H.-J. Gabius, ChemBioChem 2001, 2, 822 ± 830; d) I. Vrasidas, S. Andrÿ, P. Valentini, C. Bˆck, M. Lensch, H. Kaltner, R. M. J. Liskamp, H.-J. Gabius, R. J. Pieters, Org. Biomol. Chem. 2003, 1, 803 ± 810; e) S. Andrÿ, B. Liu, H.-J. Gabius, R. Roy, Org. Biomol. Chem. 2003, 1, 3909 ± 3916; f) S. Andrÿ, H. Kaltner, T. Furuike, S.-I. Nishimura, H.-J. Gabius, Bioconjugate Chem. 2004, 15, 87 ± 98. a) D. W. Ohannesian, R. Lotan in Glycosciences: Status and Perspectives (Eds.: H.-J. Gabius, S. Gabius), Chapman and Hall, London, 1997, pp. 459 ± 469; b) S. Andrÿ, S. Kojima, N. Yamazaki, C. Fink, H. Kaltner, K. Kayser, H.-J. Gabius, J. Cancer Res. Clin. Oncol. 1999, 125, 461 ± 474; c) H. Lahm, S. Andrÿ, A. Hˆflich, J. R. Fischer, B. Sordat, H. Kaltner, E. Wolf, H.-J. Gabius, J. Cancer Res. Clin. Oncol. 2001, 127, 375 ± 386; d) A. Danguy, I. Camby, R. Kiss, Biochim. Biophys. Acta 2002, 1572, 285 ± 293; e) I. Camby, N. Belot, F. Lefranc, N. Sadeghi, Y. de Launoit, H. Kaltner, S. Musette, F. Darro, A. Danguy, I. Salmon, H.-J. Gabius, R. Kiss, J. Neuropathol. Exp. Neurol. 2002, 61, 585 ± 596; f) P. Nangia-Makker, J. Conklin, V. Hogan, A. Raz, Trends Mol. Med. 2002, 8, 187 ± 192; g) N. Nagy, H. Le- ¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim 763 H.-J. Gabius et al. [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] gendre, O. Engels, S. Andrÿ, H. Kaltner, K. Wasano, Y. Zick, J.-C. Pector, C. Decaestecker, H.-J. Gabius, I. Salmon, R. Kiss, Cancer 2003, 97, 1849 ± 1858. a) N. Ahmad, H.-J. Gabius, H. Kaltner, S. Andrÿ, I. Kuwabara, F.-T. Liu, S. Oscarson, T. Norberg, C. F. Brewer, Can. J. Chem. 2002, 80, 1096 ± 1104; b) J. Hirabayashi, T. Hashidate, Y. Arata, N. Nishi, T. Nakamura, M. Hirashima, T. Urashima, T. Oka, M. Futai, W. E. G. M¸ller, F. Yagi, K.-i. Kasai, Biochim. Biophys. Acta 2002, 1572, 232 ± 354; c) A. M. Wu, J. H. Wu, M.-S. Tsai, J.-H. Liu, S. Andrÿ, K. Wasano, H. Kaltner, H.-J. Gabius, Biochem. J. 2002, 367, 653 ± 664. H.-J. Gabius, S. Gabius, T. V. Zemlyanukhina, N. V. Bovin, U. Brinck, A. Danguy, S. S. Joshi, K. Kayser, J. Schottelius, F. Sinowatz, L. F. Tietze, F. Vidal-Vanaclocha, J.-P. Zanetta, Histol. Histopathol. 1993, 8, 369 ± 383. a) J. Kopitz, C. von Reitzenstein, S. Andrÿ, H. Kaltner, J. Uhl, V. Ehemann, M. Cantz, H.-J. Gabius, J. Biol. Chem. 2001, 276, 35 917 ± 35 923; b) G. Rappl, H. Abken, J. M. Muche, W. Sterry, W. Tilgen, S. Andrÿ, H. Kaltner, S. Ugurel, H.-J. Gabius, U. Reinhold, Leukemia 2002, 16, 840 ± 845; c) L. Santucci, S. Fiorucci, N. Rubinstein, A. Mencarelli, B. Palazzetti, B. Federici, G. A. Rabinovich, A. Morelli, Gastroenterology 2003, 124, 1381 ± 1394; d) J. Kopitz, S. Andrÿ, C. von Reitzenstein, K. Versluis, H. Kaltner, R. J. Pieters, K. Wasano, I. Kuwabara, F.-T. Liu, M. Cantz, A. J. R. Heck, H.-J. Gabius, Oncogene, 2003, 22, 6277 ± 6288. J. P. Carver, Pure Appl. Chem. 1993, 65, 763 ± 770. a) C.-W. von der Lieth, H.-C. Siebert, T. Kozµr, M. Burchert, M. Frank, M. Gilleron, H. Kaltner, G. Kayser, E. Tajkhorshid, N. V. Bovin, J. F. G. Vliegenthart, H.-J. Gabius, Acta Anat. 1998, 161, 91 ± 109; b) R. J. Woods, Glycoconjugate J. 1998, 15, 209 ± 216; c) J. Jimÿnez-Barbero, J. L. Asensio, F. J. CaÊada, A. Poveda, Curr. Opin. Struct. Biol. 1999, 9, 549 ± 555; d) A. Imberty, S. Pÿrez, Chem. Rev. 2000, 100, 4567 ± 4588; e) M. R. Wormald, A. J. Petrescu, Y.-L. Pao, A. Glithero, T. Elliott, R. A. Dwek, Chem. Rev. 2002, 102, 371 ± 386; f) T. Weimar, R. J. Woods in NMR Spectroscopy of Glycoconjugates (Eds.: J. Jimÿnez-Barbero, T. Peters), Wiley-VCH, Weinheim, 2003, pp. 111 ± 144. a) H.-C. Siebert, M. Gilleron, H. Kaltner, C.-W. von der Lieth, T. Kozµr, N. V. Bovin, E. Y. Korchagina, J. F. G. Vliegenthart, H.-J. Gabius, Biochem. Biophys. Res. Commun. 1996, 219, 205 ± 212; b) M. Gilleron, H.-C. Siebert, H. Kaltner, C.-W. von der Lieth, T. Kozµr, K. M. Halkes, E. Y. Korchagina, N. V. Bovin, H.-J. Gabius, J. F. G. Vliegenthart, Eur. J. Biochem. 1998, 252, 416 ± 427. B. J. Hardy, J. Mol. Struct. 1997, 395±396, 187 ± 200. a) B. Meyer, T. Peters, Angew. Chem. 2003, 115, 890 ± 918; Angew. Chem. Int. Ed. 2003, 42, 864 ± 890; b) H.-C. Siebert, J. Jimÿnez-Barbero, S. Andrÿ, H. Kaltner, H.-J. Gabius, Methods Enzymol. 2003, 362, 417 ± 434. a) J. Jimÿnez-Barbero, J. F. Espinosa, J. L. Asensio, F. J. CaÊada, A. Poveda, Adv. Carbohydr. Chem. Biochem. 2001, 56, 235 ± 284; b) H. Yuasa, H. Hashimoto, Trends Glycosci. Glycotechnol. 2001, 13, 31 ± 55. a) J. F. Espinosa, F. J. CaÊada, J. L. Asensio, H. Dietrich, M. MartÌn-Lomas, R. R. Schmidt, J. Jimÿnez-Barbero, Angew. Chem. 1996, 108, 323 ± 326, Angew. Chem. Int. Ed. 1996, 35, 303 ± 306; b) J. F. Espinosa, F. J. CaÊada, J. L. Asensio, M. Martin-Pastor, H. Dietrich, M. MartÌn-Lomas, R. R. Schmidt, J. Jimÿnez-Barbero, J. Am. Chem. Soc. 1996, 118, 10 862 ± 10 871; c) J. F. Espinosa, E. Montero, A. Viµn, J. L. GarcÌa, H. Dietrich, R. R. Schmidt, M. MartÌn-Lomas, A. Imberty, F. J. CaÊada, J. Jimÿnez-Barbero, J. Am. Chem. Soc. 1998, 120, 1309 ± 1318; d) J. L. Asensio, J. F. Espinosa, H. Dietrich, F. J. CaÊada, R. R. Schmidt, M. MartÌn-Lomas, S. Andrÿ, H.-J. Gabius, J. Jimÿnez-Barbero, J. Am. Chem. Soc. 1999, 121, 8995 ± 9000; e) J. M. Alonso-Plaza, M. A. Canales, M. Jimÿnez, J. L. Roldµn, A. GarciaHerrero, L. Iturrino, J. L. Asensio, F. J. CaÊada, A. Romero, H.-C. Siebert, 764 [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] S. Andrÿ, D. SolÌs, H.-J. Gabius, J. Jimÿnez-Barbero, Biochim. Biophys. Acta 2001, 1568, 225 ± 236. H.-C. Siebert, S. Andrÿ, S.-Y. L¸, M. Frank, H. Kaltner, J. A. van Kuik, E. Y. Korchagina, N. V. Bovin, E. Tajkhorshid, R. Kaptein, J. F. G. Vliegenthart, C.-W. von der Lieth, J. Jimÿnez-Barbero, J. Kopitz, H.-J. Gabius, Biochemistry 2003, 42, 14 762 ± 14 773. a) S. K. Das, J.-M. Mallet, J. Esnault, P.-A. Driguez, P. Duchaussoy, P. Sizun, J.-P. Hÿrault, J.-M. Herbert, M. Petitou, P. Sinay, Angew. Chem. 2001, 113, 1723 ± 1726; Angew. Chem. Int. Ed. 2001, 40, 1670 ± 1673; b) M. HricovÌni, M. Guerrini, A. Bisio, G. Torri, A. Naggi, B. Casu, Semin. Thromb. Hemost. 2002, 28, 325 ± 334; c) R. Raman, G. Venkataraman, S. Ernst, V. Sasisekharan, R. Sasisekharan, Proc. Natl. Acad. Sci. USA 2003, 100, 2357 ± 2362. a) B. Casu, A. Naggi, G. Torri, Semin. Thromb. Hemost. 2002, 28, 335 ± 342; b) P. Jemth, J. Kreuger, M. Kusche-Gullberg, L. Sturiale, G. GimÿnezGallego, U. Lindahl, J. Biol. Chem. 2002, 277, 30 567 ± 30 573; c) R. Ojeda, J. Angulo, P. M. Nieto, M. MartÌn-Lomas, Can. J. Chem. 2002, 80, 917 ± 936. B. Casu, M. Reggiani, G. G. Gallo, A. Vigevani, Tetrahedron 1966, 22, 3061 ± 3083. a) H.-C. Siebert, S. Andrÿ, J. L. Asensio, F. J. CaÊada, X. Dong, J.-F. Espinosa, M. Frank, M. Gilleron, H. Kaltner, T. Kozµr, N. V. Bovin, C.-W. von der Lieth, J. F. G. Vliegenthart, J. Jimÿnez-Barbero, H.-J. Gabius, ChemBioChem 2000, 1, 181 ± 195; b) H.-C. Siebert, M. Frank, C.-W. von der Lieth, J. Jimÿnez-Barbero, H.-J. Gabius in NMR Spectroscopy of Glycoconjugates (Eds.: J. Jimÿnez-Barbero, T. Peters), Wiley-VCH, Weinheim, 2003, pp. 39 ± 57. a) A. M. Klibanov, Nature 2001, 409, 241 ± 246; b) C. Mattos, D. Ringe, Curr. Opin. Struct. Biol. 2001, 11, 761 ± 764. L. He, S. Andrÿ, H.-C. Siebert, H. Helmholz, B. Niemeyer, H.-J. Gabius, Biophys. J. 2003, 85, 511 ± 524. H.-C. Siebert, S. Andrÿ, J. F. G. Vliegenthart, H.-J. Gabius, M. J. Minch, J. Biomol. NMR 2003, 25, 197 ± 215. a) J. L. Asensio, F. J. CaÊada, M. Bruix, A. RodrÌguez-Romero, J. JimÿnezBarbero, Eur. J. Biochem. 1995, 230, 621 ± 633; b) J. L. Asensio, F. J. CaÊada, M. Bruix, C. Gonzµlez, N. Khiar, A. RodrÌguez-Romero, J. Jimÿnez-Barbero, Glycobiology 1998, 8, 569 ± 577; c) J. L. Asensio, H.-C. Siebert, C.-W. von der Lieth, J. Laynes, M. Bruix, U. M. Soedjanaatmadja, J. J. Beintema, F. J. CaÊada, H.-J. Gabius, J. Jimÿnez-Barbero, Proteins 2000, 40, 218 ± 236; e) J. F. Espinosa, J. L. Asensio, J. L. GarcÌa, J. Laynez, M. Bruix, C. Wright, H.-C. Siebert, H.-J. Gabius, F. J. CaÊada, J. JimÿnezBarbero, Eur. J. Biochem. 2000, 267, 3965 ± 3978; f) C. A. Bewley, Structure 2001, 9, 931 ± 940; g) M.-s. Sung, K. Fleming, H. A. Cheng, S. Matthews, EMBO Rep. 2001, 2, 621 ± 627; h) H.-C. Siebert, S.-Y. L¸, M. Frank, J. Kramer, R. Wechselberger, J. Joosten, S. Andrÿ, K. Rittenhouse-Olson, R. Roy, C.-W. von der Lieth, R. Kaptein, J. F. G. Vliegenthart, A. J. R. Heck, H.-J. Gabius, Biochemistry 2002, 41, 9707 ± 9717. a) E. F. Hounsell in Glycosciences: Status and Perspectives (Eds.: H.-J. Gabius, S. Gabius), Chapman and Hall, London, 1997, pp. 15 ± 29; b) H. Geyer, R. Geyer, Acta Anat. 1998, 161, 18 ± 35. J. Montreuil in Glycoproteins (Eds.: J. Montreuil, J. F. G. Vliegenthart, H. Schachter), Elsevier, Amsterdam, 1995, pp. 1 ± 12. N. Sharon, Acta Anat. 1998, 161, 7 ± 17. H.-C. Siebert, C.-W. von der Lieth, R. Kaptein, J. J. Beintema, K. Dijkstra, N. van Nuland, U. M. S. Soedjanaatmadja, A. Rice, J. F. G. Vliegenthart, C. S. Wright, H.-J. Gabius, Proteins 1997, 28, 268 ± 284. Received: August 25, 2003 ¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim www.chembiochem.org ChemBioChem 2004, 5, 740 ± 764