740
¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim
DOI: 10.1002/cbic.200300753
ChemBioChem 2004, 5, 740 ± 764
Chemical Biology of the Sugar Code
Hans-Joachim Gabius,*[a] Hans-Christian Siebert,[a] Sabine Andrÿ,[a]
JesÇs Jimÿnez-Barbero,[b] and Harold R¸diger[c]
In respectful and thankful memory of Professor Friedrich Cramer, who died three months before his 80th birthday
A high-density coding system is essential to allow cells to communicate efficiently and swiftly through complex surface interactions. All the structural requirements for forming a wide array of
signals with a system of minimal size are met by oligomers of
carbohydrates. These molecules surpass amino acids and nucleotides by far in information-storing capacity and serve as ligands
in biorecognition processes for the transfer of information. The
results of work aiming to reveal the intricate ways in which oligosaccharide determinants of cellular glycoconjugates interact with
tissue lectins and thereby trigger multifarious cellular responses
(e.g. in adhesion or growth regulation) are teaching amazing lessons about the range of finely tuned activities involved. The ability of enzymes to generate an enormous diversity of biochemical
signals is matched by receptor proteins (lectins), which are equally elaborate. The multiformity of lectins ensures accurate signal
decoding and transmission. The exquisite refinement of both
sides of the protein±carbohydrate recognition system turns the
structural complexity of glycans–a demanding but essentially
mastered problem for analytical chemistry–into a biochemical
virtue. The emerging medical importance of protein±carbohydrate recognition, for example in combating infection and the
spread of tumors or in targeting drugs, also explains why this interaction system is no longer below industrial radarscopes. Our
review sketches the concept of the sugar code, with a solid description of the historical background. We also place emphasis
on a distinctive feature of the code, that is, the potential of a carbohydrate ligand to adopt various defined shapes, each with its
own particular ligand properties (differential conformer selection).
Proper consideration of the structure and shape of the ligand enables us to envision the chemical design of potent binding partners for a target (in lectin-mediated drug delivery) or ways to
block lectins of medical importance (in infection, tumor spread,
or inflammation).
1. Introduction
Biological information storage and transfer are commonly described to be based solely on nucleic acids and proteins. In
contrast to nucleotides and amino acids, the most abundant
type of biomolecule in nature, the carbohydrate molecule, has
been almost completely sidelined in this respect. Sugar molecules have been nearly exclusively assigned as building blocks
of protective cell wall constituents (for example cellulose and
chitin) or as biochemical fuel in energy metabolism. This paradigm, which is reflected in textbooks, has been questioned
occasionally over the years. An exemplary quotation from 1972
points out that glycans do matter more than originally assumed: ™The polysaccharides of mammalian connective tissue,
and glycoproteins, begin to make biochemical sense for the
first time ever. So many exciting developments have occurred
that this period seems to have moved us out of a dark age to
see polysaccharides in quite a new light. They have become interesting molecules to contemplate in relation to the life of a
cell. The ugly ducklings have begun to look a little more like
swans. In this sense, polysaccharides begin to appear attractive
molecules, shapely molecules.∫[1] With hindsight, the answer to
the question of why the exceptional talents of carbohydrates
have remained nearly unnoticed for so long appears to be
rather simple; in essence, this neglect occurred because ™glycoconjugates are much more complex, variegated, and difficult
to study than proteins or nucleic acids.∫[2] Viewed from the perspective of bioinformatics, however, this structural property in
fact makes oligomers of saccharides ™ideal for generating comChemBioChem 2004, 5, 740 ± 764
DOI: 10.1002/cbic.200300753
pact units with explicit informational properties.∫[3] This argument and other reasons listed below explain why it is justified
to portray individual monosaccharides as letters of an alphabet. These letters form biochemical code words. The coining of
terms such as sugar code or glycomics helps condense the concept into keywords. However, it goes without saying that the
reader can expect us to carve out a distinctive image of this
fundamental functionality of glycans.[4]
2. The Hardware of the Sugar Code
Carbohydrates have several exceptional features at their disposal. These features make a strong case for a prominent role
of d-glucose and its relatives in information handling. Fore[a] Prof. Dr. H.-J. Gabius, Priv.-Doz. Dr. H.-C. Siebert, Dr. S. Andrÿ
Institut f¸r Physiologische Chemie, Tier‰rztliche Fakult‰t
Ludwig-Maximilians-Universit‰t
Veterin‰rstra˚e 13, 80539 M¸nchen (Germany)
Fax: (+ 49) 89-2180-2508
E-mail: gabius@tiph.vetmed.uni-muenchen.de
gabius@lectins.de
[b] Prof. Dr. J. Jimÿnez-Barbero
Centro de Investigaciones BiolÛgicas
CSIC, Ramiro de Maeztu 9, 28040 Madrid (Spain)
[c] Prof. Dr. H. R¸diger
Institut f¸r Pharmazie und Lebensmittelchemie
Julius-Maximilians-Universit‰t
Am Hubland, 97074 W¸rzburg (Germany)
¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim
741
H.-J. Gabius et al.
Hans-Joachim Gabius was born in Bad
Bevensen (Germany). He studied biochemistry in Hannover (as a fellow of the Studienstiftung des deutschen Volkes) and
obtained an MSc. in 1980, then a PhD in
1982 for chemical and biochemical studies on the proofreading mechanisms of
aminoacyl-tRNA synthetases, under the
direction of F. Cramer, Max Planck Institute for Experimental Medicine, Gˆttingen.
He spent most of 1981 investigating tRNA
splicing in the laboratory of J. Abelson (Department of Chemistry) at
UC San Diego. After starting work in tumor lectinology in 1983 at the
Max Planck Institute in Gˆttingen, he went on to a post-doctoral research post in the group of S. H. Barondes at UC San Diego (1984±
1985) and appointments as assistant professor for biochemistry at
the Max Planck Institute for Experimental Medicine in Gˆttingen
(1987), as associate professor for pharmaceutical chemistry at the
University of Marburg (1991), and as head of the Institute for Physiological Chemistry, Faculty of Veterinary Medicine, University of
Munich (1993). His research awards include the Otto-Hahn-Medal
(1983), the Award of the Dr. Carl Duisberg Foundation (1988), and
the Award of the Paul Martini Foundation (1990). His research interests comprise chemical, biophysical, and biochemical analysis of protein±carbohydrate interactions relevant to the biological and medical
fields, such as the development of glycoscientific strategies for tumor
diagnosis and therapy and the elucidation of the functions of mammalian lectins. He was prominently placed in the ranking of researchers by number of hot papers produced by the Institute of Scientific Information in 1998 (http://www.the-scientist.library.upenn.edu/yr1999/
june/hotresearch_p1_990 621.html; see also, www.lectins.de).
JesÇs Jimÿnez-Barbero was born in 1960
in Madrid (Spain) and is Professor at the
Center for Biological Research of the CSIC.
He obtained his PhD in Organic Chemistry
in 1987 for synthesis work and conformational studies on saccharides at the Institute of Organic Chemistry of the CSIC in
Madrid under the supervision of M. Bernabÿ and M. MartÌn-Lomas. Following
post-doctoral training in molecular
mechanics and NMR methodology from
1986 to 1988 at CERMAV-CNRS in Grenoble, the University of Z¸rich,
and the National Institute for Medical Research at Mill Hill, he
received tenure in the CSIC in 1988. He was Visiting Scientist at the
Department of Chemistry of Carnegie Mellon University at Pittsburgh
between 1990 and 1992 and then started work on the application of
NMR methodology to the study of interactions between carbohydrates and proteins and conformational and structural studies of
oligo- and polysaccharides in Madrid. He was promoted to Senior
Research Scientist in 1996 and to Full Professor in 2002, when he
moved from the Institute of Organic Chemistry to the Center for Biological Research of the CSIC. He is mostly interested in obtaining a
3D view of the molecular recognition processes in which carbohydrates are involved, in particular by the application of NMR spectroscopy and modeling methods. He is a member of the editorial boards
of several international journals and has published almost 200 scientific papers, reviews, and book chapters on the above-mentioned
topics. He has also given more than sixty lectures at international
conferences and research institutions.
Hans-Christian Siebert was born in 1960
in Kiel (Germany) and studied physics in
Kiel (1980±1983) and in Heidelberg (1983±
1987). He obtained a diploma (1987) for
work on dynamic NMR spectroscopy and
a PhD (1990) for conformational studies
of gangliosides by NMR spectroscopy and
computational methods in J. Dabrowski's
group at the Department of Organic
Chemistry, Max Planck Institute for Medical Research, Heidelberg. Following postdoctoral research in J. F. G. Vliegenthart's and R. Kaptein's groups at
the Bijvoet Center for Biomolecular Research, Utrecht University, he
joined H.-J. Gabius' research team at the Institute of Pharmaceutical
Chemistry in Marburg in 1992 and moved with him to Munich, where
he became Dr. med. vet. habil. and received the Venia legendi in Biochemistry in 1999. His research interests include NMR spectroscopy
and molecular modeling structural studies of carbohydrate±protein
interactions with biomedical relevance.
Harold R¸diger was born in Stolp (Germany). He studied chemistry and biochemistry at the University of W¸rzburg, where
he earned his PhD in 1963 for studies on
the kinetics of peptide synthesis in yeast.
He moved to the Biochemistry Institute of
the University of Uppsala (Sweden), then
headed by Nobel Prize winner A. Tiselius,
to do postdoctoral research in J. Porath's
group, where he studied modern biochemical analytical and separation techniques and worked out a purification protocol for a plant lectin.
Upon returning to Germany in 1966, he joined L. Jaenicke's group at
the Biochemistry Institute of the University of Cologne, where he studied the bacterial cobalamin-dependent methionine biosynthesis. In
1971, he became lecturer in biochemistry and in 1974 he moved to
W¸rzburg University to take up a position as a professor of biochemistry. His research interests center on plant lectins and their interaction with various ligands.
742
¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim www.chembiochem.org
ChemBioChem 2004, 5, 740 ± 764
Chemical Biology of the Sugar Code
Sabine Andrÿ was born in 1966 in Bad
Hersfeld (Germany) and studied biology in
Gˆttingen (1986±1991). In 1991 she joined
H.-J. Gabius' group at Philipps University
in Marburg. She obtained a diploma
(1992) and a PhD (1996) in biochemical
and cell/molecular biological analysis of
protein±carbohydrate interactions. She
gained experience in epidermal growth
factor receptor signaling research in A.
Villalobo's group at the Higher Council for
Scientific Research of Spain (CSIC) in Madrid in 1994. Her research
gives special emphasis to the study of glycoclusters and glycomimetics for lectin-targeted drug design, of lectin functions with medical
relevance by using cell biological models produced with up-to-date
technology for rationally manipulating galectin/glycoconjugate expression, and developing new diagnostic tools for histopathology.
most is the unsurpassed capacity of these molecules to
form isomers. In contrast to nucleotides or amino acids, saccharides contain several approximately chemically equivalent sites for chain elongation
and, notably, even for branching (Scheme 1). As illustrated
for b-linked diglucosides in
Scheme 2, chemically distinct
compounds are generated
when the attachment point for
the unit at the reducing end is
moved stepwise from the 2’ to
the 3’, 4’, or 6’-hydroxy group:
Sophorose (Glcpb1-2Glc) is a
constituent of plant glycosides
such as the sweetener stevioside from the Composita Stevia
rebaudiana, which is popular in
Japan, or the glycosides found
in root extracts of Uzara (South
African Xysmalobium and Pachycarpus species of the Asclepiadaceae family). Laminaribiose (Glcpb1-3Glc) is a product of the partial hydrolysis of
laminarin, an algal polysaccharide from Laminaria (seaweed;
Chrysophyceae/Phaeophyceae).
Laminarin also occurs in immunomodulatory fungal b1-3/1-6linked polysaccharides such as
schizophyllan (from Schizophyllum commune) or lentinan
(from Lentinus edodes). CelloChemBioChem 2004, 5, 740 ± 764
biose (Glcpb1-4Glc) is the basic structural unit of the most
common carbon compound in nature, cellulose. Gentiobiose
(Glcpb1-6Glc) occurs as the bitter-tasting ingredient of extracts
taken from the roots of Gentiana lutea. This diglucoside also
forms the carbohydrate part of various plant glycosides,
among them amygdalin, the glycoside found in bitter almonds.
This compound was instrumental in experiments delineating
the famous ™lock-and-key∫ principle. In 1894, E. Fischer investigated the stereospecificity of the enzyme emulsin and reported that it shows ™eine kr‰ftige Wirkung auf Amygdalin∫ (a
strong effect on amygdalin; p. 2990, ref. [5]). He continues
(p. 2992): ™Invertin und Emulsin haben bekanntlich manche
æhnlichkeit mit den Proteinstoffen und besitzen wie jene unzweifelhaft ein asymmetrisch gebautes Molek¸l. Ihre beschr‰nkte Wirkung auf die Glucoside liesse sich also auch
durch die Annahme erkl‰ren, dass nur bei ‰hnlichem geometrischem Bau diejenige Ann‰herung der Molek¸le stattfinden
kann, welche zur Auslˆsung des chemischen Vorganges erfor-
Scheme 1. Illustration of the exquisite chemical versatility of a monosaccharide as a module for chain initiation and
elongation. Whereas nucleotides (left; deoxyadenosine monophosphate) or amino acids (center; serine) form linear
oligo- and polymers by 5’,3’-phosphodiester-dependent or peptide-bond-dependent elongation (positions marked by
arrows), monosaccharide (right; a/b-d-glucose) addition to a growing oligomer can proceed through the four hydroxy
groups at C2, C3, C4, and C6 and the two anomeric hydroxy positions (see Schemes 2 and 3 for the structures of the
resulting diglucosides).
Scheme 2. Illustration of the structural series of b-diglucosides derived by shifting the position of the b1-linked hydroxy
group of the reducing-end glucose moiety from the 2’ to the 3’, 4’, or 6’-site (shown by arrows in Scheme 1). The biological relevance of this variability is underscored by the examples of the natural occurrence of each diglucoside given
in the text.
www.chembiochem.org
¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim
743
H.-J. Gabius et al.
derlich ist. Um ein Bild zu gebrauchen, will ich sagen, dass
Enzym und Glucosid wie Schloss und Schl¸ssel zu einander
passen m¸ssen, um eine chemische Wirkung auf einander aus¸ben zu kˆnnen.∫ (It is known that invertin and emulsin have
several features in common with proteinaceous compounds
and undoubtedly also harbor an asymmetrically built molecule.
Their limited effect on the glucosides might thus also be explained by the assumption that only molecules with a similar
geometrical design can approach one another as required for
a chemical process to occur. To use a metaphor, I wish to say
that enzyme and glucoside must fit like lock and key to be
able to exert a chemical effect on each other).[5] Moving on
from this case study of a disaccharide, the structural features
described above raise the expectation that carbohydrates will
be second to no other biomolecule class in the diversity of the
isomers they form, and this expectation is completely vindicated by diversity calculations. The theoretical limit of isomer diversity, that is, the total number of hexamers that can be
formed with 20 different building blocks, differs tremendously
between types of monomer: 6.4 î 107 hexapeptides are possible versus as many as 1.44 î 1015 hexasaccharides.[6]
These calculations include the noted variability of the attachment point for glycosides. Moreover, the level of diversity introduced by the occurrence of the two anomeric variants at
each glycosidic linkage is taken into account. The two structures in Scheme 3 illustrate that the seemingly rather minor
difference in only one structural parameter between the diglucosides cellobiose and maltose effectively translates into the
widely disparate properties of the polymers cellulose and
starch/glycogen. Thus, in order to characterize a glycosidic
linkage precisely, not one (i.e. the sequence) but three independent parameters are necessary. These parameters (for
the first and second dimensions of structural diversity; for the
third dimension, see Section 6) are: a) the sequence of the
individual monomers, b) the individual linkage points, and
c) the anomeric configuration. Amazingly, the potential for
structural diversity at the level of the sequence does not end
at this point.
In structural terms, a further level of diversity is accessible
through the introduction of substituents. Glycosaminoglycan
chains of proteoglycans found in the extracellular matrix, such
as heparan sulfates, provide a telling example of how even a
branchless backbone with repeating disaccharide units whose
main function has been thought of as passive structural scaf-
Scheme 3. Illustration of the structural impact of anomer variation on the two
otherwise structurally identical diglucosides cellobiose (building block of cellulose; see also Schemes 1 and 2) and maltose (building block of starch and
glycogen).
folding is turned into a chain of biologically distinct microdomains through substitution.[7] The presentation of substituents
facilitates versatile multicontact recognition relevant for the coordination of cell±matrix interactions. Site-specific introduction
of sulfate substituents to hydroxy/amino groups and the epimerization of d-glucuronic acid to l-iduronic acid in the basic
repeating unit (GlcN-HexA)n are the key to this heparanomic
complexity.[8] From the repeating core unit of the initial enzymatic polymer formation, a total of 48 different disaccharides
can theoretically be formed by the ensuing modifications. A
particular and rare modification pattern results in the anticoagulant pentasaccharide determinant of heparin (Scheme 4), an
example of a carbohydrate compound currently used in clinical
applications and an object of chemical refinement toward an
optimal design.[9] A synthetic pentasaccharide comprising the
same features is now commercially available (Table 1). As alluded to above, the three-dimensional shape of the molecule
comes into play too. l-Iduronic acid can undergo conformer interconversion (1C4 chair, 2So skew boat).[10] By adopting different
Scheme 4. Illustration of the structure of the heparin-derived anticoagulant pentasaccharide that binds antithrombin III with high specificity. The introduction of
the 3’-O-sulfate group (circled) into the central substituted (N- and O-sulfated) d-glucosamine residue by a 3-O-sulfotransferase is essential for pharmacologic
activity. This rare natural carbohydrate is used as a model for the development of drugs for preventing and treating venous and arterial thromboembolism.
744
¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim www.chembiochem.org
ChemBioChem 2004, 5, 740 ± 764
Chemical Biology of the Sugar Code
tree. The enterotoxin of Escherichia coli, plant lectins
such as ricin, the agglutinin from Erythrina coralloCompound
Target
Disease
dendron, and animal galectins exploit such a contact between the ligand and an aromatic side chain,
acarbose
a-glucosidases
diabetes mellitus
preferably that of tryptophan. This duty to make
(amylases)
heparin/heparinoids
antithrombin III
thrombosis
thermodynamically favorable contact to a ligand in
heparin pentasaccharide
antithrombin III
thrombosis
the binding site (see Section 6 and illustrations
(Fondaparinux)
(factor Xa)
therein for a view of a receptor±ligand complex
derivatives or mimetics of
neuraminidase
viral infection
with such a contact) is one likely reason why trypto2-deoxy-2,3-dehydro-Nacetylneuraminic acid
phan is indispensable in the panel of proteinogenic
N-butyldeoxynojirimycin
a-glucosidases
viral infection
amino acids. In the course of establishing contact
(N-glycan processing)
between the sugar ligand and the aromatic ring,
derivatives or mimetics of
adhesins and toxins
bacterial infection
water molecules are dispelled from the rather hymilk oligosaccharides
(lectins)
GlcN-(2-O-hexadecyl)
GPI-mannosyltransferase I
protozoan infection
drophobic patches of the carbohydrate, a process
phosphatidylinositol
(e.g. African sleeping
that makes a sizable contribution to the thermodysickness)
namic driving force of the binding process.[13b] The
derivatives or mimetics of
selectins
inflammatory reaction
a/x
selectivity of the molecular rendezvous, achieved
sialylated/sulfated Le
epitopes
through a combination of the above-mentioned
d-Man
phosphomannose
congenital disorder
factors, explains the exquisite way in which individisomerase deficiency
of glycosylation Ib
ual code letters, for example d-glucose and its
l-Fuc
GDP-fucose transport
congenital disorder
4-epimer d-galactose, are distinguished. Thus, a
of glycosylation IIc
(LAD II)
change of only one hydroxy group from the equaN-butyldeoxygalactonojiriglycosphingolipid
glycosphingolipid
torial to the axial orientation keeps perturbation of
mycin and properly glycosynthesis and enzymatic
storage disorders
the favored ™tridymite∫ water structure minimal and
sylated b-gluco(galacto)
degradation
is sufficient to establish distinct letters.[14]
cerebrosidase
In summary, the structural variability introduced
by changes in linkage points, anomeric position,
and placement of substituents endows carbohydrates with the features necessary for a high-density coding system. In fact, not only glycosaminoglycans but all
conformations at these flexible hingelike sites, the topological
N- and O-glycans and glycolipids are representatives of the
display of the neighboring substituents can be easily modulatchemical diversity realized by the enzymatic machinery of
ed, and this substituent pattern has an impact on the contact
glycan production.[15] Good reasons for development of a new
sites for interaction with receptors, for example, antithrombin
paradigm that views oligosaccharides as ™multipurpose tools∫
III and fibroblast growth factors (see Section 6 for further inforare as follows: a) the strategic placement of glycan chains in
mation). Needless to say, aberrations in proteoglycan synthesis
the glycocalyx so that they reach out into the extracellular
and modification have been linked to developmental dysreguspace like sensors or tentacles, b) the existence of more than
lation in model organisms, for example, sqv (squashed vulva)
1000 known N-glycan structures (this list is continuously growgenes in the nematode Caenorhabditis elegans or sfl (sulfateless)/pipe genes in the fruitfly Drosophila melanogaster.[11] It is
ing), c) the ways of marking proteins with a distinct sugar
signal likened to a postal code (e.g. Man-6-phosphate and
of note that sulfation has also found frequent use as a tool to
GalNAc-4-sulfate for routing of lysosomal enzymes and pituitaform ™Umlaut-like∫ letters in the sugar alphabet in the N- and
ry glycoprotein hormones, respectively) that have already been
O-glycans of glycoproteins and in glycolipids. For example,
detected, and d) the overall complexity of the families of glyGalNAc-4-sulfate (but not GalNAc) is important for routing
cosyltransferases and glycan-modifying enzymes such as the
pituitary glycoprotein hormones, as are GalNAc/Gal/GlcNAc-6sulfotransferases mentioned above.[4a, 15a] According to recent
sulfates for lymphocyte homing.[12] So far, 31 carbohydrate sulaccounts, the number of glycosyltransferase-related sequences
fotransferases have been described along with their individual
identified has grown to more than 7200, distributed over 65
ligand spectrum, which underlines the sophisticated ramificadistinct sequence-derived families; about 1 % of the open readtions of this type of modification.[12]
ing frames of each metazoan genome is calculated to be deComing back to the basic chemical features that are favoravoted to building up glycans.[15d] It now looks like a foregone
ble for a role in information transfer, the amphiphilic character
of carbohydrates is a boon for intermolecular interactions. This
conclusion that these determinants equip cells with attractive
property affords multiple donor/acceptor sites for directional
sensor points/areas for intermolecular contact. If matched on
hydrogen bonds.[13] Moreover, a set of suitably positioned pothe level of receptor proteins, the well-elaborated processes of
code word generation would make sense as a way to establish
larized C H bonds can be engaged in C H/p-electron and
a versatile communication mode involved in biosignaling, cellstacking interactions in certain cases (e.g. to bring about intispecific targeting, and host-defence pathways.[16] Families of
mate d-Gal±Trp contact).[13] This principle has been seen to
proteins capable of ™reading∫ the sugar-encoded messages–
work in organisms from various branches of the evolutionary
Table 1. Examples of sugar compounds used as pharmaceuticals.
ChemBioChem 2004, 5, 740 ± 764
www.chembiochem.org
¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim
745
H.-J. Gabius et al.
besides the enzymes that tailor glycan determinants–
are the missing link needed to turn structure into biological response. This assumption about information transfer
pathways has been shown to be correct by the discovery
of lectins.
Table 2. Brief historical account of lectinology.[a]
1860
1888
1891
1898
3. Lectins: Tools to Read Sugar-Encoded
Messages
The first documented observations of lectin activity were
made on clumping red blood cells (Table 2). In 1860, the
venom of the rattlesnake proved active in this respect:
™one drop of venom was put on a slide and a drop of
blood from a pigeon's wounded wing allowed to fall
upon it. They were instantly mixed. Within three minutes
the mass had coagulated firmly, and within ten it was of
arterial redness.∫[17] The concern that the term ™coagulation∫ used by Mitchell might reflect the action of procoagulants but not cell agglutination was satisfactorily addressed by deliberately repeating the experiments with
washed erythrocytes.[18] The paper by Flexner and Noguchi in which this work is reported was in fact introduced
by Mitchell who commented on it as follows: ™I have
long desired that the actions of venoms upon blood
should be further examined. I finally indicated in a series
of propositions the direction I wished the inquiry to take.
Starting from these the following very satisfactory study
has been made by Professor Flexner and Dr. Noguchi. My
own share in it, although so limited, I mention with satisfaction.∫[18] Only a few years later, in 1906, an agglutinating activity of activated-complement-coated erythrocytes
was detected in bovine serum. To allow the reader to
follow the major historical events in this field, we have
listed this finding in Table 2. As with snake venom, the
active protein of bovine serum was later biochemically
characterized as a C-type lectin (see also Section 4), in
this case from the subgroup of collectins named conglutinin, which binds to the Man8/Man9 N-glycan of human
iC3b at Asn917 in the a chain of complement glycoprotein C3.[19]
The assay used to look at haemagglutination was also
instrumental to the discovery of the cell-bridging capacity of proteins in plant extracts, initially that of toxic
castor bean extract.[20] Stillmark remarked in his M.D.
thesis, published in 1888: ™Das Ricin bewirkt in defibriniertem serumhaltigem Blute eine Zusammenballung der
rothen Blutkˆrperchen unter Bildung einer fibrin‰hnlichen Substanz.∫ (Ricin causes a conglomeration (or agglutination) of the red blood corpuscles in defibrinated
serum-containing blood that yields a fibrin-like substance).[20] The discovery that plant extracts are rich sources of agglutinins made possible the first purification of
such a protein (named concanavalin A) by crystallization
(™If jack bean extracts are covered with toluene and
simply allowed to stand exposed to the air for several
weeks, this protein is precipitated as beautifully formed
crystals having a diameter of about 0.1 mm∫)[21] and the
746
1902
1902
1906
1907
1913
1919
1936
1941
1947±1948
1952
1954
1960
1965
1972
1972±1977
1974
1978
1979
1983
1984
1985
1987
1989
1992±1993
1995
1996±1998
2001±2002
Observation of blood ™coagulation∫ by rattlesnake venom (S. W.
Mitchell)
Detection of erythrocyte agglutination by protein fractions from
castor beans and other plant seeds (H. Stillmark)
Toxic plant agglutinins applied as model antigens (P. Ehrlich)
Introduction of the term ™haemagglutinin∫ or phytohaemagglutinin for plant proteins that agglutinate red blood cells (M.
Elfstrand)
Detection of bacterial agglutinins (R. Kraus)
Demonstration that blood ™coagulation∫ by snake venom (later
shown to depend on a C-type lectin) observed in 1860 was not
caused by blood clotting but by cell agglutination (S. Flexner, H.
Noguchi)
Detection of an agglutinin in bovine serum (later characterized as
the C-type lectin conglutinin) that acts on activated complementcoated erythrocytes (J. Bordet, F. P. Gay)
Detection of nontoxic agglutinins in plants (K. Landsteiner, H.
Raubitschek)
Use of intact cells for the purification of lectins (R. Kobert)
Crystallization of a lectin, concanavalin A (J. B. Sumner)
Precipitation of starch, glycogen, and mucins by concanavalin A
and its interaction with the stromata of erythrocytes define the
carbohydrate as a ligand (J. B. Sumner, S. F. Howell)
Detection of viral agglutinins (G. K. Hirst)
Detection of lectins specific for human blood groups (W. C. Boyd,
K. O. Renkonen)
Carbohydrate nature of blood group determinants proven by
lectin-mediated agglutination and its sugar-dependent inhibition
(W. M. Watkins, W. T. J. Morgan)
Introduction of the term ™lectin∫ for plant agglutinins, primarily
for those that are blood-group specific (W. C. Boyd)
Detection of the mitogenic potency of lectins toward lymphocytes (P. C. Nowell)
Application of affinity chromatography for the isolation of lectins
(I. J. Goldstein, B. B. L. Agrawal)
Determination of the amino acid sequence and three-dimensional
structure of a lectin, concanavalin A (G. M. Edelman, K. O. Hardman, C. F. Ainsworth et al.)
Discovery of impaired synthesis of a marker for glycoprotein (lysosomal enzymes) routing as the cause of a human disease (mucolipidosis II) and identification of the marker as Man-6-phosphate,
the ligand for P-type lectins (E. F. Neufeld et al.; W. S. Sly et al.)
Isolation of a mammalian Gal/GalNAc-specific lectin from the liver
(G. Ashwell)
First conference focusing on lectins and glycoconjugates, termed
Interlec (T. C. B˘g-Hansen)
Detection of endogenous ligands for plant lectins (H. R¸diger)
Detection of the insecticidal action of a plant lectin (L. L.
Murdock)
Isolation of lectins from tumors (H.-J. Gabius; R. Lotan, A. Raz)
Discovery of immobilized glycoproteins as pan-affinity adsorbents
for lectins (H. R¸diger)
Introduction of neoglycoconjugates for localization of tissue
lectins for tumor diagnosis (H.-J. Gabius et al.)
Detection of the fungicidal action of a plant lectin (W. J. Peumans)
Identification of impaired synthesis of lectin (selectin) ligands by
defective fucosylation as the cause for leukocyte adhesion deficiency type II (A. Etzioni et al.)
Structural analysis of a lectin±ligand complex in solution by NMR
spectroscopy (J. Jimÿnez-Barbero et al.)
Detection of differential conformer selection by plant and animal
lectins (H.-J. Gabius et al.; L. Poppe et al.)
Advances in lectinology and glycosciences honored by dedication
of special issues of Biochim. Biophys. Acta, Biochimie, Biol. Chem.,
Cells Tissues Organs, Chem. Rev., Curr. Opin. Struct. Biol., J. Agric.
Food Chem. (Liener symposium), and Science to these topics
[a] Extended and modified from ref. [16d].
¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim www.chembiochem.org
ChemBioChem 2004, 5, 740 ± 764
Chemical Biology of the Sugar Code
demonstration of its interaction with carbohydrate groups
(Table 2). As summarized by Sumner and Howell in 1936, ™concanavalin A unites with some constituent of the stromata and,
since concanavalin A unites with starch, glycogen, mucins, etc.,
it is possible that this may be a carbohydrate group in a
protein.∫[22] Since the activities of several plant and animal
haemagglutinins towards reaction with erythrocytes of different AB0 blood group status resemble those of serum antibodies initially observed by Creite (1869) and Landois (1875) and
referred to as isoagglutinins by Landsteiner in 1900,[23] the
resulting classification of the haemagglutinins as antibody-like
substances sounds logical. The following quotation explains
Boyd's reason for introducing the term lectin in 1954, a term
that continues to be commonly used today: ™It would appear
to be a matter of semantics as to whether a substance not
produced in response to an antigen should be called an antibody even though it is a protein and combines specifically
with a certain antigen only. It might be better to have a different word for the substances and the present writer would
like to propose the word lectin from Latin lectus, the past
principle of legere meaning to pick, choose or select.∫[24] By
building on the pioneering observations made by Sumner and
Howell, on the detection of haptenic inhibition of antibody±antigen reactions by Landsteiner and van der Scheer,
and on the description of blood-group-specific lectins by Renkonen and Boyd (cited above), [22, 23d±f] a milestone of lectin
application was established shortly before 1954 (see Table 2).
This breakthrough was the inhibition of haemagglutination,
mediated by eel (Anguilla anguilla) serum and seed extracts of
the Leguminosa Lotus tetragonolobus, by l-fucose. These key
experiments led to the determination of ™the biochemical
basis of blood group AB0 and Lewis antigenic specificity∫[25e]
(for further listings of the course of lectin research history, see
Table 2).
To reach the present version of the term lectin, its definition
had to be subjected to several refinements. The experimental
focus on agglutination, which requires at least bivalency for
the bridging of two cell surfaces, was dropped completely in
the course of this process. The three criteria that must currently be met by a (glyco)protein for it to qualify as a member of
the lectin family are given below.[26]
a) Carbohydrate-binding activity
Assays monitoring binding to carrier-immobilized carbohydrate
ligands of (neo)glycoconjugates are now commonly used to
detect and quantify lectin activity, irrespective of the presence
of bridging functionality.[4a, 27] The presence of a carbohydrate
recognition domain (CRD) linked with other bioactive modules
in a mosaic-like protein (see also Sections 4 and 5) makes it
possible to assign bi- and multifunctional proteins to different
protein families.
b) Distinction from immunoglobulins
In the original definition of a lectin given by Boyd in 1954,[24]
the groups of immunoglobulins (Ig), such as IgG or IgM, are
ChemBioChem 2004, 5, 740 ± 764
www.chembiochem.org
deliberately excluded. It should be noted that the animal lectins of the I-type class with a distal V-set module and C2-set
domains belong to the Ig superfamily and that various lectins,
such as galectins, as well as C- and I-type lectins, are produced
from lymphocytes along with antibodies.[28]
c) Distinction from enzymes tailoring free saccharides/
glycan chains of glycoconjugates, and from sensor or carrier
proteins for free mono- or oligosaccharides
Any glycosyltransferase, glycosidase, or enzyme that modifies
its cognate carbohydrate (e.g. the sulfotransferases or epimerases), as well as transport/chemotaxis receptors for free mono-,
di-, or oligosaccharides are excluded from the lectin family.
With this explanation of the generic name for (glyco)proteins that read sugar-encoded messages in mind, it is instructive to examine the diversity of these proteins in plants and
animals. If lectins were rare inventions of nature, then communication with sugar code words would surely be restricted to
only a few messages that can be decoded.
4. Plant Lectins: Occurrence, Functions, and
Applications
The richest sources of plant lectins are the seeds or, more generally, the storage organs of plants. For most plants studied so
far, lectins have been prepared from the seeds, but roots,
tubers, bulbs, bark, or leaves have also served as starting materials for the isolation of lectins.[29] As emphasized above in the
context of the inter- and intrafamily diversity of glycosyltransferases (see the last paragraph of Section 2), the wide distribution of lectins is a strong argument for their physiological relevance. Table 3 lists families of higher plants, as defined by the
rules of botanical taxonomy, with the numbers of lectin-bearing species in each family. Algal, fern, and fungal lectins are
not included. Since an activity assay solely with haemagglutination without proper controls can yield false-positive results,
we limited the compilation to those cases for which further unambiguous evidence for lectin presence is available. The overwhelming majority of lectins characterized up to now has
been found in the Angiospermae section. Among these, about
three-quarters of the lectin-bearing species belong to the Dicotyledoneae and almost 90 % of these to the Archichlamydeae subclass of the dicot class. Leguminosae played an important role in the early history of lectinology, as outlined
above (see also Table 2), and still hold a prominent position in
the field. However, to avoid misinterpretation of our systematic
compilation, we must add that the search for lectins has not
really been carried out strategically by following the rules of
botanic systematics and searching species by species. It is thus
likely that the literature-based numbers given in Table 3 will
promptly increase when researchers begin doing so. Studies
have so far often focused on economically relevant plants. Besides the advantage of easy access to the starting material, reports on compounds from plants of nutritional value are sure
to find a wide readership. Consequently, the occurrence of lectins in plants outside the remit of modern agriculture is proba-
¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim
747
H.-J. Gabius et al.
Table 3. Systematic coverage of the occurrence of plant lectins.
Section
Class
Subclass
Order
Family
Number of known lectin-bearing species
Angiospermae
Dicotyledoneae
Archichlamydeae
Salicales
Fagales
Urticales
Salicaceae
Fagaceae
Cecropiaceae
Urticaceae
Moraceae
Loranthaceae
Viscaceae
Amaranthaceae
Caryophyllaceae
Chenopodiaceae
Phytolaccaceae
Cactaceae
Lauraceae
Ranunculaceae
Theaceae
Cruciferae
Papaveraceae
Crassulaceae
Leguminosae
Saxifragaceae
Euphorbiaceae
Sapindaceae
Celastraceae
Rhamnaceae
Vitaceae
Eleagnaceae
Passifloraceae
Cucurbitaceae
Myrtaceae
Araliaceae
Umbelliferae
Ebenaceae
Solanaceae
Lamiaceae
Convolvulaceae
Pedaliaceae
Verbenaceae
Caprifoliaceae
Compositae
Alismataceae
Alliaceae
Amaryllidaceae
Dioscoreaceae
Iridaceae
Liliaceae
Gramineae
Araceae
Cyperaceae
Musaceae
Orchidaceae
Araucariaceae
Pinaceae
2
2
1
1
18
1
3
7
1
1
1
2
2
1
1
4
1
1
140
1
13
3
2
1
1
1
2
22
1
1
3
1
5
9
5
1
1
5
2
1
6
11
1
6
20
9
17
1
2
6
1
3
Santalales
Centrospermae
Cactales
Magnoliales
Ranunculales
Guttiferales
Papaverales
Rosales
Geraniales
Sapindales
Celastrales
Rhamnales
Thymelaeales
Violales
Cucurbitales
Myrtiflorae
Umbelliflorae
Metachlamydeae
Monocotyledoneae
Gymnospermae
Coniferopsida
Ebenales
Tubiflorae
Dipsacales
Campanulales
Helobiae
Liliiflorae
Graminales
Spathiflorae
Cyperales
Scitamineae
Microspermae
Coniferae
bly underestimated. Another factor that may have played a
role in the count is that such plants often have tiny seeds.
The most popular method for tracing lectin presence is still
to test plant extracts for their ability to agglutinate cells, usually human or other mammalian erythrocytes. This classical
method, already used more than a century ago by Mitchell[17]
and Stillmark,[20] excels because of its simplicity. However, the
technique suffers from several noteworthy disadvantages.
Plant extracts can contain active material such as tannins that
748
leads to the above-mentioned false-positive results. The presence of lipids can also lead to misinterpreted results, and erythrocytes tend to agglutinate spontaneously in the presence
of only moderate concentrations of bivalent metal ions. Erythrocytes are sensitive to surface-active substances such as saponins, so lectins may easily be overlooked in their presence.
Moreover, an agglutination assay only detects lectins that are
at least bivalent and can therefore link cells, a factor noted
above in criterion (a) of the lectin definition (see Section 3).
¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim www.chembiochem.org
ChemBioChem 2004, 5, 740 ± 764
Chemical Biology of the Sugar Code
Therefore, screening methods have been developed that capitalize on the carbohydrate-binding capacity of lectins. Chemically tailored neoglycoconjugates present carrier-immobilized
carbohydrate ligands for interaction, and complex formation is
then picked up analytically by using any suitable label.[27, 30] To
avoid radioactive labeling, the intrinsic activity of enzymes that
are naturally glycosylated, such as horseradish peroxidase, or
that can serve as acceptors for glycans through chemical conjugation, such as E. coli b-galactosidase, is used to detect any
lectin-like activity presented on a matrix.[31] Recently developed
methods employ the microarray technology. This approach can
be combined with combinatorial synthesis, which illustrates
the emerging importance of the interface between lectin research and carbohydrate chemistry.[32] Needless to say, arrays
will prove instrumental in the definition of ligands with optimal affinity and selectivity, a factor of relevance for research
aiming to extend the contents of Table 1 in the future. However, at present these methods are too sophisticated for general
use. Expensive equipment and considerable expertise are
required to master the chemical syntheses and analytical evaluation techniques. In consequence, even recent studies dealing with newly discovered lectins rely on cell agglutination
as an analytical tool. The shortcomings of this approach are
generally addressed by using controls to prove inhibition of
the agglutination by haptenic sugar, as in the elucidation of
the determinants of the AB0 histo-blood group epitopes fifty
years ago.[25]
Once lectin activity had been detected, the next step in the
characterization pathway, regardless of the source of the material, is isolation of the lectin(s). This step can certainly be performed by standard protocols for protein purification, which
include ion exchange, size exclusion, and hydrophobic chromatography. Investments of time and effort are reduced by taking
advantage of the highly efficient method of affinity chromatography on immobilized carbohydrates or glycoconjugates. A
very simple means of applying this technique is to use naturally occurring polysaccharides such as dextrans. These compounds are high-affinity adsorbents for glucose-binding lectins
such as concanavalin A and pea or lentil agglutinins when the
polymer chains are cross-linked. Surprisingly, the enormous potential of this method was not initially realized. As was recently
pointed out in a commentary on the path of the lectins ™from
obscurity into the limelight∫ by Sharon,[33] the manuscript pioneering this approach did not find favorable review at first:
™Irwin J. Goldstein from the University of Michigan at Ann
Arbor, a leading lectin researcher to this very day, tells that
when he sent a note, in 1963, to Biochemical and Biophysical
Research Communications describing the purification of concanavalin A by affinity chromatography, it was rejected forthright
because 'this represents a modest advance in an obscure area.'
The note was eventually published in Biochemical Journal[34a]
and affinity chromatography soon became the method of
choice for lectin isolation∫ (see Table 2). Among the procedures
used to conjugate a saccharide to the matrix, we found divinyl
sulfone activation particularly easy in handling and efficient in
terms of final lectin yields.[34b±d] To broaden the scope of onestep lectin purification, it is convenient to covalently couple
ChemBioChem 2004, 5, 740 ± 764
www.chembiochem.org
not only saccharides but also naturally occurring glycoproteins
to the resin. For this purpose, hog gastric mucin or hen ovomucoid, both easily available in large amounts, was successfully employed.[35] The prevailing method used to elute the lectin
exploits the haptenic sugar as a competitive inhibitor. Problems arise when binding is directed to extended glycans, as is
the case for Phaseolus bean lectins (or phytohaemagglutinins
(PHAs), a formerly used generic name for plant lectins; see
below). The presence of these lectins is the biochemical cause
of the nausea that results from eating insufficiently cooked
beans (see also Table 4). In such instances of binding to the extended glycans, lectin elution from the resin can be performed
by lowering the pH value of the buffer. If the lectin is too
sensitive to withstand an acidic medium, desorption with a
borate-containing buffer offers a simple and affordable alternative. The elution profiles that result from the use of these two
protocols are illustrated in Figure 1. Successive elution with
haptenic sugar and borate was helpful for purification of distinct lectins from the same source that differ in carbohydrate
specificity. Figure 1 A shows that isolectin family I (specific for
Gal/GalNAc) found in Griffonia simplicifolia seed extracts can
be easily separated from the type II lectin (specific for
(GlcNAc)n). Figure 1 B illustrates that this procedure even allows
closely related isolectins such as the Phaseolus bean lectins to
be resolved. When the concentration of the eluant borate was
increased stepwise, it was possible to obtain the five isolectins
L4, L3E, L2E2, LE3, and E4 in separate fractions.[36] The isolectin L4
(listed by commercial suppliers as PHA-L4 or phytohaemagglutinin L4) is a popular laboratory tool used as a mitogen for lymphocytes and the chromatographic method described gives remarkably easy access to pure material without contamination
by the isoagglutinin E4 or the other three forms, as explained
in detail in the figure legend.
The members of the diverse group of plant lectins that are
studied and used most frequently are listed in Table 4. The
leading position is held by concanavalin A, the ™classical∫ Man/
Glc-binding lectin from Jack beans (see above and Table 2 for
the central role of this lectin in the history of lectinology). The
obtainable yield of concanavalin A from seed material is about
2 g per 100 g and it is chemically stable, key factors for its initial isolation by crystallization (see above). Once purified, the
lectin can undergo numerous chemical modifications. All these
properties are very favorable for chemical, biochemical, and biomedical applications (see Table 5 for a summary of research
areas in which plant lectins are used as tools). These facts explain why this lectin has attained its status as a reliable and
popular workhorse, especially for carbohydrate chemists looking for a lectin to use in an attempt to prove the ligand properties of a sugar compound attached to a new synthetic scaffold. The other lectins listed in Table 4 are capable of following
the role model concanavalin A, although they are less prominently used in research. These compounds form a panel of
probes for isolation and structural characterization of glycoconjugates (glycoproteins, glycolipids, or polysaccharides), as
well as use in various assays in cell biology, histochemistry, and
the medical sciences (Table 5).[26g, 34d, 37] The size of the panel of
lectins with related specificities (for a selection of frequently
¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim
749
H.-J. Gabius et al.
Table 4. Examples of plant lectins to illustrate the inter- and intrafamily diversity of these proteins.[a]
Plant species and name/
abbreviation of lectin
Family
Mono- or disaccharide specificity
Comments
Canavalia ensiformis
(concanavalin A, ConA)
Ricinus communis (ricin)
Leguminosae
Man/Glc
Euphorbiaceae
Gal
Triticum vulgare (WGA)
Phaseolus vulgaris (PHA)
Gramineae
Leguminosae
(GlcNAc)1-3, Neu5Ac
no simple carbohydrate known
Glycine max (SBA)
Pisum sativum (PSA)
Viscum album (VAA,
viscumin)
Leguminosae
Leguminosae
Viscaceae
GalNAc/Gal
Man/Glc
Gal
Arachis hypogaea (PNA)
Leguminosae
Lens culinaris (LCA)
Dolichos biflorus (DBA)
Leguminosae
Leguminosae
Griffonia simplicifolia
(GSA-I)
Griffonia simplicifolia
(GSA-II)
Artocarpus integrifolia
(jacalin)
Solanum tuberosum
(STA)
Galanthus nivalis (GNA)
Leguminosae
Gal, Galb3GalNAca
(TF-antigen)
Man/Glc
GalNAca3GalNAc,
GalNAc
Gal/GalNAc
cheapest and most popular lectin; first lectin isolated by crystallization and demonstrated to interact with carbohydrate (see text and Table 2 for details)
ribosome-inactivating protein, type II (RIP II), used for generating immunotoxins;
biohazard
potential function in plant defence mechanisms
isolectin L4 is a strong mitogen for T-lymphocytes, isolectin E4 is a strong erythrocyte
agglutinin (see Figure 1 B for chromatographic isolectin separation); distinguish between bisected and nonbisected N-glycans; cause of severe gastrointestinal irritation
when ingested in insufficiently cooked beans
cell sorting, bone marrow purging
binding of N-glycans enhanced by core fucosylation
RIP II used for generating immunotoxins, constituent of proprietary mistletoe extracts
(immunomodulatory and growth stimulatory for tumor cells in vitro and in vivo at
low doses; see text for details)
very popular in histochemistry; separates immature from mature thymocytes
Leguminosae
(GlcNAc)n
isolectin GSA-I-A4 agglutinates blood group A erythrocytes, isolectin GSA-I-B4 blood
group B erythrocytes
insecticidal activity, potential defence role
Moraceae
Gal (Man, TF-antigen)
used for isolation of IgA1 and mucins, mitogenic for CD4 + T-cells
Solanaceae
(GlcNAc)n
potential function in plant defence mechanisms
Amaryllidaceae
Man
Ulex europaeus (isolectin
UEA)-I
Erythrina corallodendron
(ECA)
Vicia faba (VFA)
Sambucus nigra (SNA)
Leguminosae
l-Fuc
Leguminosae
Abrus precatorius
Lotus tetragonolobus
(LTA)
Lycopersicon esculentum
Leguminosae
Leguminosae
Galb4GlcNAc, Gal,
GalNAc
Man/Glc
Neu5Aca6Gal/
GalNAc, (Gal/GalNAc)
Gal
l-Fuc
does not bind Glc as the Leguminosae lectins do, application for insect and nematode defence in transgenic crop plants tested, antiretroviral activity in vitro, selective
agglutination of rabbit but not human erythrocytes
agglutinates blood group 0(H) erythrocytes; selective marker for endothelial cells of
primates
mitogen for human lymphocytes
Solanaceae
(GlcNAc)n
Phaseolus lunatus
limensis
Datura stramonium
(DSA)
Maackia amurensis
(MAA)
Phytolacca americana
(PWM)
Bauhinia purpurea (BPA)
Leguminosae
GalNAca3[Fuca2]Gal,
GalNAc
(GlcNAc)n
Urtica dioica (UDA)
Hevea brasiliensis
(hevein)
Maclura pomifera (MPA)
Urticaceae
Euphorbiaceae
Leguminosae
Caprifoliaceae
Solanaceae
Leguminosae
Phytolaccaceae
Leguminosae
Moraceae
Neu5Aca3Gal/
GalNAc
GlcNAc
GalNAcb3GalNAc,
GalNAc
(GlcNAc)n
(GlcNAc)n
T-antigen > Tnantigen
binding of N-glycans enhanced by core fucosylation; lymphocyte mitogen
cell sorting, agglutinates blood group A erythrocytes
binding of N-glycans enhanced by core fucosylation
probe for sialylated glycoconjugates, e.g. in thymocyte differentiation
RIP II used for generating immunotoxins
agglutinates red cells of blood group 0(H), instrumental to the definition of a-lfucose as a crucial 0(H) epitope (see Table 2)
potential function in plant defence mechanisms; marker of endothelium of small
vessels in rats
agglutinates blood group A erythrocytes
potential function in plant defence mechanisms
probe for sialylated glycoconjugates
known as pokeweed mitogen; detected in 1969 in the course of investigating a
fatality associated with ingestion of pokeweed berries
enrichment of B lymphocytes, isolation of T cells producing Il-2
antifungal activity
antifungal activity; allergen in rubber products of poor quality
mitogen for lymphocytes
[a] The order of the list reflects the share of attention given to each lectin in the literature.
used plant lectins, see Table 4) ensures that the optimal tool
for a defined purpose can always be found. For example, LCA
can be used when the cells to be desorbed from a lectin-con-
750
taining solid matrix must be handled under gentle conditions,
in contrast to concanavalin A, with which harsher conditions
are required since binding is comparatively tight.[38] A frequent-
¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim www.chembiochem.org
ChemBioChem 2004, 5, 740 ± 764
Chemical Biology of the Sugar Code
Table 5. Versatility of plant lectins as research tools.[a]
Biochemistry
detection of defined carbohydrate epitopes of glycoconjugates in blots
or on thin-layer chromatography plates
purification of lectin-reactive glycoconjugates by affinity chromatography
glycan characterization by serial lectin affinity chromatography (lectin
affinity capture)
glycome analysis (glycomics)
quantification of lectin-reactive glycoconjugates in enzyme-linked lectinbinding assays (ELLA)
quantification of activities of glycosyltransferases/glycosidases by lectinbased detection of products of enzymatic reaction
model reagents for the assessment of the ligand functionality of carbohydrate-presenting scaffolds (e.g. glycodendrimers)
Cell biology
characterization of intracellular assembly, routing, and cell surface presentation of glycoconjugates in normal and genetically engineered cells
(glycomic profiling, spatially defined)
selection of cell variants (mutants, transfectants) with altered lectinbinding properties as models for dissecting glycosylation machinery and
glycan functionality (glycomic profiling, functionally defined)
fractionation of cell populations
modulation of the proliferation and activation status of cells and dissection of the involved signal pathways
model substratum for study of cell aggregation, adhesion, and migration
Medicine
Figure 1. Illustration of the chromatographic purification and separation of
plant lectins from the same species and source (see Table 4 for further information on these lectins) by using the glycan chains of immobilized glycoproteins
as affinity ligands. A) Successive elution with 25 mm d-galactose and 50 mm
borate from a column bearing desialylated hog gastric mucin as the affinity
ligand and loaded with plant extract as previously described[36] resulted in purification of Gal/GalNAc-specific Griffonia simplicifolia agglutinin I (GSA-I; subunit Mr = 30/32 kDa) and GSA-II ((GlcNAc)n-specific; subunit Mr = 28 kDa).
B) Stepwise increases in the borate concentration in the elution buffer resulted
in desorption of the five Phaseolus vulgaris isoagglutinins from immobilized
ovomucoid. Elution started with PHA-L4 (subunit Mr = 31 kDa) at 15 mm borate
and finally reached PHA-E4 at 250 mm borate. Elution was monitored by measuring the absorption at 280 nm (A280) and the agglutination activity, as described previously.[35] The latter assays revealed that potency increases from E1L3 to
E2L2 to E3L1, and finally E4 (subunit Mr = 34 kDa), the strongest erythrocyte agglutinin. Lymphocyte stimulation increased from E4 (20-fold at 37 mg mL 1) to L4
(24-fold at 8 mg mL 1).
ly encountered application concerns the mitogenic activity of
lectins (Table 5). The fact that plant lectins can affect lymphocyte activity and proliferation has led to suggestions that the
laboratory tools could be introduced as immunomodulatory
therapeutic agents in clinical applications. The example of the
galactoside-specific mistletoe lectin (VAA, formerly ML-1), a
constituent of proprietary extracts used in Austria, Germany,
and Switzerland, shows that immune functions such as secretion of proinflammatory cytokines or priming of granulocytes/
ChemBioChem 2004, 5, 740 ± 764
www.chembiochem.org
detection of disease-related alterations of glycan synthesis by lectin cytoand histochemistry
histo-blood group typing and definition of secretor status
quantification of aberrations of cell surface glycan presentation, e.g. in
malignancy
cell marker for diagnostic purposes including marking infectious agents
(viruses, bacteria, fungi, parasites)
cell marker for functional assays to pinpoint defects in cell activities such
as mediator release
[a] Extended and modified from ref. [26g].
activity of NK cells can indeed be stimulated at nontoxic doses
of lectin (VAA concentration needed to elicit in vivo effects: 1±
2 ng kg 1 body weight, given subcutaneously).[39] However, this
immunomodulatory capacity is unlikely to have a clinical perspective because lectin-dependent increase in the proliferation
(and also metastatic capacity) of tumor cells has likewise been
described for cell lines, histocultures of human tumors, and
animal models in vivo (primum non nocere).[26g, 40] Enhanced
availability of proinflammatory cytokines might account for
this effect. In more general terms, it is becoming evident that
these immune factors can also trigger growth responses in malignant cells.[41] Our understanding of how immune/inflammatory cells influence tumor growth and neovascularization is
thus undergoing a paradigmatic shift. This development is reflected in the statement that these cells ™conspire with cancer
cells in promoting∫ (rather than inhibiting) these processes,[41e]
which has implications for the way we look at immunostimulation in cancer patients. As a consequence, immunomodulation
by a lectin can exert a nonbeneficial influence on tumor parameters. Case studies, including an account of a study on melanoma patients in which treatment with a proprietary mistle-
¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim
751
H.-J. Gabius et al.
toe extract appeared to decrease
Table 6. Functions of plant lectins.[a]
the lengths of overall survival
Activity
Example of lectin
and disease-free intervals of patients with lymph node metastaexternal
protection from fungal attack
Hevea brasiliensis (rubber tree), Urtica dioica
ses, underline concerns that
activities
(stinging nettle), Solanum tuberosum (potato)
protection from herbivorous animals
Phaseolus vulgaris (French bean), Ricinus comherbal treatment modalities in
munis (castor bean), Galanthus nivalis (snowalternative/complementary meddrop), Triticum vulgare (wheat)
icine may not be free of serious
involvement in establishing symbiosis between
Pisum sativum (common pea), Lotononis bainesii
risk potential.[42] A recent review
plants and bacteria
(miles lotononis), Arachis hypogaea (peanut), Triticum vulgare (wheat), Oryza sativa (rice)
on the controversial issue of the
Internal
storage proteins
valid for all lectins
clinical use of Viscum album exactivities ordered deposition of storage proteins and enPisum sativum (common pea), Lens culinaris
tracts in cancer treatment conzymes in protein bodies and mediation of con(lentil), Glycine max (soybean), Oryza sativa (rice)
cluded that ™mistletoe therapy
tact between storage proteins and protein body
membranes
has the potential to harm cancer
modulation of enzymatic activities such as phos- Secale cereale (rye), Solanum tuberosum (potato),
[42d]
patients.∫
These data also
phatase activity
Pleurotus ostreatus (oyster mushroom), Glycine
caution against intuitive expectmax (soybean), Dolichos biflorus (horse gram)
ations that in vitro modulation
participation in growth regulation
Medicago sativa (alfalfa), Cicer arietinum (chick
pea)
of one or more immune paramadjustment to altered environmental conditions Triticum aestivum (winter wheat)
eters (plant lectins are very
[a] For further information on carbohydrate specificities, see Table 4. For a recent review, see ref. [26g].
active elicitors of such a response) will automatically be
clinically beneficial.
While knowledge on the distribution of lectins in plants has
and bacterial sialidases, often contain a second domain besides
taken enormous strides as a result of documentation of their
their catalytic section. This domain has exclusive carbohydratewidespread occurrence, it is difficult to produce a succinct
binding activity that allows it to guide and firmly position the
compendium of their functions in situ. In principle, each lectin
hydrolytic center.[45] This close cooperation of the two sites
might have distinct functions at the site of expression and
toward polysaccharide degradation (see criterion (c), Section 3)
through interplay with binding partners in the cell and the exexplains the reluctance of researchers to count these enzymes
tracellular environment, an idea also valid for animal lectins.
with a carbohydrate-binding module as lectins. Equivalent proOne particular protein can thus take care of several tasks. Powteins that bring a catalytic and a carbohydrate-binding domain
erful techniques used to regulate lectin presence on the level
together are found in both plants (e.g. b-galactosidases and
of gene expression in vitro and in vivo that were a boon for
endo-b-1-4-glucanase in strawberry) and animals (see
the elucidation of lectin functions in animals are starting to be
below).[26g] A recent example of clinical interest implicates mu[43]
exploited in plants, so progress in refining and extending
tations affecting a putative glycogen-binding domain (CBD-4)
current knowledge of the functions of plant lectins will not be
of laforin in disease onset, which is supposedly a result of mislong in coming. Table 6 summarizes current concepts on this
positioning of the phosphatase activity. This domain is the
topic, together with examples of lectins with the activities conproduct of the EPM2A (epilepsy of progressive myoclonus type
cerned. Free oligosaccharides also convey biochemical messag2) gene, which is defective in Lafora disease.[46] The detection
es and, although their binding partners do not fit the lectin
of a chitinase-related receptor-like kinase (CHRK1) in tobacco
definition given herein in every respect (see criterion (c) in Secand of receptor-like protein kinases with extracellular lectinlike
tion 3), our survey would not be complete without paying tribdomains in thale cress (Arabidopsis thaliana) and lombardy
ute to this aspect of oligosaccharide behavior. Indeed, an
poplar (Populus nigra var. italica) suggests the existence of an
emerging topic in the area of protein±carbohydrate interaction
outside/inside signaling route for the transfer of sugar-encodis the way in which oligosaccharide elicitors interact with their
ed messages into the plant cell.[16c, 47] Although it is tempting
[44]
These elicitors are products of
often ill-defined receptors.
to draw analogies between plant and animal lectin functions,
the degradation of plant/fungal cell walls or lipochitooligosacthis approach should not be taken too far. The enzymatic apcharides (Nod factors involved in the chemical cross-talk beparatus of glycan synthesis is not identical in plants and anitween nitrogen-fixing soil bacteria and their leguminous host
mals, so the patterns of potential natural ligands for evolutionplant). Of note is the observation that the rhizobial nodulation
ary adaptation diverge. For example, the structures of the core
protein NodC, a glycosyltransferase responsible for GlcNAc inregions of complex-type N-glycans in plants differ from the
corporation within the synthetic pathway of the Nod factors,
structures in animals in that the plant glycans harbor two
does not appear to be a unique invention of the evolutionary
unique additions to the substitution pattern of the core region
process because similar sequences have been found in Xeno(the a1-3-linked fucose attached to the proximal GlcNAc resipus, zebrafish, and mouse proteins[44e] (the alternative route to
due and the b1-2-linked xylose in the core mannose residue).
Mammalian cells, in contrast, have relatively abundant supplies
chitooligosaccharides employs endochitinases). Members of
of b1-4-galactosyltransferases, a1-2/6-fucosyltransferases, and
this glycosylhydrolase family (no. 18), like many other enzymes
sialyltransferases.[48]
involved in bacterial/fungal carbohydrate polymer degradation
752
¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim www.chembiochem.org
ChemBioChem 2004, 5, 740 ± 764
Chemical Biology of the Sugar Code
With regard to the lectins, the take-home messages of this
section are clear: a) plant lectins are widely found, and
b) these lectins are endowed with various functional activities
through their carbohydrate-binding activity. The number of
ways in which plant lectins are successfully applied as tools
(summarized in Table 5) intimates that far-reaching opportunities would be missed if the system of complementary molecular interaction exploited with these laboratory tools were not
naturally operative in animals. In the search for biochemical
hardware for programmed ™lock-and-key∫ interactions, lectins
and glycans have thus been judged to be ™reasonable candidates.∫[49] To gauge the extent to which our knowledge of
animal lectins has advanced over the last few decades (see
also Table 2), it is informative to recall the scepticism with
which this concept was confronted three decades ago. At that
time, the view was held (as for antibodies and antigens) that
lectins and oligosaccharides ™are unlikely to provide a general
mechanism of recognition and communication of the type
postulated by Weiss[50] because one member of each pair is
probably not a common cell component. The known lectins
generally originate from plants or invertebratesº∫[49] Isolation
of the C-type hepatocyte asialoglycoprotein receptor in 1974
and the galectin (electrolectin) from the electric eel (Electrophorus electricus) in 1975, along with later work in 1980 leading to the biochemical verification of the presence of lectins in
snake venom (originally discovered by Mitchell in 1860),[17, 51] as
well as the ensuing work has markedly changed this view.[28c,e]
The next section is a brief survey of the current status of
knowledge of lectin occurrence and functions in animals.
5. Animal Lectins: Occurrence, Functions, and
Applications
The complexity of glycosylation reactions in animals, especially
mammals, and of the resulting glycans which form the cellular
glycome is matched by that of proteins with a carbohydrate
recognition domain that meet the criteria for classification as a
lectin given at the end of Section 3.[4, 28c,e] The great strides
taken in sequence and three-dimensional analysis of lectins
have enabled researchers to pinpoint modules that accommodate glycan epitopes with great precision.[28c,e, 52] A minimum of
five lectin families has been solidly defined, the C-, I-, and Ptype lectins, pentraxins, and galectins.[28c,e] New additions to
this list will very likely include: a) the two molecular chaperones calnexin and calreticulin, which have a folding pattern resembling that of leguminous lectins, b) a mannose-binding
lectin from the pufferfish Fugu rubripes with sequence similarity to the agglutinins of monocotyledonous plants with the
same binding specificity, c) tachylectin 5A/ficolin, with their
fibrinogen-like binding sites, d) the ™chitinase-like∫ Ym1 lectin
with its TIM barrel, e) fucose-binding eel lectins, which have a
b-barrel with jelly-roll topology, and f) glycosaminoglycan-binding receptors/adhesion molecules.[19f, 28c] This subclassification is
evocative of that of glycosyltransferases and each lectin family
encompasses more than one member. Table 7 gives an idea of
the degree of intrafamily diversity. The table shows the current
status of the family of mammalian galectins (Ca2 + -independent
ChemBioChem 2004, 5, 740 ± 764
www.chembiochem.org
animal lectins with specificity for b-galactosides and derivatives
thereof, a jelly-roll-like folding pattern, and a set of invariant
amino acids in the site of contact with the ligand that includes
a central Trp residue; see Section 2 for the role of the indolyl
side chain). Scouring genome databases for respective hits is
thus a worthwhile activity, and homology-based database
mining is becoming a valuable tool for the detection of new
family members.[53] A further striking example of intrafamily diversity is the proteins containing the C-type domain (115±130
amino acids with four invariant Cys residues and a characteristic consensus sequence). This domain is often found in mosaiclike proteins with functions involved in cell adhesion (e.g. the
selectins in lymphocyte recirculation) or organization of the extracellular matrix (e.g. the hyalectans/lecticans), and in proteins
involved in glycan endocytosis. The gene encoding the C-type
domain is placed seventh in frequency amongst the 19 099
predicted genes of the nematode Caenorhabditis elegans and
thus surpasses even the epidermal-growth-factor-like and Igsuperfamily domains in ranking.[54] With 165 or 183 open reading frames (according to separate calculations), this motif, typical for a member of the family of animal lectins, is well-represented in the genome of the model organism.[52c, 55] To date,
over a hundred human proteins with C-type lectinlike domains
have been described, which establishes this group of domains
as a lectin family. These lectins are divided into six subgroups
based on their individual modular and quaternary structures.[52c] These numbers reflect a complex evolutionary genealogy and intimate fine-tuning of ligand specificity for distinct
functions.
This type of lectin and also members of several other families take advantage of the elaborate enzymatic process line
that specifically tailors the branch ends of glycan chains by
preferentially targeting the spatially accessible tips of the
sugar antennae. That lectins, through binding to their distinct
glycan determinants, are indeed able ™to provide a general
mechanism of recognition and communication∫[49] (the widespread presence of lectins in animals has already convincingly
dispelled the concerns quoted above) is proven by the accrued
knowledge presented in Table 8. It is immediately clear from
the entries in this table that these insights into lectin function
offer enormous potential for applied research in chemical biology. Endocytic receptors of the C-type lectin family, with their
fixed geometry of binding sites, are ideal as targets for synthetically tailored drug carriers. These receptors render uptake
into cells such as hepatocytes feasible.[56] Antiviral drugs can
thus be delivered to hepatocytes, for example by using triantennary N-glycans with GalNAc in the terminal position as a
post code. Conversely, lectin-dependent clearance of glycosylated pharmaproteins is therapeutically disadvantageous as it
reduces the bioavailability of the drug. It is reasonable in this
case to modify the glycan structure to reduce or even avoid
lectin binding. Integration of chemoenzymatic N-glycan synthesis and bioassays toward this aim has spawned progress in
this field.[57] To be specific, biantennary complex-type N-glycans
with a2-3(6)-sialylation and/or bisecting GlcNAc or core fucosylation in the bioactive part of the neoglycoproteins have been
studied. a2-6-Sialylation of a biantennary complex-type N-
¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim
753
H.-J. Gabius et al.
genesis. The duration of serum
presence and the activity of the
Name
Occurrence
Structural features
engineered glycoproteins in vivo
have been increased in this
galectin-1 (galaptin,
many cell types
homodimer; one CRD per subunit
way.[58] The combined use of
L-14)
(14±15 kDa): proto type
galectin-2
gastrointestinal tract; clone from human
homodimer; one CRD per subunit (43 %
these chemoenzymatic strathepatoma
sequence identity to galectin-1; 14 kDa):
egies, which render N- and Oproto type
glycans of choice available,[57, 59]
galectin-3 (CBP35, Mac- many cell types
monomer with one CRD (oligomer forand molecular biological engi2 antigen, IgE-binding
mation in solution and on surfaces);
protein, L-29, L-34)
Pro-, Tyr-, and Gly-rich repeats in N-terneering is bound to bring about
minal section (27±36 kDa): chimera type
the rational design of comgalectin-4
colon, small intestine, stomach, oral epimonomer with two partially homolopounds with prolonged bioavailthelium, esophagus; lung, testis, breast,
gous but distinct CRDs connected by a
ability or refined capacity for
liver, and placenta by RT-PCR
link peptide (36 kDa); proteolysis generates truncated proto-type-like products:
specific delivery.
tandem-repeat type
Fixed topological presentation
galectin-5
reticulocytes, erythrocytes (rat)
monomer with one CRD (17 kDa): proto
of binding sites, as discussed
type
above, is also a prerequisite for
galectin-6
small intestine, colon
tandem-repeat arrangement of two
CRDs (33 kDa)
blocking access to bacterial/viral
galectin-7
keratinocytes, stratified epithelia, carcino- homodimer; one CRD per subunit
lectins, a new concept for interma cells
(15 kDa): proto type
fering with the adhesion step of
homologous to galectins-4 and -6
galectin-8
several tissues; frequently present in
infections and the binding of
(tandem-repeat arrangement of two
tumor cell lines (link peptide extension
CRDs with unique link peptide; 34 kDa)
possible)
AB5-toxins.[30g,k] The fivefold symgalectin-9
small intestine, liver, lung, kidney,
homologous to galectins-4, -6, and -8
metry of the presentation of the
thymus (rat/mouse; small intestinal iso(tandem-repeat arrangement of two
binding sites in these toxins proform with 31/32 amino acid extension of CRDs with unique link peptide; 36 kDa)
vides the potential for extremely
link peptide); lymphatic tissue and B
cells, T cells and macrophages, pancreas,
tight binding by a suitably decolon carcinoma cells (human)
signed pentavalent ligand. This
Charcot±Leyden crystal major autocrystallizing constituent of
one CRD-like structure with specificity
configuration is evocative of a
protein (galectin-10)
eosinophils and basophils
for d-Man (16.5 kDa)
starfish and such compounds
galectin-11 (ovgal-11)
sheep gastrointestinal tract, induced
one CRD, resembles proto-type galectins
(14 kDa)
upon nematode infection
are 107-fold more potent in ingalectin-12
several tissues (upregulation in cells
homologous to galectins-4, -6, -8, and -9
hibition assays than their monosynchronized at the G1 phase or G1/S
(tandem-repeat arrangement of two
mers.[30k] Since soluble lectins
boundary of the cell cycle), adipocytes
CRDs with unique link peptide; 35.3 kDa)
also display binding sites in dishomodimer; one CRD per subunit
galectin-13
identical to placental protein 13 (pp13);
(16.1 kDa); close similarity to galectin-7
also expressed in the spleen, kidney,
tinct arrangements, the theraand the Charcot±Leyden crystal protein
bladder, and in tumor cells
peutic concept may be extended
galectin-14
ovine eosinophils, secreted into bronone CRD resembling proto-type galecbeyond infections toward atchoalveolar lavage fluid
tins (18.2 kDa)
tenuating lymphocyte accumula[a] Taken from ref. [27c], extended, and modified. Please note that the presence of the galectins in humans has
tion or metastatic spread. Glyconot been confirmed in all cases (e.g. rat galectin-5).
dendrimers have indeed been
shown to impair binding of galectins both in solid-phase
assays with selectivity for the glycoprotein ligand and type of
glycan is a means of conferring the signal to its carrier for a
galectin, and in cell-binding studies.[60] The recently delineated
rather long period of circulation.[57] Addition of a bisecting
involvement of galectins in tissue invasion during glioblastoma
GlcNAc residue to the biantennary N-glycan considerably inprogression or within the metastatic cascade (e.g. in colon,
creases uptake of the neoglycoprotein into the liver and
breast, or prostate carcinoma)[16b, 61] is a potential area of interspleen, which is relevant for clinical imaging. Neither core fucoest for testing these ideas in applications. In addition to the efsylation nor use of the glycan free of substitution can achieve
the same effect.[57] As the cited reports describe in further
fects of the spatial presentation of ligands on synthetic scaffolds such as wedge-like glycodendrimers,[60c,d] the fine specificdetail, glycan modification by substitution can also bring
ity differences between these homologous endogenous lectins
about notable changes in the affinity of the molecule for soluare being delineated to enhance probe selectivity, another
ble lectins, an effect emerging from the presence of distinct
substitutions with biological/clinical relevance.[57] Another
challenge for chemical biology.[62] A theory is forming that the
structure of the ligand and the spatial mode of its presentation
route towards optimization of glycosylation for clinical use involves glycoengineering. In this approach, new N-glycosylation
modulate binding avidity in markedly different ways for indisequences (the sequon Asn-X-Ser/Thr, where X is any amino
vidual lectins of a family. The detection of these differences
acid except Pro) are introduced into protein therapeutics such
lends credit to the assumption that intrafamily diversification is
as recombinant human erythropoietin by site-directed mutaaccompanied by quantitative alterations of the ligand profile,
Table 7. Members of the galectin family of mammalian lectins.[a]
754
¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim www.chembiochem.org
ChemBioChem 2004, 5, 740 ± 764
Chemical Biology of the Sugar Code
problem have paved the way for
endogenous proteins to become
Activity
Example of lectin
tools too. As research tools (see
Table 5), the endogenous lectins
ligand-selective molecular chaperones in endoplasmic calnexin, calreticulin
offer the added advantage of
reticulum
intracellular routing of glycoproteins and vesicles
ERGIC-53 and VIP-36 (probably also ERGL and VIPL),
being able to act as potential
P-type lectins, comitin
therapeutic agents that exploit
intracellular transport and extracellular assembly
nonintegrin 67-kDa elastin/laminin-binding protein
natural substances and signal
inducer of membrane superimposition and zippering
langerin (CD207)
pathways, for example, to limit
(formation of Birbeck granules)
cell-type-specific endocytosis
hepatic and macrophage asialoglycoprotein receptors,
tumor proliferation or T-celldendritic cell and macrophage C-type lectins (mandependent
immune
disornose receptor family members of the tandem-repeat
ders.[28f, 41c, 64] Since lectins natutype and single CRD lectins such as langerin/CD207),
rally select binding partners for
cysteine-rich domain of the dimeric form of the mannose receptor for GalNAc-4-SO4-bearing glycoprotein
an in situ function, it is a sure
hormones in hepatic endothelial cells, P-type lectins
bet that assays involving the
recognition of foreign glycans(b1,3-glucans, LPS)
CR3 (CD11b/CD18), dectin-1, Limulus coagulation facparticipation of endogenous lectors C and G, earthworm CCF
tins will increase in number. In
recognition of foreign or aberrant glycosignatures on
collectins, L-ficolin, C-type macrophage and dendritic
cells (including endocytosis or initiation of opsonizacell receptors, a/q-defensins, pentraxins (CRP, limulin),
terms of functional consideration or complement activation)
tachylectins
tions, the development of assays
targeting of enzymatic activity in multimodular proacrosin, laforin, Limulus coagulation factor C
with endogenous lectins (instead
teins
of plant surrogates) can be conintra- and intermolecular modulation of enzyme activ- porcine pancreatic a-amylase, galectin-1/a2-6-sialylities in vitro
transferase
sidered a quantum leap. Since
bridging of molecules
homodimeric and tandem-repeat-type galectins, cytothe fine sugar specificities of
kines (e.g. IL-2:IL-2R and CD3 of T-cell receptors), cereplant and mammalian lectins
bellar soluble lectin
often differ, results obtained
galectins, selectins, and other C-type lectins such as
induction or suppression of effector release (H2O2,
cytokines, etc.)
CD23, BDCA-2, and dectin-1
with plant lectins suffer from the
cell growth control and induction of apoptosis/anoigalectins, C-type lectins, amphoterin-like protein, hyainevitable drawback that they
kis
luronic-acid-binding proteins, cerebellar soluble lectin
cannot be reliably extrapolated
cell migration and routing
selectins and other C-type lectins, I-type lectins, galecto in situ functionality.
tins, hyaluronic-acid-binding proteins (RHAMM, CD44,
hyalectans/lecticans)
We have compiled the docucell±cell interactions
selectins and other C-type lectins (e.g. DC-SIGN), gamented functions of animal leclectins, I-type lectins (e.g. siglecs, N-CAM, P0, or L1)
tins for review in Table 8. Evicell±matrix interactions
galectins, heparin- and hyaluronic-acid-binding lectins
dently, carbohydrates serve as
such as hyalectans/lecticans, calreticulin
matrix network assembly
proteoglycan core proteins (C-type CRD and G1
versatile ligands. It is thus logical
domain of hyalectans/lecticans), galectins (e.g. galecto ask a fundamental question
tin-3/hensin), nonintegrin 67-kDa elastin/laminin-bindon the nature of oligosaccharing protein
ides: ™How can flexible mole[a] Taken from ref. [4c], extended, and modified.
cules act as signals?∫[65] This concern was put into words in a
recent review in which the
as mentioned at the start of this section. Systematic chemical
author states that, ™on several occasions I have heard structural
mapping with ligand derivatives and screening of arrays/librabiologist colleagues state that the glycan units in a glycoprories to discover potent ligand mimetics are likely eventually to
tein, for instance, cannot be important because they are too
allow molecules to be devised that fit hand-in-glove into a parflexible to be seen in an X-ray crystal structure or by NMR. In
ticular galectin (or any other lectin of clinical interest).[13d, 16d, 32]
other words, if they do not have a structure, how can they
have a function? That this conclusion is gratuitous∫[2] can be
As in the case of plant lectins, these reagents will be instrumental to the detection of lectin activities and to their cytoseen by turning to the next section.
and histochemical localization, which is relevant to histopathology.[27] This approach (i.e. tracking down carbohydrate-binding
6. The Third Dimension of the Sugar Code
proteins by using synthetic probes) has been termed ™reverse
lectin histochemistry∫ to distinguish it from the routine lectin
It is in principle correct to point critically at the inherent flexiapplications listed in Table 5.[63]
bility of oligosaccharides. Rapid intramolecular movements can
explain the frustrating futility of attempts to obtain crystals
The deployment of mammalian lectins as laboratory tools
from viscous solutions produced by synthetic carbohydrate
has lagged behind application of agglutinins from plants. The
chemistry. In glycoproteins, the glycan antennae can even
reasons for this lack of application are definitely the limited
behave as nearly separate entities, a noteworthy factor that
availability of, and access to the reagents. The easy-to-follow
allows the proteomic complexity to be increased through disprotocols provided by recombinant technology to solve this
Table 8. Functions of animal lectins.[a]
ChemBioChem 2004, 5, 740 ± 764
www.chembiochem.org
¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim
755
H.-J. Gabius et al.
tinct template-independent posttranslational modifications,
without altering the genomic coding capacity.[15a] Close inspection of this flexibility by molecular modeling (molecular mechanics and dynamics simulations) and NMR spectroscopy[65, 66]
has revealed that ™certain glycans have highly favored conformations.∫[2] Figures 2 and 3 focus on ligands for the galactoside-specific lectins (galectins and mistletoe lectin) introduced
above and illustrate that the conformational space of such a
free disaccharide is energetically structured like a topographical map is arranged with respect to altitude. The molecules
populate low-energy (valley) positions in the molecular dynamics simulations, and this result is experimentally verified by the
detection of time-averaged interresidual resonance transfer between water-insensitive C H protons (Figures 2 and 3).[67] Such
disaccharides thus have access to more than one position in
the F, Y, E plot characterizing the distinct sets of energetically
favored conformations (Figure 2, Figure 3). Since ™the carbohydrate moves in solution through a bunch of shapes each of
which may be selected by a receptor,∫ Hardy has likened such
a carbohydrate ligand to a ™bunch of keys,∫[68] with explicit ref-
Figure 2. Illustration of conformational aspects of the disaccharide Galb1-3GalNAca/b. This epitope (the a-anomer is the Thomsen±Friedenreich tumor antigen) is
a ligand for galectins (for further information on this family of animal lectins, see Tables 7 and 8), as shown by the occurrence of two interresidual trNOE contact
signals in the 2D trNOESY spectrum of a mixture (molar ratio 10:1) of the disaccharide with chicken liver galectin (CG-16), recorded at 500 MHz and 298 K with a
mixing time of 100 ms (top). Introduction of this information as two pairs of contour lines into the conformational energy map (F, Y E plot) derived from molecular mechanics calculations (e = 4) limits the conformational space of the bound ligand. It is also limited in this way when the experimental information is introduced into the molecular dynamics profile (300 K, 1000 ps) derived from calculations that explicitly include water molecules and start from the F, Y coordinates at
0/1808 outside the central low-energy valley. These calculations reveal a high population density within this central valley (middle), as described previously.[67b]
Three individual low-energy conformations from the central area, marked 1, 2, and 3 in the energy map, were drawn by using these sets of F, Y angle combinations to visualize the structural impact of F, Y angle changes (bottom).
756
¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim www.chembiochem.org
ChemBioChem 2004, 5, 740 ± 764
Chemical Biology of the Sugar Code
Figure 3. Illustration of the structural aspects of differential conformer selection of a digalactoside by a plant and an animal lectin.
Relevant parts of 2D ROESY/trNOESY spectra (recorded at 500 MHz,
298 K with a mixing time of 100 ms) of the free disaccharide
Gal’b1-2Gal (A) and of this ligand at a 10:1 molar ratio with (B) the
galactoside-specific mistletoe lectin (Viscum album L. agglutinin,
VAA; see Table 4 for further information) and (C) the chicken liver
galectin (CG-16), respectively. The spectra show three interresidual
cross-peaks for the free ligand and two such signals for the lectin±
ligand complexes, as described previously.[66a, 67a] The interresidual
H1’/H2 cross-peak is shared by the three spectra, whereas only one
of the interresidual H1’/H1 and H1’/H3 cross-peaks is present in
each of the trNOESY spectra of the ligand with the plant and
animal lectins. Molecular mechanics (e = 4) and molecular dynamics
calculations (e = 80, CVFF, 300 K, 1000 ps), combined with the NMRspectroscopy-based contour line pairs (see refs. [66a, 67a] for details), revealed that only one of the two conformers present in solution (labeled as 1 and 2 in the F, Y, E plot) was bound by each of
these two lectins (D). The plant agglutinin and the animal lectin
select different conformers of the digalactoside. The structures of
the conformers are shown in (E).
erence to the ™lock-and-key∫ paradigm introduced by
Fischer in 1894 (see Section 2).[5] Taken literally, each
individual conformer (™key∫) is endowed with the potential to interact with a certain complementary receptor site (™lock∫).
In other words, a lectin might perform conformer
selection, which provides a starting point for hypothesis-driven work. Several agglutinins that share sequence specificity for a disaccharide might subject
the ligand population to differential conformer selection. In this sense, recognition is primarily a shape
problem (see the passage quoted above), a statement
with substantial implications for the design of therapeutic glycomimetics. As illustrated in Figures 2 and
3, experimental data on interresidual proton distances
for the ligand in complex with the lectin in solution
are obtained by transferred nuclear Overhauser effect
(trNOE) spectroscopy, where the signal intensity
serves as a molecular ruler.[66, 69] Whereas the definition of the bound conformation is not unambiguous
for the example given in Figure 2 and requires further
experimental input or a docking analysis (see below
for further discussion and also Figure 4), the information presented in Figure 3 clearly demonstrates the
principle of differential conformer selection by lectins.[66a, 67] In this instance, a single disaccharide (Galb12Gal) forms two rapidly interconverting shapes. Each
specifically interacts with only one of the two different lectins, that is, either with a galectin or a plant
lectin. The same ligand can form a bioactive and a
bioinert conformation when viewed from the perspective of the galectin tested. As Roseman commented, ™it is this interplay between proteins and different conformers that likely allows a single carbohydrate structure (º) to be used in many different
ways.∫[2] In terms of methodology, it is the interplay of
carbohydrate chemistry, molecular modeling, NMR
ChemBioChem 2004, 5, 740 ± 764
www.chembiochem.org
¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim
757
H.-J. Gabius et al.
Figure 4. Illustration of the substantial gain of information about the actual
conformation of a carbohydrate ligand provided by access to NOE data for
contacts involving water-exchangeable protons, and an experimental example
to verify the validity of this concept (discussed in detail previously).[76a] The blurring in (A) demonstrates that even the presence of two interresidual contacts
(here H1’ to H3 and H4 of Gal’a1-3Galb1-R) does not allow accurate definition
of the conformation of the disaccharide (see Figure 2 (middle) and Figure 3 D
for the size of the area shared by two pairs of contour lines in the E plot).
Although the results of molecular mechanics calculations intimate that the
bound-state conformations are at low-energy sites in the F, Y, E plots, further
experimental evidence to support this assumption is essential. This verification
is symbolized by the substitution of the blurred image by a clear structure (B)
after inclusion in calculations of a signal indicating a third contact. In fact,
detection of the new water-sensitive contact by analysis of protein±ligand complexes in an aprotic solvent improves the precision of the conformational description by allowing a third pair of contour lines to be added to the E plot.
This pair of lines delimits the area of overlap of the two pairs of lines drawn
based on water-insensitive contacts (C). Remarkably, this area representing the
ligand's bound-state conformation, which is accommodated by a natural immunoglobulin G fraction from human serum, lies within/close to the central
low-energy valley.[76a]
spectroscopy, and biochemical preparation of the receptors that
allows the validity of this concept to be convincingly proven.
The power of this integrated approach is again made evident in Figure 5, which shows the bound-state conformers of a
glycomimetic. The tested C-glycoside offers the pharmacodynamic advantage of resistance to hydrolytic cleavage. However,
an increased degree of flexibility relative to that of the O-glycoside results from the introduction of a methylene bridge in
place of the oxygen atom (for further information, see the
legend of Figure 5).[70] By using exclusive interresidual contacts
as fingerprint-like characteristics for a certain bound-state topology, differential conformer selection was established and the
conformers selected by galectin-1 (syn-F, Y), the B chain of
ricin (anti-Y), and an enzymatically inactive mutant of the bacterial b-galactosidase (anti-F) were tracked down.[71] One may
wonder whether this result applies only to small ligand structures or also to naturally occurring extended saccharide chains.
758
A recent example is provided by a combined NMR spectroscopy and molecular modeling study that defined the boundstate topology of a cell-surface-exposed oligosaccharide chain,
the pentasaccharide of ganglioside GM1. The obtained data
add further strong support to the concept that a certain lowenergy conformer is favored for binding. The carbohydrate
chain of the ganglioside is the target for both cross-linking by
galectin-1 to induce inhibition of the growth of human SK-NMC neuroblastoma cells and for the AB5 toxin of Vibrio cholerae.[30k, 64a] This ability of one molecule to act as a ligand for two
structurally unrelated receptors prompts questions about the
topological aspects of these two recognition processes. As
shown in Figure 6, in which the two bound-state conformations are compared, there is indeed a difference at the branch
point of the carbohydrate chain.[72] The dihedral angles of the
Neu5Aca2-3Gal linkage in the bound ligand are either F, Y =
708/158 in the case of galectin-1 (in solution) or about 1728/
268 for cholera toxin (in crystals). The conformations selected
for binding represent two of the three lowest-energy conformations of the free ligand. Binding causes no distortion of the
topology of the selected ™key∫. This result makes it tempting
to suggest that ligand derivatives with the same carbohydrate
sequence but conformational restriction at the linkage of the
internal branch point could no longer interact with both receptor proteins. After all, it would be clinically desirable to block
the action of the AB5 toxin with an inhibitor while lowering
the affinity of the inhibitor to the endogenous lectin to avoid
undesired side reactions. This challenge at the interface of synthetic carbohydrate chemistry and chemical biology can be
tackled rationally given precise topological information.
Beyond selectivity, the binding of a deliberately preformed
conformer might also help reduce the entropic penalty in the
thermodynamic balance sheet of the overall association and
accommodation process.[13c,e, 65] When we analyzed the binding
of the pentasaccharide to galectin-1 by modeling, we were
able to obtain information on the major contact sites and the
resulting interaction energy terms, data that provide more
input for the design of glycomimetics.[72]
An intriguing example of the intimate relationship between
carbohydrate flexibility and molecular recognition is given by
iduronic acid in heparin/heparan sulfates, as outlined in Section 2 (for the position of l-iduronic acid in the anticoagulant
heparin pentasaccharide that binds to antithrombin III, see
Scheme 4). When latched into the recognition site of the
plasma protein antithrombin III, 2-O-sulfated l-iduronic acid is
driven toward its skewed 2S0 conformation. In contrast, the
local-kink-forming 1C4 conformation is preferred by fibroblast
growth factors because it maximizes contact between the
target determinant in the glycosaminoglycan and these homologous proteins.[10, 73] This amazing role as a versatile hinge that
allows the crucial regions of the glycosaminoglycan to adopt
the most favorable spatial topology makes it clear that the development of the epimerase reaction that produces l-iduronic
acid was not a fortuitous event but a wise investment. The
given examples teach this lesson: the more we learn about the
intricacies underlying the virtues of carbohydrates as ligands,
the more refined the ideas on the drawing-board for devising
¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim www.chembiochem.org
ChemBioChem 2004, 5, 740 ± 764
Chemical Biology of the Sugar Code
precision of the information on
the bound-state topology of
the ligand. As shown in Figures 2, 3, and 5, only C H protons have been exploited as reporter groups so far. Recruiting
hydroxy protons to contribute
to the fingerprint of 2D trNOESY/ROESY cross-peaks would
improve the quality of our
view of bound-state ligand
topology, as graphically depicted in Figure 4 A and B. To preclude loss of the information
from water-exchangeable hydroxy protons, one option was
to build on pioneering work
with glucosides. Sharp signals
were detected for these study
objects when they were dissolved in an aprotic solvent (dimethyl sulfoxide).[75] The concern that the activity of carbohydrate-binding proteins might
be harmed substantially by the
solvent change was addressed
by performing systematic binding assays. These assays revealed that the activity of proteins with a well-structured
folding pattern, for example,
the jelly roll of galectins, the
double b trefoil of the mistletoe lectin, and the Ig fold of
Figure 5. Illustration of the structural aspects of differential conformer selection of C-lactoside by an animal lectin (galectin-1), a plant lectin (the B chain of ricin; see Table 4 for further information), and a catalytically inactive form of
immunoglobulin G fractions, is
E. coli b-galactosidase (the asterisk denotes the E537Q mutant). The glycomimetic, which cannot be hydrolyzed, acnot harmed by such solvent
[71b]
cesses 23 % of the conformational space in the F, Y, E plot, while 12 % is accessed by the O-lactose.
The increased
change.[76] These data square
flexibility of C-lactoside compared to O-lactoside is accompanied by a shift of population density from the syn confor[71b]
well with encouraging experimation (F, Y: 558, 208) to the anti-Y conformation (F, Y: 408, 1808) to give a 32/54 % ratio of the confomers.
The three conformations of C-lactoside at relative energy minima (syn, anti-Y, and anti-F) are characterized by the
ences with enzymes in organic
occurrence of distinct interresidual resonance transfer processes, each of which is possible for only one topological
solvents.[77] The results of such
constellation and thus establishes an exclusive contact. Each arrow in the figure originates from the respective posiexperiments also intimate that
tion in the F, Y, E plot and points to the relevant part of a spectrum, shown together with a molecular model in
the folding pattern, at least
which the pair of protons establishing the exclusive contact is indicated: GalH1/GlcH4 (syn), GalH1/GlcH3 (anti-Y),
and GalH2/GlcH4 (anti-F).[69b, 71d] Detection of cross-peaks arising from any of these exclusive contacts in the 2D trNOaround the binding site, is not
ESY spectra of the three types of lactoside-binding proteins allows the bound-state conformation of the lactoside to
markedly changed by the solbe defined. The animal lectin, the plant agglutinin, and the enzymatically inactive bacterial b-galactosidase select difvent. Indeed, the accuracy of
ferent conformers of the ligand.
this assumption has been ascertained experimentally. Formation of dimers of the homodimeric galectin-1, instead of any indication of unfolding, was
ligands with optimal fit and specificity will become. Consideraobserved by small angle neutron scattering.[78] The experimention of the shape of the molecule and its control will play a
major role in this process. Rational synthesis and manipulation
tal approach of turning to aprotic solvents for trNOE spectrosof the structural details of the molecule, such as the sulfation
copy thus affords the possibility of detecting signals from
pattern, as well as screening of oligosaccharide libraries proligand protons other than those originating from resonance
vide routes to augment the affinity of ligands for certain tartransfer between C H protons. Figure 4 illustrates results from
gets and to obtain substances with special biological propera proof-of-principle example. The results shown prompted conties, such as dissociating anticoagulant and antiangiogenic acsideration of how the range of applicability of this approach
tivities.[9, 74] The guidelines for the synthesis clearly rely on the
could be extended.
ChemBioChem 2004, 5, 740 ± 764
www.chembiochem.org
¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim
759
H.-J. Gabius et al.
7. Conclusions and Perspectives
Figure 6. Illustration of the structural aspects of differential conformer selection of the complex carbohydrate chain in ganglioside GM1 by human galectin-1 (left) and cholera toxin (right). The structures are based on analysis of the
solution structure of the galectin-bound pentasaccharide[72] and data from the
Brookhaven Protein Bank (code no. 2CHB/3CHB) in the case of the cholera
toxin. The illustrated difference in the F, Y angle combination for the glycosidic linkage at the branch point connecting the central galactose unit (Gal’)
with the a2-3-linked sialic acid residue reflects the differential selection of two
conformers from the three relative energy minima of the free state (F, Y: 708,
158 for galectin-1; F, Y: 1728/ 268 for cholera toxin).[72]
A recent study demonstrated that addition of measured
amounts of water to an aprotic solvent does not prevent
measurement of sharp signals from the hydroxy protons of the
ligand, even at temperatures well above 0 8C.[79] We thus suggest that the use of binary solvent/water mixtures has the potential to enter the panel of strategies for collecting very detailed information on bound-state topology. The aim of these
techniques is to enable the complete structure of the complex,
including all details of the receptor, to be revealed by analysis
in solution. From this information, the way in which ligand
binding affects the conformation of the receptor, including its
sites for protein±protein interactions (measured for galectin-1
in solution by small angle neutron scattering),[78] could be discerned. This is a demanding task to accomplish, both for the
biochemist, who has to supply (isotope-labeled) material in
sufficient quantity and with sufficient solubility for analysis,
and for the NMR expert, who is responsible for turning spectra
into a structure. This problem has already been solved for a
synthetic Thomsen±Friedenreich antigen-binding 15-mer peptide, hevein-domain-containing plant lectins or lectin domains
such as the 43 amino acid hevein and GlcNAc oligomers (see
Figure 7), the 11-kDa cyanovirin-N from the cyanobacterium
(blue-green alga) Nostoc ellipsosporum and Mana1-2Mana, as
well as the 198 amino acid adhesin domain of P-pili from uropathogenic E. coli (PapGII) and galabiose (Gala1±4Galb).[80]
In answer to the question that has guided this section,
namely how flexible compounds can act as ligands, it has
become clear that the conformational space of carbohydrates
is structured into several areas. These areas are distinguishable
by their relative energy levels. Only a limited set of conformations (™bunch of keys∫)[68] is attributed to low-energy valleys,
and the accommodation of such conformers is evidently not
associated with an insurmountable entropic barrier. Although
the molecular details of the overall thermodynamics of the
generally enthalpically driven binding reaction are yet to be
understood,[13c,e] the merging of synthetic excellence with in
silico and in vitro techniques guarantees progress toward resolving this issue eventually.
760
The multifarious intermolecular recognition and regulation
processes that underlie the efficient and smooth functioning
of cell sociology have hitherto been assigned exclusively to nucleic acids and proteins in the central dogma of molecular biology. Despite a fashionable tendency to write off anything
beyond genomics, the problem of how the limited panel of
primary gene products is increased to serve all purposes properly and even to allow rapid and reversible regulation has engendered a surge in interest in mechanisms of posttranslational modification. Glycan chains have all the properties required
for high-density information storage and are therefore qualified to make a mark in this respect. Their finely tuned synthesis
even allows for dynamic modulations in response to external
signals, and the ensuing interplay with endogenous lectins furnishes cells with an efficient communication system. This transition in the way we look at glycans, which means that the
focus is no longer merely on the role of these molecules as
biochemical fuel or protective cell wall constituents, has not
passed unnoticed. As a consequence, cellular glycoconjugates
and lectins are receiving increasing attention and respect. The
entry at the bottom of Table 2 concerning the years 2001/2 attests this development. Stepwise refinements in instrumental
capacity for structural analysis of carbohydrate oligo- and polymers have made it possible to consider deciphering the sequence of a glycan no longer deterrent.[15a, 81] The same holds
true for conformational analysis. The realization of the enormous talents of glycans occurred in a gradual process rather
than by a quantum leap.[82] Fittingly, progress in lectinology
also followed this pattern, as the historic survey in Table 2 recounts and the steady increase of publications dealing with
lectins reflects.[16d] The instrumental role of leguminous and eel
lectins in the definition of the structure of AB0 histo-blood
group epitopes about 50 years ago (see Section 3) sets a precedent for, and shows the enormous potential of merging these
lines of research in the glycosciences branch of chemical biology. The design of optimal ligands to block disease-causing
lectin activities (e.g. in bacterial infection or tumor invasion) or
of lectin-mimetic peptides to elicit clinically beneficial lectin activities (e.g. removal of activated T-cells in autoimmune diseases or destruction of tumor cells by mimicking the capacity of
galectin-1 to induce apoptosis/anoikis) are aims for this research. As summarized by Sharon recently, ™breaking the glycocode and identifying the receptors are of prime importance
not only for theoretical reasons, but also to facilitate the development of novel treatments for the many diseases in which
carbohydrate recognition plays a key role.∫[83]
Acknowledgements
The excellent manuscript processing by R. Ohl and the constructive and exceptionally helpful comments of both reviewers are
greatly appreciated, as is the support from the Verein zur Fˆrderung des biologisch-technischen Fortschritts in der Medizin e.V. A
sincere apology is directed to colleagues whose original work
¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim www.chembiochem.org
ChemBioChem 2004, 5, 740 ± 764
Chemical Biology of the Sugar Code
Figure 7. Relevant sections of the 2D NOESY spectra (recorded at 360 MHz, 300 K and with a mixing time of 200 ms) of the 43 amino acid plant lectin hevein (for
further information on this lectin, see Table 4) in the absence (A) and in the presence (B) of N,N’-diacetylchitobiose. Characteristic alterations in the Ser19-dependent
signals caused by the presence of a ligand are indicated by arrows. Involvement of the aromatic amino acids Trp21, Trp23, and Tyr30 in ligand binding is delineated
by laser photo CIDNP difference spectra (aromatic section) of 1 mm hevein in the absence (C) and in the presence (D) of 1 mm N,N’-diacetylchitobiose at pD 4.[84]
The spatial proximity of Ser19 and the three aromatic amino acid side chains to the ligand is depicted by the superposition of twenty snapshots (E) of the lectin±
ligand complex taken in the course of a molecular dynamics simulation with explicit inclusion of water molecules, as presented in detail previously.[76a]
ChemBioChem 2004, 5, 740 ± 764
www.chembiochem.org
¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim
761
H.-J. Gabius et al.
could not be completely included, discussed, and cited because of
space limitations and the scope of this review. With regret regarding this aspect of the paper, we set out to produce a primer
on the concept of the sugar code as we see it, illustrated by selected proof-of-principle examples to convey a flavor of the field.
Keywords: adhesion ¥ bioinformatics ¥ drug design ¥
glycosylation ¥ lectins ¥ molecular modeling ¥ NMR
spectroscopy
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
D. A. Rees, Biochem. J. 1972, 126, 257 ± 273.
S. Roseman, J. Biol. Chem. 2001, 276, 41 527 ± 41 542.
P. J. Winterburn, C. F. Phelps, Nature 1972, 236, 147 ± 151.
a) H.-J. Gabius, Naturwissenschaften 2000, 87, 108 ± 121; b) J. Hirabayashi, K.-i. Kasai, Trends Glycosci. Glycotechnol. 2000, 12, 1 ± 5; c) H.-J.
Gabius, S. Andrÿ, H. Kaltner, H.-C. Siebert, Biochim. Biophys. Acta 2002,
1572, 165 ± 177.
E. Fischer, Ber. Dt. Chem. Ges. 1894, 27, 2985 ± 2993.
R. A. Laine in Glycosciences: Status and Perspectives (Eds.: H.-J. Gabius, S.
Gabius), Chapman and Hall, London, 1997, pp. 1 ± 14.
a) W. D. Comper, J. Theor. Biol. 1990, 145, 497 ± 509; b) H. Kresse in Glycosciences: Status and Perspectives (Eds.: H.-J. Gabius, S. Gabius), Chapman and Hall, London, 1997, pp. 201 ± 222; c) J. Turnbull, A. Powell, S.
Guimond, Trends Cell Biol. 2001, 11, 75 ± 82.
a) S. Yamada, K. Sugahara, Trends Glycosci. Glycotechnol. 1998, 10, 95 ±
123; b) R. Sasisekharan, G. Venkataraman, Curr. Opin. Chem. Biol. 2000,
4, 626 ± 631.
a) S. Alban in Carbohydrates in Drug Design (Eds.: Z. J. Witczak, K. A. Nieforth), M. Dekker, New York, 1997, pp. 209 ± 276; b) R. J. Linhardt, T.
Toida in Carbohydrates in Drug Design (Eds.: Z. J. Witczak, K. A. Nieforth),
M. Dekker, New York, 1997, pp. 277 ± 341; c) B. Casu, U. Lindahl, Adv.
Carbohydr. Chem. Biochem. 2001, 57, 159 ± 206; d) I. Capila, R. J. Linhardt, Angew. Chem. 2002, 114, 428 ± 451; Angew. Chem. Int. Ed. 2002,
41, 390 ± 412; e) A. S. Gallus, D. W. Coghlan, Curr. Opin. Hematol. 2002, 9,
422 ± 429; f) M. Sundaram, Y. Qi, Z. Shriver, D. Liu, G. Zhao, G. Venkataraman, R. Langer, R. Sasisekharan, Proc. Natl. Acad. Sci. USA 2003, 100,
651 ± 656.
B. Casu, M. Petitou, M. Provasoli, P. Sinay, Trends Biochem. Sci. 1988, 13,
221 ± 225.
a) N. Perrimon, M. Bernfield, Nature 2000, 404, 725 ± 728; b) S. B. Selleck,
Trends Genet. 2000, 16, 206 ± 212; c) M. Princivalle, A. De Agostini, Int. J.
Dev. Biol. 2002, 46, 267 ± 278; d) J. Turnbull, K. Drummond, Z. Huang, M.
Ford-Perriss, M. Murphy, S. Guimond, Biochem. Soc. Transact. 2003, 31,
343 ± 348.
a) L. V. Hooper, S. M. Manzella, J. U. Baenziger in Glycosciences: Status
and Perspectives (Eds.: H.-J. Gabius, S. Gabius), Chapman and Hall, Weinheim±London, 1997, pp. 261 ± 276; b) M. Fukuda, N. Hiraoka, T. O.
Akama, M. N. Fukuda, J. Biol. Chem. 2001, 276, 47 747 ± 47 750; c) J. R.
Grunwell, C. R. Bertozzi, Biochemistry 2002, 41, 13 117 ± 13 126; d) K.
Honke, N. Taniguchi, Med. Res. Rev. 2002, 22, 637 ± 654; e) J. U. Baenziger, Biochem. Soc. Transact. 2003, 31, 326 ± 330; f) I. Brockhausen, Biochem. Soc. Transact. 2003, 31, 318 ± 325.
a) F. A. Quiocho, Pure Appl. Chem. 1989, 61, 1293 ± 1306; b) R. U. Lemieux, Acc. Chem. Res. 1996, 29, 373 ± 380; c) H.-J. Gabius, Pharmaceut.
Res. 1998, 15, 23 ± 30; d) D. SolÌs, J. Jimÿnez-Barbero, H. Kaltner, A.
Romero, H.-C. Siebert, C.-W. von der Lieth, H.-J. Gabius, Cells Tissues
Organs 2001, 168, 5 ± 23; e) T. K. Dam, C. F. Brewer, Chem. Rev. 2002,
102, 387 ± 429.
a) H. Uedeira, H. Uedeira, J. Sol. Chem. 1985, 14, 27 ± 34; b) J. Hirabayashi, Quart. Rev. Biol. 1996, 71, 365 ± 380; c) A. M. Striegel, J. Am. Chem.
Soc. 2003, 125, 4146 ± 4148.
a) G. Reuter, H.-J. Gabius, Cell. Mol. Life Sci. 1999, 55, 368 ± 422; b) T.
Hennet, Cell. Mol. Life Sci. 2002, 59, 1081 ± 1095; c) R. G. Spiro, Glycobiology 2002, 12, 43R ± 56R; d) P. M. Coutinho, E. Deleury, G. J. Davies, B.
Henrissat, J. Mol. Biol. 2003, 328, 307 ± 317; e) D. J. Becker, J. B. Lowe,
Glycobiology 2003, 13, 41R ± 53R; f) K. G. Ten Hagen, T. A. Fritz, L. A.
Tabak, Glycobiology 2003, 13, 1R ± 16R.
762
[16] a) S. H. Barondes, Annu. Rev. Biochem. 1981, 50, 207 ± 231; b) H. Kaltner,
B. Stierstorfer, Acta Anat. 1998, 161, 162 ± 179; c) A. Villalobo, H.-J.
Gabius, Acta Anat. 1998, 161, 110 ± 129; d) H. R¸diger, H.-C. Siebert, D.
SolÌs, J. Jimÿnez-Barbero, A. Romero, C.-W. von der Lieth, T. DÌaz-MauriÊo, H.-J. Gabius, Curr. Med. Chem. 2000, 7, 389 ± 416; e) N. M. Dahms,
M. K. Hancock, Biochim. Biophys. Acta 2002, 1572, 317 ± 340; f) S.-i. Kawabata, R. Tsuda, Biochim. Biophys. Acta 2002, 1572, 414 ± 421; g) J. Lu, C.
Teh, U. Kishore, K. B. M. Reid, Biochim. Biophys. Acta 2002, 1572, 387 ±
400; h) P. H. Weigel, J. H. N. Yik, Biochim. Biophys. Acta 2002, 1572, 341 ±
363.
[17] S. W. Mitchell, Smithsonian Contrib. Knowledge 1860, XII, 89 ± 90.
[18] S. Flexner, H. Noguchi, J. Exp. Med. 1902, 6, 277 ± 301.
[19] a) J. Bordet, F. P. Gay, Ann. Inst. Pasteur 1906, 20, 467 ± 498; b) J. Bordet,
O. Streng, Zbl. Bakteriol. Parasitenkd. Infektionskrankh. Hyg. Abt. I Orig.
1909, 49, 260 ± 276; c) S. Hirani, J. D. Lambris, H. J. M¸ller-Eberhard, J.
Immunol. 1985, 134, 1105 ± 1109; d) H.-J. Gabius, Int. J. Biochem. 1994,
26, 469 ± 477; e) G. R. Vasta, M. Quesenberry, H. Ahmed, N. O'Leary, Dev.
Comp. Immunol. 1999, 23, 401 ± 420; f) D. C. Kilpatrick, Biochim. Biophys.
Acta 2002, 1572, 401 ± 413.
[20] H. Stillmark, ‹ber Ricin, ein giftiges Ferment aus den Samen von Ricinus
comm. L. und einigen anderen Euphorbiaceen Inaugural Dissertation,
Schnakenburg's Buchdruckerei, Dorpat, 1888.
[21] J. B. Sumner, J. Biol. Chem. 1919, 37, 137 ± 142.
[22] J. B. Sumner, S. F. Howell, J. Bacteriol. 1936, 32, 227 ± 237.
[23] a) A. Creite, Z. Rat. Med. 1869, 36, 90 ± 108; b) K. Landsteiner, Zbl. Bakteriol. Parasitenkd. Infektionskrankh. Hyg. Abtlg. I Orig. 1900, 27, 357 ± 362;
c) K. Landsteiner, Wiener Klin. Wschr. 1901, 42, 1020 ± 1024; d) K. Landsteiner, J. van der Scheer, J. Exp. Med. 1931, 54, 295 ± 305; e) K. O. Renkonen, Ann. Med. Exp. Biol. Fenn. 1948, 26, 66 ± 72; f) W. C. Boyd, R. M. Reguera, J. Immunol. 1944, 62, 333 ± 339; g) W. C. Boyd, Vox Sang. 1963, 8,
1 ± 32; h) N. C. Hughes-Jones, B. Gardner, Br. J. Haematol. 2002, 119,
889 ± 893.
[24] W. C. Boyd in The Proteins (Eds.: H. Neurath, K. Bailey), Academic Press,
New York, 1954, Vol. 2, Part 2, pp. 756 ± 844.
[25] a) W. M. Watkins, W. T. J. Morgan, Nature 1952, 169, 825 ± 826; b) W. T. J.
Morgan, W. M. Watkins, Br. J. Exp. Pathol. 1953, 34, 94 ± 103; c) W. J.
Judd, CRC Crit. Rev. Clin. Lab. Sci. 1980, 12, 172 ± 214; d) G. W. G. Bird,
Transfusion Med. Rev. 1989, 3, 55 ± 62; e) W. M. Watkins, Trends Glycosci.
Glycotechnol. 1999, 11, 391 ± 411; f) H. P. Schwarz, F. Dorner, Br. J. Haematol. 2003, 121, 556 ± 565.
[26] a) I. J. Goldstein, R. C. Hughes, M. Monsigny, T. Osawa, N. Sharon, Nature
1980, 285, 66; b) J. Kocourek, V. HorœejsœÌ, Nature 1981, 290, 188;
c) M. B. F. Dixon, Nature 1981, 292, 192; d) J. Kocourek, V. HorœejsœÌ in Lectins. Biology, Biochemistry, Clinical Biochemistry (Eds.: T. C. B˘g-Hansen,
G. A. Spengler), W. de Gruyter, Berlin, 1983, Vol. 3, pp. 3 ± 6; e) S. H. Barondes, Trends Biochem. Sci. 1988, 13, 480 ± 482; f) H.-J. Gabius, Biochim.
Biophys. Acta 1991, 1071, 1 ± 18; g) H. R¸diger, H.-J. Gabius, Glycoconjugate J. 2001, 18, 589 ± 613.
[27] a) H.-J. Gabius, S. Andrÿ, A. Danguy, K. Kayser, S. Gabius, Methods Enzymol. 1994, 242, 37 ± 46; b) H.-J. Gabius, C. Unverzagt, K. Kayser, Biotech.
Histochem. 1998, 73, 263 ± 277; c) H.-J. Gabius, Anat. Histol. Embryol.
2001, 30, 3 ± 31.
[28] a) H.-J. Gabius, Cancer Invest. 1987, 5, 39 ± 46; b) L. D. Powell, A. Varki, J.
Biol. Chem. 1995, 270, 14 243 ± 14 246; c) H.-J. Gabius, Eur. J. Biochem.
1997, 243, 543 ± 576; d) T. Angata, E. C. M. Brinkman-Van der Linden, Biochim. Biophys. Acta 2002, 1572, 294 ± 316; e) D. C. Kilpatrick, Biochim. Biophys. Acta 2002, 1572, 187 ± 197; f) G. A. Rabinovich, N. Rubinstein,
M. A. Toscano, Biochim. Biophys. Acta 2002, 1572, 274 ± 284.
[29] H. R¸diger, Acta Anat. 1998, 161, 130 ± 152.
[30] a) C. P. Stowell, Y. C. Lee, Adv. Carbohydr. Chem. Biochem. 1980, 37, 225 ±
281; b) J. D. Aplin, J. C. Wriston, Jr., CRC Crit. Rev. Biochem. 1981, 10,
259 ± 306; c) H.-J. Gabius, Angew. Chem. 1988, 100, 1321 ± 1330; Angew.
Chem. Int. Ed. Engl. 1988, 27, 1267 ± 1276; d) Neoglycoconjugates. Preparation and Applications (Eds.: Y. C. Lee, R. T. Lee), Academic Press, San
Diego, 1994; e) N. V. Bovin, H.-J. Gabius, Chem. Soc. Rev. 1995, 24, 413 ±
421; f) R. Roy, Trends Glycosci. Glycotechnol. 1996, 8, 79 ± 99; g) M.
Mammen, S.-K. Choi, G. M. Whitesides, Angew. Chem. 1998, 110, 2908 ±
2953; Angew. Chem. Int. Ed. 1998, 37, 2754 ± 2794; h) L. L. Kiessling, L. E.
Strong, J. E. Gestwicki, Annu. Rep. Med. Chem. 2000, 35, 321 ± 330; i) N.
Yamazaki, S. Kojima, N. V. Bovin, S. Andrÿ, S. Gabius, H.-J. Gabius, Adv.
Drug Deliv. Rev. 2000, 43, 225 ± 244; j) B. T. Houseman, M. Mrksich, Top.
¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim www.chembiochem.org
ChemBioChem 2004, 5, 740 ± 764
Chemical Biology of the Sugar Code
[31]
[32]
[33]
[34]
[35]
[36]
[37]
[38]
[39]
[40]
[41]
[42]
[43]
[44]
[45]
Curr. Chem. 2002, 218, 1 ± 44; k) C. L. Schengrund, Biochem. Pharmacol.
2003, 65, 699 ± 707.
a) W. Straus, J. Histochem. Cytochem. 1983, 31, 78 ± 84; b) H.-J. Gabius, R.
Engelhardt, K. P. Hellmann, T. Hellmann, A. Ochsenfahrt, Anal. Biochem.
1987, 165, 349 ± 355; c) S. Gabius, K. P. Hellmann, T. Hellmann, U. Brinck,
H.-J. Gabius, Anal. Biochem. 1989, 182, 447 ± 451.
a) A. Barkley, P. Arya, Chem. Eur. J. 2001, 7, 555 ± 563; b) S.-I. Nishimura,
Curr. Opin. Chem. Biol. 2001, 5, 325 ± 335; c) K. R. Love, P. H. Seeberger,
Angew. Chem. 2002, 114, 3733 ± 3736; Angew. Chem. Int. Ed. 2002, 41,
3583 ± 3586; d) L. A. Marcaurelle, P. H. Seeberger, Curr. Opin. Chem. Biol.
2002, 6, 289 ± 296; e) C. O. Mellet, J. M. G. Fernµndez, ChemBioChem
2002, 3, 819 ± 822; f) O. Ramstrˆm, T. Bunyapaiboonsri, S. Lohmann,
J.-M. Lehn, Biochim. Biophys. Acta 2002, 1572, 178 ± 186.
N. Sharon, Protein Sci. 1998, 7, 2042 ± 2048.
a) B. B. L. Agrawal, I. J. Goldstein, Biochem. J. 1965, 26, 23c; b) J. H.
Pazur, Adv. Carbohydr. Chem. Biochem. 1981, 39, 405 ± 447; c) H.-J.
Gabius, Anal. Biochem. 1990, 189, 91 ± 94; d) H.-J. Gabius in Protein
Liquid Chromatography (Ed.: M. Kastner), Elsevier, Amsterdam, 2000,
pp. 619 ± 638.
a) T. Freier, G. Fleischmann, H. R¸diger, Biol. Chem. Hoppe-Seyler 1985,
366, 1023 ± 1028; b) H. R¸diger in Lectins and Glycobiology (Eds.: H.-J.
Gabius, S. Gabius), Springer, Heidelberg, 1993, pp. 31 ± 46.
G. Fleischmann, I. Mauder, W. Illert, H. R¸diger, Biol. Chem. Hoppe-Seyler
1985, 366, 1029 ± 1032.
a) H. Lis, N. Sharon, Annu. Rev. Biochem. 1986, 55, 35 ± 67; b) I. Damjanov,
Lab. Invest. 1987, 57, 5 ± 20; c) T. Osawa, T. Tsuji, Annu. Rev. Biochem.
1987, 56, 21 ± 42; d) A. Danguy, F. Akif, B. Pajak, H.-J. Gabius, Histol. Histopathol. 1994, 9, 155 ± 171; e) J. F. Kennedy, P. M. G. Palva, M. T. S. Corella, M. S. M. Cavalcanti, L. C. B. B. Coelho, Carbohydr. Polym. 1995, 26,
219 ± 230; f) R. D. Cummings in Glycosciences: Status and Perspectives
(Eds.: H.-J. Gabius, S. Gabius), Chapman and Hall, London, 1997,
pp. 191 ± 199; g) W. J. Peumans, E. J. M. van Damme, Crit. Rev. Biochem.
Mol. Biol. 1998, 33, 209 ± 259.
a) V. Kinzel, D. K¸bler, J. Richards, M. Stˆhr in Concanavalin A as a Tool
(Eds.: H. Bittiger, H. P. Schnebli), Wiley, London, 1976, pp. 467 ± 478; b) V.
Kinzel, D. K¸bler, J. Richards, M. Stˆhr, Science 1976, 192, 487 ± 489.
a) T. Hajto, K. Hostanska, H.-J. Gabius, Cancer Res. 1989, 49, 4803 ± 4808;
b) T. Hajto, K. Hostanska, K. Frei, C. Rordorf, H.-J. Gabius, Cancer Res.
1990, 50, 3322 ± 3326.
a) E. Kunze, H. Schulz, M. Adamek, H.-J. Gabius, J. Cancer Res. Clin.
Oncol. 2000, 126, 125 ± 138; b) H.-J. Gabius, F. Darro, M. Remmelink, S.
Andrÿ, J. Kopitz, A. Danguy, S. Gabius, I. Salmon, R. Kiss, Cancer Invest.
2001, 19, 114 ± 126; c) A. V. Timoshenko, Y. Lan, H.-J. Gabius, P. K. Lala,
Eur. J. Cancer 2001, 37, 1910 ± 1920; d) S. Gabius, H.-J. Gabius, Dtsch.
med. Wschr. 2002, 127, 457 ± 459.
a) S. Chouaib, C. Asselin-Paturel, F. Mami-Chouaib, A. Caignard, J. Y. Blay,
Immunol. Today 1997, 18, 493 ± 497; b) J. A. Sogn, Immunity 1998, 9,
757 ± 763; c) H.-J. Gabius, Biochimie 2001, 83, 659 ± 666; d) L. M. Coussens, Z. Werb, Nature 2002, 420, 860 ± 867; e) J. L. Yu, J. W. Rak, Breast
Cancer Res. 2003, 5, 83 ± 88.
a) H. Heimpel, Dtsch. med. Wschr. 1995, 16, 205; b) W. Hagenah, I.
Dˆrges, E. Gafumbegete, T. Wagner, Dtsch. med. Wschr. 1998, 123,
1001 ± 1004; c) A. M. M. Eggermont, U. R. Kleeberg, D. J. Ruiter, S. Suciu
in ASCO Educational Book (Ed.: M. C. Perry), American Society of Clinical
Oncology, Alexandria, VA, USA, 2001, pp. 88 ± 93; d) E. Ernst, K. Schmidt,
M. K. Steuer-Vogt, Int. J. Cancer 2003, 107, 262 ± 267.
a) L. M. Brill, C. J. Evans, A. M. Hirsch, Plant J. 2001, 25, 453 ± 461; b) R.
Esteban, B. Dopico, F. J. MuÊoz, S. Romo, E. Labrador, Physiol. Plant.
2002, 114, 619 ± 626; c) W.-d. Yong, Y.-y. Xu, W.-z. Xu, X. Wang, N. Li, J.-s.
Wu, T.-b. Liang, K. Chong, Z.-h. Xu, K.-h. Tan, Z.-q. Zhu, Planta 2003, 217,
261 ± 270.
a) J.-C. Promÿ, Curr. Opin. Struct. Biol. 1996, 6, 671 ± 678; b) A. M. Hirsch,
Curr. Opin. Plant Biol. 1999, 2, 320 ± 326; c) P. Potin, K. Bouarab, F.
K¸pper, B. Kloareg, Curr. Opin. Microbiol. 1999, 2, 276 ± 283; d) T. Yamaguchi, Y. Ito, N. Shibuya, Trends Glycosci. Glycotechnol. 2000, 12, 113 ±
120; e) P. P. G. van der Holst, H. R. M. Schlaman, H. P. Spaink, Curr. Opin.
Struct. Biol. 2001, 11, 608 ± 616.
a) P. Tomme, R. A. J. Warren, N. R. Gilkes, Adv. Microbiol. Physiol. 1995,
37, 1 ± 81; b) C. Khosia, P. B. Harbury, Nature, 2001, 409, 247 ± 252;
c) B. W. McLean, A. B. Boraston, D. Brouwer, N. Sanaie, C. A. Fyfe, R. A. J.
Warren, D. G. Kilburn, C. A. Haynes, J. Biol. Chem. 2002, 277, 50 245 ±
ChemBioChem 2004, 5, 740 ± 764
www.chembiochem.org
[46]
[47]
[48]
[49]
[50]
[51]
[52]
[53]
[54]
[55]
[56]
[57]
[58]
[59]
[60]
[61]
50 254; d) S. Thobhani, B. Ember, A. Siriwardena, G.-J. Boons, J. Am.
Chem. Soc. 2003, 125, 7154 ± 7155.
a) J. Wang, J. A. Stuckey, M. J. Wishart, J. E. Dixon, J. Biol. Chem. 2002,
277, 2377 ± 2380; b) S. Ganesh, N. Tsurutani, T. Suzuki, Y. Hishii, T. Ishihara, A. V. Delgado-Escueta, K. Yamakawa, Biochem. Biophys. Res.
Commun. 2004, 313, 1101 ± 1109.
a) Y. S. Kim, J. H. Lee, G. M. Yoon, H. S. Cho, S.-W. Park, M. C. Suh, D.
Choi, H. J. Ha, J. R. Liu, H.-S. Pai, Plant Physiol. 2000, 123, 905 ± 915; b) A.
Barre, C. Hervÿ, B. Lescure, P. Rougÿ, Crit. Rev. Plant Sci. 2002, 21, 379 ±
399; c) M. Nishiguchi, K. Yoshida, T. Sumizono, K. Tazaki, Mol. Genet.
Genomics 2002, 267, 506 ± 514.
a) P. A. Gleeson, Curr. Top. Microbiol. Immunol. 1988, 139, 1 ± 34; b) H.
Ueda, H. Ogawa, Trends Glycosci. Glycotechnol. 1999, 11, 413 ± 428; c) S.
Chen, A. M. Spence, H. Schachter, Trends Glycosci. Glycotechnol. 2001,
13, 447 ± 462; d) I. B. H. Wilson, Curr. Opin. Struct. Biol. 2002, 12, 569 ±
577.
S. Roth, Quart. Rev. Biol. 1973, 48, 541 ± 563.
P. Weiss, Yale J. Biol. Med. 1947, 19, 235 ± 278.
a) R. L. Hudgin, W. E. Pricer, Jr., G. Ashwell, R. J. Stockert, A. G. Morell, J.
Biol. Chem. 1974, 249, 5536 ± 5543; b) V. I. Teichberg, I. Silman, D. D.
Beitsch, G. Resheff, Proc. Natl. Acad. Sci. USA 1975, 72, 1383 ± 1387;
c) T. K. Gartner, K. Stocker, D. C. Williams, FEBS Lett. 1980, 117, 13 ± 16.
a) H. Lis, N. Sharon, Chem. Rev. 1998, 98, 637 ± 674; b) W. J. Peumans, A.
Barre, Q. Hao, P. Rougÿ, E. J. M. van Damme, Trends Glycosci. Glycotechnol. 2000, 12, 83 ± 101; c) R. B. Dodd, K. Drickamer, Glycobiology 2001,
11, 71R ± 79R; d) R. Loris, Biochim. Biophys. Acta 2002, 1572, 198 ± 208.
D. N. W. Cooper, Biochim. Biophys. Acta 2002, 1572, 209 ± 231.
The C. elegans Sequencing Consortium, Science 1998, 282, 2012 ± 2018.
R. O. Hynes, Q. Zhao, J. Cell Biol. 2000, 150, F89 ± F95.
a) J. C. Rogers, S. Kornfeld, Biochem. Biophys. Res. Commun. 1971, 45,
622 ± 629; b) Y. C. Lee, FASEB J. 1992, 6, 3193 ± 3200; c) L. Fiume, C. Busi,
G. Di Stefano, A. Mattioli, Adv. Drug Deliv. Rev. 1994, 14, 51 ± 65;
d) D. K. F. Meijer, G. Molema, Sem. Liver Dis. 1995, 15, 202 ± 256; e) H.-J.
Gabius, Cancer Investig. 1997, 15, 454 ± 464; f) K. G. Rice in Glycosciences:
Status and Perspectives (Eds.: H.-J. Gabius, S. Gabius), Chapman and
Hall, London±Weinheim, 1997, pp. 471 ± 483; g) B. G. Davis, M. A. Robinson, Curr. Opin. Drug Discov. Develop. 2002, 5, 279 ± 288.
a) S. Andrÿ, C. Unverzagt, S. Kojima, X. Dong, C. Fink, K. Kayser, H.-J.
Gabius, Bioconjugate Chem. 1997, 8, 845 ± 855; b) C. Unverzagt, S.
Andrÿ, J. Seifert, S. Kojima, C. Fink, G. Srikrishna, H. Freeze, K. Kayser, H.J. Gabius, J. Med. Chem. 2002, 45, 478 ± 491; c) S. Andrÿ, C. Unverzagt, S.
Kojima, M. Frank, J. Seifert, C. Fink, K. Kayser, C.-W. von der Lieth, H.-J.
Gabius, Eur. J. Biochem. 2004, 271, 118 ± 134.
S. Elliott, T. Lorenzini, S. Asher, K. Aoki, D. Brankow, L. Buck, L. Busse, D.
Chang, J. Fuller, J. Grant, N. Hernday, M. Hokum, S. Hu, A. Knudten, N.
Levin, R. Komorowski, F. Martin, R. Navarro, T. Osslund, G. Rogers, N.
Rogers, G. Trail, J. Egrie, Nat. Biotechnol. 2003, 21, 414 ± 421.
a) O. Seitz, ChemBioChem 2000, 1, 214 ± 246; b) M. Mizuno, Trends Glycosci. Glycotechnol. 2001, 13, 11 ± 30; c) M. J. Grogan, M. R. Pratt, L. A.
Marcaurelle, C. R. Bertozzi, Annu. Rev. Biochem. 2002, 71, 593 ± 634.
a) S. Andrÿ, P. J. Cejas Ortega, M. Alamino Perez, R. Roy, H.-J. Gabius, Glycobiology 1999, 9, 1253 ± 1261; b) S. Andrÿ, B. Frisch, H. Kaltner, D. L.
Desouza, F. Schuber, H.-J. Gabius, Pharmaceut. Res. 2000, 17, 985 ± 990;
c) S. Andrÿ, R. J. Pieters, I. Vrasidas, H. Kaltner, I. Kuwabara, F.-T. Liu,
R. M. J. Liskamp, H.-J. Gabius, ChemBioChem 2001, 2, 822 ± 830; d) I.
Vrasidas, S. Andrÿ, P. Valentini, C. Bˆck, M. Lensch, H. Kaltner, R. M. J. Liskamp, H.-J. Gabius, R. J. Pieters, Org. Biomol. Chem. 2003, 1, 803 ± 810;
e) S. Andrÿ, B. Liu, H.-J. Gabius, R. Roy, Org. Biomol. Chem. 2003, 1,
3909 ± 3916; f) S. Andrÿ, H. Kaltner, T. Furuike, S.-I. Nishimura, H.-J.
Gabius, Bioconjugate Chem. 2004, 15, 87 ± 98.
a) D. W. Ohannesian, R. Lotan in Glycosciences: Status and Perspectives
(Eds.: H.-J. Gabius, S. Gabius), Chapman and Hall, London, 1997,
pp. 459 ± 469; b) S. Andrÿ, S. Kojima, N. Yamazaki, C. Fink, H. Kaltner, K.
Kayser, H.-J. Gabius, J. Cancer Res. Clin. Oncol. 1999, 125, 461 ± 474; c) H.
Lahm, S. Andrÿ, A. Hˆflich, J. R. Fischer, B. Sordat, H. Kaltner, E. Wolf,
H.-J. Gabius, J. Cancer Res. Clin. Oncol. 2001, 127, 375 ± 386; d) A.
Danguy, I. Camby, R. Kiss, Biochim. Biophys. Acta 2002, 1572, 285 ± 293;
e) I. Camby, N. Belot, F. Lefranc, N. Sadeghi, Y. de Launoit, H. Kaltner, S.
Musette, F. Darro, A. Danguy, I. Salmon, H.-J. Gabius, R. Kiss, J. Neuropathol. Exp. Neurol. 2002, 61, 585 ± 596; f) P. Nangia-Makker, J. Conklin, V.
Hogan, A. Raz, Trends Mol. Med. 2002, 8, 187 ± 192; g) N. Nagy, H. Le-
¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim
763
H.-J. Gabius et al.
[62]
[63]
[64]
[65]
[66]
[67]
[68]
[69]
[70]
[71]
gendre, O. Engels, S. Andrÿ, H. Kaltner, K. Wasano, Y. Zick, J.-C. Pector, C.
Decaestecker, H.-J. Gabius, I. Salmon, R. Kiss, Cancer 2003, 97, 1849 ±
1858.
a) N. Ahmad, H.-J. Gabius, H. Kaltner, S. Andrÿ, I. Kuwabara, F.-T. Liu, S.
Oscarson, T. Norberg, C. F. Brewer, Can. J. Chem. 2002, 80, 1096 ± 1104;
b) J. Hirabayashi, T. Hashidate, Y. Arata, N. Nishi, T. Nakamura, M. Hirashima, T. Urashima, T. Oka, M. Futai, W. E. G. M¸ller, F. Yagi, K.-i. Kasai, Biochim. Biophys. Acta 2002, 1572, 232 ± 354; c) A. M. Wu, J. H. Wu, M.-S.
Tsai, J.-H. Liu, S. Andrÿ, K. Wasano, H. Kaltner, H.-J. Gabius, Biochem. J.
2002, 367, 653 ± 664.
H.-J. Gabius, S. Gabius, T. V. Zemlyanukhina, N. V. Bovin, U. Brinck, A.
Danguy, S. S. Joshi, K. Kayser, J. Schottelius, F. Sinowatz, L. F. Tietze, F.
Vidal-Vanaclocha, J.-P. Zanetta, Histol. Histopathol. 1993, 8, 369 ± 383.
a) J. Kopitz, C. von Reitzenstein, S. Andrÿ, H. Kaltner, J. Uhl, V. Ehemann,
M. Cantz, H.-J. Gabius, J. Biol. Chem. 2001, 276, 35 917 ± 35 923; b) G.
Rappl, H. Abken, J. M. Muche, W. Sterry, W. Tilgen, S. Andrÿ, H. Kaltner,
S. Ugurel, H.-J. Gabius, U. Reinhold, Leukemia 2002, 16, 840 ± 845; c) L.
Santucci, S. Fiorucci, N. Rubinstein, A. Mencarelli, B. Palazzetti, B. Federici, G. A. Rabinovich, A. Morelli, Gastroenterology 2003, 124, 1381 ± 1394;
d) J. Kopitz, S. Andrÿ, C. von Reitzenstein, K. Versluis, H. Kaltner, R. J.
Pieters, K. Wasano, I. Kuwabara, F.-T. Liu, M. Cantz, A. J. R. Heck,
H.-J. Gabius, Oncogene, 2003, 22, 6277 ± 6288.
J. P. Carver, Pure Appl. Chem. 1993, 65, 763 ± 770.
a) C.-W. von der Lieth, H.-C. Siebert, T. Kozµr, M. Burchert, M. Frank, M.
Gilleron, H. Kaltner, G. Kayser, E. Tajkhorshid, N. V. Bovin, J. F. G. Vliegenthart, H.-J. Gabius, Acta Anat. 1998, 161, 91 ± 109; b) R. J. Woods, Glycoconjugate J. 1998, 15, 209 ± 216; c) J. Jimÿnez-Barbero, J. L. Asensio, F. J.
CaÊada, A. Poveda, Curr. Opin. Struct. Biol. 1999, 9, 549 ± 555; d) A. Imberty, S. Pÿrez, Chem. Rev. 2000, 100, 4567 ± 4588; e) M. R. Wormald, A. J.
Petrescu, Y.-L. Pao, A. Glithero, T. Elliott, R. A. Dwek, Chem. Rev. 2002,
102, 371 ± 386; f) T. Weimar, R. J. Woods in NMR Spectroscopy of Glycoconjugates (Eds.: J. Jimÿnez-Barbero, T. Peters), Wiley-VCH, Weinheim,
2003, pp. 111 ± 144.
a) H.-C. Siebert, M. Gilleron, H. Kaltner, C.-W. von der Lieth, T. Kozµr, N. V.
Bovin, E. Y. Korchagina, J. F. G. Vliegenthart, H.-J. Gabius, Biochem. Biophys. Res. Commun. 1996, 219, 205 ± 212; b) M. Gilleron, H.-C. Siebert, H.
Kaltner, C.-W. von der Lieth, T. Kozµr, K. M. Halkes, E. Y. Korchagina, N. V.
Bovin, H.-J. Gabius, J. F. G. Vliegenthart, Eur. J. Biochem. 1998, 252, 416 ±
427.
B. J. Hardy, J. Mol. Struct. 1997, 395±396, 187 ± 200.
a) B. Meyer, T. Peters, Angew. Chem. 2003, 115, 890 ± 918; Angew. Chem.
Int. Ed. 2003, 42, 864 ± 890; b) H.-C. Siebert, J. Jimÿnez-Barbero, S.
Andrÿ, H. Kaltner, H.-J. Gabius, Methods Enzymol. 2003, 362, 417 ± 434.
a) J. Jimÿnez-Barbero, J. F. Espinosa, J. L. Asensio, F. J. CaÊada, A.
Poveda, Adv. Carbohydr. Chem. Biochem. 2001, 56, 235 ± 284; b) H.
Yuasa, H. Hashimoto, Trends Glycosci. Glycotechnol. 2001, 13, 31 ± 55.
a) J. F. Espinosa, F. J. CaÊada, J. L. Asensio, H. Dietrich, M. MartÌn-Lomas,
R. R. Schmidt, J. Jimÿnez-Barbero, Angew. Chem. 1996, 108, 323 ± 326,
Angew. Chem. Int. Ed. 1996, 35, 303 ± 306; b) J. F. Espinosa, F. J. CaÊada,
J. L. Asensio, M. Martin-Pastor, H. Dietrich, M. MartÌn-Lomas, R. R.
Schmidt, J. Jimÿnez-Barbero, J. Am. Chem. Soc. 1996, 118, 10 862 ±
10 871; c) J. F. Espinosa, E. Montero, A. Viµn, J. L. GarcÌa, H. Dietrich, R. R.
Schmidt, M. MartÌn-Lomas, A. Imberty, F. J. CaÊada, J. Jimÿnez-Barbero,
J. Am. Chem. Soc. 1998, 120, 1309 ± 1318; d) J. L. Asensio, J. F. Espinosa,
H. Dietrich, F. J. CaÊada, R. R. Schmidt, M. MartÌn-Lomas, S. Andrÿ, H.-J.
Gabius, J. Jimÿnez-Barbero, J. Am. Chem. Soc. 1999, 121, 8995 ± 9000;
e) J. M. Alonso-Plaza, M. A. Canales, M. Jimÿnez, J. L. Roldµn, A. GarciaHerrero, L. Iturrino, J. L. Asensio, F. J. CaÊada, A. Romero, H.-C. Siebert,
764
[72]
[73]
[74]
[75]
[76]
[77]
[78]
[79]
[80]
[81]
[82]
[83]
[84]
S. Andrÿ, D. SolÌs, H.-J. Gabius, J. Jimÿnez-Barbero, Biochim. Biophys.
Acta 2001, 1568, 225 ± 236.
H.-C. Siebert, S. Andrÿ, S.-Y. L¸, M. Frank, H. Kaltner, J. A. van Kuik, E. Y.
Korchagina, N. V. Bovin, E. Tajkhorshid, R. Kaptein, J. F. G. Vliegenthart,
C.-W. von der Lieth, J. Jimÿnez-Barbero, J. Kopitz, H.-J. Gabius, Biochemistry 2003, 42, 14 762 ± 14 773.
a) S. K. Das, J.-M. Mallet, J. Esnault, P.-A. Driguez, P. Duchaussoy, P. Sizun,
J.-P. Hÿrault, J.-M. Herbert, M. Petitou, P. Sinay, Angew. Chem. 2001, 113,
1723 ± 1726; Angew. Chem. Int. Ed. 2001, 40, 1670 ± 1673; b) M. HricovÌni,
M. Guerrini, A. Bisio, G. Torri, A. Naggi, B. Casu, Semin. Thromb. Hemost.
2002, 28, 325 ± 334; c) R. Raman, G. Venkataraman, S. Ernst, V. Sasisekharan, R. Sasisekharan, Proc. Natl. Acad. Sci. USA 2003, 100, 2357 ±
2362.
a) B. Casu, A. Naggi, G. Torri, Semin. Thromb. Hemost. 2002, 28, 335 ±
342; b) P. Jemth, J. Kreuger, M. Kusche-Gullberg, L. Sturiale, G. GimÿnezGallego, U. Lindahl, J. Biol. Chem. 2002, 277, 30 567 ± 30 573; c) R. Ojeda,
J. Angulo, P. M. Nieto, M. MartÌn-Lomas, Can. J. Chem. 2002, 80, 917 ±
936.
B. Casu, M. Reggiani, G. G. Gallo, A. Vigevani, Tetrahedron 1966, 22,
3061 ± 3083.
a) H.-C. Siebert, S. Andrÿ, J. L. Asensio, F. J. CaÊada, X. Dong, J.-F. Espinosa, M. Frank, M. Gilleron, H. Kaltner, T. Kozµr, N. V. Bovin, C.-W. von der
Lieth, J. F. G. Vliegenthart, J. Jimÿnez-Barbero, H.-J. Gabius, ChemBioChem 2000, 1, 181 ± 195; b) H.-C. Siebert, M. Frank, C.-W. von der Lieth,
J. Jimÿnez-Barbero, H.-J. Gabius in NMR Spectroscopy of Glycoconjugates
(Eds.: J. Jimÿnez-Barbero, T. Peters), Wiley-VCH, Weinheim, 2003,
pp. 39 ± 57.
a) A. M. Klibanov, Nature 2001, 409, 241 ± 246; b) C. Mattos, D. Ringe,
Curr. Opin. Struct. Biol. 2001, 11, 761 ± 764.
L. He, S. Andrÿ, H.-C. Siebert, H. Helmholz, B. Niemeyer, H.-J. Gabius, Biophys. J. 2003, 85, 511 ± 524.
H.-C. Siebert, S. Andrÿ, J. F. G. Vliegenthart, H.-J. Gabius, M. J. Minch, J.
Biomol. NMR 2003, 25, 197 ± 215.
a) J. L. Asensio, F. J. CaÊada, M. Bruix, A. RodrÌguez-Romero, J. JimÿnezBarbero, Eur. J. Biochem. 1995, 230, 621 ± 633; b) J. L. Asensio, F. J.
CaÊada, M. Bruix, C. Gonzµlez, N. Khiar, A. RodrÌguez-Romero, J. Jimÿnez-Barbero, Glycobiology 1998, 8, 569 ± 577; c) J. L. Asensio, H.-C.
Siebert, C.-W. von der Lieth, J. Laynes, M. Bruix, U. M. Soedjanaatmadja,
J. J. Beintema, F. J. CaÊada, H.-J. Gabius, J. Jimÿnez-Barbero, Proteins
2000, 40, 218 ± 236; e) J. F. Espinosa, J. L. Asensio, J. L. GarcÌa, J. Laynez,
M. Bruix, C. Wright, H.-C. Siebert, H.-J. Gabius, F. J. CaÊada, J. JimÿnezBarbero, Eur. J. Biochem. 2000, 267, 3965 ± 3978; f) C. A. Bewley, Structure
2001, 9, 931 ± 940; g) M.-s. Sung, K. Fleming, H. A. Cheng, S. Matthews,
EMBO Rep. 2001, 2, 621 ± 627; h) H.-C. Siebert, S.-Y. L¸, M. Frank, J.
Kramer, R. Wechselberger, J. Joosten, S. Andrÿ, K. Rittenhouse-Olson, R.
Roy, C.-W. von der Lieth, R. Kaptein, J. F. G. Vliegenthart, A. J. R. Heck,
H.-J. Gabius, Biochemistry 2002, 41, 9707 ± 9717.
a) E. F. Hounsell in Glycosciences: Status and Perspectives (Eds.: H.-J.
Gabius, S. Gabius), Chapman and Hall, London, 1997, pp. 15 ± 29; b) H.
Geyer, R. Geyer, Acta Anat. 1998, 161, 18 ± 35.
J. Montreuil in Glycoproteins (Eds.: J. Montreuil, J. F. G. Vliegenthart, H.
Schachter), Elsevier, Amsterdam, 1995, pp. 1 ± 12.
N. Sharon, Acta Anat. 1998, 161, 7 ± 17.
H.-C. Siebert, C.-W. von der Lieth, R. Kaptein, J. J. Beintema, K. Dijkstra,
N. van Nuland, U. M. S. Soedjanaatmadja, A. Rice, J. F. G. Vliegenthart,
C. S. Wright, H.-J. Gabius, Proteins 1997, 28, 268 ± 284.
Received: August 25, 2003
¹ 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim www.chembiochem.org
ChemBioChem 2004, 5, 740 ± 764