Unusually Acidic Proteins in Biomineralization

Calcium carbonate biominerals are the most abundant mineral on the surface of the Earth. In eukaryotes, all biologically controlled calcium carbonate minerals are associated with a minor organic matrix, which displays several essential functions: crystal nucleation, control of crystal shape, and crystal growth inhibition. In addition, the matrix may be involved in enzymatic functions and may mediate cell–cell and cell–matrix interactions. The matrix is a mixture of proteins, glycoproteins, complex carbohydrates, proteoglycans, glycosaminoglycans and, sometimes, lipids. The biochemical properties of this matrix have been studied in numerous cases. One peculiarity shared by most (if not all) matrices associated with calcium carbonate biominerals is the presence of unusually acidic proteins; very often, these are rich in aspartic acid residues. The nature and biochemical properties of these proteins, their study and their unusual behaviour in solution remain topics of debate. In this chapter, we review our present knowledge of these unusually acidic proteins associated with calcium carbonate biomineralizations in selected eukaryotic phyla.

In the biosphere, eukaryotes represent major contributors to the production of calcium carbonate biominerals. These biominerals play a pivotal role, for several reasons [1]: (i) they are a major actor in the carbonate cycle; (ii) they represent a major sink for both calcium and carbon, and thus, participate in the climate 1 regulation; and (iii) they contribute to maintain the Earth's homeostasis by buffering the oceans and maintaining them at a reasonable degree of supersaturation.
All calcium carbonate biominerals produced by eukaryotes share a remarkable property. They are all organo-mineral composites, where the organic phase represents only a small fraction (from <0.1 wt% to few wt%) of the total biomineral. Today, it is known that this matrix plays essential functions in mineralization [2]. In particular, it acts as a template for calcium carbonate deposition, by favoring the growth of calcium carbonate crystals in privileged directions, and then stopping their growth. The biomineral also stabilizes unstable or metastable polymorphs of calcium carbonate (vaterite, aragonite, amorphous calcium carbonate). In addition to these physico-chemical functions, the organic matrix may also be involved in cell-matrix interactions and cell-cell communication. This organic matrix is not homogeneous, but rather is composed of a mixture of different macromolecular components, including proteins, saccharides, glycans, and lipids.
Among the proteinaceous moiety, the key components of the matrix are the unusually acidic proteins. These are central to classical hypotheses on biomineralization, because they are always associated with calcium carbonate biominerals, and because they are shown to interact strongly with them [3].
The concept that eukaryote calcium carbonate biominerals are organo-mineral composites has long been known. In fact, such reports extend back to the mid-19th century, when Frémy [4] analyzed for the first time conchiolin, the insoluble organic residue of mother-of-pearl. The finding of unusually acidic proteins associated with calcium carbonate biominerals is much more recent, however. It is difficult to determine with precision who was the first to claim that acidic macromolecules were needed to bind calcium ions and nucleate calcium carbonate. The proposal was first made during the early 1960s, and formulated successively by Glimcher [5], Hare [6], Simkiss [7], and Degens and co-workers [8]. Hare, in 1963, had a surprisingly modern view: ''. . . the role of the organic matrix in mineralization is probably to provide a set of highly specific templates which act as the sites for the nucleation of the mineral phase . . . Aspartic and glutamic acid side chains could provide negatively charged sites, which would attract calcium ions.'' In 1967, Degens also said: ''. . . the most essential factor in nucleating a mineral phase appears to be the availability of free carboxyl groups provided by certain acidic amino acids . . .''. What appears however certain was that the detection of acidic proteinaceous fractions associated with calcium carbonate biominerals was correlated to the development and the extensive use of amino acid analyses of protein hydrolyzates.
An important step was crossed almost simultaneously by Meenakshi and coworkers, and by Crenshaw, with the discovery of the EDTA-soluble matrix [9], extracted from molluskan shells. In both cases, this matrix appeared to be singularly enriched in aspartate residues, in comparison to the EDTA-insoluble matrix, which was more hydrophobic. Although Crenshaw believed that most of the aspartate residues were in their amide form, he demonstrated that the soluble matrix was intracrystalline, since it was not degraded by NaOCl-treatment of the shell powder. This meant that the soluble acidic matrix is tightly bound to the mineral phase. A few years later, Weiner and Hood, in a key report, showed that the soluble proteins of the molluskan shell matrix were truly aspartic acid-rich and that they could act as a template for crystal nucleation [10]. Consequently, these authors proposed a first model, where the binding of calcium ions is performed via the negatively charged carboxyl radicals of aspartic acids, in a hypothetical (D-Y) n sequence (where Y can be any amino acid). They explained that the atomic distance between two consecutive calcium ions in the aragonite lattice would match the distance between two consecutive negatively charged radicals of aspartic acid in such a sequence (which implies that one calcium ion is chelated by two consecutive carboxylates). Some years later, Wheeler et al. brought another missing part to the puzzle when they observed that the acidic soluble matrix had the ability to delay the in-vitro precipitation of calcium carbonate [11] -an effect which was found to be dose-dependent. At that time, it appeared that the acidic proteins of the shell played two antagonist roles in calcium carbonate biomineralization: nucleation and inhibition. These two concepts were central to the discipline for the two decades following their discovery. Nowadays, although still important, these concepts are replaced in a more dynamic perspective, as we will see in the following sections.

What Makes a Protein Unusually Acidic?
In the light of our introduction, it seems necessary first to define an unusually acidic protein. In the living world, the majority of proteins have an isoelectric point, which is either neutral or slightly acidic. The isoelectric point (pI) is the pH value at which a protein has no net electric charge. At a pH below pI, proteins carry a net positive charge, but above pI their net charge is negative. In the proteome of the bacteria Escherichia coli, more than 90% of the abundant proteins lie in an isoelectric point window of 4 to 7 [12]. Thus, those proteins which exhibit a low pI (below 4/4.5) can be considered as ''unusually acidic''. However, such a definition is restrictive, as a protein with a pI around neutrality can exhibit very acidic functional domains, compensated by basic ones.
There are two reasons for a protein for being ''unusually acidic''. First, in its primary structure, acidic amino acids dominate, and the ratio between acidic and basic residues is largely in favor of the acidic residues. Second, this protein exhibits post-translational modifications, which bring additional negative charges to the peptide core. In the first case, the acidity of a protein is generally determined by the amount of aspartic acid (Asp, or D) and glutamic acid (Glu, or E) residues in the sequence. Aspartic and glutamic acids are the two natural amino acids, which exhibit a carboxylic acid group in their side chain (aCH 2 COOH for Asp, aC 2 H 4 COOH for Glu). Their frequencies in ''standard proteins'' are 5.3% and 6.2%, respectively [13]. The other parameter to consider is the ratio between these amino acids and the sum of lysine (Lys or K), arginine (Arg, or R) and histidine (His, or H): a protein with only few D or E residues can indeed exhibit a relatively acidic pI if it contains very few basic residues. For example, this is the case for GAMP, a crustacean matrix protein (see Section 16.5.2.5, Arthropods).
Post-translational modifications represent a second way to increase the polyanionic characteristic of a protein. In the case of proteins associated with biominerals, the most frequent modifications are glycosylation, phosphorylation, sulfation, and carboxylation of glutamic acid [14]. Glycosylation is the enzymatic addition of a saccharide moiety to a protein core [15]. Two types of glycosylation exist: N-linked glycosylation to the amide nitrogen of asparagine side chain, and O-linked glycosylation to the hydroxyl group of serine or threonine side chain. The first type is found for example in the molluskan shell protein dermatopontin [16], whereas the second type is suspected to be the main glycosylation type of mucoperlin, another shell protein [17]. Although not all glycosylations modify the net charge of a protein, some do, such as the addition of sialic acids (a family of nine-carbon, negatively charged monosaccharides), the addition of polysaccharides composed of acidic sugars such as glucuronic acid (e.g., hyaluronic acids), or the addition of saccharides which exhibit a terminal sulfate group.
The second important post-translational modification is phosphorylation, which represents the addition of a phosphate group (PO 4 ) on to serine (the most frequent case), threonine, or, more rarely, on to tyrosine [15]. Phosphorylations are catalyzed by protein kinases, an extremely diversified family of enzymes. Phosphorylations are extremely important in biomineralization systems and fulfill important functions. Phosphate groups are thought to bind calcium ions in cooperation with carboxylic groups [18]. For example, phosphophoryn, which is found in the dentin matrix, is highly phosphorylated, and the phosphate groups are important for its function as a mediator of dentin biomineralization [19]. RP-1, a molluskan shell protein is a potent inhibitor of calcium carbonate precipitation in vitro, but this effect is completely dependent on phosphate groups, as the dephosphorylation of RP-1 results in a complete loss of any inhibitory effect [20]. Orchestin, a protein associated with calcium storage concretions in a terrestrial crustacean, is a calcium-binding protein. Here again, this ability, which is conveyed by phosphorylated serine residues, is lost when serine is dephosphorylated [21].
Another post-translational modification, which can increase the negative charge of a protein involved in biomineralization, is sulfation -that is, the addition of a sulfate group to the side chain hydroxyl group of a tyrosine residue. Several extracellular matrix proteins exhibit sulfated tyrosine residues [22]. In biomineralization research, however, this post-translational modification is poorly documented, although its existence is suspected on the basis of computer-based analyses of the primary structures of several biomineralizing proteins [23].
Finally, the carboxylation of glutamic acid residues leads to the formation of g-carboxy-glutamic acid (abbreviated as Gla). Proteins containing Gla residuessuch as osteocalcin and matrix Gla protein -are important constituents of calcium phosphate biominerals, such as bone [24]. However, to our knowledge, Gla has not yet been detected in proteins associated with calcium carbonate biominerals.

Biochemical Techniques for Studying Unusually Acidic Proteins
Because unusually acidic proteins are strongly bound to the mineral phase, they are usually released by dissolving the calcium carbonate powder, with concentrated EDTA (directly in solution or by dialysis against EDTA) [10,25], with dilute acetic acid [26], or more rarely with hydrochloric acid [27] or formic acid [28]. EDTA functions at neutral pH and therefore, does not denature matrix proteins. On the other hand, EDTA forms aggregates which are difficult to remove, even by extensive dialysis [29]. Thus, when not completely removed, EDTA can dramatically interfere with subsequent investigations such as inhibition tests or the invitro growth of calcite. In this regard, our preference is to use dilute (5% v/v) cold (4 C) acetic acid, which is progressively added to the decalcifying solution using an automatic titrimeter, such that the pH never falls below 4. Another interesting possibility is the soft demineralization of calcium carbonate powder on a cation-exchange resin [30]. Extraction of the matrix with bi-distilled water, as has been performed in few cases [31], is insufficient to extract acidic proteins.
Once solubilized, the acidic proteins, which are considered to be major components of the matrix soluble fraction [10], are then separated from the insoluble fraction by centrifugation. They are subsequently concentrated and purified from the mineral ions, by combining ultrafiltration and dialysis. When they are free from salts, they can be analyzed by using different biochemical methods, in order to obtain structural information. Usually, because these proteins do not have a globular shape, gel permeation chromatography is not very effective in resolving discrete molecules. To date, the most relevant fractionation techniques have been a combination of high-performance liquid chromatography (HPLC) and ionexchange chromatography [32], or polyacrylamide gel electrophoresis under denaturing conditions (SDS-PAGE) [33]. Even by optimizing the fractionation conditions, unusually acidic proteins have a tendency to smear, except when extracted from amorphous calcium carbonate (ACC) structures [34] or from eggshells [35].
The analysis of acidic proteins by gel electrophoresis requires some precautions to be taken. First, denaturing conditions must be used [36] in order entirely to dissociate the macromolecular complexes formed in the matrix. Second, acidic proteins do not stain easily with Coomassie Brilliant Blue (CBB), because they are often poor in, and sometimes devoid of, any aromatic and basic amino acids, with which the anionic blue form of CBB reacts exclusively [37]. Classical silver staining can be employed [38], although despite the high sensitivity of this staining, ''negative staining'' patterns of acidic proteins are often observed [19a]. Furthermore, because of their charge, acidic proteins tend to diffuse quickly out of the gel. This obstacle can be precluded by a double fixation of the gel after the electrophoresis, and a modification of the staining procedure, which visualizes all of the acidic macromolecules [39]. Other staining procedures can be used for visualizing highly acidic proteins, in particular Alcian blue [40] or carbocyanine (Stains-all) [41]. Alcian blue stains blue the polyanionic macromolecules (proteins, glycoproteins, proteoglycans, acidic polysaccharides), and seems adapted for acidic proteins, but is not very sensitive. Carbocyanine dye stains blue the polyanionic proteins (e.g., sialoglycoproteins, phosphoproteins, calcium-binding proteins), and the other proteins red. One peculiarity of the Stains-all staining is that calcium-binding proteins exhibit generally a distinctive metachromatic blue color [21,42]. However, in some cases, polyanionic molecules such as calciumbinding proteins stains purple with carbocyanine [42]. Our own experience has shown that some unusually acidic proteins of the shell of the bivalve Pinna nobilis stains also purple with carbocyanine [43]. However, this staining is not very sensitive and fades rapidly when exposed to light. Modifications have been performed for improving the sensitivity and stability of carbocyanine staining by combining Stains-all and silver staining [44]. In addition to carbocyanine, red ruthenium, which initially is used as a histochemical dye for acidic glycosaminoglycans, specifically stains red any calcium-binding proteins [45], and may consequently be used to detect such proteins among matrix acidic components.
One peculiarity observed for highly acidic proteins is their behavior in SDS-PAGE: they migrate to a calculated molecular mass higher than their molecular mass deduced from their sequence or evaluated by mass spectrometry. This is the case for nacrein, a molluskan shell protein (see Section 16.5.2.6), and for GAMP and orchestin (two crustacean calcium storage structures proteins; see Section 16.5.2.5). If this discrepancy can be attributed to the putative presence of posttranslational modifications, then another explanation lies in the presence of strongly biased amino acid domains which are known to bind SDS poorly [46]. Among these domains are those rich in acidic amino acids (e.g., asp-rich).
Following electrophoresis, unusually acidic proteins can be blotted onto nitrocellulose membranes, and studied for their ability to bind radioactive calcium ions [47]. This test has been applied successfully in a number of cases [21,48], although some acidic proteins associated with calcium carbonate biominerals do not bind at all (because they are unable to renature during electro-transfer on the membrane [49]), or they bind calcium very poorly. The calcium overlay test developed by Ebashi and co-workers was primarily adapted to high-affinity-lowcapacity calcium-binding proteins -that is, proteins with canonical calciumbinding domains, such as EF-hand [50]. Several extracellular calcium-binding proteins do not exhibit EF-hand motifs or other canonical calcium-binding domains, and bind calcium only with a low affinity [51]. Because unusually acidic proteins associated with calcium carbonate biominerals exhibit domains rich in glutamate or aspartate when they are calcium-binding, chelation of the calcium ions (sometimes in cooperation with other anionic groups such as phosphate or sulfate) is performed with a high capacity and a low affinity. This type of affinity is compatible with the nucleation process, which requires a reversible binding of calcium ions [52]. Our own practical experience has shown that the Maruyama test must be adapted in some cases, for such proteins, for example by reducing the time for rinsing the membranes, and by increasing the time of film exposure.
Other characterizations of unusually acidic proteins blotted onto nitrocellulose membranes include studies of their putative post-translational modifications. For example, phosphorylations can be studied with commercially available antibodies raised against phosphoserine, phosphotyrosine or phosphothreonine, or by performing either a complete (e.g., by using the lambda protein phosphatase) or specific (using Ser/Thr-or Tyr-protein phosphatase) enzymatic cleavage. Another approach consists -after in-vivo labeling of phosphoproteins with radioactive phosphate -of identifying the labeled amino acids. Following the in-vivo injection of 32 Pi, the whole organic matrix proteins are classically extracted, then submitted to an acidic hydrolysis followed by a separation by thin-layer chromatography. The radiolabeled amino acids were revealed by autoradiography and identified by comparison with standard phosphoamino acids [21].
Glycosylation studies can be approached via analysis with lectins (carbohydratebinding proteins), via chemical treatment using trimethanesulfonic acid (TMFS) followed by a high-performance anion-exchange chromatography, or via enzymatic deglycosylations (with a mixture of glycosidases or with single specific glycosidases) [43].
During the past 20 years, few functional tests have been developed for monitoring the effect of unusually acidic proteins on the crystallization of calcium carbonate. For the in-vitro inhibition test, known amounts of an acidic protein are added to a solution of sodium bicarbonate (10 mM) to which a solution of calcium chloride is quickly added [25,53]. Variations of the pH values indicate whether the spontaneous precipitation of calcium carbonate (calcite) occurs normally, or whether the process is delayed. If the latter, the delay corresponds to the inhibiting effect of the tested acidic protein. A variant of this test consists of maintaining a constant pH (the pH stat test) by adding the required volume of sodium hydroxide via an automatic titrimeter [11]. The volume added is inversely proportional to the inhibiting capacity of the tested molecule. Another semiquantitative variant of this test consists of measuring in a Petri dish the inhibition zone caused by acidic proteins on calcium chloride-containing agarose hydrogel immersed in sodium bicarbonate solution [54]. The second test, the ''interference test'', consists of growing calcium carbonate crystals, by slow diffusion of ammonium bicarbonate vapors into a solution of calcium chloride, to which known amounts of acidic proteins are added [27]. The crystals are then observed by scanning electron microscopy. In ''blank'' tests, most of the crystals have the typical rhombohedral shape. Interfering acidic proteins can drastically modify the shape of the crystals, by provoking the development of new crystal faces, or by promoting the appearance of polycrystalline aggregates. The effect is also dose-dependent. Several acidic proteins associated with calcium carbonate biominerals are known to exert an effect at concentrations as low as 1 mgm L À1 .

Interactions of Acidic Proteins with Calcium Carbonate Crystals and Organo-Mineral Models
At nanoscale and molecular scale, the problem of the interactions of acidic proteins with calcium carbonate biominerals has been solved by a variety of comple-mentary techniques, including classical scanning electron microscopy (SEM) observations, cryo-transmission electron microscopy (TEM), X-ray diffraction, atomic force microscopy (AFM), computer-based molecular modeling, crystal growth experiments in the presence of acidic protein followed by SEM observations, circular dichroism (CD) and NMR structural analyses from synthetic peptides. Much of our knowledge has been derived from studies conducted by Weiner, Addadi, Aizenberg and co-workers [27,55], the group of Stephen Mann [56], Wierzbicki, Sikes and co-workers [57], the group of De Yoreo [58], DeOliveira and Laursen [59], the group of John Evans [60], and Valiyaveettil and coworkers [61]. In a number of studies, the molluskan shell or sea urchin spines were used as a model system. Briefly, different modes of interactions of unusually acidic macromolecules with calcium carbonate crystals can be distinguished. Acidic proteins can induce nucleation, adsorb specifically onto some crystal faces, and/or intercalate in a controlled manner into the crystal lattice. It has also been suggested that, in some cases, they stabilize amorphous calcium carbonate.
As noted above, the consecutive discoveries of Weiner and Hood, [10] and of Wheeler and co-workers [11], led to the idea that acidic proteins play two antagonistic roles, depending on their state. Thus, when adsorbed onto an organic insoluble template, the acidic proteins promote crystal nucleation, but when they are free in solution they play an opposite role, by inhibiting crystal growth. In the initial nucleation model of Weiner et al. [62], which was developed from molluskan shell biomineralization, the acidic (Asp-rich) proteins were bound to a ''sole'' of hydrophobic silk-fibroin-like proteins (adopting the antiparallel b-sheet conformation). These insoluble proteins were in turn attached to a chitin core. In this ''sandwich'' model, the carboxylate functions of the side chains of aspartic acid residues were accessible to calcium ions. Sulfated polysaccharides, which supposedly were bound to the acidic protein core, could also contribute by attracting and concentrating calcium in the vicinity of the soluble template [55a]. The crystals then could grow on top of the polyanionic layer. As described, the model was related to hetero-epitaxy.
The initial molluskan biomineralization model presented above has undergone drastic evolution during the past few years, for two reasons. First, recent cryo-TEM observations of nacre samples have brought to light a completely different organization of organo-mineral assembly [63]. Second, very recently Cölfen and co-workers discovered that each nacre tablet in the nacreous shell layer of Haliotis laevigata is surrounded by a thin layer of ACC [64]. With regard to the first aspect, Levi-Kalisman and co-workers proposed a molluskan nacre model where b-chitin provides the insoluble organic framework, in which the hydrophobic (Gly/Alarich) proteins (the silk-fibroin-like proteins) form a gel. The acidic proteins are thought to be clustered at the interface between the framework and the gel, and also to be entrapped within the gel. Aragonite tablets nucleate within the gel, and push away the gel by expanding their size. The second aspect concerns the formation of transient amorphous calcium carbonate. The fact that the elaboration of any crystalline calcium carbonate biomineral (calcite or aragonite) could be pre-ceded by a preliminary step, during which ACC forms, seems to be a general phenomenon which is shared by mollusks, crustaceans, and echinoderms [65]. This reconstitutes the role of acidic proteins in a more general context, where they cooperate with other components of the matrix, in a highly controlled sequence [66]. First, the framework (chitin) is built, after which the initial mineral granules (ACC) are secreted (they may be formed intracellularly and exported to the site of crystallization). The role of acidic proteins, at this stage, remains obscure: it is unclear whether they stabilize the ACC for a short period, during the transit of amorphous precursor granules from the cell to the site of mineralization, or whether they provide the inhibiting micro-environment between the secreting cells and the location of mineral assembling process. The third stage is nucleation of the crystals (aragonite tablets), driven by acidic proteins, in cooperation with sulfate groups. The final stage is growth of the formed crystals, and completion of the mineralizing cycle.
Because acidic proteins exhibit several anionic sites, they can interact strongly with calcium carbonate crystals. The crystal-binding properties imply that there is a molecular recognition between acidic proteins and the mineral surfaces. This recognition involves both the primary (Asp-rich domains) and the secondary structure (b-sheet) of acidic proteins. In numerous cases, it has been shown that negatively charged polypeptides (e.g., poly-aspartic acid) are extremely effective inhibitors of calcium carbonate growth [67], as they are much more potent inhibitors than small mineral ions (Mg) or small organic molecules such as free aspartic acid. Their effect is reinforced when they are associated with a hydrophobic domain [67]. One mechanism involved is that they are adsorbed onto crystal nuclei, and ''poison'' them. When the acidic proteins cover all the faces of small growing nuclei, the crystals can no longer grow and the inhibition is complete. When acidic proteins are adsorbed onto specific faces, the crystals grow in privileged directions [55a] and the final crystal shapes are different from those of calcium carbonate crystals, and synthesized in a purely chemical manner.
The controlled intercalation of acidic proteins into the crystal lattice of calcite or aragonite serves as a general mechanism, and explains, for example, why the sea urchin calcitic spine has a better resistance to fracture than pure calcite. In particular, the spine does not break along the [104] plane of calcite, which is the most frequent cleavage plane. The intercalation of acidic glycoproteins along crystal planes that are oblique to the [104] plane modifies the mechanical properties of the spine. Consequently, this latter crystallographic plane is not privileged when mechanical constraints are applied. The intercalation is controlled and differential, according to the type of acidic proteins. This has been demonstrated in two cases. First, partly purified proteins from the spines of the sea urchin Paracentrotus lividus were found to interact only with faces roughly parallel to the c crystallographic axis of calcite. Second, the soluble matrix of the calcitic prisms of Atrina rigida, a bivalve, can be divided in two populations of proteins: the most acidic interacting with the {001} set of faces, and the less acidic, with the {01l} set of faces [27, 55h]. Computer simulations and AFM observations performed with synthetic poly-Asp peptides show that they bind {110} faces of calcite, and that they stretch in parallel rows in a direction parallel to the c crystallographic axis [68].
Recently, another effect of acidic proteins in calcite biominerals was detected. Pokroy and co-workers [69], by measuring the lattice parameters of different biogenic molluskan calcite biominerals, observed a slight lattice distortion (about 2 Â 10 À3 ), after taking into account the required corrections for magnesium and sulfur. This distortion is not isotropic, but rather is maximal along the c axis. It was further demonstrated that calcite crystals grown in the presence of one acidic protein, caspartin, exhibit a similar lattice distortion. Although this effect can only be measured with an extremely precise X-ray diffractometer, it is clearly significant.

Occurrence of Unusually Acidic Proteins in Selected Metazoan CaCO 3 -Mineralizing Phyla
So far, unusually acidic proteins are a feature seen to be shared by most -if not all -calcium carbonate-mineralizing phyla. However, it must be borne in mind that for some eukaryotic phyla, our knowledge is extremely limited: for example, we know virtually nothing of the molecular aspects of the biomineralization of the bryozoans, a group of colonial animals that, in geological times, has had considerable importance as reef-builders. The situation is similar for several calcifying green or red algae. In contrast, a wealth of data is becoming available for mollusks, crustaceans, echinoderms, or vertebrates. In between these situations, the foraminifera, calcifying sponges, brachiopods, urochordates (tunicates) have also been studied, though much of our knowledge is based on a limited number of amino acid compositions of bulk matrices or, in the best cases, the amino acid compositions of purified proteins. Cnidarians represent a phylum for which several amino acid compositions of bulk skeletal matrices were determined during the 1970s and 1980s, and for which sequence data on skeletal proteins have just begun to be published.
Here, we have deliberately chosen to assemble the data on unusually acidic proteins characterized from three different mineralizing systems, for which several protein sequences have been published and are available in protein databases, into three tables. These comprise the molluskan shell (Table 16.1), the cuticle and calcium storage structures of crustaceans (Table 16.2), and finally, the eggshell (Table 16.3). We selected only those proteins which exhibit a theoretical pI below 6. For mollusks, approximately 22 protein sequences are known, of which 14 are acidic and four can be considered as extremely acidic (pI < 4). For crustaceans, 28 complete cuticular protein sequences (plus nine partial sequences) have been retrieved. In addition, two complete protein sequences of calcium storage structures have been determined. From these 30 sequences, 21 can be considered The pI and D þ E percentage values are calculated from each protein primary sequence, devoid of the signal peptide when predictable, obtained in the Swiss-Prot/TrEMBL database. This does not take into account the putative post-translational modifications, which could occur in vivo and modify the pI of the mature proteins. Only completely sequenced proteins, the pI of which is <6, have been listed. These were retrieved from bivalves (BIVALVIA) and gastropods (GAS).
as unusually acidic and are consequently presented in Table 16.2. For the eggshell, 17 sequences are known, but only nine are acidic. There is no doubt that, in the near future, these three lists will be considerably extended, in particular because several genome sequencing projects are currently under way (see www.genomesonline.org), and will soon provide a wealth of information on the organization (introns and exons) and of the location of each calcifying gene. This is already the case for the sea urchin Strongylocentrotus purpuratus, for the edible mussel Mytilus californianus, and the lobster Homarus americanus. The pI and D þ E percentage values are calculated from each protein primary sequence, devoid of the signal peptide when predictable, obtained in the Swiss-Prot/TrEMBL database. This does not take into account the putative post-translational modifications, which could occur in vivo and modify the pI of the mature proteins. Only completely sequenced proteins, the pI of which is <6, have been listed. These are essentially components of the cuticle matrix, except for the final two from calcium storage structures.

Concluding Remarks
The review on unusually acidic proteins involved in biomineralization calls for few remarks.
First, most of our present knowledge is restricted to a limited number of species, namely two coccolithophorid algae, a few foraminifera, four cnidarians, eight mollusks (four bivalve species, four gastropod species), seven arthropods, and five echinoderms. The question thus remains as to whether this sampling is sufficient to provide a good representation of the acidic proteins associated with calcium carbonate biominerals. The answer is, probably not, considering the huge size of some phyla (mollusks and arthropods) and the diversity of skeletal textures (corals and mollusks). Most likely, we have a very partial view of the diversity of all unusually acidic proteins involved in biomineralization, and are just ''discovering the tip of the iceberg''.
Second, the acidity of unusually acidic proteins is often due to aspartic acid, rather than to glutamic acid. Although there are several exceptions to this (e.g., some crustacean proteins), there is a net tendency for biological systems to choose aspartic acid -a remarkable selection which to date is totally unexplained. The pI and D þ E percentage values are calculated from each protein primary sequence, devoid of the signal peptide when predictable, obtained in the Swiss-Prot/TrEMBL database. This does not take into account the putative post-translational modifications, which could occur in vivo and modify the pI of the mature proteins. Only completely sequenced proteins, the pI of which is <6, have been listed. All protein sequences were obtained from avian species (except pelovaterin, from a turtle).
Third, all of the known skeletal proteins are not acidic when considering their primary structure. For example, in the case of mollusks about one-fifth of the proteins are unusually acidic, while the remainder are acidic, neutral, or even basic. This is even more striking in the case of sea urchins, where most of the spicule proteins exhibit a (theoretical) neutral or basic pI. One aspect to consider is, of course, post-translational modifications, that can drastically modify the properties of biomineralization-associated proteins, and increase their polyanionic properties.
Fourth, most of the unusually acidic proteins belong to the soluble fraction. However, some limit-cases exist, such as the molluskan shell protein MSI31 [73], or the crustacean GAMP [87]. For example, MSI31 exhibits hydrophobic domains typical of framework insoluble proteins, in addition to acidic domains, which are characteristic of soluble shell proteins. GAMP is an insoluble protein, rich in glutamine residues (20%), with an acidic pI (4.11) and glutamic acid residues (9%) dispersed throughout its N-terminal half. These counter-examples make the classic dichotomy between acidic soluble proteins and hydrophobic insoluble proteins less pertinent.
Which brings us to the final remark. Many of the unusually acidic proteins and, more generally, of proteins associated with calcium carbonate biominerals, are modular, with each module corresponding to a functional domain. As a consequence, most of these proteins must be multifunctional, though how they function is still beyond our comprehension. Clearly, a major effort must be made to understand the function(s) of these proteins, and in that sense the morpholino approach, as used in the otolith system [95], may provide some promising results.