Selecting statistical models and variable combinations for optimal classification using otolith microchemistry - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue Ecological Applications Année : 2011

Selecting statistical models and variable combinations for optimal classification using otolith microchemistry

Olivier Bruguier
  • Fonction : Auteur
  • PersonId : 861964
Rita P. Vasconcelos
  • Fonction : Auteur
Henrique Cabral
Maria J. Costa
  • Fonction : Auteur
Monica Lara
  • Fonction : Auteur
David L. Jones
  • Fonction : Auteur
David Mouillot
  • Fonction : Auteur

Résumé

Reliable assessment of fish origin is of critical importance for exploited species, since nursery areas must be identified and protected to maintain recruitment to the adult stock. During the last two decades, otolith chemical signatures (or "fingerprints") have been increasingly used as tools to discriminate between coastal habitats. However, correct assessment of fish origin from otolith fingerprints depends on various environmental and methodological parameters, including the choice of the statistical method used to assign fish to unknown origin. Among the available methods of classification, Linear Discriminant Analysis (LDA) is the most frequently used, although it assumes data are multivariate normal with homogeneous within-group dispersions, conditions that are not always met by otolith chemical data, even after transformation. Other less constrained classification methods are available, but there is a current lack of comparative analysis in applications to otolith microchemistry. Here, we assessed stock identification accuracy for four classification methods (LDA, Quadratic Discriminant Analysis [QDA], Random Forests [RF], and Artificial Neural Networks [ANN]), through the use of three distinct data sets. In each case, all possible combinations of chemical elements were examined to identify the elements to be used for optimal accuracy in fish assignment to their actual origin. Our study shows that accuracy differs according to the model and the number of elements considered. Best combinations did not include all the elements measured, and it was not possible to define an ad hoc multielement combination for accurate site discrimination. Among all the models tested, RF and ANN performed best, especially for complex data sets (e. g., with numerous fish species and/or chemical elements involved). However, for these data, RF was less time-consuming and more interpretable than ANN, and far more efficient and less demanding in terms of assumptions than LDA or QDA. Therefore, when LDA and QDA assumptions cannot be reached, the use of machine learning methods, such as RF, should be preferred for stock assessment and nursery identification based on otolith microchemistry, especially when data set include multispecific otolith signatures and/or many chemical elements.
Fichier non déposé

Dates et versions

hal-00644549 , version 1 (24-11-2011)

Identifiants

Citer

Leny Mercier, Audrey M. Darnaude, Olivier Bruguier, Rita P. Vasconcelos, Henrique Cabral, et al.. Selecting statistical models and variable combinations for optimal classification using otolith microchemistry. Ecological Applications, 2011, 21, pp.1352-1364. ⟨10.1890/09-1887.1⟩. ⟨hal-00644549⟩
131 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More