A Novel Gradient Accumulation Method for Calibration of Named Entity Recognition Models
Résumé
The adoption of deep learning models has brought significant performance improvements across several research fields, such as computer vision and natural language processing. However, their "black-box" nature yields the downside of poor explainability: in particular, several real-world applications require-to varying extents-reliable confidence scores associated to a model's prediction. The relation between a model's accuracy and confidence is typically referred to as calibration. In this work, we propose a novel calibration method based on gradient accumulation in conjunction with existing loss regularization techniques. Our experiments on the Named Entity Recognition task show an improvement of the performance/calibration ratio compared to the current methods.
Domaines
Informatique [cs]
Origine : Fichiers produits par l'(les) auteur(s)