Using automatic speech recognition to predict aided speech-in-noise intelligibility - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2020

Using automatic speech recognition to predict aided speech-in-noise intelligibility

Lionel Fontan
Bertrand Segura
  • Fonction : Auteur
Michael Stone
  • Fonction : Auteur

Résumé

As the main complaint of people with age-related hearing loss (ARHL) is difficulty under- standing speech, the success of rehabilitation through hearing aids (HAs) is often meas- ured through speech intelligibility tests. These tests can be fairly lengthy and therefore cannot be conducted for all HA settings that might yield optimal speech intelligibility to the hearing-impaired listener. Recent studies showed that automatic speech recognition (ASR) can be used as an ob- jective measure for the prediction of unaided speech intelligibility in quiet in people with real or simulated ARHL (Fontan et al., 2017; Fontan et al., in revision). The aim of the present study was to assess the applicability of ASR to a wider range of listening con- ditions, involving unaided and aided speech-in-noise perception in older hearing-im- paired (OHI) listeners. Twenty-eight OHI participants (mean age = 73.3 years) were recruited for this study. They completed several speech-identification tasks, involving logatoms, words, and sentenc- es. All speech materials were mixed with a background noise with the long-term average speech spectrum (LTASS) and presented monaurally through headphones at 60 dB SPL. The signal-to-noise ratio was -1.5 dB. Participants completed the identification tasks unaided and aided using a HA simulator implementing individual gains prescribed by the CAM2b fitting rule. A speech-intelligibility prediction system was set up, consisting of: (1) the HA simula- tor used for the OHI participants (Moore et al., 2010), (2) an age-related-hearing-loss simulator implementing the algorithms described by Nejime and Moore (1997), and (3) an HMM-GMM-based ASR system using the Julius decoder software (Nagoya Institute of Technology, Japan), with acoustic models trained on speech in LTASS noise, and a different language model for each of the speech materials. Human and machine in- telligibility scores were calculated as the percentage of logatoms or words that were correctly identified. The results show that, on average, the implementation of CAM2b gains significantly im- proved speech-in-noise intelligibility performances both in OHI listeners and the ASR system.
Fichier non déposé

Dates et versions

hal-02960442 , version 1 (07-10-2020)

Identifiants

  • HAL Id : hal-02960442 , version 1

Citer

Lionel Fontan, Jérôme Farinas, Bertrand Segura, Michael Stone, Christian Füllgrabe. Using automatic speech recognition to predict aided speech-in-noise intelligibility. Speech In Noise Workshop, Jan 2020, Toulouse, France. ⟨hal-02960442⟩
71 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More