Deep quaternion neural networks for spoken language understanding

Titouan Parcollet; Mohamed Morchid; Georges Linarès

Communication Dans Un Congrès Année : 2017

Deep quaternion neural networks for spoken language understanding

(1) , (1) , (1)

Titouan Parcollet

Fonction : Auteur correspondant
PersonId : 174514
IdHAL : titouan-parcollet
ORCID : 0000-0003-0672-1346

Connectez-vous pour contacter l'auteur

Laboratoire Informatique d'Avignon

Mohamed Morchid

Fonction : Auteur
PersonId : 21451
IdHAL : morchid
ORCID : 0000-0002-4427-2468
IdRef : 188328343

Laboratoire Informatique d'Avignon

Georges Linarès

Fonction : Auteur
PersonId : 4977
IdHAL : georges-linares
IdRef : 079368794

Laboratoire Informatique d'Avignon

Résumé

The availability of open-source software is playing a remarkable role in the popularization of speech recognition and deep learning. Kaldi, for instance, is nowadays an established framework used to develop state-of-the-art speech recognizers. PyTorch is used to build neural networks with the Python language and has recently spawn tremendous interest within the machine learning community thanks to its simplicity and flexibility. The PyTorch-Kaldi project aims to bridge the gap between these popular toolkits, trying to inherit the efficiency of Kaldi and the flexibility of PyTorch. PyTorch-Kaldi is not only a simple interface between these software, but it embeds several useful features for developing modern speech recognizers. For instance, the code is specifically designed to naturally plug-in user-defined acoustic models. As an alternative, users can exploit several pre-implemented neural networks that can be customized using intuitive configuration files. PyTorch-Kaldi supports multiple feature and label streams as well as combinations of neural networks, enabling the use of complex neural architectures. The toolkit is publicly-released along with a rich documentation and is designed to properly work locally or on HPC clusters. Experiments, that are conducted on several datasets and tasks, show that PyTorch-Kaldi can effectively be used to develop modern state-of-the-art speech recognizers.

Mots clés

deep learning

Domaines

Informatique [cs] Intelligence artificielle [cs.AI]

Fichier principal

asru2017_titouan_parcollet.pdf (404.79 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Titouan Parcollet : Connectez-vous pour contacter le contributeur

https://hal.science/hal-02107630

Soumis le : mardi 18 juin 2019-09:08:38

Dernière modification le : mercredi 3 novembre 2021-09:59:44

Dates et versions

hal-02107630 , version 1 (18-06-2019)

Identifiants

HAL Id : hal-02107630 , version 1

Citer

Titouan Parcollet, Mohamed Morchid, Georges Linarès. Deep quaternion neural networks for spoken language understanding. 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Dec 2017, Okinawa, Japan. pp.504-511. ⟨hal-02107630⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-AVIGNON LIA

110 Consultations

309 Téléchargements

Deep quaternion neural networks for spoken language understanding

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager