Skip to Main content Skip to Navigation
Conference papers

CGCNN: COMPLEX GABOR CONVOLUTIONAL NEURAL NETWORK ON RAW SPEECH

Abstract : Convolutional Neural Networks (CNN) have been usedin Automatic Speech Recognition (ASR) to learn represen-tations directly from the raw signal instead of hand-craftedacoustic features, providing a richer and lossless input signal.Recent researches propose to inject prior acoustic knowledgeto the first convolutional layer by integrating the shape of theimpulse responses in order to increase both the interpretabil-ity of the learnt acoustic model, and its performances. Wepropose to combine the complex Gabor filter with complex-valued deep neural networks to replace usual CNN weightskernels, to fully take advantage of its optimal time-frequencyresolution and of the complex domain. The conducted exper-iments on the TIMIT phoneme recognition task shows thatthe proposed approach reaches top-of-the-line performanceswhile remaining interpretable.
Complete list of metadata

Cited literature [27 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-02474746
Contributor : Paul-Gauthier Noé <>
Submitted on : Tuesday, February 11, 2020 - 3:46:50 PM
Last modification on : Monday, February 8, 2021 - 11:18:02 AM
Long-term archiving on: : Tuesday, May 12, 2020 - 3:18:43 PM

File

gabor_complex_cnn_final.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-02474746, version 1

Collections

Citation

Paul-Gauthier Noé, Titouan Parcollet, Mohamed Morchid. CGCNN: COMPLEX GABOR CONVOLUTIONAL NEURAL NETWORK ON RAW SPEECH. ICASSP 2020, May 2020, Barcelona, Spain. ⟨hal-02474746⟩

Share

Metrics

Record views

147

Files downloads

237