Using values of the human cochlea in the macro and micro mechanical model for automatic speech recognition

José Luis Oropeza Rodríguez, Sergio Suárez Guerra

Producción científica: Capítulo del libro/informe/acta de congresoContribución a la conferenciarevisión exhaustiva

Resumen

Recently the parametric representation using cochlea behavior has been used in different studies related with Automatic Speech Recognition (ASR). That is because this hearing organ in mammalians is the most important element used to make a transduction of the sound pressure that is received by the outer ear. This paper shows how the macro and micro mechanical model is used in ASR tasks. The values that Neely, Elliot and Ku founded in their works, related with the macro and micro mechanical model such as Neely were used to set the central frequencies of a bank filter to obtain parameters from the speech in a similar form as MFCC (Mel Frequency Cepstrum Coefficients) has been constructed.

An approach that considers a new form to distribute the bank filter in our parametric representation is proposed. Then this distribution of the bank filter to have a different representation of the speech in frequency domain compared with MFCC is applied. The response of these three values mentioned above into macro and micro mechanical model to create the central frequencies of the bank filter were used, then the Mel scale function substituted by a representation based in the cochlear response based on the Neely model. This model was used with a set of different parameters of the cochlea, used by Nelly, Elliot and Ku in their works, such as mass, damping and stiffness; among others. A performance of 98 to 100% was reached for a task that uses Spanish isolated digits pronounced by 5 different speakers. Corpus SUSAS with neutral sound records with some advantages in comparison with MFCC was applied.

Idioma originalInglés
Título de la publicación alojadaNature-Inspired Computation and Machine Learning - 13th Mexican International Conference on Artificial Intelligence, MICAI 2014, Proceedings
EditoresAlexander Gelbukh, Félix A. Castro-Espinoza, Sofía N. Galicia-Haro
EditorialSpringer Verlag
Páginas242-251
Número de páginas10
ISBN (versión digital)9783319136493
DOI
EstadoPublicada - 2014
Evento13th Mexican International Conference on Artificial Intelligence, MICAI 2014 - Tuxtla Gutiérrez, México
Duración: 16 nov. 201422 nov. 2014

Serie de la publicación

NombreLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volumen8857
ISSN (versión impresa)0302-9743
ISSN (versión digital)1611-3349

Conferencia

Conferencia13th Mexican International Conference on Artificial Intelligence, MICAI 2014
País/TerritorioMéxico
CiudadTuxtla Gutiérrez
Período16/11/1422/11/14

Huella

Profundice en los temas de investigación de 'Using values of the human cochlea in the macro and micro mechanical model for automatic speech recognition'. En conjunto forman una huella única.

Citar esto