Using values of the human cochlea in the macro and micro mechanical model for automatic speech recognition

José Luis Oropeza Rodríguez, Sergio Suárez Guerra

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Recently the parametric representation using cochlea behavior has been used in different studies related with Automatic Speech Recognition (ASR). That is because this hearing organ in mammalians is the most important element used to make a transduction of the sound pressure that is received by the outer ear. This paper shows how the macro and micro mechanical model is used in ASR tasks. The values that Neely, Elliot and Ku founded in their works, related with the macro and micro mechanical model such as Neely were used to set the central frequencies of a bank filter to obtain parameters from the speech in a similar form as MFCC (Mel Frequency Cepstrum Coefficients) has been constructed.

An approach that considers a new form to distribute the bank filter in our parametric representation is proposed. Then this distribution of the bank filter to have a different representation of the speech in frequency domain compared with MFCC is applied. The response of these three values mentioned above into macro and micro mechanical model to create the central frequencies of the bank filter were used, then the Mel scale function substituted by a representation based in the cochlear response based on the Neely model. This model was used with a set of different parameters of the cochlea, used by Nelly, Elliot and Ku in their works, such as mass, damping and stiffness; among others. A performance of 98 to 100% was reached for a task that uses Spanish isolated digits pronounced by 5 different speakers. Corpus SUSAS with neutral sound records with some advantages in comparison with MFCC was applied.

Original languageEnglish
Title of host publicationNature-Inspired Computation and Machine Learning - 13th Mexican International Conference on Artificial Intelligence, MICAI 2014, Proceedings
EditorsAlexander Gelbukh, Félix A. Castro-Espinoza, Sofía N. Galicia-Haro
PublisherSpringer Verlag
Pages242-251
Number of pages10
ISBN (Electronic)9783319136493
DOIs
StatePublished - 2014
Event13th Mexican International Conference on Artificial Intelligence, MICAI 2014 - Tuxtla Gutiérrez, Mexico
Duration: 16 Nov 201422 Nov 2014

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume8857
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference13th Mexican International Conference on Artificial Intelligence, MICAI 2014
Country/TerritoryMexico
CityTuxtla Gutiérrez
Period16/11/1422/11/14

Keywords

  • Cochlea
  • Place theory and bank filter
  • Speech recognition

Fingerprint

Dive into the research topics of 'Using values of the human cochlea in the macro and micro mechanical model for automatic speech recognition'. Together they form a unique fingerprint.

Cite this