TY - GEN
T1 - The place theory as an alternative solution in Automatic Speech Recognition tasks
AU - Oropeza-Rodríguez, José Luis
AU - Suárez-Guerra, Sergio
AU - Jiménez-Hernández, Mario
N1 - Publisher Copyright:
© Springer International Publishing Switzerland 2014.
PY - 2014
Y1 - 2014
N2 - Recently the parametric representation using cochlea behavior has been used in different studies related with Automatic Speech Recognition (ASR). This paper shows how using an alternative solution reported in the state of the art solves the Lesser and Berkeley’s cochlea model in ASR tasks. An approach that considers a new form to construct the bank filter in the parametric representation used to extract MFCC is proposed. Then this distribution of the bank filter to have a new representation of the speech in frequency domain is used. It is important to indicate that MFCC parameters use Mel scale to create a bank filter. The cochlea behavior based on the theory to create the central frequencies of the bank filter was used, .The Mel scale function was substituted for our purpose. A 98.5% performance was reached, for a task that uses isolated digits pronounced by 5 different speakers in the Spanish language and corpus SUSAS with neutral sound records with some advantages in comparison with MFCC was used.
AB - Recently the parametric representation using cochlea behavior has been used in different studies related with Automatic Speech Recognition (ASR). This paper shows how using an alternative solution reported in the state of the art solves the Lesser and Berkeley’s cochlea model in ASR tasks. An approach that considers a new form to construct the bank filter in the parametric representation used to extract MFCC is proposed. Then this distribution of the bank filter to have a new representation of the speech in frequency domain is used. It is important to indicate that MFCC parameters use Mel scale to create a bank filter. The cochlea behavior based on the theory to create the central frequencies of the bank filter was used, .The Mel scale function was substituted for our purpose. A 98.5% performance was reached, for a task that uses isolated digits pronounced by 5 different speakers in the Spanish language and corpus SUSAS with neutral sound records with some advantages in comparison with MFCC was used.
KW - Automatic speech recognition
KW - Cochlea operation
KW - Place theory and bank filter component
KW - Speech recognition
UR - http://www.scopus.com/inward/record.url?scp=84949132095&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-12568-8_21
DO - 10.1007/978-3-319-12568-8_21
M3 - Contribución a la conferencia
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 167
EP - 174
BT - Progress in Pattern Recognition Image Analysis, Computer Vision and Applications - 19th Iberoamerican Congress, CIARP 2014, Proceedings
A2 - Bayro-Corrochano, Eduardo
A2 - Hancock, Edwin
PB - Springer Verlag
T2 - 19th Iberoamerican Congress on Pattern Recognition, CIARP 2014
Y2 - 2 November 2014 through 5 November 2014
ER -