TY - JOUR
T1 - Etiquetado fonético automático al nivel palabra usando la dinámica de cambio de los vectores del libro código
AU - Guerra, Sergio Suárez
AU - Rodríguez, José Luis Oropeza
N1 - Publisher Copyright:
© 2020 Instituto Politecnico Nacional. All rights reserved.
PY - 2020
Y1 - 2020
N2 - An alternative solution is described regarding the phonetic labeling that compose a set of pronounced by an announcer, susceptible of being used in any language, according to the needs and characteristics associated with the proposal. The procedure is based on the monitoring of the dynamics of change of the cepstral vectors associated with the frequency of Mel (MFCCs) that make up the Book Code (LC), extracted from the word to be labeled. This dynamics of change analyzes where a transition from one vector (MFCC) of the LC occurs to another, as well as the disturbances that occur in the zone of change due to the phonetic concatenation. Metrics are established to consider coarticulation noise and define the location of the phonetic separation boundary. Two methods are used to evaluate the dynamics of vector change and deliver the most accurate labeling. The percentage of recognition and correct labeling obtained with this application is 97.9% lower by 1.06%, with respect to the percentage of recognition obtained on the same corpus of words, but using manual labeling. The more important are that, the time used in the labeling of the voice corpus automatically is significantly less than the estimate of being done manually, in addition to eliminating personal subjectivity in the labeling work.
AB - An alternative solution is described regarding the phonetic labeling that compose a set of pronounced by an announcer, susceptible of being used in any language, according to the needs and characteristics associated with the proposal. The procedure is based on the monitoring of the dynamics of change of the cepstral vectors associated with the frequency of Mel (MFCCs) that make up the Book Code (LC), extracted from the word to be labeled. This dynamics of change analyzes where a transition from one vector (MFCC) of the LC occurs to another, as well as the disturbances that occur in the zone of change due to the phonetic concatenation. Metrics are established to consider coarticulation noise and define the location of the phonetic separation boundary. Two methods are used to evaluate the dynamics of vector change and deliver the most accurate labeling. The percentage of recognition and correct labeling obtained with this application is 97.9% lower by 1.06%, with respect to the percentage of recognition obtained on the same corpus of words, but using manual labeling. The more important are that, the time used in the labeling of the voice corpus automatically is significantly less than the estimate of being done manually, in addition to eliminating personal subjectivity in the labeling work.
KW - Phonetic labeling
KW - Voice recognition
UR - http://www.scopus.com/inward/record.url?scp=85089089453&partnerID=8YFLogxK
U2 - 10.13053/CyS-24-2-3229
DO - 10.13053/CyS-24-2-3229
M3 - Artículo
AN - SCOPUS:85089089453
SN - 1405-5546
VL - 24
SP - 855
EP - 868
JO - Computacion y Sistemas
JF - Computacion y Sistemas
IS - 2
ER -