TY - JOUR
T1 - Toward Universal Word Sense Disambiguation Using Deep Neural Networks
AU - Calvo, Hiram
AU - Rocha-Ramirez, Arturo P.
AU - Moreno-Armendariz, Marco A.
AU - Duchanoy, Carlos A.
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2019
Y1 - 2019
N2 - Traditionally, approaches based on neural networks to solve the problem of disambiguation of the meaning of words (WSD) use a set of classifiers at the end, which results in a specialization in a single set of words - those for which they were trained. This makes impossible to apply the learned models to words not previously seen in the training corpus. This paper seeks to address a generalization of the problem of WSD in order to solve it through deep neural networks without limiting the method to a fixed set of words, with a performance close to the state-of-the-art, and an acceptable computational cost. We explore different architectures based on multilayer perceptrons, recurrent cells (Long Short-Term Memory-LSTM and Gated Recurrent Units-GRU), and a classifier model. Different sources and dimensions of embeddings were tested as well. The main evaluation was performed on the Senseval 3 English Lexical Sample. To evaluate the application to an unseen set of words, learned models are evaluated in the completely unseen words of a different corpus (Senseval 2 English Lexical Sample), overcoming the random baseline.
AB - Traditionally, approaches based on neural networks to solve the problem of disambiguation of the meaning of words (WSD) use a set of classifiers at the end, which results in a specialization in a single set of words - those for which they were trained. This makes impossible to apply the learned models to words not previously seen in the training corpus. This paper seeks to address a generalization of the problem of WSD in order to solve it through deep neural networks without limiting the method to a fixed set of words, with a performance close to the state-of-the-art, and an acceptable computational cost. We explore different architectures based on multilayer perceptrons, recurrent cells (Long Short-Term Memory-LSTM and Gated Recurrent Units-GRU), and a classifier model. Different sources and dimensions of embeddings were tested as well. The main evaluation was performed on the Senseval 3 English Lexical Sample. To evaluate the application to an unseen set of words, learned models are evaluated in the completely unseen words of a different corpus (Senseval 2 English Lexical Sample), overcoming the random baseline.
KW - LSTM
KW - Word sense disambiguation
KW - multilayer perceptron
KW - recurrent neural networks
KW - senseval english lexical sample test
UR - http://www.scopus.com/inward/record.url?scp=85065968247&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2019.2914921
DO - 10.1109/ACCESS.2019.2914921
M3 - Artículo
AN - SCOPUS:85065968247
SN - 2169-3536
VL - 7
SP - 60264
EP - 60275
JO - IEEE Access
JF - IEEE Access
M1 - 8706934
ER -