Toward Universal Word Sense Disambiguation Using Deep Neural Networks

Hiram Calvo; Arturo P. Rocha-Ramirez; Marco A. Moreno-Armendariz; Carlos A. Duchanoy

doi:10.1109/ACCESS.2019.2914921

Toward Universal Word Sense Disambiguation Using Deep Neural Networks

Hiram Calvo, Arturo P. Rocha-Ramirez, Marco A. Moreno-Armendariz, Carlos A. Duchanoy

Centro de Investigación en Computación (CIC)

Producción científica: Contribución a una revista › Artículo › revisión exhaustiva

17 Citas (Scopus)

Resumen

Traditionally, approaches based on neural networks to solve the problem of disambiguation of the meaning of words (WSD) use a set of classifiers at the end, which results in a specialization in a single set of words - those for which they were trained. This makes impossible to apply the learned models to words not previously seen in the training corpus. This paper seeks to address a generalization of the problem of WSD in order to solve it through deep neural networks without limiting the method to a fixed set of words, with a performance close to the state-of-the-art, and an acceptable computational cost. We explore different architectures based on multilayer perceptrons, recurrent cells (Long Short-Term Memory-LSTM and Gated Recurrent Units-GRU), and a classifier model. Different sources and dimensions of embeddings were tested as well. The main evaluation was performed on the Senseval 3 English Lexical Sample. To evaluate the application to an unseen set of words, learned models are evaluated in the completely unseen words of a different corpus (Senseval 2 English Lexical Sample), overcoming the random baseline.

Idioma original	Inglés
Número de artículo	8706934
Páginas (desde-hasta)	60264-60275
Número de páginas	12
Publicación	IEEE Access
Volumen	7
DOI	https://doi.org/10.1109/ACCESS.2019.2914921
Estado	Publicada - 2019

Acceder al documento

10.1109/ACCESS.2019.2914921

Otros archivos y enlaces

Enlace a la publicación en Scopus

Citar esto

@article{bea2f7a44f3140b39e8e47a819f6b195,

title = "Toward Universal Word Sense Disambiguation Using Deep Neural Networks",

abstract = "Traditionally, approaches based on neural networks to solve the problem of disambiguation of the meaning of words (WSD) use a set of classifiers at the end, which results in a specialization in a single set of words - those for which they were trained. This makes impossible to apply the learned models to words not previously seen in the training corpus. This paper seeks to address a generalization of the problem of WSD in order to solve it through deep neural networks without limiting the method to a fixed set of words, with a performance close to the state-of-the-art, and an acceptable computational cost. We explore different architectures based on multilayer perceptrons, recurrent cells (Long Short-Term Memory-LSTM and Gated Recurrent Units-GRU), and a classifier model. Different sources and dimensions of embeddings were tested as well. The main evaluation was performed on the Senseval 3 English Lexical Sample. To evaluate the application to an unseen set of words, learned models are evaluated in the completely unseen words of a different corpus (Senseval 2 English Lexical Sample), overcoming the random baseline.",

keywords = "LSTM, Word sense disambiguation, multilayer perceptron, recurrent neural networks, senseval english lexical sample test",

author = "Hiram Calvo and Rocha-Ramirez, {Arturo P.} and Moreno-Armendariz, {Marco A.} and Duchanoy, {Carlos A.}",

note = "Publisher Copyright: {\textcopyright} 2013 IEEE.",

year = "2019",

doi = "10.1109/ACCESS.2019.2914921",

language = "Ingl{\'e}s",

volume = "7",

pages = "60264--60275",

journal = "IEEE Access",

issn = "2169-3536",

}

TY - JOUR

T1 - Toward Universal Word Sense Disambiguation Using Deep Neural Networks

AU - Calvo, Hiram

AU - Rocha-Ramirez, Arturo P.

AU - Moreno-Armendariz, Marco A.

AU - Duchanoy, Carlos A.

PY - 2019

Y1 - 2019

N2 - Traditionally, approaches based on neural networks to solve the problem of disambiguation of the meaning of words (WSD) use a set of classifiers at the end, which results in a specialization in a single set of words - those for which they were trained. This makes impossible to apply the learned models to words not previously seen in the training corpus. This paper seeks to address a generalization of the problem of WSD in order to solve it through deep neural networks without limiting the method to a fixed set of words, with a performance close to the state-of-the-art, and an acceptable computational cost. We explore different architectures based on multilayer perceptrons, recurrent cells (Long Short-Term Memory-LSTM and Gated Recurrent Units-GRU), and a classifier model. Different sources and dimensions of embeddings were tested as well. The main evaluation was performed on the Senseval 3 English Lexical Sample. To evaluate the application to an unseen set of words, learned models are evaluated in the completely unseen words of a different corpus (Senseval 2 English Lexical Sample), overcoming the random baseline.

AB - Traditionally, approaches based on neural networks to solve the problem of disambiguation of the meaning of words (WSD) use a set of classifiers at the end, which results in a specialization in a single set of words - those for which they were trained. This makes impossible to apply the learned models to words not previously seen in the training corpus. This paper seeks to address a generalization of the problem of WSD in order to solve it through deep neural networks without limiting the method to a fixed set of words, with a performance close to the state-of-the-art, and an acceptable computational cost. We explore different architectures based on multilayer perceptrons, recurrent cells (Long Short-Term Memory-LSTM and Gated Recurrent Units-GRU), and a classifier model. Different sources and dimensions of embeddings were tested as well. The main evaluation was performed on the Senseval 3 English Lexical Sample. To evaluate the application to an unseen set of words, learned models are evaluated in the completely unseen words of a different corpus (Senseval 2 English Lexical Sample), overcoming the random baseline.

KW - LSTM

KW - Word sense disambiguation

KW - multilayer perceptron

KW - recurrent neural networks

KW - senseval english lexical sample test

UR - http://www.scopus.com/inward/record.url?scp=85065968247&partnerID=8YFLogxK

U2 - 10.1109/ACCESS.2019.2914921

DO - 10.1109/ACCESS.2019.2914921

M3 - Artículo

AN - SCOPUS:85065968247

SN - 2169-3536

VL - 7

SP - 60264

EP - 60275

JO - IEEE Access

JF - IEEE Access

M1 - 8706934

ER -

Toward Universal Word Sense Disambiguation Using Deep Neural Networks

Resumen

Acceder al documento

Otros archivos y enlaces

Huella

Citar esto