Improving unsupervised WSD with a dynamic thesaurus

Javier Tejada-Cárcamo; Hiram Calvo; Alexander Gelbukh

doi:10.1007/978-3-540-87391-4_27

Improving unsupervised WSD with a dynamic thesaurus

Javier Tejada-Cárcamo, Hiram Calvo, Alexander Gelbukh

Centro de Investigación en Computación (CIC)

Producción científica: Capítulo del libro/informe/acta de congreso › Contribución a la conferencia › revisión exhaustiva

1 Cita (Scopus)

Resumen

The method proposed by Diana McCarthy et al. [1] obtains the predominant sense for an ambiguous word based on a weighted list of terms related to the ambiguous word. This list of terms is obtained using the distributional similarity method proposed by Lin [2] to obtain a thesaurus. In that method, every occurrence of the ambiguous word uses the same thesaurus, regardless of the context where it occurs. Every different word to be disambiguated uses the same thesaurus. In this paper we explore a different method that accounts for the context of a word when determining the most frequent sense of an ambiguous word. In our method the list of distributed similar words is built based on the syntactic context of the ambiguous word. We attain a precision of 69.86%, which is 7% higher than the supervised baseline of using the MFS of 90% SemCor against the remaining 10% of SemCor.

Idioma original	Inglés
Título de la publicación alojada	Text, Speech and Dialogue - 11th International Conference, TSD 2008, Proceedings
Páginas	201-210
Número de páginas	10
DOI	https://doi.org/10.1007/978-3-540-87391-4_27
Estado	Publicada - 2008
Evento	11th International Conference on Text, Speech and Dialogue, TSD 2008 - Brno, República Checa Duración: 8 sep. 2008 → 12 sep. 2008

Serie de la publicación

Nombre	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volumen	5246 LNAI
ISSN (versión impresa)	0302-9743
ISSN (versión digital)	1611-3349

Conferencia

Conferencia	11th International Conference on Text, Speech and Dialogue, TSD 2008
País/Territorio	República Checa
Ciudad	Brno
Período	8/09/08 → 12/09/08

Acceder al documento

10.1007/978-3-540-87391-4_27

Otros archivos y enlaces

Enlace a la publicación en Scopus

Citar esto

Tejada-Cárcamo, J., Calvo, H., & Gelbukh, A. (2008). Improving unsupervised WSD with a dynamic thesaurus. En Text, Speech and Dialogue - 11th International Conference, TSD 2008, Proceedings (pp. 201-210). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5246 LNAI). https://doi.org/10.1007/978-3-540-87391-4_27

@inproceedings{7786c7ed631441b7b4e9696531ab2076,

title = "Improving unsupervised WSD with a dynamic thesaurus",

abstract = "The method proposed by Diana McCarthy et al. [1] obtains the predominant sense for an ambiguous word based on a weighted list of terms related to the ambiguous word. This list of terms is obtained using the distributional similarity method proposed by Lin [2] to obtain a thesaurus. In that method, every occurrence of the ambiguous word uses the same thesaurus, regardless of the context where it occurs. Every different word to be disambiguated uses the same thesaurus. In this paper we explore a different method that accounts for the context of a word when determining the most frequent sense of an ambiguous word. In our method the list of distributed similar words is built based on the syntactic context of the ambiguous word. We attain a precision of 69.86%, which is 7% higher than the supervised baseline of using the MFS of 90% SemCor against the remaining 10% of SemCor.",

author = "Javier Tejada-C{\'a}rcamo and Hiram Calvo and Alexander Gelbukh",

note = "Funding Information: Work done under partial support of Mexican Government (CONACyT, SNI), IPN (PIFI, SIP). The authors wish to thank Rada Mihalcea for her useful comments and discussion.; 11th International Conference on Text, Speech and Dialogue, TSD 2008 ; Conference date: 08-09-2008 Through 12-09-2008",

year = "2008",

doi = "10.1007/978-3-540-87391-4_27",

language = "Ingl{\'e}s",

isbn = "3540873902",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

pages = "201--210",

booktitle = "Text, Speech and Dialogue - 11th International Conference, TSD 2008, Proceedings",

}

Tejada-Cárcamo, J, Calvo, H & Gelbukh, A 2008, Improving unsupervised WSD with a dynamic thesaurus. En Text, Speech and Dialogue - 11th International Conference, TSD 2008, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 5246 LNAI, pp. 201-210, 11th International Conference on Text, Speech and Dialogue, TSD 2008, Brno, República Checa, 8/09/08. https://doi.org/10.1007/978-3-540-87391-4_27

Improving unsupervised WSD with a dynamic thesaurus. / Tejada-Cárcamo, Javier; Calvo, Hiram ; Gelbukh, Alexander.
Text, Speech and Dialogue - 11th International Conference, TSD 2008, Proceedings. 2008. p. 201-210 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5246 LNAI).

Producción científica: Capítulo del libro/informe/acta de congreso › Contribución a la conferencia › revisión exhaustiva

TY - GEN

T1 - Improving unsupervised WSD with a dynamic thesaurus

AU - Tejada-Cárcamo, Javier

AU - Calvo, Hiram

AU - Gelbukh, Alexander

N1 - Funding Information: Work done under partial support of Mexican Government (CONACyT, SNI), IPN (PIFI, SIP). The authors wish to thank Rada Mihalcea for her useful comments and discussion.

PY - 2008

Y1 - 2008

N2 - The method proposed by Diana McCarthy et al. [1] obtains the predominant sense for an ambiguous word based on a weighted list of terms related to the ambiguous word. This list of terms is obtained using the distributional similarity method proposed by Lin [2] to obtain a thesaurus. In that method, every occurrence of the ambiguous word uses the same thesaurus, regardless of the context where it occurs. Every different word to be disambiguated uses the same thesaurus. In this paper we explore a different method that accounts for the context of a word when determining the most frequent sense of an ambiguous word. In our method the list of distributed similar words is built based on the syntactic context of the ambiguous word. We attain a precision of 69.86%, which is 7% higher than the supervised baseline of using the MFS of 90% SemCor against the remaining 10% of SemCor.

AB - The method proposed by Diana McCarthy et al. [1] obtains the predominant sense for an ambiguous word based on a weighted list of terms related to the ambiguous word. This list of terms is obtained using the distributional similarity method proposed by Lin [2] to obtain a thesaurus. In that method, every occurrence of the ambiguous word uses the same thesaurus, regardless of the context where it occurs. Every different word to be disambiguated uses the same thesaurus. In this paper we explore a different method that accounts for the context of a word when determining the most frequent sense of an ambiguous word. In our method the list of distributed similar words is built based on the syntactic context of the ambiguous word. We attain a precision of 69.86%, which is 7% higher than the supervised baseline of using the MFS of 90% SemCor against the remaining 10% of SemCor.

UR - http://www.scopus.com/inward/record.url?scp=53049104018&partnerID=8YFLogxK

U2 - 10.1007/978-3-540-87391-4_27

DO - 10.1007/978-3-540-87391-4_27

M3 - Contribución a la conferencia

SN - 3540873902

SN - 9783540873907

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 201

EP - 210

BT - Text, Speech and Dialogue - 11th International Conference, TSD 2008, Proceedings

T2 - 11th International Conference on Text, Speech and Dialogue, TSD 2008

Y2 - 8 September 2008 through 12 September 2008

ER -

Improving unsupervised WSD with a dynamic thesaurus

Resumen

Serie de la publicación

Conferencia

Acceder al documento

Otros archivos y enlaces

Huella

Citar esto