Improving unsupervised WSD with a dynamic thesaurus

Javier Tejada-Cárcamo; Hiram Calvo; Alexander Gelbukh

doi:10.1007/978-3-540-87391-4_27

Improving unsupervised WSD with a dynamic thesaurus

Javier Tejada-Cárcamo, Hiram Calvo, Alexander Gelbukh

Centro de Investigación en Computación (CIC)

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

1 Scopus citations

Abstract

The method proposed by Diana McCarthy et al. [1] obtains the predominant sense for an ambiguous word based on a weighted list of terms related to the ambiguous word. This list of terms is obtained using the distributional similarity method proposed by Lin [2] to obtain a thesaurus. In that method, every occurrence of the ambiguous word uses the same thesaurus, regardless of the context where it occurs. Every different word to be disambiguated uses the same thesaurus. In this paper we explore a different method that accounts for the context of a word when determining the most frequent sense of an ambiguous word. In our method the list of distributed similar words is built based on the syntactic context of the ambiguous word. We attain a precision of 69.86%, which is 7% higher than the supervised baseline of using the MFS of 90% SemCor against the remaining 10% of SemCor.

Original language	English
Title of host publication	Text, Speech and Dialogue - 11th International Conference, TSD 2008, Proceedings
Pages	201-210
Number of pages	10
DOIs	https://doi.org/10.1007/978-3-540-87391-4_27
State	Published - 2008
Event	11th International Conference on Text, Speech and Dialogue, TSD 2008 - Brno, Czech Republic Duration: 8 Sep 2008 → 12 Sep 2008

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	5246 LNAI
ISSN (Print)	0302-9743
ISSN (Electronic)	1611-3349

Conference

Conference	11th International Conference on Text, Speech and Dialogue, TSD 2008
Country/Territory	Czech Republic
City	Brno
Period	8/09/08 → 12/09/08

Access to Document

10.1007/978-3-540-87391-4_27

Cite this

Tejada-Cárcamo, J., Calvo, H., & Gelbukh, A. (2008). Improving unsupervised WSD with a dynamic thesaurus. In Text, Speech and Dialogue - 11th International Conference, TSD 2008, Proceedings (pp. 201-210). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5246 LNAI). https://doi.org/10.1007/978-3-540-87391-4_27

@inproceedings{7786c7ed631441b7b4e9696531ab2076,

title = "Improving unsupervised WSD with a dynamic thesaurus",

abstract = "The method proposed by Diana McCarthy et al. [1] obtains the predominant sense for an ambiguous word based on a weighted list of terms related to the ambiguous word. This list of terms is obtained using the distributional similarity method proposed by Lin [2] to obtain a thesaurus. In that method, every occurrence of the ambiguous word uses the same thesaurus, regardless of the context where it occurs. Every different word to be disambiguated uses the same thesaurus. In this paper we explore a different method that accounts for the context of a word when determining the most frequent sense of an ambiguous word. In our method the list of distributed similar words is built based on the syntactic context of the ambiguous word. We attain a precision of 69.86%, which is 7% higher than the supervised baseline of using the MFS of 90% SemCor against the remaining 10% of SemCor.",

author = "Javier Tejada-C{\'a}rcamo and Hiram Calvo and Alexander Gelbukh",

note = "Funding Information: Work done under partial support of Mexican Government (CONACyT, SNI), IPN (PIFI, SIP). The authors wish to thank Rada Mihalcea for her useful comments and discussion.; 11th International Conference on Text, Speech and Dialogue, TSD 2008 ; Conference date: 08-09-2008 Through 12-09-2008",

year = "2008",

doi = "10.1007/978-3-540-87391-4_27",

language = "Ingl{\'e}s",

isbn = "3540873902",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

pages = "201--210",

booktitle = "Text, Speech and Dialogue - 11th International Conference, TSD 2008, Proceedings",

}

Tejada-Cárcamo, J, Calvo, H & Gelbukh, A 2008, Improving unsupervised WSD with a dynamic thesaurus. in Text, Speech and Dialogue - 11th International Conference, TSD 2008, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 5246 LNAI, pp. 201-210, 11th International Conference on Text, Speech and Dialogue, TSD 2008, Brno, Czech Republic, 8/09/08. https://doi.org/10.1007/978-3-540-87391-4_27

Improving unsupervised WSD with a dynamic thesaurus. / Tejada-Cárcamo, Javier; Calvo, Hiram ; Gelbukh, Alexander.
Text, Speech and Dialogue - 11th International Conference, TSD 2008, Proceedings. 2008. p. 201-210 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5246 LNAI).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Improving unsupervised WSD with a dynamic thesaurus

AU - Tejada-Cárcamo, Javier

AU - Calvo, Hiram

AU - Gelbukh, Alexander

N1 - Funding Information: Work done under partial support of Mexican Government (CONACyT, SNI), IPN (PIFI, SIP). The authors wish to thank Rada Mihalcea for her useful comments and discussion.

PY - 2008

Y1 - 2008

N2 - The method proposed by Diana McCarthy et al. [1] obtains the predominant sense for an ambiguous word based on a weighted list of terms related to the ambiguous word. This list of terms is obtained using the distributional similarity method proposed by Lin [2] to obtain a thesaurus. In that method, every occurrence of the ambiguous word uses the same thesaurus, regardless of the context where it occurs. Every different word to be disambiguated uses the same thesaurus. In this paper we explore a different method that accounts for the context of a word when determining the most frequent sense of an ambiguous word. In our method the list of distributed similar words is built based on the syntactic context of the ambiguous word. We attain a precision of 69.86%, which is 7% higher than the supervised baseline of using the MFS of 90% SemCor against the remaining 10% of SemCor.

AB - The method proposed by Diana McCarthy et al. [1] obtains the predominant sense for an ambiguous word based on a weighted list of terms related to the ambiguous word. This list of terms is obtained using the distributional similarity method proposed by Lin [2] to obtain a thesaurus. In that method, every occurrence of the ambiguous word uses the same thesaurus, regardless of the context where it occurs. Every different word to be disambiguated uses the same thesaurus. In this paper we explore a different method that accounts for the context of a word when determining the most frequent sense of an ambiguous word. In our method the list of distributed similar words is built based on the syntactic context of the ambiguous word. We attain a precision of 69.86%, which is 7% higher than the supervised baseline of using the MFS of 90% SemCor against the remaining 10% of SemCor.

UR - http://www.scopus.com/inward/record.url?scp=53049104018&partnerID=8YFLogxK

U2 - 10.1007/978-3-540-87391-4_27

DO - 10.1007/978-3-540-87391-4_27

M3 - Contribución a la conferencia

SN - 3540873902

SN - 9783540873907

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 201

EP - 210

BT - Text, Speech and Dialogue - 11th International Conference, TSD 2008, Proceedings

T2 - 11th International Conference on Text, Speech and Dialogue, TSD 2008

Y2 - 8 September 2008 through 12 September 2008

ER -

Improving unsupervised WSD with a dynamic thesaurus

Abstract

Publication series

Conference

Access to Document

Other files and links

Fingerprint

Cite this