TY - GEN
T1 - Improving unsupervised WSD with a dynamic thesaurus
AU - Tejada-Cárcamo, Javier
AU - Calvo, Hiram
AU - Gelbukh, Alexander
N1 - Funding Information:
Work done under partial support of Mexican Government (CONACyT, SNI), IPN (PIFI, SIP). The authors wish to thank Rada Mihalcea for her useful comments and discussion.
PY - 2008
Y1 - 2008
N2 - The method proposed by Diana McCarthy et al. [1] obtains the predominant sense for an ambiguous word based on a weighted list of terms related to the ambiguous word. This list of terms is obtained using the distributional similarity method proposed by Lin [2] to obtain a thesaurus. In that method, every occurrence of the ambiguous word uses the same thesaurus, regardless of the context where it occurs. Every different word to be disambiguated uses the same thesaurus. In this paper we explore a different method that accounts for the context of a word when determining the most frequent sense of an ambiguous word. In our method the list of distributed similar words is built based on the syntactic context of the ambiguous word. We attain a precision of 69.86%, which is 7% higher than the supervised baseline of using the MFS of 90% SemCor against the remaining 10% of SemCor.
AB - The method proposed by Diana McCarthy et al. [1] obtains the predominant sense for an ambiguous word based on a weighted list of terms related to the ambiguous word. This list of terms is obtained using the distributional similarity method proposed by Lin [2] to obtain a thesaurus. In that method, every occurrence of the ambiguous word uses the same thesaurus, regardless of the context where it occurs. Every different word to be disambiguated uses the same thesaurus. In this paper we explore a different method that accounts for the context of a word when determining the most frequent sense of an ambiguous word. In our method the list of distributed similar words is built based on the syntactic context of the ambiguous word. We attain a precision of 69.86%, which is 7% higher than the supervised baseline of using the MFS of 90% SemCor against the remaining 10% of SemCor.
UR - http://www.scopus.com/inward/record.url?scp=53049104018&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-87391-4_27
DO - 10.1007/978-3-540-87391-4_27
M3 - Contribución a la conferencia
SN - 3540873902
SN - 9783540873907
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 201
EP - 210
BT - Text, Speech and Dialogue - 11th International Conference, TSD 2008, Proceedings
T2 - 11th International Conference on Text, Speech and Dialogue, TSD 2008
Y2 - 8 September 2008 through 12 September 2008
ER -