Augmenting word space models for Word Sense Discrimination using an automatic thesaurus

Producción científica: Capítulo del libro/informe/acta de congresoContribución a la conferenciarevisión exhaustiva

Resumen

This paper presents an algorithm for Word Sense Discrimination that divides the global representation of a word into a number of classes by determining for any two occurrences whether they belong to the same sense or not. We rely on the notion that words that are used in similar contexts will have the same or a closely related meaning, thus, given a target word, we group its dependency co-occurrences in a Word Space Model. Each cluster represents a distinct meaning or sense of that word. We experiment with augmenting the bag of words of each cluster of co-occurrences, the dictionary of sense definition, and augmenting both. Then we count the number of intersections of each word of the bag of clustered senses and the bag of the dictionary of senses following the Lesk method. We find an increase in recall and a decrease in precision when augmenting. However, the best resulting F-measure is for the option of augmenting the both dictionary of senses and the bag of words from the clusters.

Idioma originalInglés
Título de la publicación alojadaAdvances in Natural Language Processing - 6th International Conference, GoTAL 2008, Proceedings
Páginas100-107
Número de páginas8
DOI
EstadoPublicada - 2008
Evento6th International Conference on Natural Language Processing, GoTAL 2008 - Gothenburg, Suecia
Duración: 25 ago. 200827 ago. 2008

Serie de la publicación

NombreLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volumen5221 LNAI
ISSN (versión impresa)0302-9743
ISSN (versión digital)1611-3349

Conferencia

Conferencia6th International Conference on Natural Language Processing, GoTAL 2008
País/TerritorioSuecia
CiudadGothenburg
Período25/08/0827/08/08

Huella

Profundice en los temas de investigación de 'Augmenting word space models for Word Sense Discrimination using an automatic thesaurus'. En conjunto forman una huella única.

Citar esto