Automatic evaluation of quality of an explanatory dictionary by comparison of word senses

Alexander Gelbukh; Grigori Sidorov; Sang Yong Han; Liliana Chanona-Hernandez

Automatic evaluation of quality of an explanatory dictionary by comparison of word senses

Alexander Gelbukh, Grigori Sidorov, Sang Yong Han, Liliana Chanona-Hernandez

Centro de Investigación en Computación (CIC)

Producción científica: Contribución a una revista › Artículo de revisión › revisión exhaustiva

1 Cita (Scopus)

Resumen

Words in the explanatory dictionary have different meanings (senses) described using natural language definitions. If the definitions of two senses of the same word are too similar, it is difficult to grasp the difference and thus it is difficult to judge which of the two senses is intended in a particular contexts, especially when such a decision is to be made automatically as in the task of automatic word sense disambiguation. We suggest a method of formal evaluation of this aspect of quality of an explanatory dictionary by calculating the similarity of different senses of the same word. We calculate the similarity between two given senses as the relative number of equal or synonymous words in their definitions. In addition to the general assessment of the dictionary, the individual suspicious definitions are reported for possible improvement. In our experiments we used the Anaya explanatory dictionary of Spanish. Our experiments show that there are about 10% of substantially similar definitions in this dictionary, which indicates rather low quality.

Idioma original	Inglés
Páginas (desde-hasta)	556-562
Número de páginas	7
Publicación	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volumen	2890
Estado	Publicada - 2003

Otros archivos y enlaces

Enlace a la publicación en Scopus

Citar esto

@article{36478e2b82544334a512ef9d99d4d802,

title = "Automatic evaluation of quality of an explanatory dictionary by comparison of word senses",

abstract = "Words in the explanatory dictionary have different meanings (senses) described using natural language definitions. If the definitions of two senses of the same word are too similar, it is difficult to grasp the difference and thus it is difficult to judge which of the two senses is intended in a particular contexts, especially when such a decision is to be made automatically as in the task of automatic word sense disambiguation. We suggest a method of formal evaluation of this aspect of quality of an explanatory dictionary by calculating the similarity of different senses of the same word. We calculate the similarity between two given senses as the relative number of equal or synonymous words in their definitions. In addition to the general assessment of the dictionary, the individual suspicious definitions are reported for possible improvement. In our experiments we used the Anaya explanatory dictionary of Spanish. Our experiments show that there are about 10% of substantially similar definitions in this dictionary, which indicates rather low quality.",

author = "Alexander Gelbukh and Grigori Sidorov and Han, {Sang Yong} and Liliana Chanona-Hernandez",

year = "2003",

language = "Ingl{\'e}s",

volume = "2890",

pages = "556--562",

journal = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

issn = "0302-9743",

publisher = "Springer Verlag",

}

Automatic evaluation of quality of an explanatory dictionary by comparison of word senses. / Gelbukh, Alexander ; Sidorov, Grigori; Han, Sang Yong et al.
En: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol. 2890, 2003, p. 556-562.

Producción científica: Contribución a una revista › Artículo de revisión › revisión exhaustiva

TY - JOUR

T1 - Automatic evaluation of quality of an explanatory dictionary by comparison of word senses

AU - Gelbukh, Alexander

AU - Sidorov, Grigori

AU - Han, Sang Yong

AU - Chanona-Hernandez, Liliana

PY - 2003

Y1 - 2003

N2 - Words in the explanatory dictionary have different meanings (senses) described using natural language definitions. If the definitions of two senses of the same word are too similar, it is difficult to grasp the difference and thus it is difficult to judge which of the two senses is intended in a particular contexts, especially when such a decision is to be made automatically as in the task of automatic word sense disambiguation. We suggest a method of formal evaluation of this aspect of quality of an explanatory dictionary by calculating the similarity of different senses of the same word. We calculate the similarity between two given senses as the relative number of equal or synonymous words in their definitions. In addition to the general assessment of the dictionary, the individual suspicious definitions are reported for possible improvement. In our experiments we used the Anaya explanatory dictionary of Spanish. Our experiments show that there are about 10% of substantially similar definitions in this dictionary, which indicates rather low quality.

AB - Words in the explanatory dictionary have different meanings (senses) described using natural language definitions. If the definitions of two senses of the same word are too similar, it is difficult to grasp the difference and thus it is difficult to judge which of the two senses is intended in a particular contexts, especially when such a decision is to be made automatically as in the task of automatic word sense disambiguation. We suggest a method of formal evaluation of this aspect of quality of an explanatory dictionary by calculating the similarity of different senses of the same word. We calculate the similarity between two given senses as the relative number of equal or synonymous words in their definitions. In addition to the general assessment of the dictionary, the individual suspicious definitions are reported for possible improvement. In our experiments we used the Anaya explanatory dictionary of Spanish. Our experiments show that there are about 10% of substantially similar definitions in this dictionary, which indicates rather low quality.

UR - http://www.scopus.com/inward/record.url?scp=35248882583&partnerID=8YFLogxK

M3 - Artículo de revisión

AN - SCOPUS:35248882583

SN - 0302-9743

VL - 2890

SP - 556

EP - 562

JO - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

JF - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -

Automatic evaluation of quality of an explanatory dictionary by comparison of word senses

Resumen

Otros archivos y enlaces

Huella

Citar esto