Resumen
MEDLINE is a representative collection of medical documents supplied with original full-text natural-language abstracts as well as with representative keywords (called MeSH-terms) manually selected by the expert annotators from a pre-defined ontology and structured according to their relation to the document. We show how the structured manually assigned semantic descriptions can be combined with the original full-text abstracts to improve quality of clustering the documents into a small number of clusters. As a baseline, we compare our results with clustering using only abstracts or only MeSH-terms. Our experiments show 36% to 47% higher cluster coherence, as well as more refined keywords for the produced clusters.
Idioma original | Inglés |
---|---|
Páginas (desde-hasta) | 322-331 |
Número de páginas | 10 |
Publicación | Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) |
Volumen | 2972 |
DOI | |
Estado | Publicada - 2004 |
Evento | Third Mexican International Conferenceon Artificial Intelligence - Mexico City, México Duración: 26 abr. 2004 → 30 abr. 2004 |