Advanced clustering technique for medical data using semantic information

Kwangcheol Shin, Sang Yong Han, Alexander Gelbukh

Producción científica: Contribución a una revistaArtículo de la conferenciarevisión exhaustiva

1 Cita (Scopus)

Resumen

MEDLINE is a representative collection of medical documents supplied with original full-text natural-language abstracts as well as with representative keywords (called MeSH-terms) manually selected by the expert annotators from a pre-defined ontology and structured according to their relation to the document. We show how the structured manually assigned semantic descriptions can be combined with the original full-text abstracts to improve quality of clustering the documents into a small number of clusters. As a baseline, we compare our results with clustering using only abstracts or only MeSH-terms. Our experiments show 36% to 47% higher cluster coherence, as well as more refined keywords for the produced clusters.

Idioma originalInglés
Páginas (desde-hasta)322-331
Número de páginas10
PublicaciónLecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science)
Volumen2972
DOI
EstadoPublicada - 2004
EventoThird Mexican International Conferenceon Artificial Intelligence - Mexico City, México
Duración: 26 abr. 200430 abr. 2004

Huella

Profundice en los temas de investigación de 'Advanced clustering technique for medical data using semantic information'. En conjunto forman una huella única.

Citar esto