A method of describing document contents through topic selection

Producción científica: Capítulo del libro/informe/acta de congresoContribución a la conferenciarevisión exhaustiva

14 Citas (Scopus)

Resumen

Given a large hierarchical dictionary of concepts, the task of selection of the concepts that describe the contents of a given document is considered. The problem consists in proper handling of the top-level concepts in the hierarchy. As a representation of the document, a histogram of the topics with their respective contribution in the document is used. The contribution is determined by comparison of the document with the «ideal» document for each topic in the dictionary. The «ideal» document for a concept is one that contains only the keywords belonging to this concept, in proportion to their occurrences in the training corpus. A fast algorithm of comparison for some types of metrics is proposed. The application of the method in a system classifier is discussed.

Idioma originalInglés
Título de la publicación alojadaString Processing and Information Retrieval Symposium and International Workshop on Groupware, SPIRE 1999 and CRIWG 1999
EditorialInstitute of Electrical and Electronics Engineers Inc.
Páginas73-80
Número de páginas8
ISBN (versión digital)0769502687, 9780769502687
DOI
EstadoPublicada - 1999
Evento1999 String Processing and Information Retrieval Symposium and International Workshop on Groupware, SPIRE 1999 and CRIWG 1999 - Cancun, México
Duración: 22 sep. 199924 sep. 1999

Serie de la publicación

NombreString Processing and Information Retrieval Symposium and International Workshop on Groupware, SPIRE 1999 and CRIWG 1999

Conferencia

Conferencia1999 String Processing and Information Retrieval Symposium and International Workshop on Groupware, SPIRE 1999 and CRIWG 1999
País/TerritorioMéxico
CiudadCancun
Período22/09/9924/09/99

Huella

Profundice en los temas de investigación de 'A method of describing document contents through topic selection'. En conjunto forman una huella única.

Citar esto