Compression of Boolean inverted files by document ordering

Producción científica: Capítulo del libro/informe/acta de congresoContribución a la conferenciarevisión exhaustiva

2 Citas (Scopus)

Resumen

Boolean queries are used to search a document collection for the documents that contain specific terms, independently of the frequency of a term in the document. To perform such queries, a search engine maintains an inverted file, which lists for each keyword the documents containing it. The size of such a file is comparable with that of the document collection, which is a considerable storage overhead. We show how the inverted file can be compressed by ordering the documents in the collection in a specific way. Finding the near-optimal order can be recast as a Hamming-distance traveling salesman problem.

Idioma originalInglés
Título de la publicación alojadaNLP-KE 2003 - 2003 International Conference on Natural Language Processing and Knowledge Engineering, Proceedings
EditoresChengqing Zong
EditorialInstitute of Electrical and Electronics Engineers Inc.
Páginas244-249
Número de páginas6
ISBN (versión digital)0780379020, 9780780379022
DOI
EstadoPublicada - 2003
EventoInternational Conference on Natural Language Processing and Knowledge Engineering, NLP-KE 2003 - Beijing, China
Duración: 26 oct. 200329 oct. 2003

Serie de la publicación

NombreNLP-KE 2003 - 2003 International Conference on Natural Language Processing and Knowledge Engineering, Proceedings

Conferencia

ConferenciaInternational Conference on Natural Language Processing and Knowledge Engineering, NLP-KE 2003
País/TerritorioChina
CiudadBeijing
Período26/10/0329/10/03

Huella

Profundice en los temas de investigación de 'Compression of Boolean inverted files by document ordering'. En conjunto forman una huella única.

Citar esto