Lazy query enrichment: A method for indexing large specialized document bases with morphology and concept hierarchy

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

A full-text information retrieval system has to deal with various phenomena of string equivalence: ignore case matching, morphological inflection, derivation, synonymy, and hyponymy or hyperonymy. Technically, this can be handled either at the time of indexing by reducing equivalent strings to a common form or at the time of query processing by enriching the query with the whole set of the equivalent forms. We argue for that the latter way allows for greater flexibility and easier maintenance, while being more affordable than it is usually considered. Our proposal consists in enriching the query only with those forms that really appear in the document base. Our experiments with a thesaurus- based information retrieval system showed only insignificant increase of the query size on average with a 200-megabyte document base, even with highly inflective Spanish language.

Original languageEnglish
Title of host publicationDatabase and Expert Systems Applications - 11th International Conference, DEXA 2000, Proceedings
EditorsMohamed Ibrahim, Josef Kung, Norman Revell
PublisherSpringer Verlag
Pages526-535
Number of pages10
ISBN (Print)9783540679783
DOIs
StatePublished - 2000
Event11th International Conference on Database and Expert Systems Applications, DEXA 2000 - London, United Kingdom
Duration: 4 Sep 20008 Sep 2000

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume1873
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference11th International Conference on Database and Expert Systems Applications, DEXA 2000
Country/TerritoryUnited Kingdom
CityLondon
Period4/09/008/09/00

Fingerprint

Dive into the research topics of 'Lazy query enrichment: A method for indexing large specialized document bases with morphology and concept hierarchy'. Together they form a unique fingerprint.

Cite this