Lazy query enrichment: A method for indexing large specialized document bases with morphology and concept hierarchy

Alexander F. Gelbukh

doi:10.1007/3-540-44469-6_49

Lazy query enrichment: A method for indexing large specialized document bases with morphology and concept hierarchy

Alexander F. Gelbukh

Centro de Investigación en Computación (CIC)

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

3 Scopus citations

Abstract

A full-text information retrieval system has to deal with various phenomena of string equivalence: ignore case matching, morphological inflection, derivation, synonymy, and hyponymy or hyperonymy. Technically, this can be handled either at the time of indexing by reducing equivalent strings to a common form or at the time of query processing by enriching the query with the whole set of the equivalent forms. We argue for that the latter way allows for greater flexibility and easier maintenance, while being more affordable than it is usually considered. Our proposal consists in enriching the query only with those forms that really appear in the document base. Our experiments with a thesaurus- based information retrieval system showed only insignificant increase of the query size on average with a 200-megabyte document base, even with highly inflective Spanish language.

Original language	English
Title of host publication	Database and Expert Systems Applications - 11th International Conference, DEXA 2000, Proceedings
Editors	Mohamed Ibrahim, Josef Kung, Norman Revell
Publisher	Springer Verlag
Pages	526-535
Number of pages	10
ISBN (Print)	9783540679783
DOIs	https://doi.org/10.1007/3-540-44469-6_49
State	Published - 2000
Event	11th International Conference on Database and Expert Systems Applications, DEXA 2000 - London, United Kingdom Duration: 4 Sep 2000 → 8 Sep 2000

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	1873
ISSN (Print)	0302-9743
ISSN (Electronic)	1611-3349

Conference

Conference	11th International Conference on Database and Expert Systems Applications, DEXA 2000
Country/Territory	United Kingdom
City	London
Period	4/09/00 → 8/09/00

Access to Document

10.1007/3-540-44469-6_49

Cite this

Gelbukh, A. F. (2000). Lazy query enrichment: A method for indexing large specialized document bases with morphology and concept hierarchy. In M. Ibrahim, J. Kung, & N. Revell (Eds.), Database and Expert Systems Applications - 11th International Conference, DEXA 2000, Proceedings (pp. 526-535). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 1873). Springer Verlag. https://doi.org/10.1007/3-540-44469-6_49

Gelbukh, Alexander F. / Lazy query enrichment : A method for indexing large specialized document bases with morphology and concept hierarchy. Database and Expert Systems Applications - 11th International Conference, DEXA 2000, Proceedings. editor / Mohamed Ibrahim ; Josef Kung ; Norman Revell. Springer Verlag, 2000. pp. 526-535 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

@inproceedings{93d6102aff754248b7452c309aa9ec9f,

title = "Lazy query enrichment: A method for indexing large specialized document bases with morphology and concept hierarchy",

abstract = "A full-text information retrieval system has to deal with various phenomena of string equivalence: ignore case matching, morphological inflection, derivation, synonymy, and hyponymy or hyperonymy. Technically, this can be handled either at the time of indexing by reducing equivalent strings to a common form or at the time of query processing by enriching the query with the whole set of the equivalent forms. We argue for that the latter way allows for greater flexibility and easier maintenance, while being more affordable than it is usually considered. Our proposal consists in enriching the query only with those forms that really appear in the document base. Our experiments with a thesaurus- based information retrieval system showed only insignificant increase of the query size on average with a 200-megabyte document base, even with highly inflective Spanish language.",

author = "Gelbukh, {Alexander F.}",

note = "Publisher Copyright: {\textcopyright} Springer-Verlag Berlin Heidelberg 2000.; 11th International Conference on Database and Expert Systems Applications, DEXA 2000 ; Conference date: 04-09-2000 Through 08-09-2000",

year = "2000",

doi = "10.1007/3-540-44469-6_49",

language = "Ingl{\'e}s",

isbn = "9783540679783",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer Verlag",

pages = "526--535",

editor = "Mohamed Ibrahim and Josef Kung and Norman Revell",

booktitle = "Database and Expert Systems Applications - 11th International Conference, DEXA 2000, Proceedings",

address = "Alemania",

}

Gelbukh, AF 2000, Lazy query enrichment: A method for indexing large specialized document bases with morphology and concept hierarchy. in M Ibrahim, J Kung & N Revell (eds), Database and Expert Systems Applications - 11th International Conference, DEXA 2000, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 1873, Springer Verlag, pp. 526-535, 11th International Conference on Database and Expert Systems Applications, DEXA 2000, London, United Kingdom, 4/09/00. https://doi.org/10.1007/3-540-44469-6_49

Lazy query enrichment: A method for indexing large specialized document bases with morphology and concept hierarchy. / Gelbukh, Alexander F.
Database and Expert Systems Applications - 11th International Conference, DEXA 2000, Proceedings. ed. / Mohamed Ibrahim; Josef Kung; Norman Revell. Springer Verlag, 2000. p. 526-535 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 1873).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Lazy query enrichment

T2 - 11th International Conference on Database and Expert Systems Applications, DEXA 2000

AU - Gelbukh, Alexander F.

N1 - Publisher Copyright: © Springer-Verlag Berlin Heidelberg 2000.

PY - 2000

Y1 - 2000

N2 - A full-text information retrieval system has to deal with various phenomena of string equivalence: ignore case matching, morphological inflection, derivation, synonymy, and hyponymy or hyperonymy. Technically, this can be handled either at the time of indexing by reducing equivalent strings to a common form or at the time of query processing by enriching the query with the whole set of the equivalent forms. We argue for that the latter way allows for greater flexibility and easier maintenance, while being more affordable than it is usually considered. Our proposal consists in enriching the query only with those forms that really appear in the document base. Our experiments with a thesaurus- based information retrieval system showed only insignificant increase of the query size on average with a 200-megabyte document base, even with highly inflective Spanish language.

AB - A full-text information retrieval system has to deal with various phenomena of string equivalence: ignore case matching, morphological inflection, derivation, synonymy, and hyponymy or hyperonymy. Technically, this can be handled either at the time of indexing by reducing equivalent strings to a common form or at the time of query processing by enriching the query with the whole set of the equivalent forms. We argue for that the latter way allows for greater flexibility and easier maintenance, while being more affordable than it is usually considered. Our proposal consists in enriching the query only with those forms that really appear in the document base. Our experiments with a thesaurus- based information retrieval system showed only insignificant increase of the query size on average with a 200-megabyte document base, even with highly inflective Spanish language.

UR - http://www.scopus.com/inward/record.url?scp=77954228036&partnerID=8YFLogxK

U2 - 10.1007/3-540-44469-6_49

DO - 10.1007/3-540-44469-6_49

M3 - Contribución a la conferencia

SN - 9783540679783

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 526

EP - 535

BT - Database and Expert Systems Applications - 11th International Conference, DEXA 2000, Proceedings

A2 - Ibrahim, Mohamed

A2 - Kung, Josef

A2 - Revell, Norman

PB - Springer Verlag

Y2 - 4 September 2000 through 8 September 2000

ER -

Gelbukh AF. Lazy query enrichment: A method for indexing large specialized document bases with morphology and concept hierarchy. In Ibrahim M, Kung J, Revell N, editors, Database and Expert Systems Applications - 11th International Conference, DEXA 2000, Proceedings. Springer Verlag. 2000. p. 526-535. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/3-540-44469-6_49

Lazy query enrichment: A method for indexing large specialized document bases with morphology and concept hierarchy

Abstract

Publication series

Conference

Access to Document

Other files and links

Fingerprint

Cite this