TY - JOUR
T1 - Inferences for enrichment of collocation databases by means of semantic relations
AU - Gelbukh, Alexander
N1 - Publisher Copyright:
© 2018 Instituto Politecnico Nacional. All rights reserved.
PY - 2018
Y1 - 2018
N2 - A text consists of words that are syntactically linked and semantically combinable—like “political party,” “pay attention,” or “stone cold.” Such semantically plausible combinations of two content words, which we hereafter refer to as collocations, are important knowledge in many areas of computational linguistics. We present the structure of a lexical resource that provides such knowledge—a collocation database (CBD). Since such databases cannot be complete under any reasonable compilation procedure, we consider heuristic-based inference mechanisms that predict new plausible collocations based on the ones present in the CDB, with the help of a WordNet-like thesaurus: If an available collocation combines the entries A and B, and B is ‘similar’ to C, then A and C are supposed to constitute a collocation of the same category. Also, we describe the semantically induced morphological categories suiting for such inference, as well as the heuristics for filtering out wrong hypotheses. We discuss the experience in inferences obtained with CrossLexica CDB.
AB - A text consists of words that are syntactically linked and semantically combinable—like “political party,” “pay attention,” or “stone cold.” Such semantically plausible combinations of two content words, which we hereafter refer to as collocations, are important knowledge in many areas of computational linguistics. We present the structure of a lexical resource that provides such knowledge—a collocation database (CBD). Since such databases cannot be complete under any reasonable compilation procedure, we consider heuristic-based inference mechanisms that predict new plausible collocations based on the ones present in the CDB, with the help of a WordNet-like thesaurus: If an available collocation combines the entries A and B, and B is ‘similar’ to C, then A and C are supposed to constitute a collocation of the same category. Also, we describe the semantically induced morphological categories suiting for such inference, as well as the heuristics for filtering out wrong hypotheses. We discuss the experience in inferences obtained with CrossLexica CDB.
KW - Collocations
KW - Enrichment
KW - Hypernyms
KW - Inference rules
KW - Meronyms
KW - Synonyms
UR - http://www.scopus.com/inward/record.url?scp=85045945219&partnerID=8YFLogxK
U2 - 10.13053/CyS-22-1-2923
DO - 10.13053/CyS-22-1-2923
M3 - Artículo
SN - 1405-5546
VL - 22
SP - 103
EP - 117
JO - Computacion y Sistemas
JF - Computacion y Sistemas
IS - 1
ER -