Abstract
Collocations are defmed as syntactically linked and semantically plausible combinations of content words. Since collocations constitute a bulk of common texts and depend on the language, creation of collocation databases (CBD5) is important. However, manual compilation of such databases is prohibitively expensive. We present heuristics for automatic generation of new Spanish collocations based on those already present in a CBD, with the help of WordNet-like thesaurus: If a word A is semantically "similar" to a word B and a collocation B + C is known, then A + C presumably is a collocation of the same type given certain conditions are met.
Original language | English |
---|---|
Title of host publication | Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) |
Pages | 25-32 |
Number of pages | 8 |
Volume | 2389 |
DOIs | |
State | Published - 2002 |
Externally published | Yes |