Survey of Word co-occurrence measures for collocation detection

Research output: Contribution to journalReview articlepeer-review

20 Scopus citations

Abstract

This paper presents a detailed survey of word co-occurrence measures used in natural language processing. Word co-occurrence information is vital for accurate computational text treatment, it is important to distinguish words which can combine freely with other words from other words whose preferences to generate phrases are restricted. The latter words together with their typical co-occurring companions are called collocations. To detect collocations, many word co-occurrence measures, also called association measures, are used to determine a high degree of cohesion between words in collocations as opposed to a low degree of cohesion in free word combinations. We describe such association measures grouping them in classes depending on approaches and mathematical models used to formalize word co-occurrence.

Original languageEnglish
Pages (from-to)327-344
Number of pages18
JournalComputacion y Sistemas
Volume20
Issue number3
DOIs
StatePublished - 2016

Keywords

  • Association measure
  • Collocation
  • Hybrid approach to model word co-occurrence
  • Rule-based language model
  • Statistical language model
  • Word co-occurrence measure

Fingerprint

Dive into the research topics of 'Survey of Word co-occurrence measures for collocation detection'. Together they form a unique fingerprint.

Cite this