Distribution-based semantic similarity of nouns

Igor A. Bolshakov, Alexander Gelbukh

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

In our previous work we have proposed two methods for evaluating semantic similarity / dissimilarity of nouns based on their modifier sets registered in Oxford Collocation Dictionary for Student of English. In this paper we provide further details on the experimental support and discussion of these methods. Given two nouns, in the first method the similarity is measured by the relative size of the intersection of the sets of modifiers applicable to both of them. In the second method, the dissimilarity is measured by the difference between the mean values of cohesion between a noun and the two sets of modifiers: its own ones and those of the other noun in question. Here, the cohesion between words is measured via Web statistics for co-occurrences of words. The two proposed measures prove to be in approximately inverse dependency. Our experiments show that Web-based weighting (the second method) gives better results.

Original languageEnglish
Title of host publicationProgress in Pattern Recognition, Image Analysis and Applications - 12th Iberoamerican Congress on Pattern Recognition, CIARP 2007, Proceedings
Pages704-713
Number of pages10
StatePublished - 2007
Event12th Iberoamerican Congress on Pattern Recognition, CIARP 2007 - Vina del Mar-Valparaiso, Chile
Duration: 13 Nov 200716 Nov 2007

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4756 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference12th Iberoamerican Congress on Pattern Recognition, CIARP 2007
Country/TerritoryChile
CityVina del Mar-Valparaiso
Period13/11/0716/11/07

Keywords

  • Lexical resources
  • Natural language processing
  • Semantic relatedness
  • Web as corpus
  • Word space model

Fingerprint

Dive into the research topics of 'Distribution-based semantic similarity of nouns'. Together they form a unique fingerprint.

Cite this