Word2set: WordNet-Based Word Representation Rivaling Neural Word Embedding for Lexical Similarity and Sentiment Analysis

Sergio Jimenez, Fabio A. Gonzalez, Alexander Gelbukh, George Duenas

Research output: Contribution to journalArticlepeer-review

15 Scopus citations

Abstract

Measuring lexical similarity using WordNet has a long tradition. In the last decade, it has been challenged by distributional methods, and more recently by neural word embedding. In recent years, several larger lexical similarity benchmarks have been introduced, on which word embedding has achieved state-of-the-art results. The success of such methods has eclipsed the use of WordNet for predicting human judgments of lexical similarity. We propose a new set cardinality-based method for measuring lexical similarity, which exploits the WordNet graph, obtaining a word representation, which we called word2set, based on related neighboring words. We show that the features extracted from set cardinalities computed using this word representation, when fed into a support vector regression classifier trained on a dataset of common synonyms and antonyms, produce results competitive with those of word-embedding approaches. On the task of predicting the lexical sentiment polarity, our WordNet set-based representation significantly outperforms the classical measures and achieves the performance of neural embeddings. Although word embedding is still the best approach for these tasks, our method significantly reduces the gap between the results shown by knowledge-based approaches and by distributional representations, without requiring a large training corpus. It is also more effective for less-frequent words.

Original languageEnglish
Article number8686355
Pages (from-to)41-53
Number of pages13
JournalIEEE Computational Intelligence Magazine
Volume14
Issue number2
DOIs
StatePublished - May 2019

Fingerprint

Dive into the research topics of 'Word2set: WordNet-Based Word Representation Rivaling Neural Word Embedding for Lexical Similarity and Sentiment Analysis'. Together they form a unique fingerprint.

Cite this