Integrated concept blending with vector space models

Research output: Contribution to journalArticlepeer-review

21 Scopus citations

Abstract

Traditional concept retrieval is based on usual word definition dictionaries with simple performance: they just map words to their definitions. This approach is mostly helpful for readers and language students, but writers sometimes need to find a word that encompasses a set of ideas that they have in mind. For this task, inverse dictionaries are ready to help; however, in some cases a sought word does not correspond to a single definition but to a composite meaning of several concepts. A language producer then tends to require a concept search that starts with a group of words or a series of related terms, looking for a target word. This paper aims to assist on this task by presenting a new approach for concept blending through the development of a search-by-concept method based on vector space representation using semantic analysis and statistical natural language processing techniques. Words are represented as numeric vectors based on different semantic similarity measures and probabilistic measures; the semantic properties of a word are captured in the vector elements determined by a given linguistic context. Three different sources are used as context for word vector construction: WordNet, a distributional thesaurus, and the Latent Dirichlet Allocation algorithm; each source is used for building a different semantic vector space. The concept-blender input is then conformed by a set of n-nouns. All input members are read and substituted by their corresponding vectors. Then, a semantic space analysis including a filtering and ranking process is carried out to deploy a list of target words. A test set of 50 concepts was created in order to evaluate the system's performance. A group of 30 evaluators found our integrated concept blending model to provide better results for finding an adequate word for the provided set of concepts.

Original languageEnglish
Pages (from-to)79-96
Number of pages18
JournalComputer Speech and Language
Volume40
DOIs
StatePublished - Nov 2016

Keywords

  • Computational linguistics
  • Concept-blending
  • Lexicography
  • Natural language processing
  • Reverse lookup dictionaries
  • Vector space models

Fingerprint

Dive into the research topics of 'Integrated concept blending with vector space models'. Together they form a unique fingerprint.

Cite this