Integrated concept blending with vector space models

Producción científica: Contribución a una revistaArtículorevisión exhaustiva

21 Citas (Scopus)

Resumen

Traditional concept retrieval is based on usual word definition dictionaries with simple performance: they just map words to their definitions. This approach is mostly helpful for readers and language students, but writers sometimes need to find a word that encompasses a set of ideas that they have in mind. For this task, inverse dictionaries are ready to help; however, in some cases a sought word does not correspond to a single definition but to a composite meaning of several concepts. A language producer then tends to require a concept search that starts with a group of words or a series of related terms, looking for a target word. This paper aims to assist on this task by presenting a new approach for concept blending through the development of a search-by-concept method based on vector space representation using semantic analysis and statistical natural language processing techniques. Words are represented as numeric vectors based on different semantic similarity measures and probabilistic measures; the semantic properties of a word are captured in the vector elements determined by a given linguistic context. Three different sources are used as context for word vector construction: WordNet, a distributional thesaurus, and the Latent Dirichlet Allocation algorithm; each source is used for building a different semantic vector space. The concept-blender input is then conformed by a set of n-nouns. All input members are read and substituted by their corresponding vectors. Then, a semantic space analysis including a filtering and ranking process is carried out to deploy a list of target words. A test set of 50 concepts was created in order to evaluate the system's performance. A group of 30 evaluators found our integrated concept blending model to provide better results for finding an adequate word for the provided set of concepts.

Idioma originalInglés
Páginas (desde-hasta)79-96
Número de páginas18
PublicaciónComputer Speech and Language
Volumen40
DOI
EstadoPublicada - nov. 2016

Huella

Profundice en los temas de investigación de 'Integrated concept blending with vector space models'. En conjunto forman una huella única.

Citar esto