Computing text similarity using Tree Edit Distance

Grigori Sidorov, Helena Gomez-Adorno, Ilia Markov, David Pinto, Nahun Loya

Producción científica: Capítulo del libro/informe/acta de congresoContribución a la conferenciarevisión exhaustiva

25 Citas (Scopus)

Resumen

In this paper, we propose the application of the Tree Edit Distance (TED) for calculation of similarity between syntactic n-grams for further detection of soft similarity between texts. The computation of text similarity is the basic task for many natural language processing problems, and it is an open research field. Syntactic n-grams are text features for Vector Space Model construction extracted from dependency trees. Soft similarity is application of Vector Space Model taking into account similarity of features. First, we discuss the advantages of the application of the TED to syntactic n-grams. Then, we present a procedure based on the TED and syntactic n-grams for calculating soft similarity between texts.

Idioma originalInglés
Título de la publicación alojada2015 Annual Meeting of the North American Fuzzy Information Processing Society, NAFIPS 2015
EditorialInstitute of Electrical and Electronics Engineers Inc.
ISBN (versión digital)9781467372473
DOI
EstadoPublicada - 29 sep. 2015
EventoAnnual Meeting of the North American Fuzzy Information Processing Society, NAFIPS 2015 - Redmond, Estados Unidos
Duración: 17 ago. 201519 ago. 2015

Serie de la publicación

NombreAnnual Conference of the North American Fuzzy Information Processing Society - NAFIPS
Volumen2015-September

Conferencia

ConferenciaAnnual Meeting of the North American Fuzzy Information Processing Society, NAFIPS 2015
País/TerritorioEstados Unidos
CiudadRedmond
Período17/08/1519/08/15

Huella

Profundice en los temas de investigación de 'Computing text similarity using Tree Edit Distance'. En conjunto forman una huella única.

Citar esto