Computing text similarity using Tree Edit Distance

Grigori Sidorov, Helena Gomez-Adorno, Ilia Markov, David Pinto, Nahun Loya

Resultado de la investigación: Contribución a una conferenciaArtículo

16 Citas (Scopus)

Resumen

© 2015 IEEE. In this paper, we propose the application of the Tree Edit Distance (TED) for calculation of similarity between syntactic n-grams for further detection of soft similarity between texts. The computation of text similarity is the basic task for many natural language processing problems, and it is an open research field. Syntactic n-grams are text features for Vector Space Model construction extracted from dependency trees. Soft similarity is application of Vector Space Model taking into account similarity of features. First, we discuss the advantages of the application of the TED to syntactic n-grams. Then, we present a procedure based on the TED and syntactic n-grams for calculating soft similarity between texts.
Idioma originalInglés estadounidense
DOI
EstadoPublicada - 29 sep 2015
EventoAnnual Conference of the North American Fuzzy Information Processing Society - NAFIPS -
Duración: 29 sep 2015 → …

Conferencia

ConferenciaAnnual Conference of the North American Fuzzy Information Processing Society - NAFIPS
Período29/09/15 → …

Huella dactilar

Syntactics
Vector spaces
Trees (mathematics)
Processing

Citar esto

Sidorov, G., Gomez-Adorno, H., Markov, I., Pinto, D., & Loya, N. (2015). Computing text similarity using Tree Edit Distance. Papel presentado en Annual Conference of the North American Fuzzy Information Processing Society - NAFIPS, . https://doi.org/10.1109/NAFIPS-WConSC.2015.7284129
Sidorov, Grigori ; Gomez-Adorno, Helena ; Markov, Ilia ; Pinto, David ; Loya, Nahun. / Computing text similarity using Tree Edit Distance. Papel presentado en Annual Conference of the North American Fuzzy Information Processing Society - NAFIPS, .
@conference{472829f97ce34a16a67ade47d151ee69,
title = "Computing text similarity using Tree Edit Distance",
abstract = "{\circledC} 2015 IEEE. In this paper, we propose the application of the Tree Edit Distance (TED) for calculation of similarity between syntactic n-grams for further detection of soft similarity between texts. The computation of text similarity is the basic task for many natural language processing problems, and it is an open research field. Syntactic n-grams are text features for Vector Space Model construction extracted from dependency trees. Soft similarity is application of Vector Space Model taking into account similarity of features. First, we discuss the advantages of the application of the TED to syntactic n-grams. Then, we present a procedure based on the TED and syntactic n-grams for calculating soft similarity between texts.",
author = "Grigori Sidorov and Helena Gomez-Adorno and Ilia Markov and David Pinto and Nahun Loya",
year = "2015",
month = "9",
day = "29",
doi = "10.1109/NAFIPS-WConSC.2015.7284129",
language = "American English",
note = "Annual Conference of the North American Fuzzy Information Processing Society - NAFIPS ; Conference date: 29-09-2015",

}

Sidorov, G, Gomez-Adorno, H, Markov, I, Pinto, D & Loya, N 2015, 'Computing text similarity using Tree Edit Distance', Papel presentado en Annual Conference of the North American Fuzzy Information Processing Society - NAFIPS, 29/09/15. https://doi.org/10.1109/NAFIPS-WConSC.2015.7284129

Computing text similarity using Tree Edit Distance. / Sidorov, Grigori; Gomez-Adorno, Helena; Markov, Ilia; Pinto, David; Loya, Nahun.

2015. Papel presentado en Annual Conference of the North American Fuzzy Information Processing Society - NAFIPS, .

Resultado de la investigación: Contribución a una conferenciaArtículo

TY - CONF

T1 - Computing text similarity using Tree Edit Distance

AU - Sidorov, Grigori

AU - Gomez-Adorno, Helena

AU - Markov, Ilia

AU - Pinto, David

AU - Loya, Nahun

PY - 2015/9/29

Y1 - 2015/9/29

N2 - © 2015 IEEE. In this paper, we propose the application of the Tree Edit Distance (TED) for calculation of similarity between syntactic n-grams for further detection of soft similarity between texts. The computation of text similarity is the basic task for many natural language processing problems, and it is an open research field. Syntactic n-grams are text features for Vector Space Model construction extracted from dependency trees. Soft similarity is application of Vector Space Model taking into account similarity of features. First, we discuss the advantages of the application of the TED to syntactic n-grams. Then, we present a procedure based on the TED and syntactic n-grams for calculating soft similarity between texts.

AB - © 2015 IEEE. In this paper, we propose the application of the Tree Edit Distance (TED) for calculation of similarity between syntactic n-grams for further detection of soft similarity between texts. The computation of text similarity is the basic task for many natural language processing problems, and it is an open research field. Syntactic n-grams are text features for Vector Space Model construction extracted from dependency trees. Soft similarity is application of Vector Space Model taking into account similarity of features. First, we discuss the advantages of the application of the TED to syntactic n-grams. Then, we present a procedure based on the TED and syntactic n-grams for calculating soft similarity between texts.

UR - https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=84961888075&origin=inward

UR - https://www.scopus.com/inward/citedby.uri?partnerID=HzOxMe3b&scp=84961888075&origin=inward

U2 - 10.1109/NAFIPS-WConSC.2015.7284129

DO - 10.1109/NAFIPS-WConSC.2015.7284129

M3 - Paper

ER -

Sidorov G, Gomez-Adorno H, Markov I, Pinto D, Loya N. Computing text similarity using Tree Edit Distance. 2015. Papel presentado en Annual Conference of the North American Fuzzy Information Processing Society - NAFIPS, . https://doi.org/10.1109/NAFIPS-WConSC.2015.7284129