Complete syntactic N-grams as style markers for authorship attribution

Producción científica: Contribución a una revistaArtículorevisión exhaustiva

18 Citas (Scopus)

Resumen

In this paper we present an authorship attribution method based on the use of complete (non-continuous, with bifurcations) syntactic n-grams as style markers. Syntactic n-grams are obtained by following paths in subtrees of a syntactic tree. We work with relatively short text fragments and build authors’ profiles of various sizes using tf-idf scheme. We train SVM classifier to perform the task. We compare the method with the application of character n-grams and show that the accuracy increases when using complete syntactic n-grams.

Idioma originalInglés
Páginas (desde-hasta)9-17
Número de páginas9
PublicaciónLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volumen8856
DOI
EstadoPublicada - 2014

Huella

Profundice en los temas de investigación de 'Complete syntactic N-grams as style markers for authorship attribution'. En conjunto forman una huella única.

Citar esto