Author clustering using hierarchical Clustering analysis: Notebook for PAN at CLEF 2017

Helena Gómez-Adorno, Yuridiana Aleman, Darnes Vilariño, Miguel A. Sanchez-Perez, David Pinto, Grigori Sidorov

Producción científica: Contribución a una revistaArtículo de la conferenciarevisión exhaustiva

8 Citas (Scopus)

Resumen

This paper presents our approach to the Author Clustering task at PAN 2017. We performed a hierarchical clustering analysis of different document features: typed and untyped character n-grams, and word n-grams. We experimented with two feature representation methods, log-entropy model, and tf-idf; while tuning minimum frequency threshold values to reduce the dimensionality. Our system was ranked 1st in both subtasks, author clustering and authorship-link ranking.

Idioma originalInglés
PublicaciónCEUR Workshop Proceedings
Volumen1866
EstadoPublicada - 2017
Evento18th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2017 - Dublin, Irlanda
Duración: 11 sep. 201714 sep. 2017

Huella

Profundice en los temas de investigación de 'Author clustering using hierarchical Clustering analysis: Notebook for PAN at CLEF 2017'. En conjunto forman una huella única.

Citar esto