CIC-GIL approach to cross-domain authorship attribution: Notebook for PAN at CLEF 2018

Carolina Martín-Del-Campo-Rodríguez, Helena Gómez-Adorno, Grigori Sidorov, Ildar Batyrshin

Producción científica: Contribución a una revistaArtículo de la conferenciarevisión exhaustiva

1 Cita (Scopus)

Resumen

We present the CIC-GIL approach to the cross-domain authorship attribution task at PAN 2018. This year's evaluation lab focuses on the closed-set attribution task applied to a Fanflction corpus in five languages: English, French, Italian, Polish, and Spanish. We followed a traditional machine learning approach and selected different feature sets depending on the language. We evaluated document features such as typed and untyped character n-grams, word n-grams, and function word n-grams. Our final system uses the log-entropy weighting scheme and SVM as classifier.

Idioma originalInglés
PublicaciónCEUR Workshop Proceedings
Volumen2125
EstadoPublicada - 2018
Evento19th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2018 - Avignon, Francia
Duración: 10 sep. 201814 sep. 2018

Huella

Profundice en los temas de investigación de 'CIC-GIL approach to cross-domain authorship attribution: Notebook for PAN at CLEF 2018'. En conjunto forman una huella única.

Citar esto