CIC-GIL approach to cross-domain authorship attribution: Notebook for PAN at CLEF 2018

Carolina Martín-Del-Campo-Rodríguez, Helena Gómez-Adorno, Grigori Sidorov, Ildar Batyrshin

Research output: Contribution to journalConference articlepeer-review

1 Scopus citations

Abstract

We present the CIC-GIL approach to the cross-domain authorship attribution task at PAN 2018. This year's evaluation lab focuses on the closed-set attribution task applied to a Fanflction corpus in five languages: English, French, Italian, Polish, and Spanish. We followed a traditional machine learning approach and selected different feature sets depending on the language. We evaluated document features such as typed and untyped character n-grams, word n-grams, and function word n-grams. Our final system uses the log-entropy weighting scheme and SVM as classifier.

Original languageEnglish
JournalCEUR Workshop Proceedings
Volume2125
StatePublished - 2018
Event19th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2018 - Avignon, France
Duration: 10 Sep 201814 Sep 2018

Fingerprint

Dive into the research topics of 'CIC-GIL approach to cross-domain authorship attribution: Notebook for PAN at CLEF 2018'. Together they form a unique fingerprint.

Cite this