CIC-GIL approach to cross-domain authorship attribution: Notebook for PAN at CLEF 2018

Carolina Martín-Del-Campo-Rodríguez, Helena Gómez-Adorno, Grigori Sidorov, Ildar Batyrshin

Research output: Contribution to conferencePaper

1 Scopus citations

Abstract

We present the CIC-GIL approach to the cross-domain authorship attribution task at PAN 2018. This year's evaluation lab focuses on the closed-set attribution task applied to a Fanflction corpus in five languages: English, French, Italian, Polish, and Spanish. We followed a traditional machine learning approach and selected different feature sets depending on the language. We evaluated document features such as typed and untyped character n-grams, word n-grams, and function word n-grams. Our final system uses the log-entropy weighting scheme and SVM as classifier.
Original languageAmerican English
StatePublished - 1 Jan 2018
EventCEUR Workshop Proceedings -
Duration: 1 Jan 2018 → …

Conference

ConferenceCEUR Workshop Proceedings
Period1/01/18 → …

    Fingerprint

Cite this

Martín-Del-Campo-Rodríguez, C., Gómez-Adorno, H., Sidorov, G., & Batyrshin, I. (2018). CIC-GIL approach to cross-domain authorship attribution: Notebook for PAN at CLEF 2018. Paper presented at CEUR Workshop Proceedings, .