The winning approach to cross-genre gender identification in Russian at RUSProfiling 2017

Ilia Markov, Helena Gómez-Adorno, Grigori Sidorov, Alexander Gelbukh

Research output: Contribution to journalConference articlepeer-review

10 Scopus citations

Abstract

We present the CIC systems submitted to the 2017 PAN shared task on Cross-Genre Gender Identification in Russian texts (RUSProfiling). We submitted five systems. One of them was based on a statistical approach using only lexical features, and other four on machine-learning techniques using some combinations of gender-specific Russian grammatical features, word and character n-grams, and suffix n-grams. Our systems achieved the highest weighted accuracy across all the test datasets, occupying the first four places in the ranking.

Original languageEnglish
Pages (from-to)20-24
Number of pages5
JournalCEUR Workshop Proceedings
Volume2036
StatePublished - 2017
Event2017 Working Notes of Forum for Information Retrieval Evaluation, FIRE 2017 - Bangalore, India
Duration: 8 Dec 201710 Dec 2017

Keywords

  • Author profiling
  • Computational linguistics
  • Cross-genre
  • Gender identification
  • Machine learning
  • Russian
  • Social media

Fingerprint

Dive into the research topics of 'The winning approach to cross-genre gender identification in Russian at RUSProfiling 2017'. Together they form a unique fingerprint.

Cite this