TY - JOUR
T1 - The winning approach to cross-genre gender identification in Russian at RUSProfiling 2017
AU - Markov, Ilia
AU - Gómez-Adorno, Helena
AU - Sidorov, Grigori
AU - Gelbukh, Alexander
N1 - Funding Information:
This work was partially supported by the Mexican Government (CONACYT projects 240844, SNI, COFAA-IPN, SIP-IPN 20171813, 20172008, and 20172044).
PY - 2017
Y1 - 2017
N2 - We present the CIC systems submitted to the 2017 PAN shared task on Cross-Genre Gender Identification in Russian texts (RUSProfiling). We submitted five systems. One of them was based on a statistical approach using only lexical features, and other four on machine-learning techniques using some combinations of gender-specific Russian grammatical features, word and character n-grams, and suffix n-grams. Our systems achieved the highest weighted accuracy across all the test datasets, occupying the first four places in the ranking.
AB - We present the CIC systems submitted to the 2017 PAN shared task on Cross-Genre Gender Identification in Russian texts (RUSProfiling). We submitted five systems. One of them was based on a statistical approach using only lexical features, and other four on machine-learning techniques using some combinations of gender-specific Russian grammatical features, word and character n-grams, and suffix n-grams. Our systems achieved the highest weighted accuracy across all the test datasets, occupying the first four places in the ranking.
KW - Author profiling
KW - Computational linguistics
KW - Cross-genre
KW - Gender identification
KW - Machine learning
KW - Russian
KW - Social media
UR - http://www.scopus.com/inward/record.url?scp=85041441696&partnerID=8YFLogxK
M3 - Artículo de la conferencia
AN - SCOPUS:85041441696
SN - 1613-0073
VL - 2036
SP - 20
EP - 24
JO - CEUR Workshop Proceedings
JF - CEUR Workshop Proceedings
T2 - 2017 Working Notes of Forum for Information Retrieval Evaluation, FIRE 2017
Y2 - 8 December 2017 through 10 December 2017
ER -