CIC-GIL approach to author profiling in Spanish tweets: Location and occupation

Ilia Markov, Helena Gómez-Adorno, Mónica Jasso-Rosales, Grigori Sidorov

Research output: Contribution to journalConference articlepeer-review

2 Scopus citations

Abstract

We present the CIC-GIL approach to the author profiling (AP) task at MEX-A3T 2018. The task consists of two subtasks: identification of authors’ location (6-way) and occupation (8-way) in a corpus of Mexican Spanish tweets. We used the logistic regression algorithm trained on typed character n-grams, function-word n-grams, and regionalisms for location identification, and typed character n-grams with several modifications for occupation identification. Our best run showed F1-macro score of 73.63% for location and 48.94% for occupation identification. The results are competitive with other participating teams; in particular, our best run was ranked fourth in the shared task.

Original languageEnglish
Pages (from-to)97-101
Number of pages5
JournalCEUR Workshop Proceedings
Volume2150
StatePublished - 2018
Event3rd Workshop on Evaluation of Human Language Technologies for Iberian Languages, IberEval 2018 - Sevilla, Spain
Duration: 18 Sep 2018 → …

Keywords

  • Author profiling
  • Location identification
  • Machine learning
  • N-grams
  • Occupation identification
  • Social media
  • Spanish

Fingerprint

Dive into the research topics of 'CIC-GIL approach to author profiling in Spanish tweets: Location and occupation'. Together they form a unique fingerprint.

Cite this