NLP-CIC at HASOC 2020: Multilingual offensive language detection using all-in-one model

Segun Taofeek Aroyehun, Alexander Gelbukh

Research output: Contribution to journalConference articlepeer-review

3 Scopus citations

Abstract

We describe our deep learning model submitted to the HASOC 2020 shared task on detection of offensive language in social media in three Indo-European languages: English, German, and Hindi. We fine-tune a pre-trained multilingual encoder on the combination of data provided for the competition. Our submission received a competitive macro- average F1 score of 0.4980 on the English Subtask A as well as comparatively strong performance on the German data.

Original languageEnglish
Pages (from-to)331-335
Number of pages5
JournalCEUR Workshop Proceedings
Volume2826
StatePublished - 2020
EventWorking Notes of FIRE - 12th Forum for Information Retrieval Evaluation, FIRE-WN 2020 - Hyderabad, India
Duration: 16 Dec 202020 Dec 2020

Keywords

  • Deep learning
  • Multilingual
  • Offensive content identification
  • Text classification

Fingerprint

Dive into the research topics of 'NLP-CIC at HASOC 2020: Multilingual offensive language detection using all-in-one model'. Together they form a unique fingerprint.

Cite this