Arabic Misogyny Identification

Fazlourrahman Balouchzahi, Grigori Sidorov, Hosahalli Lakshmaiah Shashirekha

Research output: Contribution to journalConference articlepeer-review

Abstract

Social media usually consists of various forms of toxic contents such as Hate Speech (HS) and contents in offensive and abusive languages, in addition to useful and relevant ones. The offensive contents on social media may target a religion, community, individual or group of people, with specific thoughts and beliefs. A category of offensive content targeting women termed as Misogyny is increasing day-by-day and a person/group who shares such content is called a Misogynist. Misogyny detection can be seen as a sub-category of HS and Offensive Language Identification (OLI) tasks in which women and issues regarding them such as their rights are targeted. Despite the several works undertaken for HS and OLI tasks by several researchers, Misogyny detection has been studied rarely even for rich resource languages. To promote Misogyny detection in Arabic language, Arabic Misogyny Identification (ArMI)a shared task in Forum for Information Retrieval Evaluation (FIRE) 2021 provides the dataset and invites the researches to develop models for Misogyny detection in the given text. The shared task consists of two subtasks which can be modeled as binary and multiclass Text Classification (TC) tasks. This paper describes the models submitted by our team MUCIC to the ArMI shared task. The proposed methodology uses a combination of top frequent char and word n-grams as features to train Machine Learning (ML) classifiers and obtained an accuracy of 0.873 and F1-score of 0.497 for Subtask A and B respectively.

Original languageEnglish
Pages (from-to)839-846
Number of pages8
JournalCEUR Workshop Proceedings
Volume3159
StatePublished - 2021
EventWorking Notes of FIRE - 13th Forum for Information Retrieval Evaluation, FIRE-WN 2021 - Gandhinagar, India
Duration: 13 Dec 202117 Dec 2021

Keywords

  • Hate Speech
  • Machine Learning
  • Misogyny Detection
  • Offensive Language
  • Social Media

Fingerprint

Dive into the research topics of 'Arabic Misogyny Identification'. Together they form a unique fingerprint.

Cite this