Transfer Learning from Multilingual DeBERTa for Sexism Identification

Hoang Thang Ta; Abu Bakar Siddiqur Rahman; Lotfollah Najjar; Alexander Gelbukh

Transfer Learning from Multilingual DeBERTa for Sexism Identification

Hoang Thang Ta, Abu Bakar Siddiqur Rahman, Lotfollah Najjar, Alexander Gelbukh

Centro de Investigación en Computación (CIC)

Research output: Contribution to journal › Conference article › peer-review

2 Scopus citations

Abstract

In this paper, we address the Task 1 and Task 2 of the EXIST 2022 in detecting sexism in a broad sense, from ideological inequality, sexual violence, misogyny to other expressions that involve implicit sexist behaviours in social networks. We apply transfer learning from a pre-trained multilingual DeBERTa (mDeBERTa) model and its zero classification to gain a better performance than BERT-based approaches. Lastly, we combine all 3 methods: mDeBERTa, zero classification, and BERT for majority vote. For Task 1, mDeBERTa is the best method with an accuracy of 76.09% and F1 of 76.08%. Meanwhile, an accuracy of 66.26% and F1 of 47.06% are the best results in Task2, when using majority vote. Our main contribution is to use DeBERTa and zero classification with designing only one classifier in sexism identification.

Original language	English
Journal	CEUR Workshop Proceedings
Volume	3202
State	Published - 2022
Event	2022 Iberian Languages Evaluation Forum, IberLEF 2022 - A Coruna, Spain Duration: 20 Sep 2022 → …

Keywords

DeBERTa
EXIST 2022
IberLEF
Offensive Language
Sexism Identification
Text Classification

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Cite this

@article{78f22ce516064cd7b9b9c4b4d5aaebee,

title = "Transfer Learning from Multilingual DeBERTa for Sexism Identification",

abstract = "In this paper, we address the Task 1 and Task 2 of the EXIST 2022 in detecting sexism in a broad sense, from ideological inequality, sexual violence, misogyny to other expressions that involve implicit sexist behaviours in social networks. We apply transfer learning from a pre-trained multilingual DeBERTa (mDeBERTa) model and its zero classification to gain a better performance than BERT-based approaches. Lastly, we combine all 3 methods: mDeBERTa, zero classification, and BERT for majority vote. For Task 1, mDeBERTa is the best method with an accuracy of 76.09% and F1 of 76.08%. Meanwhile, an accuracy of 66.26% and F1 of 47.06% are the best results in Task2, when using majority vote. Our main contribution is to use DeBERTa and zero classification with designing only one classifier in sexism identification.",

keywords = "DeBERTa, EXIST 2022, IberLEF, Offensive Language, Sexism Identification, Text Classification",

author = "Ta, {Hoang Thang} and Rahman, {Abu Bakar Siddiqur} and Lotfollah Najjar and Alexander Gelbukh",

note = "Publisher Copyright: {\textcopyright} 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).; 2022 Iberian Languages Evaluation Forum, IberLEF 2022 ; Conference date: 20-09-2022",

year = "2022",

language = "Ingl{\'e}s",

volume = "3202",

journal = "CEUR Workshop Proceedings",

issn = "1613-0073",

publisher = "CEUR-WS",

}

TY - JOUR

T1 - Transfer Learning from Multilingual DeBERTa for Sexism Identification

AU - Ta, Hoang Thang

AU - Rahman, Abu Bakar Siddiqur

AU - Najjar, Lotfollah

AU - Gelbukh, Alexander

PY - 2022

Y1 - 2022

N2 - In this paper, we address the Task 1 and Task 2 of the EXIST 2022 in detecting sexism in a broad sense, from ideological inequality, sexual violence, misogyny to other expressions that involve implicit sexist behaviours in social networks. We apply transfer learning from a pre-trained multilingual DeBERTa (mDeBERTa) model and its zero classification to gain a better performance than BERT-based approaches. Lastly, we combine all 3 methods: mDeBERTa, zero classification, and BERT for majority vote. For Task 1, mDeBERTa is the best method with an accuracy of 76.09% and F1 of 76.08%. Meanwhile, an accuracy of 66.26% and F1 of 47.06% are the best results in Task2, when using majority vote. Our main contribution is to use DeBERTa and zero classification with designing only one classifier in sexism identification.

AB - In this paper, we address the Task 1 and Task 2 of the EXIST 2022 in detecting sexism in a broad sense, from ideological inequality, sexual violence, misogyny to other expressions that involve implicit sexist behaviours in social networks. We apply transfer learning from a pre-trained multilingual DeBERTa (mDeBERTa) model and its zero classification to gain a better performance than BERT-based approaches. Lastly, we combine all 3 methods: mDeBERTa, zero classification, and BERT for majority vote. For Task 1, mDeBERTa is the best method with an accuracy of 76.09% and F1 of 76.08%. Meanwhile, an accuracy of 66.26% and F1 of 47.06% are the best results in Task2, when using majority vote. Our main contribution is to use DeBERTa and zero classification with designing only one classifier in sexism identification.

KW - DeBERTa

KW - EXIST 2022

KW - IberLEF

KW - Offensive Language

KW - Sexism Identification

KW - Text Classification

UR - http://www.scopus.com/inward/record.url?scp=85137339044&partnerID=8YFLogxK

M3 - Artículo de la conferencia

AN - SCOPUS:85137339044

SN - 1613-0073

VL - 3202

JO - CEUR Workshop Proceedings

JF - CEUR Workshop Proceedings

T2 - 2022 Iberian Languages Evaluation Forum, IberLEF 2022

Y2 - 20 September 2022

ER -

Transfer Learning from Multilingual DeBERTa for Sexism Identification

Abstract

Keywords

UN SDGs

Other files and links

Fingerprint

Cite this