MUCIC at CheckThat! 2021: FaDo-fake news detection and domain identification using transformers ensembling

Fazlourrahman Balouchzahi; Hosahalli Lakshmaiah Shashirekha; Grigori Sidorov

MUCIC at CheckThat! 2021: FaDo-fake news detection and domain identification using transformers ensembling

Fazlourrahman Balouchzahi, Hosahalli Lakshmaiah Shashirekha, Grigori Sidorov

Centro de Investigación en Computación (CIC)

Producción científica: Contribución a una revista › Artículo de la conferencia › revisión exhaustiva

6 Citas (Scopus)

Resumen

Since the beginning of Covid-19 era in November 2019, the patient growth curve is closely accompanied by the growth of fake news. Therefore, developing tools and models for the detection of fake news from real ones in various domains have become more significant than the earlier days. To address the detection of fake news, in this paper, we, team MUCIC, describe the models submitted to 'Fake News Detection', a shared task organized by CLEF-2021-CheckThat! Lab. This shared task contains two subtasks namely; Fake News Detection of News Articles (Subtask 3A) and Topical Domain Classification of News Articles (Subtask 3B) and both are multi-class text classification tasks. The proposed models have been developed by fine-tuning the three transformer-based language models namely; Roberta, Distilbert, and BERT from HuggingFace using training data and then ensembling them as estimators with majority voting. The proposed models performances evaluated through the evaluation script provided by organizers obtained F1-scores of 0.5309 and 0.8550 for Subtask 3A and Subtask 3B respectively.

Idioma original	Inglés
Páginas (desde-hasta)	455-464
Número de páginas	10
Publicación	CEUR Workshop Proceedings
Volumen	2936
Estado	Publicada - 2021
Evento	2021 Working Notes of CLEF - Conference and Labs of the Evaluation Forum, CLEF-WN 2021 - Virtual, Bucharest, Rumanía Duración: 21 sep. 2021 → 24 sep. 2021

Otros archivos y enlaces

Enlace a la publicación en Scopus

Citar esto

@article{83a5efa26f8c4ca48f2382c3d121341a,

title = "MUCIC at CheckThat! 2021: FaDo-fake news detection and domain identification using transformers ensembling",

abstract = "Since the beginning of Covid-19 era in November 2019, the patient growth curve is closely accompanied by the growth of fake news. Therefore, developing tools and models for the detection of fake news from real ones in various domains have become more significant than the earlier days. To address the detection of fake news, in this paper, we, team MUCIC, describe the models submitted to 'Fake News Detection', a shared task organized by CLEF-2021-CheckThat! Lab. This shared task contains two subtasks namely; Fake News Detection of News Articles (Subtask 3A) and Topical Domain Classification of News Articles (Subtask 3B) and both are multi-class text classification tasks. The proposed models have been developed by fine-tuning the three transformer-based language models namely; Roberta, Distilbert, and BERT from HuggingFace using training data and then ensembling them as estimators with majority voting. The proposed models performances evaluated through the evaluation script provided by organizers obtained F1-scores of 0.5309 and 0.8550 for Subtask 3A and Subtask 3B respectively.",

keywords = "BERT, Distilbert, Domain identification, Fake news detection, Roberta, Transformers",

author = "Fazlourrahman Balouchzahi and Shashirekha, {Hosahalli Lakshmaiah} and Grigori Sidorov",

note = "Publisher Copyright: {\textcopyright} 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).; 2021 Working Notes of CLEF - Conference and Labs of the Evaluation Forum, CLEF-WN 2021 ; Conference date: 21-09-2021 Through 24-09-2021",

year = "2021",

language = "Ingl{\'e}s",

volume = "2936",

pages = "455--464",

journal = "CEUR Workshop Proceedings",

issn = "1613-0073",

publisher = "CEUR-WS",

}

TY - JOUR

T1 - MUCIC at CheckThat! 2021

T2 - 2021 Working Notes of CLEF - Conference and Labs of the Evaluation Forum, CLEF-WN 2021

AU - Balouchzahi, Fazlourrahman

AU - Shashirekha, Hosahalli Lakshmaiah

AU - Sidorov, Grigori

PY - 2021

Y1 - 2021

N2 - Since the beginning of Covid-19 era in November 2019, the patient growth curve is closely accompanied by the growth of fake news. Therefore, developing tools and models for the detection of fake news from real ones in various domains have become more significant than the earlier days. To address the detection of fake news, in this paper, we, team MUCIC, describe the models submitted to 'Fake News Detection', a shared task organized by CLEF-2021-CheckThat! Lab. This shared task contains two subtasks namely; Fake News Detection of News Articles (Subtask 3A) and Topical Domain Classification of News Articles (Subtask 3B) and both are multi-class text classification tasks. The proposed models have been developed by fine-tuning the three transformer-based language models namely; Roberta, Distilbert, and BERT from HuggingFace using training data and then ensembling them as estimators with majority voting. The proposed models performances evaluated through the evaluation script provided by organizers obtained F1-scores of 0.5309 and 0.8550 for Subtask 3A and Subtask 3B respectively.

AB - Since the beginning of Covid-19 era in November 2019, the patient growth curve is closely accompanied by the growth of fake news. Therefore, developing tools and models for the detection of fake news from real ones in various domains have become more significant than the earlier days. To address the detection of fake news, in this paper, we, team MUCIC, describe the models submitted to 'Fake News Detection', a shared task organized by CLEF-2021-CheckThat! Lab. This shared task contains two subtasks namely; Fake News Detection of News Articles (Subtask 3A) and Topical Domain Classification of News Articles (Subtask 3B) and both are multi-class text classification tasks. The proposed models have been developed by fine-tuning the three transformer-based language models namely; Roberta, Distilbert, and BERT from HuggingFace using training data and then ensembling them as estimators with majority voting. The proposed models performances evaluated through the evaluation script provided by organizers obtained F1-scores of 0.5309 and 0.8550 for Subtask 3A and Subtask 3B respectively.

KW - BERT

KW - Distilbert

KW - Domain identification

KW - Fake news detection

KW - Roberta

KW - Transformers

UR - http://www.scopus.com/inward/record.url?scp=85113505995&partnerID=8YFLogxK

M3 - Artículo de la conferencia

AN - SCOPUS:85113505995

SN - 1613-0073

VL - 2936

SP - 455

EP - 464

JO - CEUR Workshop Proceedings

JF - CEUR Workshop Proceedings

Y2 - 21 September 2021 through 24 September 2021

ER -

MUCIC at CheckThat! 2021: FaDo-fake news detection and domain identification using transformers ensembling

Resumen

Otros archivos y enlaces

Huella

Citar esto