PolyHope: Two-level hope speech detection from tweets

Fazlourrahman Balouchzahi; Grigori Sidorov; Alexander Gelbukh

doi:10.1016/j.eswa.2023.120078

PolyHope: Two-level hope speech detection from tweets

Fazlourrahman Balouchzahi, Grigori Sidorov, Alexander Gelbukh

Centro de Investigación en Computación (CIC)

Producción científica: Contribución a una revista › Artículo › revisión exhaustiva

9 Citas (Scopus)

Resumen

Hope is characterized as openness of spirit towards the future, a desire, expectation, and wish for something to happen or to be true that remarkably affects human's state of mind, emotions, behaviors, and decisions. Hope is usually associated with concepts of desired expectations and possibility/probability concerning the future. Despite its importance, hope has rarely been studied as a social media analysis task. This paper presents a hope speech dataset that classifies each tweet first into “Hope” and “Not Hope”, then into three fine-grained hope categories: “Generalized Hope”, “Realistic Hope”, and “Unrealistic Hope” (along with “Not Hope”). English tweets in the first half of 2022 were collected to build this dataset. Furthermore, we describe our annotation process and guidelines in detail and discuss the challenges of classifying hope and the limitations of the existing hope speech detection corpora. In addition, we reported several baselines based on different learning approaches, such as traditional machine learning, deep learning, and transformers, to benchmark our dataset. We evaluated our baselines using averaged-weighted and averaged-macro F1-scores. Observations show that a strict process for annotator selection and detailed annotation guidelines enhanced the dataset's quality. This strict annotation process yielded promising performance for simple machine learning classifiers with only uni-grams; however, binary and multiclass hope speech detection results reveal that contextual embedding models have higher performance in this dataset.

Idioma original	Inglés
Número de artículo	120078
Publicación	Expert Systems with Applications
Volumen	225
DOI	https://doi.org/10.1016/j.eswa.2023.120078
Estado	Publicada - 1 sep. 2023

Acceder al documento

10.1016/j.eswa.2023.120078

Otros archivos y enlaces

Enlace a la publicación en Scopus

Citar esto

@article{31b1f97f0f1a458e82f0b587a76137fd,

title = "PolyHope: Two-level hope speech detection from tweets",

abstract = "Hope is characterized as openness of spirit towards the future, a desire, expectation, and wish for something to happen or to be true that remarkably affects human's state of mind, emotions, behaviors, and decisions. Hope is usually associated with concepts of desired expectations and possibility/probability concerning the future. Despite its importance, hope has rarely been studied as a social media analysis task. This paper presents a hope speech dataset that classifies each tweet first into “Hope” and “Not Hope”, then into three fine-grained hope categories: “Generalized Hope”, “Realistic Hope”, and “Unrealistic Hope” (along with “Not Hope”). English tweets in the first half of 2022 were collected to build this dataset. Furthermore, we describe our annotation process and guidelines in detail and discuss the challenges of classifying hope and the limitations of the existing hope speech detection corpora. In addition, we reported several baselines based on different learning approaches, such as traditional machine learning, deep learning, and transformers, to benchmark our dataset. We evaluated our baselines using averaged-weighted and averaged-macro F1-scores. Observations show that a strict process for annotator selection and detailed annotation guidelines enhanced the dataset's quality. This strict annotation process yielded promising performance for simple machine learning classifiers with only uni-grams; however, binary and multiclass hope speech detection results reveal that contextual embedding models have higher performance in this dataset.",

keywords = "Deep learning, Desire, Expectation, Hope, Machine learning, Natural Language Processing, Transformers, Wish",

author = "Fazlourrahman Balouchzahi and Grigori Sidorov and Alexander Gelbukh",

note = "Publisher Copyright: {\textcopyright} 2023 Elsevier Ltd",

year = "2023",

month = sep,

day = "1",

doi = "10.1016/j.eswa.2023.120078",

language = "Ingl{\'e}s",

volume = "225",

journal = "Expert Systems with Applications",

issn = "0957-4174",

}

TY - JOUR

T1 - PolyHope

T2 - Two-level hope speech detection from tweets

AU - Balouchzahi, Fazlourrahman

AU - Sidorov, Grigori

AU - Gelbukh, Alexander

PY - 2023/9/1

Y1 - 2023/9/1

N2 - Hope is characterized as openness of spirit towards the future, a desire, expectation, and wish for something to happen or to be true that remarkably affects human's state of mind, emotions, behaviors, and decisions. Hope is usually associated with concepts of desired expectations and possibility/probability concerning the future. Despite its importance, hope has rarely been studied as a social media analysis task. This paper presents a hope speech dataset that classifies each tweet first into “Hope” and “Not Hope”, then into three fine-grained hope categories: “Generalized Hope”, “Realistic Hope”, and “Unrealistic Hope” (along with “Not Hope”). English tweets in the first half of 2022 were collected to build this dataset. Furthermore, we describe our annotation process and guidelines in detail and discuss the challenges of classifying hope and the limitations of the existing hope speech detection corpora. In addition, we reported several baselines based on different learning approaches, such as traditional machine learning, deep learning, and transformers, to benchmark our dataset. We evaluated our baselines using averaged-weighted and averaged-macro F1-scores. Observations show that a strict process for annotator selection and detailed annotation guidelines enhanced the dataset's quality. This strict annotation process yielded promising performance for simple machine learning classifiers with only uni-grams; however, binary and multiclass hope speech detection results reveal that contextual embedding models have higher performance in this dataset.

AB - Hope is characterized as openness of spirit towards the future, a desire, expectation, and wish for something to happen or to be true that remarkably affects human's state of mind, emotions, behaviors, and decisions. Hope is usually associated with concepts of desired expectations and possibility/probability concerning the future. Despite its importance, hope has rarely been studied as a social media analysis task. This paper presents a hope speech dataset that classifies each tweet first into “Hope” and “Not Hope”, then into three fine-grained hope categories: “Generalized Hope”, “Realistic Hope”, and “Unrealistic Hope” (along with “Not Hope”). English tweets in the first half of 2022 were collected to build this dataset. Furthermore, we describe our annotation process and guidelines in detail and discuss the challenges of classifying hope and the limitations of the existing hope speech detection corpora. In addition, we reported several baselines based on different learning approaches, such as traditional machine learning, deep learning, and transformers, to benchmark our dataset. We evaluated our baselines using averaged-weighted and averaged-macro F1-scores. Observations show that a strict process for annotator selection and detailed annotation guidelines enhanced the dataset's quality. This strict annotation process yielded promising performance for simple machine learning classifiers with only uni-grams; however, binary and multiclass hope speech detection results reveal that contextual embedding models have higher performance in this dataset.

KW - Deep learning

KW - Desire

KW - Expectation

KW - Hope

KW - Machine learning

KW - Natural Language Processing

KW - Transformers

KW - Wish

UR - http://www.scopus.com/inward/record.url?scp=85152436287&partnerID=8YFLogxK

U2 - 10.1016/j.eswa.2023.120078

DO - 10.1016/j.eswa.2023.120078

M3 - Artículo

AN - SCOPUS:85152436287

SN - 0957-4174

VL - 225

JO - Expert Systems with Applications

JF - Expert Systems with Applications

M1 - 120078

ER -

PolyHope: Two-level hope speech detection from tweets

Resumen

Acceder al documento

Otros archivos y enlaces

Huella

Citar esto