TY - JOUR
T1 - Automatic Hate Speech Detection Using Deep Neural Networks and Word Embedding
AU - Ojo, Olumide Ebenezer
AU - Ta, Thang Hoang
AU - Gelbukh, Alexander
AU - Calvo, Hiram
AU - Sidorov, Grigori
AU - Adebanji, Olaronke Oluwayemisi
N1 - Publisher Copyright:
© 2022 Instituto Politecnico Nacional. All rights reserved.
PY - 2022
Y1 - 2022
N2 - Hatred spreading through the use of language on social media platforms and in online groups is becoming a well-known phenomenon. By comparing two text representations: bag of words (BoW) and pre-trained word embedding using GloVe, we used a binary classification approach to automatically process user contents to detect hate speech. The Naive Bayes Algorithm (NBA), Logistic Regression Model (LRM), Support Vector Machines (SVM), Random Forest Classifier (RFC) and the one-dimensional Convolutional Neural Networks (1D-CNN) are the models proposed. With a weighted macro-F1 score of 0.66 and a 0.90 accuracy, the performance of the 1D-CNN and GloVe embeddings was best among all the models.
AB - Hatred spreading through the use of language on social media platforms and in online groups is becoming a well-known phenomenon. By comparing two text representations: bag of words (BoW) and pre-trained word embedding using GloVe, we used a binary classification approach to automatically process user contents to detect hate speech. The Naive Bayes Algorithm (NBA), Logistic Regression Model (LRM), Support Vector Machines (SVM), Random Forest Classifier (RFC) and the one-dimensional Convolutional Neural Networks (1D-CNN) are the models proposed. With a weighted macro-F1 score of 0.66 and a 0.90 accuracy, the performance of the 1D-CNN and GloVe embeddings was best among all the models.
KW - 1D-CNN
KW - Hate speech
KW - gloVe
UR - http://www.scopus.com/inward/record.url?scp=85129210092&partnerID=8YFLogxK
U2 - 10.13053/CyS-26-2-4107
DO - 10.13053/CyS-26-2-4107
M3 - Artículo
AN - SCOPUS:85129210092
SN - 1405-5546
VL - 26
SP - 1007
EP - 1013
JO - Computacion y Sistemas
JF - Computacion y Sistemas
IS - 2
ER -