Individual vs. Group Violent Threats Classification in Online Discussions

Noman Ashraf; Rabia Mustafa; Grigori Sidorov; Alexander Gelbukh

doi:10.1145/3366424.3385778

Individual vs. Group Violent Threats Classification in Online Discussions

Noman Ashraf, Rabia Mustafa, Grigori Sidorov, Alexander Gelbukh

Centro de Investigación en Computación (CIC)

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

26 Scopus citations

Abstract

Violent threat is a serious crime affecting the targeted individuals or groups. It is essential for media providers to block the users that post such threats. In this paper, we focused on detection of violent threat language in YouTube comments. We categorized the threatening comments into those targeting an individual or a group. We started from an existing dataset with violent threat language identified, but without any categorization into comments targeting individuals or groups. We adopted a binary classification approach for the prediction of individual- vs. group-targeting threats. We compared two text representations: bag of words (BOW) and pre-trained word embedding such as GloVe and fastText. We used deep-learning classifiers such as 1D-CNN, LSTM, and bidirectional LSTM (BiLSTM). GloVe embedding showed the worst results, fastText performed much better, and BiLSTM on BOW with term frequency-inverse document frequency (TF-IDF) weighting scheme gave the best results, achieving 0.94% ROC-AUC and Macro-F1 score of 0.85%.

Original language	English
Title of host publication	The Web Conference 2020 - Companion of the World Wide Web Conference, WWW 2020
Publisher	Association for Computing Machinery
Pages	629-633
Number of pages	5
ISBN (Electronic)	9781450370240
DOIs	https://doi.org/10.1145/3366424.3385778
State	Published - 20 Apr 2020
Event	29th International World Wide Web Conference, WWW 2020 - Taipei, Taiwan, Province of China Duration: 20 Apr 2020 → 24 Apr 2020

Publication series

Name	The Web Conference 2020 - Companion of the World Wide Web Conference, WWW 2020

Conference

Conference	29th International World Wide Web Conference, WWW 2020
Country/Territory	Taiwan, Province of China
City	Taipei
Period	20/04/20 → 24/04/20

Keywords

NLP
Violent threat
deep learning
individual and group threats
social media

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

10.1145/3366424.3385778

Cite this

Ashraf, N., Mustafa, R., Sidorov, G., & Gelbukh, A. (2020). Individual vs. Group Violent Threats Classification in Online Discussions. In The Web Conference 2020 - Companion of the World Wide Web Conference, WWW 2020 (pp. 629-633). (The Web Conference 2020 - Companion of the World Wide Web Conference, WWW 2020). Association for Computing Machinery. https://doi.org/10.1145/3366424.3385778

@inproceedings{5281c6d9d1064786b05c68552754e4b5,

title = "Individual vs. Group Violent Threats Classification in Online Discussions",

abstract = "Violent threat is a serious crime affecting the targeted individuals or groups. It is essential for media providers to block the users that post such threats. In this paper, we focused on detection of violent threat language in YouTube comments. We categorized the threatening comments into those targeting an individual or a group. We started from an existing dataset with violent threat language identified, but without any categorization into comments targeting individuals or groups. We adopted a binary classification approach for the prediction of individual- vs. group-targeting threats. We compared two text representations: bag of words (BOW) and pre-trained word embedding such as GloVe and fastText. We used deep-learning classifiers such as 1D-CNN, LSTM, and bidirectional LSTM (BiLSTM). GloVe embedding showed the worst results, fastText performed much better, and BiLSTM on BOW with term frequency-inverse document frequency (TF-IDF) weighting scheme gave the best results, achieving 0.94% ROC-AUC and Macro-F1 score of 0.85%.",

keywords = "NLP, Violent threat, deep learning, individual and group threats, social media",

author = "Noman Ashraf and Rabia Mustafa and Grigori Sidorov and Alexander Gelbukh",

note = "Publisher Copyright: {\textcopyright} 2020 ACM.; 29th International World Wide Web Conference, WWW 2020 ; Conference date: 20-04-2020 Through 24-04-2020",

year = "2020",

month = apr,

day = "20",

doi = "10.1145/3366424.3385778",

language = "Ingl{\'e}s",

series = "The Web Conference 2020 - Companion of the World Wide Web Conference, WWW 2020",

publisher = "Association for Computing Machinery",

pages = "629--633",

booktitle = "The Web Conference 2020 - Companion of the World Wide Web Conference, WWW 2020",

}

Ashraf, N, Mustafa, R, Sidorov, G & Gelbukh, A 2020, Individual vs. Group Violent Threats Classification in Online Discussions. in The Web Conference 2020 - Companion of the World Wide Web Conference, WWW 2020. The Web Conference 2020 - Companion of the World Wide Web Conference, WWW 2020, Association for Computing Machinery, pp. 629-633, 29th International World Wide Web Conference, WWW 2020, Taipei, Taiwan, Province of China, 20/04/20. https://doi.org/10.1145/3366424.3385778

Individual vs. Group Violent Threats Classification in Online Discussions. / Ashraf, Noman; Mustafa, Rabia; Sidorov, Grigori et al.
The Web Conference 2020 - Companion of the World Wide Web Conference, WWW 2020. Association for Computing Machinery, 2020. p. 629-633 (The Web Conference 2020 - Companion of the World Wide Web Conference, WWW 2020).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Individual vs. Group Violent Threats Classification in Online Discussions

AU - Ashraf, Noman

AU - Mustafa, Rabia

AU - Sidorov, Grigori

AU - Gelbukh, Alexander

PY - 2020/4/20

Y1 - 2020/4/20

N2 - Violent threat is a serious crime affecting the targeted individuals or groups. It is essential for media providers to block the users that post such threats. In this paper, we focused on detection of violent threat language in YouTube comments. We categorized the threatening comments into those targeting an individual or a group. We started from an existing dataset with violent threat language identified, but without any categorization into comments targeting individuals or groups. We adopted a binary classification approach for the prediction of individual- vs. group-targeting threats. We compared two text representations: bag of words (BOW) and pre-trained word embedding such as GloVe and fastText. We used deep-learning classifiers such as 1D-CNN, LSTM, and bidirectional LSTM (BiLSTM). GloVe embedding showed the worst results, fastText performed much better, and BiLSTM on BOW with term frequency-inverse document frequency (TF-IDF) weighting scheme gave the best results, achieving 0.94% ROC-AUC and Macro-F1 score of 0.85%.

AB - Violent threat is a serious crime affecting the targeted individuals or groups. It is essential for media providers to block the users that post such threats. In this paper, we focused on detection of violent threat language in YouTube comments. We categorized the threatening comments into those targeting an individual or a group. We started from an existing dataset with violent threat language identified, but without any categorization into comments targeting individuals or groups. We adopted a binary classification approach for the prediction of individual- vs. group-targeting threats. We compared two text representations: bag of words (BOW) and pre-trained word embedding such as GloVe and fastText. We used deep-learning classifiers such as 1D-CNN, LSTM, and bidirectional LSTM (BiLSTM). GloVe embedding showed the worst results, fastText performed much better, and BiLSTM on BOW with term frequency-inverse document frequency (TF-IDF) weighting scheme gave the best results, achieving 0.94% ROC-AUC and Macro-F1 score of 0.85%.

KW - NLP

KW - Violent threat

KW - deep learning

KW - individual and group threats

KW - social media

UR - http://www.scopus.com/inward/record.url?scp=85091693888&partnerID=8YFLogxK

U2 - 10.1145/3366424.3385778

DO - 10.1145/3366424.3385778

M3 - Contribución a la conferencia

AN - SCOPUS:85091693888

T3 - The Web Conference 2020 - Companion of the World Wide Web Conference, WWW 2020

SP - 629

EP - 633

BT - The Web Conference 2020 - Companion of the World Wide Web Conference, WWW 2020

PB - Association for Computing Machinery

T2 - 29th International World Wide Web Conference, WWW 2020

Y2 - 20 April 2020 through 24 April 2020

ER -

Individual vs. Group Violent Threats Classification in Online Discussions

Abstract

Publication series

Conference

Keywords

UN SDGs

Access to Document

Other files and links

Fingerprint

Cite this