MUCIC@TamilNLP-ACL2022: Abusive Comment Detection in Tamil Language using 1D Conv-LSTM

F. Balouchzahi; M. D. Anusha; H. L. Shashirekha; G. Sidorov

MUCIC@TamilNLP-ACL2022: Abusive Comment Detection in Tamil Language using 1D Conv-LSTM

F. Balouchzahi, M. D. Anusha, H. L. Shashirekha, G. Sidorov

Centro de Investigación en Computación (CIC)

Producción científica: Capítulo del libro/informe/acta de congreso › Contribución a la conferencia › revisión exhaustiva

4 Citas (Scopus)

Resumen

Abusive language content such as hate speech, profanity, and cyberbullying etc., which is common in online platforms is creating lot of problems to the users as well as policy makers. Hence, detection of such abusive language in user-generated online content has become increasingly important over the past few years. Online platforms strive hard to moderate the abusive content to reduce societal harm, comply with laws, and create a more inclusive environment for their users. In spite of various methods to automatically detect abusive languages in online platforms, the problem still persists. To address the automatic detection of abusive languages in online platforms, this paper describes the models submitted by our team - MUCIC to the shared task on "Abusive Comment Detection in Tamil-ACL 2022". This shared task addresses the abusive comment detection in native Tamil script texts and code-mixed Tamil texts. To address this challenge, two models: i) n-gram-Multilayer Perceptron (n-gram-MLP) model utilizing MLP classifier fed with char-n gram features and ii) 1D Convolutional Long Short-Term Memory (1D Conv-LSTM) model, were submitted. The n-gram-MLP model fared well among these two models with weighted F1-scores of 0.560 and 0.430 for code-mixed Tamil and native Tamil script texts, respectively. This work may be reproduced using the code available in Gthub.

Idioma original	Inglés
Título de la publicación alojada	DravidianLangTech 2022 - 2nd Workshop on Speech and Language Technologies for Dravidian Languages, Proceedings of the Workshop
Editores	Bharathi Raja Chakravarthi, Ruba Priyadharshini, Anand Kumar Madasamy, Parameswari Krishnamurthy, Elizabeth Sherly, Sinnathamby Mahesan
Editorial	Association for Computational Linguistics (ACL)
Páginas	64-69
Número de páginas	6
ISBN (versión digital)	9781955917346
Estado	Publicada - 2022
Evento	2nd Workshop on Speech and Language Technologies for Dravidian Languages, Proceedings of the Workshop, DravidianLangTech 2022 - Dublin, Irlanda Duración: 26 may. 2022 → …

Serie de la publicación

Nombre	DravidianLangTech 2022 - 2nd Workshop on Speech and Language Technologies for Dravidian Languages, Proceedings of the Workshop

Conferencia

Conferencia	2nd Workshop on Speech and Language Technologies for Dravidian Languages, Proceedings of the Workshop, DravidianLangTech 2022
País/Territorio	Irlanda
Ciudad	Dublin
Período	26/05/22 → …

Otros archivos y enlaces

Enlace a la publicación en Scopus

Citar esto

Balouchzahi, F., Anusha, M. D., Shashirekha, H. L., & Sidorov, G. (2022). MUCIC@TamilNLP-ACL2022: Abusive Comment Detection in Tamil Language using 1D Conv-LSTM. En B. R. Chakravarthi, R. Priyadharshini, A. K. Madasamy, P. Krishnamurthy, E. Sherly, & S. Mahesan (Eds.), DravidianLangTech 2022 - 2nd Workshop on Speech and Language Technologies for Dravidian Languages, Proceedings of the Workshop (pp. 64-69). (DravidianLangTech 2022 - 2nd Workshop on Speech and Language Technologies for Dravidian Languages, Proceedings of the Workshop). Association for Computational Linguistics (ACL).

Balouchzahi, F. ; Anusha, M. D. ; Shashirekha, H. L. et al. / MUCIC@TamilNLP-ACL2022 : Abusive Comment Detection in Tamil Language using 1D Conv-LSTM. DravidianLangTech 2022 - 2nd Workshop on Speech and Language Technologies for Dravidian Languages, Proceedings of the Workshop. editor / Bharathi Raja Chakravarthi ; Ruba Priyadharshini ; Anand Kumar Madasamy ; Parameswari Krishnamurthy ; Elizabeth Sherly ; Sinnathamby Mahesan. Association for Computational Linguistics (ACL), 2022. pp. 64-69 (DravidianLangTech 2022 - 2nd Workshop on Speech and Language Technologies for Dravidian Languages, Proceedings of the Workshop).

@inproceedings{e77af4b19411492aa807d8d153ed367f,

title = "MUCIC@TamilNLP-ACL2022: Abusive Comment Detection in Tamil Language using 1D Conv-LSTM",

abstract = "Abusive language content such as hate speech, profanity, and cyberbullying etc., which is common in online platforms is creating lot of problems to the users as well as policy makers. Hence, detection of such abusive language in user-generated online content has become increasingly important over the past few years. Online platforms strive hard to moderate the abusive content to reduce societal harm, comply with laws, and create a more inclusive environment for their users. In spite of various methods to automatically detect abusive languages in online platforms, the problem still persists. To address the automatic detection of abusive languages in online platforms, this paper describes the models submitted by our team - MUCIC to the shared task on {"}Abusive Comment Detection in Tamil-ACL 2022{"}. This shared task addresses the abusive comment detection in native Tamil script texts and code-mixed Tamil texts. To address this challenge, two models: i) n-gram-Multilayer Perceptron (n-gram-MLP) model utilizing MLP classifier fed with char-n gram features and ii) 1D Convolutional Long Short-Term Memory (1D Conv-LSTM) model, were submitted. The n-gram-MLP model fared well among these two models with weighted F1-scores of 0.560 and 0.430 for code-mixed Tamil and native Tamil script texts, respectively. This work may be reproduced using the code available in Gthub.",

author = "F. Balouchzahi and Anusha, {M. D.} and Shashirekha, {H. L.} and G. Sidorov",

note = "Publisher Copyright: {\textcopyright} 2022 Association for Computational Linguistics.; 2nd Workshop on Speech and Language Technologies for Dravidian Languages, Proceedings of the Workshop, DravidianLangTech 2022 ; Conference date: 26-05-2022",

year = "2022",

language = "Ingl{\'e}s",

series = "DravidianLangTech 2022 - 2nd Workshop on Speech and Language Technologies for Dravidian Languages, Proceedings of the Workshop",

publisher = "Association for Computational Linguistics (ACL)",

pages = "64--69",

editor = "Chakravarthi, {Bharathi Raja} and Ruba Priyadharshini and Madasamy, {Anand Kumar} and Parameswari Krishnamurthy and Elizabeth Sherly and Sinnathamby Mahesan",

booktitle = "DravidianLangTech 2022 - 2nd Workshop on Speech and Language Technologies for Dravidian Languages, Proceedings of the Workshop",

}

Balouchzahi, F, Anusha, MD, Shashirekha, HL & Sidorov, G 2022, MUCIC@TamilNLP-ACL2022: Abusive Comment Detection in Tamil Language using 1D Conv-LSTM. En BR Chakravarthi, R Priyadharshini, AK Madasamy, P Krishnamurthy, E Sherly & S Mahesan (eds.), DravidianLangTech 2022 - 2nd Workshop on Speech and Language Technologies for Dravidian Languages, Proceedings of the Workshop. DravidianLangTech 2022 - 2nd Workshop on Speech and Language Technologies for Dravidian Languages, Proceedings of the Workshop, Association for Computational Linguistics (ACL), pp. 64-69, 2nd Workshop on Speech and Language Technologies for Dravidian Languages, Proceedings of the Workshop, DravidianLangTech 2022, Dublin, Irlanda, 26/05/22.

MUCIC@TamilNLP-ACL2022: Abusive Comment Detection in Tamil Language using 1D Conv-LSTM. / Balouchzahi, F.; Anusha, M. D.; Shashirekha, H. L. et al.
DravidianLangTech 2022 - 2nd Workshop on Speech and Language Technologies for Dravidian Languages, Proceedings of the Workshop. ed. / Bharathi Raja Chakravarthi; Ruba Priyadharshini; Anand Kumar Madasamy; Parameswari Krishnamurthy; Elizabeth Sherly; Sinnathamby Mahesan. Association for Computational Linguistics (ACL), 2022. p. 64-69 (DravidianLangTech 2022 - 2nd Workshop on Speech and Language Technologies for Dravidian Languages, Proceedings of the Workshop).

Producción científica: Capítulo del libro/informe/acta de congreso › Contribución a la conferencia › revisión exhaustiva

TY - GEN

T1 - MUCIC@TamilNLP-ACL2022

T2 - 2nd Workshop on Speech and Language Technologies for Dravidian Languages, Proceedings of the Workshop, DravidianLangTech 2022

AU - Balouchzahi, F.

AU - Anusha, M. D.

AU - Shashirekha, H. L.

AU - Sidorov, G.

PY - 2022

Y1 - 2022

N2 - Abusive language content such as hate speech, profanity, and cyberbullying etc., which is common in online platforms is creating lot of problems to the users as well as policy makers. Hence, detection of such abusive language in user-generated online content has become increasingly important over the past few years. Online platforms strive hard to moderate the abusive content to reduce societal harm, comply with laws, and create a more inclusive environment for their users. In spite of various methods to automatically detect abusive languages in online platforms, the problem still persists. To address the automatic detection of abusive languages in online platforms, this paper describes the models submitted by our team - MUCIC to the shared task on "Abusive Comment Detection in Tamil-ACL 2022". This shared task addresses the abusive comment detection in native Tamil script texts and code-mixed Tamil texts. To address this challenge, two models: i) n-gram-Multilayer Perceptron (n-gram-MLP) model utilizing MLP classifier fed with char-n gram features and ii) 1D Convolutional Long Short-Term Memory (1D Conv-LSTM) model, were submitted. The n-gram-MLP model fared well among these two models with weighted F1-scores of 0.560 and 0.430 for code-mixed Tamil and native Tamil script texts, respectively. This work may be reproduced using the code available in Gthub.

AB - Abusive language content such as hate speech, profanity, and cyberbullying etc., which is common in online platforms is creating lot of problems to the users as well as policy makers. Hence, detection of such abusive language in user-generated online content has become increasingly important over the past few years. Online platforms strive hard to moderate the abusive content to reduce societal harm, comply with laws, and create a more inclusive environment for their users. In spite of various methods to automatically detect abusive languages in online platforms, the problem still persists. To address the automatic detection of abusive languages in online platforms, this paper describes the models submitted by our team - MUCIC to the shared task on "Abusive Comment Detection in Tamil-ACL 2022". This shared task addresses the abusive comment detection in native Tamil script texts and code-mixed Tamil texts. To address this challenge, two models: i) n-gram-Multilayer Perceptron (n-gram-MLP) model utilizing MLP classifier fed with char-n gram features and ii) 1D Convolutional Long Short-Term Memory (1D Conv-LSTM) model, were submitted. The n-gram-MLP model fared well among these two models with weighted F1-scores of 0.560 and 0.430 for code-mixed Tamil and native Tamil script texts, respectively. This work may be reproduced using the code available in Gthub.

UR - http://www.scopus.com/inward/record.url?scp=85137172156&partnerID=8YFLogxK

M3 - Contribución a la conferencia

AN - SCOPUS:85137172156

T3 - DravidianLangTech 2022 - 2nd Workshop on Speech and Language Technologies for Dravidian Languages, Proceedings of the Workshop

SP - 64

EP - 69

BT - DravidianLangTech 2022 - 2nd Workshop on Speech and Language Technologies for Dravidian Languages, Proceedings of the Workshop

A2 - Chakravarthi, Bharathi Raja

A2 - Priyadharshini, Ruba

A2 - Madasamy, Anand Kumar

A2 - Krishnamurthy, Parameswari

A2 - Sherly, Elizabeth

A2 - Mahesan, Sinnathamby

PB - Association for Computational Linguistics (ACL)

Y2 - 26 May 2022

ER -

Balouchzahi F, Anusha MD, Shashirekha HL, Sidorov G. MUCIC@TamilNLP-ACL2022: Abusive Comment Detection in Tamil Language using 1D Conv-LSTM. En Chakravarthi BR, Priyadharshini R, Madasamy AK, Krishnamurthy P, Sherly E, Mahesan S, editores, DravidianLangTech 2022 - 2nd Workshop on Speech and Language Technologies for Dravidian Languages, Proceedings of the Workshop. Association for Computational Linguistics (ACL). 2022. p. 64-69. (DravidianLangTech 2022 - 2nd Workshop on Speech and Language Technologies for Dravidian Languages, Proceedings of the Workshop).

MUCIC@TamilNLP-ACL2022: Abusive Comment Detection in Tamil Language using 1D Conv-LSTM

Resumen

Serie de la publicación

Conferencia

Otros archivos y enlaces

Huella

Citar esto