THANGCIC at PoliticEs 2022: Term-based BERT for Extracting Political Ideology from Spanish Author Profiling

Hoang Thang Ta, Abu Bakar Siddiqur Rahman, Lotfollah Najjar, Alexander Gelbukh

Producción científica: Contribución a una revistaArtículo de la conferenciarevisión exhaustiva

1 Cita (Scopus)

Resumen

This paper presents our participation in the task of detecting gender, profession, and political ideology in tweets of Spanish users, in a binary and multi-class perspective. The task plays an important role in identifying political ideology of parties and politicians, especially new emerging ones. This may support relevant tasks to make predictions in the elections, or create an impact on the decision of citizens through out propagation systems. For each user, we extracted features as the most popular terms from a bunch of his/her tweets, then put them as input data for the training, which applied a transfer learning set up on pre-trained BERT models. Our quick method should be suggested as a baseline for the task with the highest F1 average macro of 72.72%. In detail, we obtained F1 Gender of 69.14%, F1 Profession of 81.47%, F1 Ideology Binary of 75.76%, and F1 Ideology Multiclass of 64.51%.

Idioma originalInglés
PublicaciónCEUR Workshop Proceedings
Volumen3202
EstadoPublicada - 2022
Evento2022 Iberian Languages Evaluation Forum, IberLEF 2022 - A Coruna, Espana
Duración: 20 sep. 2022 → …

Huella

Profundice en los temas de investigación de 'THANGCIC at PoliticEs 2022: Term-based BERT for Extracting Political Ideology from Spanish Author Profiling'. En conjunto forman una huella única.

Citar esto