GAN-BERT, an Adversarial Learning Architecture for Paraphrase Identification

Hoang Thang Ta, Abu Bakar Siddiqur Rahman, Lotfollah Najjar, Alexander Gelbukh

Producción científica: Contribución a una revistaArtículo de la conferenciarevisión exhaustiva

2 Citas (Scopus)

Resumen

In this paper, we address the task of Paraphrase Identification in Mexican Spanish (PAR-MEX) at sentence-level. We introduced our method, using text embeddings from pre-trained transformer models for the training process by GAN-BERT, an adversarial learning. We modified noises for the generator, which have a random rate and the same size of the hidden layer of transformers. To improve the model performance, a rule of thumb based on the pair similarity is used to remove possible wrong sentence pairs in positive examples; parallel with the addition of unlabelled data in the same domain. The best obtained F1 is 90.22%, ranked third in the final result table, also outperformed the organizers' baseline.

Idioma originalInglés
PublicaciónCEUR Workshop Proceedings
Volumen3202
EstadoPublicada - 2022
Evento2022 Iberian Languages Evaluation Forum, IberLEF 2022 - A Coruna, Espana
Duración: 20 sep. 2022 → …

Citar esto