mBERT and Simple Post-Processing: A Baseline for Disease Mention Detection in Spanish

Antonio Tamayo, Diego A. Burgos, Alexander Gelbukh

Producción científica: Contribución a una revistaArtículo de la conferenciarevisión exhaustiva

1 Cita (Scopus)

Resumen

Automatic disease mention extraction is a relevant task due to its various applications in the medical field. During the last decade, many related works have been published, which have accelerated the progress of this research area, but most of them have been carried out in English. In this work, we propose a deep-learning baseline for this task in Spanish. We report an approach based on transfer learning using multilingual BERT and a straightforward post-processing to tackle the problem. Our system does not use any external resources and rely only on efficient fine tuning, which makes it a fair baseline (Micro F1 = 0.5456) for disease mention identification in Spanish using transformer-based models.

Idioma originalInglés
Páginas (desde-hasta)350-356
Número de páginas7
PublicaciónCEUR Workshop Proceedings
Volumen3180
EstadoPublicada - 2022
Evento2022 Conference and Labs of the Evaluation Forum, CLEF 2022 - Bologna, Italia
Duración: 5 sep. 20228 sep. 2022

Huella

Profundice en los temas de investigación de 'mBERT and Simple Post-Processing: A Baseline for Disease Mention Detection in Spanish'. En conjunto forman una huella única.

Citar esto