TY - GEN
T1 - Urdu Named Entity Recognition with Attention Bi-LSTM-CRF Model
AU - Ullah, Fida
AU - Ullah, Ihsan
AU - Kolesnikova, Olga
N1 - Publisher Copyright:
© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2022
Y1 - 2022
N2 - The named entity recognition (NER) task is a challenging problem in natural language processing (NLP), especially for languages with very few annotated corpora such as Urdu. In this paper we proposed an Attention-Bi-LSTM-CRF method and applied it to the MK-PUCIT Corpus which is the latest NER dataset available for the Urdu language. In addition to word-level embedding, we used an embedding-level focus mechanism. The output of the embedding layer was fed into a bidirectional-LSTM encoder unit, accompanied by another self-attention layer to boost the system’s accuracy. Our Attention-Bi-LSTM-CRF model demonstrated an F1-score of 92%. The cumulative findings of the experiments show that our approach outperforms existing methods, thus yielding a new UNER (Urdu Named Entity Recognition) state-of-the-art performance.
AB - The named entity recognition (NER) task is a challenging problem in natural language processing (NLP), especially for languages with very few annotated corpora such as Urdu. In this paper we proposed an Attention-Bi-LSTM-CRF method and applied it to the MK-PUCIT Corpus which is the latest NER dataset available for the Urdu language. In addition to word-level embedding, we used an embedding-level focus mechanism. The output of the embedding layer was fed into a bidirectional-LSTM encoder unit, accompanied by another self-attention layer to boost the system’s accuracy. Our Attention-Bi-LSTM-CRF model demonstrated an F1-score of 92%. The cumulative findings of the experiments show that our approach outperforms existing methods, thus yielding a new UNER (Urdu Named Entity Recognition) state-of-the-art performance.
KW - Attention mechanism
KW - Deep learning
KW - Named entity recognition
KW - Natural language processing
KW - Word embedding
UR - http://www.scopus.com/inward/record.url?scp=85142829581&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-19496-2_1
DO - 10.1007/978-3-031-19496-2_1
M3 - Contribución a la conferencia
AN - SCOPUS:85142829581
SN - 9783031194955
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 3
EP - 17
BT - Advances in Computational Intelligence - 21st Mexican International Conference on Artificial Intelligence, MICAI 2022, Proceedings
A2 - Pichardo Lagunas, Obdulia
A2 - Martínez Seis, Bella
A2 - Martínez-Miranda, Juan
PB - Springer Science and Business Media Deutschland GmbH
T2 - 21st Mexican International Conference on Artificial Intelligence, MICAI 2022
Y2 - 24 October 2022 through 29 October 2022
ER -