Urdu Named Entity Recognition with Attention Bi-LSTM-CRF Model

Fida Ullah, Ihsan Ullah, Olga Kolesnikova

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

The named entity recognition (NER) task is a challenging problem in natural language processing (NLP), especially for languages with very few annotated corpora such as Urdu. In this paper we proposed an Attention-Bi-LSTM-CRF method and applied it to the MK-PUCIT Corpus which is the latest NER dataset available for the Urdu language. In addition to word-level embedding, we used an embedding-level focus mechanism. The output of the embedding layer was fed into a bidirectional-LSTM encoder unit, accompanied by another self-attention layer to boost the system’s accuracy. Our Attention-Bi-LSTM-CRF model demonstrated an F1-score of 92%. The cumulative findings of the experiments show that our approach outperforms existing methods, thus yielding a new UNER (Urdu Named Entity Recognition) state-of-the-art performance.

Original languageEnglish
Title of host publicationAdvances in Computational Intelligence - 21st Mexican International Conference on Artificial Intelligence, MICAI 2022, Proceedings
EditorsObdulia Pichardo Lagunas, Bella Martínez Seis, Juan Martínez-Miranda
PublisherSpringer Science and Business Media Deutschland GmbH
Pages3-17
Number of pages15
ISBN (Print)9783031194955
DOIs
StatePublished - 2022
Event21st Mexican International Conference on Artificial Intelligence, MICAI 2022 - Monterrey, Mexico
Duration: 24 Oct 202229 Oct 2022

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13613 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference21st Mexican International Conference on Artificial Intelligence, MICAI 2022
Country/TerritoryMexico
CityMonterrey
Period24/10/2229/10/22

Keywords

  • Attention mechanism
  • Deep learning
  • Named entity recognition
  • Natural language processing
  • Word embedding

Fingerprint

Dive into the research topics of 'Urdu Named Entity Recognition with Attention Bi-LSTM-CRF Model'. Together they form a unique fingerprint.

Cite this