The Combination of BERT and Data Oversampling for Answer Type Prediction

Thang Ta Hoang, Olumide Ebenezer Ojo, Olaronke Oluwayemisi Adebanji, Hiram Calvo, Alexander Gelbukh

Research output: Contribution to journalConference articlepeer-review

3 Scopus citations

Abstract

In this paper, we address the Task 1 (of the SMART Task 2021) of predicting the answer categories and types based on target ontologies, which could be useful in knowledge-based Question Answering (QA) systems. We introduced our method by combining the power of BERT architectures with data oversampling via replacements of linked terms to Wikidata and dependent noun phrases to attain the state-of-the-art performance. The accuracy on the DBpedia dataset is 98.5%, whereas NDCG@5 and NDCG@10 are 72.7% and 66.4% respectively. Our model has the best performance compared to other teams, with the accuracy score of 98% and Mean Reciprocal Rank (MRR) of 70% on the Wikidata dataset.

Original languageEnglish
JournalCEUR Workshop Proceedings
Volume3119
StatePublished - 2022
Event2nd SeMantic Answer Type and Relation Prediction Task at ISWC Semantic Web Challenge, SMART 2021 - Virtual, Online
Duration: 26 Oct 2021 → …

Keywords

  • Answer Type Prediction
  • ISWC
  • Question Answering
  • Semantic Web Challenge

Fingerprint

Dive into the research topics of 'The Combination of BERT and Data Oversampling for Answer Type Prediction'. Together they form a unique fingerprint.

Cite this