Improving Neural Machine Translation for Low Resource Languages Using Mixed Training: The Case of Ethiopian Languages

Atnafu Lambebo Tonja, Olga Kolesnikova, Muhammad Arif, Alexander Gelbukh, Grigori Sidorov

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Scopus citations

Abstract

Neural Machine Translation (NMT) has shown improvement for high-resource languages, but there is still a problem with low-resource languages as NMT performs well on huge parallel data available for high-resource languages. In spite of many proposals to solve the problem of low-resource languages, it continues to be a difficult challenge. The issue becomes even more complicated when few resources cover only one domain. In our attempt to combat this issue, we propose a new approach to improve NMT for low-resource languages. The proposed approach using the transformer model shows 5.3, 5.0, and 3.7 BLEU score improvement for Gamo-English, Gofa-English, and Dawuro-English language pairs, respectively, where Gamo, Gofa, and Dawuro are related low-resource Ethiopian languages. We discuss our contributions and envisage future steps in this challenging research area.

Original languageEnglish
Title of host publicationAdvances in Computational Intelligence - 21st Mexican International Conference on Artificial Intelligence, MICAI 2022, Proceedings
EditorsObdulia Pichardo Lagunas, Bella Martínez Seis, Juan Martínez-Miranda
PublisherSpringer Science and Business Media Deutschland GmbH
Pages30-40
Number of pages11
ISBN (Print)9783031194955
DOIs
StatePublished - 2022
Event21st Mexican International Conference on Artificial Intelligence, MICAI 2022 - Monterrey, Mexico
Duration: 24 Oct 202229 Oct 2022

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13613 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference21st Mexican International Conference on Artificial Intelligence, MICAI 2022
Country/TerritoryMexico
CityMonterrey
Period24/10/2229/10/22

Keywords

  • Ethiopian languages
  • Low-resource machine translation
  • Machine translation
  • Mixed training
  • Neural machine translation

Fingerprint

Dive into the research topics of 'Improving Neural Machine Translation for Low Resource Languages Using Mixed Training: The Case of Ethiopian Languages'. Together they form a unique fingerprint.

Cite this