Addressing the Issue of Unavailability of Parallel Corpus Incorporating Monolingual Corpus on PBSMT System for English-Manipuri Translation

Amika Achom, Partha Pakray, Alexander Gelbukh

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This research paper work establishes an important concept of improving Phrase based Statistical Machine Translation System incorporating monolingual corpus on the target side of the English to Manipuri translation language pair. However, there has been no work that focuses on translating one of the Indian Minority Tibeton-Burman Manipuri language pair. This Phrase based Statistical Machine Translation system has been developed using the Moses open-source toolkit and evaluated carefully using various automatic and human evaluation techniques. PBSMT achieves a BLEU Score of 10.15 as compared to the baseline PBSMT of BLEU Score 9.89 using the same training, tuning, and testing datasets. This research paper work addresses the issue of limited availability of parallel text corpora (English-Manipuri pair).

Original languageEnglish
Title of host publicationComputational Linguistics and Intelligent Text Processing - 19th International Conference, CICLing 2018, Revised Selected Papers
EditorsAlexander Gelbukh
PublisherSpringer Science and Business Media Deutschland GmbH
Pages299-319
Number of pages21
ISBN (Print)9783031237928
DOIs
StatePublished - 2023
Event19th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2018 - Hanoi, Viet Nam
Duration: 18 Mar 201824 Mar 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13396 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference19th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2018
Country/TerritoryViet Nam
CityHanoi
Period18/03/1824/03/18

Keywords

  • Adequacy
  • Automatic-bilingual evaluation understudy score (BLEU)
  • English to manipuri parallel corpora
  • Human evaluation-fluency
  • Overall rating
  • Phrase based statistical machine translation

Fingerprint

Dive into the research topics of 'Addressing the Issue of Unavailability of Parallel Corpus Incorporating Monolingual Corpus on PBSMT System for English-Manipuri Translation'. Together they form a unique fingerprint.

Cite this