[-] Show simple item record

dc.contributor.advisorZhao, Yunxineng
dc.contributor.authorGraham, Heather Mackenzieeng
dc.date.issued2020eng
dc.date.submitted2020 Springeng
dc.description.abstract[ACCESS RESTRICTED TO THE UNIVERSITY OF MISSOURI AT REQUEST OF AUTHOR.] While machine translation has achieved impressive results on the world's most widely spoken languages, thousands of languages do not have the quantity of data necessary to train a state-of-the-art system. We propose here a technique to identify the best available datasets for augmentation in many-to-one multilingual neural machine translation systems by quantifying the factors that most affect translation performance - data set domain, relation between source side languages, translation quality, and data set size. Previous research has considered these factors qualitatively and in isolation of each other, but selecting an augmenting data set from various possibilities requires a quantitative synthesis of all these factors. We evaluate a number of techniques to measure each of these factors and learn a system combining them. The focus is on the Luyia languages of western Kenya as a case study for an extreme low resource scenario, but the application of these techniques to similar languages is also explored.eng
dc.description.bibrefIncludes bibliographical references.eng
dc.format.extent1 online resource (x, 95 pages) ; Illustrationseng
dc.identifier.urihttps://hdl.handle.net/10355/78189
dc.languageEnglisheng
dc.publisherUniversity of Missouri--Columbiaeng
dc.relation.ispartofcommunityUniversity of Missouri--Columbia. Graduate School. Theses and Dissertationseng
dc.rightsAccess to files is limited to the campuses of the University of Missourieng
dc.rights.licenseThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 License. Copyright held by author.
dc.subject.disciplineComputer scienceeng
dc.titleSelecting data for multilingual multi-domain neural machine translation on low resource languageseng
dc.typeThesiseng
thesis.degree.disciplineComputer science (MU)eng
thesis.degree.grantorUniversity of Missouri--Columbiaeng
thesis.degree.levelMasterseng
thesis.degree.nameM.S.eng


Files in this item

[PDF]

This item appears in the following Collection(s)

[-] Show simple item record