[-] Show simple item record

dc.contributor.advisorZhao, Yunxineng
dc.contributor.authorSun, Xieeng
dc.date.issued2011eng
dc.date.submitted2011 Falleng
dc.descriptionTitle from PDF of title page (University of Missouri--Columbia, viewed on May 30, 2012).eng
dc.descriptionThe entire thesis text is included in the research.pdf file; the official abstract appears in the short.pdf file; a non-technical public abstract appears in the public.pdf file.eng
dc.descriptionDissertation advisor: Dr. Yunxin Zhaoeng
dc.descriptionVita.eng
dc.descriptionPh. D. University of Missouri--Columbia 2011eng
dc.description"December 2011"eng
dc.description.abstractIn this dissertation, a novel approach of integrating template matching with statistical modeling is proposed to improve continuous speech recognition. Commonly used Hidden Markov Models (HMMs) are ineffective in modeling details of speech temporal evolutions, which can be overcome by template-based methods. However, template-based methods are difficult to be extended in large vocabulary continuous speech recognition (LVCSR). Our proposed approach takes advantages of both statistical modeling and template matching to overcome the weaknesses of traditional HMMs and conventional template-based methods. We use multiple Gaussian Mixture Model indices to represent each frame of speech templates. The local distances of log likelihood ratio and Kullback-Leibler divergence are proposed for dynamic time warping based template matching. In order to reduce computational complexity and storage space, we propose methods of minimum distance template selection and maximum log-likelihood template selection, and investigate a template compression method on top of template selection to further improve recognition performance. Experimental results on the TIMIT phone recognition task and a LVCSR task of telehealth captioning demonstrated that the proposed approach significantly improved the performance of recognition accuracy over the HMM baselines, and on the TIMIT task, the proposed method showed consistent performance improvements over progressively enhanced HMM baselines. Moreover, the template selection methods largely reduced computation and storage complexities. Finally, an investigation was made to combine acoustic scores in triphone template matching with scores of prosodic features, which showed positive effects on vowels in LVCSR.eng
dc.description.bibrefIncludes bibliographical references.eng
dc.format.extentxii, 95 pageseng
dc.identifier.oclc872561466eng
dc.identifier.urihttps://doi.org/10.32469/10355/14455eng
dc.identifier.urihttps://hdl.handle.net/10355/14455
dc.languageEnglisheng
dc.publisherUniversity of Missouri--Columbiaeng
dc.relation.ispartofcommunityUniversity of Missouri--Columbia. Graduate School. Theses and Dissertationseng
dc.rightsOpenAccess.eng
dc.rights.licenseThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 License.
dc.subjectspeech recognitioneng
dc.subjecttemplate matchingeng
dc.subjectlattice rescoringeng
dc.subjectGaussian mixture modeleng
dc.titleIntegrate template matching and statistical modeling for continuous speech recognitioneng
dc.typeThesiseng
thesis.degree.disciplineComputer science (MU)eng
thesis.degree.grantorUniversity of Missouri--Columbiaeng
thesis.degree.levelDoctoraleng
thesis.degree.namePh. D.eng


Files in this item

[PDF]
[PDF]
[PDF]

This item appears in the following Collection(s)

[-] Show simple item record