[-] Show simple item record

dc.contributor.advisorZhao, Yunxineng
dc.contributor.authorXue, Jian, 1975-eng
dc.date.issued2007eng
dc.date.submitted2007 Falleng
dc.descriptionThe entire dissertation/thesis text is included in the research.pdf file; the official abstract appears in the short.pdf file (which also appears in the research.pdf); a non-technical general description, or public abstract, appears in the public.pdf file.eng
dc.descriptionTitle from title screen of research.pdf file (viewed on March 4, 2008)eng
dc.descriptionIncludes bibliographical references.eng
dc.descriptionVita.eng
dc.descriptionThesis (Ph. D.) University of Missouri-Columbia 2007.eng
dc.descriptionDissertations, Academic -- University of Missouri--Columbia -- Computer science.eng
dc.description.abstractIn this work, new approaches are proposed for online large vocabulary conversational speech recognition, including a fast confusion network algorithm, novel features and a Random Forests based classifier for word confidence annotation, new improvements in speech decoding speed and latency, novel lookahead phonetic decision tree state tying and Random Forests of phonetic decision tree state tying for acoustic modeling of speech sound units. The fast confusion network algorithm significantly improves the time complexity from O(T3) to O(T), with T equaling the number of links in a word lattice. Several novel features, as well as Random Forests based classification technique are proposed to improve word annotation accuracy for automatic captioning. In order to improve the speed of speech decoding engine, we propose to use complementary word confidence scores to prune uncompetitive search paths, and use subspace distribution clustering hidden Markov modeling to speed up computation of acoustic scores and local confidence scores. We further integrate pre-backtrace in decoding search to significantly reduce captioning latency. In this work we also investigate novel approaches to improve the performance of phonetic decision tree state tying, including two lookahead methods and a Random Forests method. Constrained lookahead method finds an optimal question among n pre-selected questions for each split node to decrease effects of outliers, and it also discounts the contributions of likelihood gains by deeper decedents. Stochastic full lookahead method uses sub-tree size instead of likelihood gain as a measure for phonetic question selection, in order to produce small trees with better generalization capability and consistent with training data. The Random Forests method uses an ensemble of phonetic decision trees to derive a single strong model for each speech unit. We investigate several methods of combining the acoustic scores from multiple models obtained from multiple phonetic decision trees in decoding search. We further propose clustering methods to compact the Random Forests generated acoustic models to speed up decoding search.eng
dc.identifier.merlinb62414045eng
dc.identifier.oclc212818230eng
dc.identifier.urihttps://hdl.handle.net/10355/4821
dc.identifier.urihttps://doi.org/10.32469/10355/4821eng
dc.languageEnglisheng
dc.publisherUniversity of Missouri--Columbiaeng
dc.relation.ispartofcollectionUniversity of Missouri--Columbia. Graduate School. Theses and Dissertationseng
dc.subject.lcshSpeech processing systemseng
dc.subject.lcshCoding theoryeng
dc.subject.lcshPattern recognition systemseng
dc.titleImprovement of decoding engine & phonetic decision tree in acoustic modeling for online large vocabulary conversational speech recognitioneng
dc.typeThesiseng
thesis.degree.disciplineComputer science (MU)eng
thesis.degree.grantorUniversity of Missouri--Columbiaeng
thesis.degree.levelDoctoraleng
thesis.degree.namePh. D.eng


Files in this item

[PDF]
[PDF]
[PDF]

This item appears in the following Collection(s)

[-] Show simple item record