Shared more. Cited more. Safe forever.
    • advanced search
    • submit works
    • about
    • help
    • contact us
    • login
    View Item 
    •   MOspace Home
    • University of Missouri-Columbia
    • Graduate School - MU Theses and Dissertations (MU)
    • Theses and Dissertations (MU)
    • Theses (MU)
    • 2005 Theses (MU)
    • 2005 MU theses - Freely available online
    • View Item
    •   MOspace Home
    • University of Missouri-Columbia
    • Graduate School - MU Theses and Dissertations (MU)
    • Theses and Dissertations (MU)
    • Theses (MU)
    • 2005 Theses (MU)
    • 2005 MU theses - Freely available online
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.
    advanced searchsubmit worksabouthelpcontact us

    Browse

    All of MOspaceCommunities & CollectionsDate IssuedAuthor/ContributorTitleIdentifierThesis DepartmentThesis AdvisorThesis SemesterThis CollectionDate IssuedAuthor/ContributorTitleIdentifierThesis DepartmentThesis AdvisorThesis Semester

    Statistics

    Most Popular ItemsStatistics by CountryMost Popular AuthorsStatistics by Referrer

    Language modeling for automatic speech recognition in telehealth

    Zhang, Xiaojia, 1977-
    View/Open
    [PDF] public.pdf (12.68Kb)
    [PDF] short.pdf (9.846Kb)
    [PDF] research.pdf (563.1Kb)
    Date
    2005
    Format
    Thesis
    Metadata
    [+] Show full item record
    Abstract
    Standard statistic n-gram language models play a critical and indispensable role in automatic speech recognition (ASR) applications. Though helpful to ASR, it suffers from a practical problem when lacking sufficient in-domain training data that come from same or similar sources as the task text. In order to improve language model performance, various datasets need to be used to supplement the in-domain training data. This thesis investigates effective approaches to language modeling for telehealth which consists of doctor-patient conversation speech in medical specialty domain. Efforts were made to collect and analyze various datasets for training as well as to find a method for modeling target language. By effectively defining word classes, and by combining class and word trigram language models trained separately from in-domain and out-of-domain datasets, large improvements were achieved in perplexity reduction over a baseline word trigram language model that simply interpolates word trigram models trained from different data sources.
    URI
    http://hdl.handle.net/10355/4245
    Degree
    M.S.
    Thesis Department
    Computer science (MU)
    Collections
    • 2005 MU theses - Freely available online
    • Computer Science electronic theses and dissertations (MU)

    Send Feedback
    hosted by University of Missouri Library Systems
     

     


    Send Feedback
    hosted by University of Missouri Library Systems