Real-time speaker-independent large vocabulary continuous speech recoginition

Li, Xiaolong, 1976-

Real-time speaker-independent large vocabulary continuous speech recoginition

Files

public.pdf (8.81 KB)

short.pdf (587.92 KB)

research.pdf (1.65 MB)

Authors

Li, Xiaolong, 1976-

Date

2005

Format

Thesis

Abstract

In this dissertation, a real-time decoding engine for speaker-independent large vocabulary continuous speech recognition (LVCSR) is presented. Three indispensable and correlated performance measurements -- accuracy, speed, and memory cost, are carefully considered in the system design. A novel algorithm, Order-Preserving Language Model Context Pre-computing (OPCP) is proposed for fast Language Model (LM) lookup, resulting in significant improvement in both overall decoding time and memory space without any decrease of recognition accuracy. The time and memory savings in LM lookup by using OPCP became more pronounced with the increase of LM size. By using the OPCP method and other optimizations, our one-pass LVCSR decoding engine, named TigerEngine, reached real-time speed in both tasks of Wall Street Journal 20K and Switchboard 33K, on the platform of a Dell workstation with one 3.2 GHz Xeon CPU. TigerEngine is to be used in automatic captioning for Telehealth.

URI

https://hdl.handle.net/10355/4119
https://doi.org/10.32469/10355/4119

Degree

Ph. D.

Thesis Department

Computer science (MU)

Collections

2005 MU dissertations - Freely available online
Computer Science electronic theses and dissertations (MU)

Full item page

Real-time speaker-independent large vocabulary continuous speech recoginition

Files

Authors

Meeting name

Sponsors

Date

Journal Title

Format

Subject

Research Projects

Organizational Units

Journal Issue

Abstract

Table of Contents

URI

DOI

PubMed ID

Degree

Thesis Department

Rights

License

Collections