Probabilistic Segmentation for Segment-Based Speech Recognition
user correction - Legacy Corrections
Steven C. Lee
MASSACHUSETTS INSTITUTE OF TECHNOLOGY; Department of Electrical Engineering; and Computer Science
Segment-based speech recognition systems must explicitly hypothesize segment start and end times. The purpose of a segmentation algorithm is to hypothesize those times and to compose a graph of segments from them. During recognition, this graph is an input to a search that finds the optimal sequence of sound units through the graph. The goal of this thesis is to create a high-quality, real-time phonetic segmentation algorithm for segment-based speech recognition. A high-quality segmentation algorithm produces a sparse network of segments that contains most of the actual segments in the speech utterance. A real-time algorithm implies that it is fast, and that it is able to produce an output in a pipelined manner. The approach taken in this thesis is to adopt the framework of a state-of-the-art algorithm that does not operate in real-time, and to make the modifications necessary to enable it to run in real-time. The algorithm adopted as the starting point for this work makes use of a for...