Discriminative Complexity Control and Linear Projections for Large Vocabulary Speech Recognition (2005)
BibTeX
@MISC{Liu05discriminativecomplexity,
author = {Xunying Liu},
title = {Discriminative Complexity Control and Linear Projections for Large Vocabulary Speech Recognition},
year = {2005}
}
OpenURL
Abstract
Selecting the optimal model structure with the “appropriate” complexity is a standard prob-lem for training large vocabulary continuous speech recognition (LVCSR) systems, and machine learning in general. State-of-the-art LVCSR systems are highly complex. A wide variety of tech-niques may be used which alter the system complexity and word error rate (WER). Explicitly evaluating systems for all possible configurations is infeasible. Automatic model complexity control criteria are needed. Most existing complexity control schemes can be classified into two types, Bayesian learning techniques and information theory approaches. An implicit assumption is made in both that increasing the likelihood on held-out data decreases the WER. However, this correlation is found to be quite weak for current speech recognition systems. Hence it is preferable to employ discriminative methods for complexity control. In this thesis a novel discriminative model selection technique, the marginalization of a discriminative growth function, is presented. This is a closer approximation to the true WER than standard likelihood based approaches. The number of Gaussian components and feature dimensions of an HMM based LVCSR system is controlled. Experimental results on a wide rage of LVCSR tasks showed that







