Results 11 - 20
of
440
A Robust System for Natural Spoken Dialogue
- ASSOCIATION FOR COMPUTATIONAL LINGUISTICS
, 1996
"... This paper describes a system that leads us to believe in the feasibility of constructing natural spoken dialogue systems in task-oriented domains. It specifically addresses the issue of robust interpretation of speech in the presence of recognition errors. Robustness is achieved by a combination of ..."
Abstract
-
Cited by 111 (10 self)
- Add to MetaCart
This paper describes a system that leads us to believe in the feasibility of constructing natural spoken dialogue systems in task-oriented domains. It specifically addresses the issue of robust interpretation of speech in the presence of recognition errors. Robustness is achieved by a combination of statistical error post-correction, syntactically- and semantically-driven robust parsing, and extensive use of the dialogue context. We present an evaluation of the system using time-to-completion and the quality of the final solution that suggests that most native speakers of English can use the system successfully with virtually no training.
Preliminaries to a Theory of Speech Disfluencies
, 1994
"... This thesis examines disfluencies (e.g., "um", repeated words, and a variety of forms of self-repair) in the spontaneous speech of adult normal speakers of American English. Despite their prevalence, disfluencies have traditionally been viewed as irregular events and have received little attention. ..."
Abstract
-
Cited by 97 (7 self)
- Add to MetaCart
This thesis examines disfluencies (e.g., "um", repeated words, and a variety of forms of self-repair) in the spontaneous speech of adult normal speakers of American English. Despite their prevalence, disfluencies have traditionally been viewed as irregular events and have received little attention. The goal of the thesis is to provide evidence that, on the contrary, disfluencies show remarkably regular trends in a number of dimensions. These regularities have consequences for models of human language production; they can also be exploited to improve performance in speech applications. The method includes analysis of over 5000 hand-annotated disfluencies from a database (250,000 words) containing three different styles of spontaneous speech: task-oriented human-computer dialog, task-oriented human-human dialog, and human-human conversation on a prescribed topic. The approach is theory-neutral and strongly data-driven. The annotations correspond to observable characteristics ("features") ...
Hidden Markov processes
- IEEE Trans. Inform. Theory
, 2002
"... Abstract—An overview of statistical and information-theoretic aspects of hidden Markov processes (HMPs) is presented. An HMP is a discrete-time finite-state homogeneous Markov chain observed through a discrete-time memoryless invariant channel. In recent years, the work of Baum and Petrie on finite- ..."
Abstract
-
Cited by 93 (2 self)
- Add to MetaCart
Abstract—An overview of statistical and information-theoretic aspects of hidden Markov processes (HMPs) is presented. An HMP is a discrete-time finite-state homogeneous Markov chain observed through a discrete-time memoryless invariant channel. In recent years, the work of Baum and Petrie on finite-state finite-alphabet HMPs was expanded to HMPs with finite as well as continuous state spaces and a general alphabet. In particular, statistical properties and ergodic theorems for relative entropy densities of HMPs were developed. Consistency and asymptotic normality of the maximum-likelihood (ML) parameter estimator were proved under some mild conditions. Similar results were established for switching autoregressive processes. These processes generalize HMPs. New algorithms were developed for estimating the state, parameter, and order of an HMP, for universal coding and classification of HMPs, and for universal decoding of hidden Markov channels. These and other related topics are reviewed in this paper. Index Terms—Baum–Petrie algorithm, entropy ergodic theorems, finite-state channels, hidden Markov models, identifiability, Kalman filter, maximum-likelihood (ML) estimation, order estimation, recursive parameter estimation, switching autoregressive processes, Ziv inequality. I.
Per-Survivor Processing: A General Approach to MLSE in Uncertain Environments
- IEEE Trans. Commun
, 1995
"... Per-Survivor Processing (PSP) provides a general framework for the approximation of Maximum Likelihood Sequence Estimation (MLSE) algorithms whenever the presence of unknown quantities prevents the precise use of the classical Viterbi algorithm. This principle stems from the idea that data-aided est ..."
Abstract
-
Cited by 90 (0 self)
- Add to MetaCart
Per-Survivor Processing (PSP) provides a general framework for the approximation of Maximum Likelihood Sequence Estimation (MLSE) algorithms whenever the presence of unknown quantities prevents the precise use of the classical Viterbi algorithm. This principle stems from the idea that data-aided estimation of unknown parameters may be embedded into the structure of the Viterbi algorithm itself. Among the numerous possible applications, we concentrate here on (a) adaptive MLSE, (b) simultaneous Trellis Coded Modulation (TCM) decoding and phase synchronization, (c) adaptive Reduced State Sequence Estimation (RSSE). As a matter of fact, PSP is interpretable as a generalization of decision feedback techniques of RSSE to decoding in the presence of unknown parameters. A number of algorithms for the simultaneous estimation of data sequence andunknown channel parameters are presented and compared with "conventional" techniques based on the use of tentative decisions. Results for uncoded modu...
A Maximum-Likelihood Approach to Stochastic Matching for Robust Speech Recognition
- IEEE Transactions on Speech and Audio Processing
, 1996
"... is granted. A Maximum-Likelihood Approach to Stochastic Matching for Robust Speech Recognition Ananth Sankar 2 and Chin-Hui Lee Speech Research Department AT&T Bell Laboratories Murray Hill, NJ 07974 1 Introduction Recently there has been much interest in the problem of improving the performanc ..."
Abstract
-
Cited by 86 (14 self)
- Add to MetaCart
is granted. A Maximum-Likelihood Approach to Stochastic Matching for Robust Speech Recognition Ananth Sankar 2 and Chin-Hui Lee Speech Research Department AT&T Bell Laboratories Murray Hill, NJ 07974 1 Introduction Recently there has been much interest in the problem of improving the performance of automatic speech recognition (ASR) systems in adverse environments. When there is a mismatch between the training and testing environments, ASR systems suffer a degradation in performance. The goal of robust speech recognition is to remove the effect of this mismatch so as to bring the recognition performance as close as possible to the matched conditions. In speech recognition, the speech is usually modeled by a set of hidden Markov models (HMM) X . During recognition the observed utterance Y is decoded using these models. Due to the mismatch between training and testing conditions, this often results in a degradation in performance compared to the matched conditions. The mismatch b...
Iterative decoding of compound codes by probability propagation in graphical models
- IEEE Journal on Selected Areas in Communications
, 1998
"... Abstract—We present a unified graphical model framework for describing compound codes and deriving iterative decoding algorithms. After reviewing a variety of graphical models (Markov random fields, Tanner graphs, and Bayesian networks), we derive a general distributed marginalization algorithm for ..."
Abstract
-
Cited by 85 (8 self)
- Add to MetaCart
Abstract—We present a unified graphical model framework for describing compound codes and deriving iterative decoding algorithms. After reviewing a variety of graphical models (Markov random fields, Tanner graphs, and Bayesian networks), we derive a general distributed marginalization algorithm for functions described by factor graphs. From this general algorithm, Pearl’s belief propagation algorithm is easily derived as a special case. We point out that recently developed iterative decoding algorithms for various codes, including “turbo decoding ” of parallelconcatenated convolutional codes, may be viewed as probability propagation in a graphical model of the code. We focus on Bayesian network descriptions of codes, which give a natural input/state/output/channel description of a code and channel, and we indicate how iterative decoders can be developed for parallel- and serially-concatenated coding systems, product codes, and low-density parity-check codes. I.
Optimal and Sub-Optimal Maximum A Posteriori Algorithms Suitable for Turbo Decoding
- ETT
, 1997
"... For estimating the states or outputs of a Markov process, the symbol-by-symbol maximum a posteriori (MAP) algorithm is optimal. However, this algorithm, even in its recursive form, poses technical difficulties because of numerical representation problems, the necessity of non-linear functions and a ..."
Abstract
-
Cited by 83 (16 self)
- Add to MetaCart
For estimating the states or outputs of a Markov process, the symbol-by-symbol maximum a posteriori (MAP) algorithm is optimal. However, this algorithm, even in its recursive form, poses technical difficulties because of numerical representation problems, the necessity of non-linear functions and a high number of additions and multiplications. MAP like algorithms operating in the logarithmic domain presented in the past solve the numerical problem and reduce the computational complexity, but are suboptimal especially at low SNR (a common example is the Max-Log-MAP because of its use of the max function). A further simplification yields the soft-output Viterbi algorithm (SOVA). In this paper, we present a Log-MAP algorithm that avoids the approximations in the Max-Log-MAP algorithm and hence is equivalent to the true MAP, but without its major disadvantages. We compare the (Log-)MAP, Max-Log-MAP and SOVA from a theoretical point of view to illuminate their commonalities and differences. As a practical example forming the basis for simulations, we consider Turbo decoding, where recursive systematic convolutional component codes are decoded with the three algorithms, and we also demonstrate the practical suitability of the Log-MAP by including quantization effects. The SOVA is, at 10
Multiresolution markov models for signal and image processing
- Proceedings of the IEEE
, 2002
"... This paper reviews a significant component of the rich field of statistical multiresolution (MR) modeling and processing. These MR methods have found application and permeated the literature of a widely scattered set of disciplines, and one of our principal objectives is to present a single, coheren ..."
Abstract
-
Cited by 83 (11 self)
- Add to MetaCart
This paper reviews a significant component of the rich field of statistical multiresolution (MR) modeling and processing. These MR methods have found application and permeated the literature of a widely scattered set of disciplines, and one of our principal objectives is to present a single, coherent picture of this framework. A second goal is to describe how this topic fits into the even larger field of MR methods and concepts–in particular making ties to topics such as wavelets and multigrid methods. A third is to provide several alternate viewpoints for this body of work, as the methods and concepts we describe intersect with a number of other fields. The principle focus of our presentation is the class of MR Markov processes defined on pyramidally organized trees. The attractiveness of these models stems from both the very efficient algorithms they admit and their expressive power and broad applicability. We show how a variety of methods and models relate to this framework including models for self-similar and 1/f processes. We also illustrate how these methods have been used in practice. We discuss the construction of MR models on trees and show how questions that arise in this context make contact with wavelets, state space modeling of time series, system and parameter identification, and hidden
Rate-Distortion Optimized Mode Selection for Very Low Bit Rate Video Coding and the Emerging H.263 Standard
, 1995
"... This paper addresses the problem of encoder optimization in a macroblock-based multi-mode video compression system. An efficient solution is proposed in which, for a given image region, the optimum combination of macroblock modes and the associated mode parameters are jointly selected so as to minim ..."
Abstract
-
Cited by 69 (12 self)
- Add to MetaCart
This paper addresses the problem of encoder optimization in a macroblock-based multi-mode video compression system. An efficient solution is proposed in which, for a given image region, the optimum combination of macroblock modes and the associated mode parameters are jointly selected so as to minimize the overall distortion for a given bit-rate budget. Conditions for optimizing the encoder operation are derived within a rate-constrained product code framework using a Lagrangian formulation. The instantaneous rate of the encoder is controlled by a single Lagrange multiplier that makes the method amenable to mobile wireless networks with time-varying capacity. When rate and distortion dependencies are introduced between adjacent blocks (as is the case when the motion vectors are differentially encoded and/or overlapped block motion compensation is employed), the ensuing encoder complexity is surmounted using dynamic programming. Due to the generic nature of the algorithm, it can be succ...
An Introduction to Factor Graphs
- IEEE SIGNAL PROCESSING MAG., JAN. 2004
, 2004
"... A large variety of algorithms in coding, signal processing, and artificial intelligence may be viewed as instances of the summary-product algorithm (or belief/probability ..."
Abstract
-
Cited by 67 (23 self)
- Add to MetaCart
A large variety of algorithms in coding, signal processing, and artificial intelligence may be viewed as instances of the summary-product algorithm (or belief/probability

