Results 1  10
of
151
Polynomial Splines and Their Tensor Products in Extended Linear Modeling
 Ann. Statist
, 1997
"... ANOVA type models are considered for a regression function or for the logarithm of a probability function, conditional probability function, density function, conditional density function, hazard function, conditional hazard function, or spectral density function. Polynomial splines are used to m ..."
Abstract

Cited by 217 (16 self)
 Add to MetaCart
(Show Context)
ANOVA type models are considered for a regression function or for the logarithm of a probability function, conditional probability function, density function, conditional density function, hazard function, conditional hazard function, or spectral density function. Polynomial splines are used to model the main effects, and their tensor products are used to model any interaction components that are included. In the special context of survival analysis, the baseline hazard function is modeled and nonproportionality is allowed. In general, the theory involves the L 2 rate of convergence for the fitted model and its components. The methodology involves least squares and maximum likelihood estimation, stepwise addition of basis functions using Rao statistics, stepwise deletion using Wald statistics, and model selection using BIC, crossvalidation or an independent test set. Publically available software, written in C and interfaced to S/SPLUS, is used to apply this methodology to...
Support vector machines for speech recognition
 Proceedings of the International Conference on Spoken Language Processing
, 1998
"... Statistical techniques based on hidden Markov Models (HMMs) with Gaussian emission densities have dominated signal processing and pattern recognition literature for the past 20 years. However, HMMs trained using maximum likelihood techniques suffer from an inability to learn discriminative informati ..."
Abstract

Cited by 114 (2 self)
 Add to MetaCart
(Show Context)
Statistical techniques based on hidden Markov Models (HMMs) with Gaussian emission densities have dominated signal processing and pattern recognition literature for the past 20 years. However, HMMs trained using maximum likelihood techniques suffer from an inability to learn discriminative information and are prone to overfitting and overparameterization. Recent work in machine learning has focused on models, such as the support vector machine (SVM), that automatically control generalization and parameterization as part of the overall optimization process. In this paper, we show that SVMs provide a significant improvement in performance on a static pattern classification task based on the Deterding vowel data. We also describe an application of SVMs to large vocabulary speech recognition, and demonstrate an improvement in error rate on a continuous alphadigit task (OGI Aphadigits) and a large vocabulary conversational speech task (Switchboard). Issues related to the development and optimization of an SVM/HMM hybrid system are discussed.
The Use of Classifiers in Sequential Inference
, 2001
"... We study the problem of combining the outcomes of several different classifiers in a way that provides a coherent inference that satisfies some constraints. In particular, we develop two general approaches for an important subproblem  identifying phrase structure. The first is a Markovian appro ..."
Abstract

Cited by 110 (40 self)
 Add to MetaCart
(Show Context)
We study the problem of combining the outcomes of several different classifiers in a way that provides a coherent inference that satisfies some constraints. In particular, we develop two general approaches for an important subproblem  identifying phrase structure. The first is a Markovian approach that extends standard HMMs to allow the use of a rich observation structure and of general classifiers to model stateobservation dependencies. The second is an extension of constraint satisfaction formalisms. We develop efficient combination algorithms under both models and study them experimentally in the context of shallow parsing. 1 Introduction In many situations it is necessary to make decisions that depend on the outcomes of several different classifiers in a way that provides a coherent inference that satisfies some constraints  the sequential nature of the data or other domain specific constraints. Consider, for example, the problem of chunking natural language sentences ...
A framework for recognizing the simultaneous aspects of American Sign Language
 Computer Vision and Image Understanding
, 2001
"... The major challenge that faces American Sign Language (ASL) recognition now is developing methods that will scale well with increasing vocabulary size. Unlike in spoken languages, phonemes can occur simultaneously in ASL. The number of possible combinations of phonemes is approximately 1.5 × 10 9, w ..."
Abstract

Cited by 99 (8 self)
 Add to MetaCart
(Show Context)
The major challenge that faces American Sign Language (ASL) recognition now is developing methods that will scale well with increasing vocabulary size. Unlike in spoken languages, phonemes can occur simultaneously in ASL. The number of possible combinations of phonemes is approximately 1.5 × 10 9, which cannot be tackled by conventional hidden Markov modelbased methods. Gesture recognition, which is less constrained than ASL recognition, suffers from the same problem. In this paper we present a novel framework to ASL recognition that aspires to being a solution to the scalability problems. It is based on breaking down the signs into their phonemes and modeling them with parallel hidden Markov models. These model the simultaneous aspects of ASL independently. Thus, they can be trained independently, and do not require consideration of the different combinations at training time. We show in experiments with a 22signvocabulary how to apply this framework in practice. We also show that parallel hidden Markov models outperform conventional hidden Markov models. c ○ 2001 Academic Press Key Words: sign language recognition; gesture recognition; human motion modeling; hidden Markov models.
Emotional speech recognition: resources, features, and methods
 Speech Communication
, 2006
"... In this paper we overview emotional speech recognition having in mind three goals. The first goal is to provide an uptodate record of the available emotional speech data collections. The number of emotional states, the language, the number of speakers, and the kind of speech are briefly addressed. ..."
Abstract

Cited by 80 (2 self)
 Add to MetaCart
(Show Context)
In this paper we overview emotional speech recognition having in mind three goals. The first goal is to provide an uptodate record of the available emotional speech data collections. The number of emotional states, the language, the number of speakers, and the kind of speech are briefly addressed. The second goal is to present the most frequent acoustic features used for emotional speech recognition and to assess how the emotion affects them. Typical features are the pitch, the formants, the vocaltract crosssection areas, the melfrequency cepstral coefficients, the Teager energy operatorbased features, the intensity of the speech signal, and the speech rate. The third goal is to review appropriate techniques in order to classify speech into emotional states. We examine separately classification techniques that exploit timing information from which that ignore it. Classification techniques based
Online cursive script recognition using Time Delay Neural Networks and Hidden Markov Models
 In International Conference on Acoustics, Speech, and Signal Processing (ICASSP
, 1994
"... ..."
Parallel hidden markov models for american sign language recognition
 In ICCV
, 1999
"... The major challenge that faces American Sign Language (ASL) recognition now is to develop methods that will scale well with increasing vocabulary size. Unlike in spoken languages, phonemes can occur simultaneously in ASL. The number of possible combinations of phonemes after enforcing linguistic con ..."
Abstract

Cited by 63 (4 self)
 Add to MetaCart
(Show Context)
The major challenge that faces American Sign Language (ASL) recognition now is to develop methods that will scale well with increasing vocabulary size. Unlike in spoken languages, phonemes can occur simultaneously in ASL. The number of possible combinations of phonemes after enforcing linguistic constraints is approximately 5.5 × 10 8. Gesture recognition, which is less constrained than ASL recognition, suffers from the same problem. Thus, it is not feasible to train conventional hidden Markov models (HMMs) for largescale ASL applications. Factorial HMMs and coupled HMMs are two extensions to HMMs that explicitly attempt to model several processes occuring in parallel. Unfortunately, they still require consideration of the combinations at training time. In this paper we present a novel approach to ASL recognition that aspires to being a solution to the scalability problems. It is based on parallel HMMs (PaHMMs), which model the parallel processes independently. Thus, they can also be trained independently, and do not require consideration of the different combinations at training time. We develop the recognition algorithm for PaHMMs and show that it runs in time polynomial in the number of states, and in time linear in the number of parallel processes. We run several experiments with a 22 sign vocabulary and demonstrate that PaHMMs can improve the robustness of HMMbased recognition even on a small scale. Thus, PaHMMs are a very promising general recognition scheme with applications in both gesture and ASL recognition. 1.
Heterogeneous acoustic measurements and multiple classifiers for speech recognition
, 1998
"... ..."
What HMMs can do
, 2002
"... Since their inception over thirty years ago, hidden Markov models (HMMs) have have become the predominant methodology for automatic speech recognition (ASR) systems — today, most stateoftheart speech systems are HMMbased. There have been a number of ways to explain HMMs and to list their capabil ..."
Abstract

Cited by 47 (5 self)
 Add to MetaCart
(Show Context)
Since their inception over thirty years ago, hidden Markov models (HMMs) have have become the predominant methodology for automatic speech recognition (ASR) systems — today, most stateoftheart speech systems are HMMbased. There have been a number of ways to explain HMMs and to list their capabilities, each of these ways having both advantages and disadvantages. In an effort to better understand what HMMs can do, this tutorial analyzes HMMs by exploring a novel way in which an HMM can be defined, namely in terms of random variables and conditional independence assumptions. We prefer this definition as it allows us to reason more throughly about the capabilities of HMMs. In particular, it is possible to deduce that there are, in theory at least, no theoretical limitations to the class of probability distributions representable by HMMs. This paper concludes that, in search of a model to supersede the HMM for ASR, we should rather than trying to correct for HMM limitations in the general case, new models should be found based on their potential for better parsimony, computational requirements, and noise insensitivity.
Noise Adaptive Stream Weighting in AudioVisual Speech Recognition
 EURASIP J. APPL. SIGNAL PROCESSING
, 2002
"... When trying to overcome the significant performance drops of ASR systems in the presence of noise, one road to follow is the integration of the information present in the lips movement of the speaker. Comparisons showed that integration of audio and video data on the decision level yields best re ..."
Abstract

Cited by 46 (9 self)
 Add to MetaCart
(Show Context)
When trying to overcome the significant performance drops of ASR systems in the presence of noise, one road to follow is the integration of the information present in the lips movement of the speaker. Comparisons showed that integration of audio and video data on the decision level yields best recognition results. This raises the question how to weight the two modalities in different noise conditions. Throughout this article we develop a weighting process adaptive to various background noise situations. Firstly