Results 1  10
of
133
Polynomial Splines and Their Tensor Products in Extended Linear Modeling
 Ann. Statist
, 1997
"... ANOVA type models are considered for a regression function or for the logarithm of a probability function, conditional probability function, density function, conditional density function, hazard function, conditional hazard function, or spectral density function. Polynomial splines are used to m ..."
Abstract

Cited by 142 (14 self)
 Add to MetaCart
ANOVA type models are considered for a regression function or for the logarithm of a probability function, conditional probability function, density function, conditional density function, hazard function, conditional hazard function, or spectral density function. Polynomial splines are used to model the main effects, and their tensor products are used to model any interaction components that are included. In the special context of survival analysis, the baseline hazard function is modeled and nonproportionality is allowed. In general, the theory involves the L 2 rate of convergence for the fitted model and its components. The methodology involves least squares and maximum likelihood estimation, stepwise addition of basis functions using Rao statistics, stepwise deletion using Wald statistics, and model selection using BIC, crossvalidation or an independent test set. Publically available software, written in C and interfaced to S/SPLUS, is used to apply this methodology to...
The Use of Classifiers in Sequential Inference
, 2001
"... We study the problem of combining the outcomes of several different classifiers in a way that provides a coherent inference that satisfies some constraints. In particular, we develop two general approaches for an important subproblem  identifying phrase structure. The first is a Markovian appro ..."
Abstract

Cited by 91 (34 self)
 Add to MetaCart
We study the problem of combining the outcomes of several different classifiers in a way that provides a coherent inference that satisfies some constraints. In particular, we develop two general approaches for an important subproblem  identifying phrase structure. The first is a Markovian approach that extends standard HMMs to allow the use of a rich observation structure and of general classifiers to model stateobservation dependencies. The second is an extension of constraint satisfaction formalisms. We develop efficient combination algorithms under both models and study them experimentally in the context of shallow parsing. 1 Introduction In many situations it is necessary to make decisions that depend on the outcomes of several different classifiers in a way that provides a coherent inference that satisfies some constraints  the sequential nature of the data or other domain specific constraints. Consider, for example, the problem of chunking natural language sentences ...
Support vector machines for speech recognition
 Proceedings of the International Conference on Spoken Language Processing
, 1998
"... Statistical techniques based on hidden Markov Models (HMMs) with Gaussian emission densities have dominated signal processing and pattern recognition literature for the past 20 years. However, HMMs trained using maximum likelihood techniques suffer from an inability to learn discriminative informati ..."
Abstract

Cited by 76 (2 self)
 Add to MetaCart
Statistical techniques based on hidden Markov Models (HMMs) with Gaussian emission densities have dominated signal processing and pattern recognition literature for the past 20 years. However, HMMs trained using maximum likelihood techniques suffer from an inability to learn discriminative information and are prone to overfitting and overparameterization. Recent work in machine learning has focused on models, such as the support vector machine (SVM), that automatically control generalization and parameterization as part of the overall optimization process. In this paper, we show that SVMs provide a significant improvement in performance on a static pattern classification task based on the Deterding vowel data. We also describe an application of SVMs to large vocabulary speech recognition, and demonstrate an improvement in error rate on a continuous alphadigit task (OGI Aphadigits) and a large vocabulary conversational speech task (Switchboard). Issues related to the development and optimization of an SVM/HMM hybrid system are discussed.
A framework for recognizing the simultaneous aspects of American Sign Language
 Computer Vision and Image Understanding
, 2001
"... The major challenge that faces American Sign Language (ASL) recognition now is developing methods that will scale well with increasing vocabulary size. Unlike in spoken languages, phonemes can occur simultaneously in ASL. The number of possible combinations of phonemes is approximately 1.5 × 10 9, w ..."
Abstract

Cited by 75 (6 self)
 Add to MetaCart
The major challenge that faces American Sign Language (ASL) recognition now is developing methods that will scale well with increasing vocabulary size. Unlike in spoken languages, phonemes can occur simultaneously in ASL. The number of possible combinations of phonemes is approximately 1.5 × 10 9, which cannot be tackled by conventional hidden Markov modelbased methods. Gesture recognition, which is less constrained than ASL recognition, suffers from the same problem. In this paper we present a novel framework to ASL recognition that aspires to being a solution to the scalability problems. It is based on breaking down the signs into their phonemes and modeling them with parallel hidden Markov models. These model the simultaneous aspects of ASL independently. Thus, they can be trained independently, and do not require consideration of the different combinations at training time. We show in experiments with a 22signvocabulary how to apply this framework in practice. We also show that parallel hidden Markov models outperform conventional hidden Markov models. c ○ 2001 Academic Press Key Words: sign language recognition; gesture recognition; human motion modeling; hidden Markov models.
Online cursive script recognition using time delay neural networks and hidden markov models
 In Proceedings of the IEEE International Conference on Acoustics Speech and Signal Processing
, 1994
"... ..."
Parallel hidden markov models for american sign language recognition
 In ICCV
, 1999
"... The major challenge that faces American Sign Language (ASL) recognition now is to develop methods that will scale well with increasing vocabulary size. Unlike in spoken languages, phonemes can occur simultaneously in ASL. The number of possible combinations of phonemes after enforcing linguistic con ..."
Abstract

Cited by 47 (4 self)
 Add to MetaCart
The major challenge that faces American Sign Language (ASL) recognition now is to develop methods that will scale well with increasing vocabulary size. Unlike in spoken languages, phonemes can occur simultaneously in ASL. The number of possible combinations of phonemes after enforcing linguistic constraints is approximately 5.5 × 10 8. Gesture recognition, which is less constrained than ASL recognition, suffers from the same problem. Thus, it is not feasible to train conventional hidden Markov models (HMMs) for largescale ASL applications. Factorial HMMs and coupled HMMs are two extensions to HMMs that explicitly attempt to model several processes occuring in parallel. Unfortunately, they still require consideration of the combinations at training time. In this paper we present a novel approach to ASL recognition that aspires to being a solution to the scalability problems. It is based on parallel HMMs (PaHMMs), which model the parallel processes independently. Thus, they can also be trained independently, and do not require consideration of the different combinations at training time. We develop the recognition algorithm for PaHMMs and show that it runs in time polynomial in the number of states, and in time linear in the number of parallel processes. We run several experiments with a 22 sign vocabulary and demonstrate that PaHMMs can improve the robustness of HMMbased recognition even on a small scale. Thus, PaHMMs are a very promising general recognition scheme with applications in both gesture and ASL recognition. 1.
Heterogeneous acoustic measurements and multiple classifiers for speech recognition
, 1998
"... ..."
What HMMs can do
, 2002
"... Since their inception over thirty years ago, hidden Markov models (HMMs) have have become the predominant methodology for automatic speech recognition (ASR) systems — today, most stateoftheart speech systems are HMMbased. There have been a number of ways to explain HMMs and to list their capabil ..."
Abstract

Cited by 33 (4 self)
 Add to MetaCart
Since their inception over thirty years ago, hidden Markov models (HMMs) have have become the predominant methodology for automatic speech recognition (ASR) systems — today, most stateoftheart speech systems are HMMbased. There have been a number of ways to explain HMMs and to list their capabilities, each of these ways having both advantages and disadvantages. In an effort to better understand what HMMs can do, this tutorial analyzes HMMs by exploring a novel way in which an HMM can be defined, namely in terms of random variables and conditional independence assumptions. We prefer this definition as it allows us to reason more throughly about the capabilities of HMMs. In particular, it is possible to deduce that there are, in theory at least, no theoretical limitations to the class of probability distributions representable by HMMs. This paper concludes that, in search of a model to supersede the HMM for ASR, we should rather than trying to correct for HMM limitations in the general case, new models should be found based on their potential for better parsimony, computational requirements, and noise insensitivity.
Noise Adaptive Stream Weighting in AudioVisual Speech Recognition
 EURASIP J. APPL. SIGNAL PROCESSING
, 2002
"... When trying to overcome the significant performance drops of ASR systems in the presence of noise, one road to follow is the integration of the information present in the lips movement of the speaker. Comparisons showed that integration of audio and video data on the decision level yields best re ..."
Abstract

Cited by 31 (4 self)
 Add to MetaCart
When trying to overcome the significant performance drops of ASR systems in the presence of noise, one road to follow is the integration of the information present in the lips movement of the speaker. Comparisons showed that integration of audio and video data on the decision level yields best recognition results. This raises the question how to weight the two modalities in different noise conditions. Throughout this article we develop a weighting process adaptive to various background noise situations. Firstly
Pairwise Neural Network Classifiers with Probabilistic Outputs
 in Advances in Neural Information Processing Systems 7
, 1994
"... Multiclass classification problems can be efficiently solved by partitioning the original problem into subproblems involving only two classes: for each pair of classes, a (potentially small) neural network is trained using only the data of these two classes. We show how to combine the outputs of t ..."
Abstract

Cited by 29 (0 self)
 Add to MetaCart
Multiclass classification problems can be efficiently solved by partitioning the original problem into subproblems involving only two classes: for each pair of classes, a (potentially small) neural network is trained using only the data of these two classes. We show how to combine the outputs of the twoclass neural networks in order to obtain posterior probabilities for the class decisions. The resulting probabilistic pairwise classifier is part of a handwriting recognition system which is currently applied to check reading. We present results on real world data bases and show that, from a practical point of view, these results compare favorably to other neural network approaches. 1 Introduction Generally, a pattern classifier consists of two main parts: a feature extractor and a classification algorithm. Both parts have the same ultimate goal, namely to transform a given input pattern into a representation that is easily interpretable as a class decision. In the case of feedforwar...