Results 1 -
6 of
6
Speaker Independent Continuous Speech Recognition Using An Acoustic-Phonetic Italian Corpus
- in Proc. of ICSLP
, 1994
"... The objective of this paper is to describe the activity that is being carried out at IRST laboratories for the development of an HMM-based speaker independent continuous speech recognition system for the Italian language. The recognition system is trained and tested using the acoustic-phonetic conti ..."
Abstract
-
Cited by 40 (28 self)
- Add to MetaCart
The objective of this paper is to describe the activity that is being carried out at IRST laboratories for the development of an HMM-based speaker independent continuous speech recognition system for the Italian language. The recognition system is trained and tested using the acoustic-phonetic continuous speech portion of the APASCI corpus. Acoustic modeling is based on the use of Continuous Density HMMs with gaussian mixture observation densities. As a baseline, a set of 38 Context Independent Units was evaluated using different numbers of mixture components. Then, two other classes of Context Dependent Unit sets were considered, that provide different performance and system complexity. Performance, expressed in terms of Phone loop recognition accuracy and Word loop recognition accuracy, shows an improvement using both of these classes of unit sets, with respect to the baseline. I. INTRODUCTION A baseline of a speaker independent continuous speech recognition system for the Italian ...
Genones: Generalized Mixture Tying in Continuous Hidden Markov Model-Based Speech Recognizers
- IEEE Transactions on Speech and Audio Processing
, 1996
"... An algorithm is proposed that achieves a good trade-off between modeling resolution and robustness by using a new, general scheme for tying of mixture components in continuous mixture-density hidden Markov model (HMM)-based speech recognizers. The sets of HMM states that share the same mixture co ..."
Abstract
-
Cited by 36 (7 self)
- Add to MetaCart
An algorithm is proposed that achieves a good trade-off between modeling resolution and robustness by using a new, general scheme for tying of mixture components in continuous mixture-density hidden Markov model (HMM)-based speech recognizers. The sets of HMM states that share the same mixture components are determined automatically using agglomerative clustering techniques. Experimental results on ARPA's Wall-Street Journal corpus show that this scheme reduces errors by 25% over typical tied-mixture systems. New fast algorithms for computing Gaussian likelihoods--the most time-consuming aspect of continuous-density HMM systems--are also presented. These new algorithms significantly reduce the number of Gaussian densities that are evaluated with little or no impact on speech recognition accuracy. Corresponding Author: Vassilios Digalakis Address: Electronic and Computer Engineering Department Technical University of Crete, Kounoupidiana Chania, 73100 GREECE Phone: +30-821...
A Comparative Study Of Linear Feature Transformation Techniques For Automatic Speech Recognition
- in Proc. Int. Conf. on Spoken Language Processing
, 1996
"... Although widely used, there are still open questions concerning which properties of Linear Discriminant Analysis (LDA) do account for its success in many speech recognition systems. In order to gain more insight into the nature of the transformation we compare LDA with mel-cepstral feature vectors w ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
Although widely used, there are still open questions concerning which properties of Linear Discriminant Analysis (LDA) do account for its success in many speech recognition systems. In order to gain more insight into the nature of the transformation we compare LDA with mel-cepstral feature vectors with respect to the following criteria: decorrelation and ordering property, invariance under linear transforms, automatic learning of dynamical features, and data dependence of the transformation.
A NN/HMM Hybrid For Continuous Speech Recognition With A Discriminant Nonlinear Feature Extraction
- Proc. ICASSP-98
, 1998
"... This paper deals with a hybrid NN/HMM architecture for continuous speech recognition. We present a novel approach to set up a neural linear or nonlinear feature transformation that is used as a preprocessor on top of the HMM system's RBF-network to produce discriminative feature vectors that are wel ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
This paper deals with a hybrid NN/HMM architecture for continuous speech recognition. We present a novel approach to set up a neural linear or nonlinear feature transformation that is used as a preprocessor on top of the HMM system's RBF-network to produce discriminative feature vectors that are well suited for being modeled by mixtures of Gaussian distributions. In order to omit the computational cost of discriminative training of a context -dependent system, we propose to train a discriminant neural feature transformation on a system of low complexity and reuse this transformation in the context-dependent system to output improved feature vectors. The resulting hybrid system is an extension of a state-of-the-art continuous HMM system, and in fact, it is the first hybrid system that really is capable of outperforming these standard systems with respect to the recognition accuracy, without the need for discriminative training of the entire system. In experiments carried out on the Reso...
SELECTIVE TRAINING FOR HIDDEN MARKOV MODELS with APPLICATIONS to SPEECH CLASSIFICATION
, 1997
"... Traditional maximum likelihood estimation of hidden Markov model parameters aims at maximizing the overall probability across the training tokens of a given speech unit. Therefore, it disregards any interaction and biases across the models in the training procedure. Often the resulting model paramet ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Traditional maximum likelihood estimation of hidden Markov model parameters aims at maximizing the overall probability across the training tokens of a given speech unit. Therefore, it disregards any interaction and biases across the models in the training procedure. Often the resulting model parameters do not result in minimum error classification in the training set. A new selective training method is proposed which controls the influence of outliers in the training data on the generated models. The resulting models are shown to possess feature statistics which are more clearly separated for confusable patterns. The proposed selective training procedure is used for hidden Markov model training, with application to foreign accent classification, language identification, and speech recognition using the E-set alphabet. The resulting error rates are measurably improved over traditional Forward-Backward training under open test conditions. The proposed method is similar in terms of its go...
Application of Discriminant Analysis to Speech Recognition with Auditory Features
, 1995
"... Linear discriminant analysis (LDA) has been applied to auditory features to reduce the feature dimension. Results indicate that it may be necessary to reduce the feature dimension of auditory features, for use with continuous output hidden Markov models (HMM). Bartlett's statistic has been used to o ..."
Abstract
- Add to MetaCart
Linear discriminant analysis (LDA) has been applied to auditory features to reduce the feature dimension. Results indicate that it may be necessary to reduce the feature dimension of auditory features, for use with continuous output hidden Markov models (HMM). Bartlett's statistic has been used to obtain an upper bound on the final feature dimension. 1 Introduction The problem of robust speech recognition under variable environmental conditions such as noise and acoustic reverberations must be adequately addressed before such systems find widespread use in real world applications. Recently there has been an increased interest in using auditory model-based front-ends for speech recognition systems. Several researchers have experimentally indicated the robustness of auditory modelbased processing of speech signals. However, despite encouraging results from small scale experiments, auditory model-based signal processing is not used in the state-of-the-art speech recognition systems. Some...

