Results 1 -
5 of
5
Adaptation To Environment And Speaker Using Maximum Likelihood Neural Networks
"... When there is a mismatch between training and testing conditions, statistical speech recognition algorithms suffer from severe degradation in recognition accuracy. The mismatch could be due to the interference from acoustical environments where systems are actually used or from speakers themselves. ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
When there is a mismatch between training and testing conditions, statistical speech recognition algorithms suffer from severe degradation in recognition accuracy. The mismatch could be due to the interference from acoustical environments where systems are actually used or from speakers themselves. In this paper, a neural network based transformation approach is studied to handle the data distribution mismatches between training and testing conditions. The conditional probability that comes from hidden Markov model (HMM) based recognizers is used for the objective function of a neural network. It maximizes the likelihood of the data from a testing environment, and allows global optimization of the network when used with HMM-based recognizers. The new objective function can be used to transform speech feature vectors, or the mean vectors and covariance matrices of a recognizer. The proposed algorithm is evaluated on a noisy distant-talking version of the Resource Management database.
A Natural Human-Computer Interface for Controlling Wheeled Robotic Vehicles
, 2003
"... Robots are used increasingly to execute dangerous tasks and military missions. Autonomous robots are the warriors of the future, executing missions without requiring continuous supervision. ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Robots are used increasingly to execute dangerous tasks and military missions. Autonomous robots are the warriors of the future, executing missions without requiring continuous supervision.
Date
, 2003
"... Management) is entirely my own work and has not been submitted for assessment for ..."
Abstract
- Add to MetaCart
Management) is entirely my own work and has not been submitted for assessment for
Neural Network based Regression for Robust Overlapping Speech Recognition using Microphone Arrays
, 2008
"... a b submitted for publication ..."
MLP-based Log Spectral Energy Mapping for Robust Overlapping Speech Recognition
, 2007
"... submitted for publication Abstract. This paper investigates a multilayer perceptron (MLP) based acoustic feature mapping to extract robust features for automatic speech recognition (ASR) of overlapping speech. The MLP is trained to learn the mapping from log mel filter bank energies (MFBEs) extracte ..."
Abstract
- Add to MetaCart
submitted for publication Abstract. This paper investigates a multilayer perceptron (MLP) based acoustic feature mapping to extract robust features for automatic speech recognition (ASR) of overlapping speech. The MLP is trained to learn the mapping from log mel filter bank energies (MFBEs) extracted from the distant microphone recordings, including multiple overlapping speakers, to log MFBEs extracted from the clean speech signal. The outputs of the MLP are then used to generate mel filterbank cepstral coefficient (MFCC) acoustic features, that are subsequently used in acoustic model adaptation and system evaluation. The proposed approach is evaluated through extensive studies on the MONC corpus, which includes both non-overlapping single speaker and overlapping multispeaker conditions. We demonstrate that by learning the mapping between log MFBEs extracted from noisy and clean signals the performance of ASR system can be significantly improved in overlapping multi-speaker condition compared a conventional delay-sum beamforming approach, while keeping the performance of the system on single non-overlapping speaker condition intact. 2 IDIAP–RR 07-54 1

