• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Flanagan,“Telephone speech recognition using neural networks and hidden Markov models (1999)

by D Yuk, J
Venue:In Proc. ICASSP
Add To MetaCart

Tools

Sorted by:
Results 1 - 5 of 5

Adaptation To Environment And Speaker Using Maximum Likelihood Neural Networks

by Dongsuk Yuk, James Flanagan, Mahesh Krishnamoorthy, Krishna Dayanidhi
"... When there is a mismatch between training and testing conditions, statistical speech recognition algorithms suffer from severe degradation in recognition accuracy. The mismatch could be due to the interference from acoustical environments where systems are actually used or from speakers themselves. ..."
Abstract - Cited by 1 (1 self) - Add to MetaCart
When there is a mismatch between training and testing conditions, statistical speech recognition algorithms suffer from severe degradation in recognition accuracy. The mismatch could be due to the interference from acoustical environments where systems are actually used or from speakers themselves. In this paper, a neural network based transformation approach is studied to handle the data distribution mismatches between training and testing conditions. The conditional probability that comes from hidden Markov model (HMM) based recognizers is used for the objective function of a neural network. It maximizes the likelihood of the data from a testing environment, and allows global optimization of the network when used with HMM-based recognizers. The new objective function can be used to transform speech feature vectors, or the mean vectors and covariance matrices of a recognizer. The proposed algorithm is evaluated on a noisy distant-talking version of the Resource Management database.

A Natural Human-Computer Interface for Controlling Wheeled Robotic Vehicles

by Frans Flippo , 2003
"... Robots are used increasingly to execute dangerous tasks and military missions. Autonomous robots are the warriors of the future, executing missions without requiring continuous supervision. ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
Robots are used increasingly to execute dangerous tasks and military missions. Autonomous robots are the warriors of the future, executing missions without requiring continuous supervision.

Date

by Bryan Duggan , 2003
"... Management) is entirely my own work and has not been submitted for assessment for ..."
Abstract - Add to MetaCart
Management) is entirely my own work and has not been submitted for assessment for

Neural Network based Regression for Robust Overlapping Speech Recognition using Microphone Arrays

by Weifeng Li A, John Dines A, Herve Bourlard, Weifeng Li, John Dines, Mathew Magimai. -doss, Herve Bourlard , 2008
"... a b submitted for publication ..."
Abstract - Add to MetaCart
a b submitted for publication

MLP-based Log Spectral Energy Mapping for Robust Overlapping Speech Recognition

by Weifeng Li A, John Dines A, Herve Bourlard, Weifeng Li, Mathew Magimai. -doss, John Dines, Herve Bourlard , 2007
"... submitted for publication Abstract. This paper investigates a multilayer perceptron (MLP) based acoustic feature mapping to extract robust features for automatic speech recognition (ASR) of overlapping speech. The MLP is trained to learn the mapping from log mel filter bank energies (MFBEs) extracte ..."
Abstract - Add to MetaCart
submitted for publication Abstract. This paper investigates a multilayer perceptron (MLP) based acoustic feature mapping to extract robust features for automatic speech recognition (ASR) of overlapping speech. The MLP is trained to learn the mapping from log mel filter bank energies (MFBEs) extracted from the distant microphone recordings, including multiple overlapping speakers, to log MFBEs extracted from the clean speech signal. The outputs of the MLP are then used to generate mel filterbank cepstral coefficient (MFCC) acoustic features, that are subsequently used in acoustic model adaptation and system evaluation. The proposed approach is evaluated through extensive studies on the MONC corpus, which includes both non-overlapping single speaker and overlapping multispeaker conditions. We demonstrate that by learning the mapping between log MFBEs extracted from noisy and clean signals the performance of ASR system can be significantly improved in overlapping multi-speaker condition compared a conventional delay-sum beamforming approach, while keeping the performance of the system on single non-overlapping speaker condition intact. 2 IDIAP–RR 07-54 1
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University