• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Experiments of Speech Recognition in a Noisy and Reverberant Environment using a Microphone Array and HMM Adaptation (1996)

by D Giuliani, M Omologo, P Svaizer
Venue:Proc. of ICSLP
Add To MetaCart

Tools

Sorted by:
Results 1 - 5 of 5

Environmental Conditions and Acoustic Transduction in Hands-Free Speech Recognition

by M. Omologo, P. Svaizer, M. Matassoni - Speech Communication , 1998
"... Hands-free interaction represents a key-point for increase of flexibility of present applications and for the development of new speech recognition applications, where the user can not be encumbered by either hand-held or head-mounted microphones. When the microphone is far from the speaker, the ..."
Abstract - Cited by 22 (4 self) - Add to MetaCart
Hands-free interaction represents a key-point for increase of flexibility of present applications and for the development of new speech recognition applications, where the user can not be encumbered by either hand-held or head-mounted microphones. When the microphone is far from the speaker, the transduced signal is affected by degradation of different nature, that is often unpredictable. Special microphones and multimicrophone acquisition systems represent a way of reducing some environmental noise effects. Robust processing and adaptation techniques can be further used in order to compensate for different kinds of variability that may be present in the recognizer input. The purpose of this paper is to re-visit some of the assumptions about the different sources of this variability and to discuss both on special transducer systems and on compensation/adaptation techniques that can be adopted. In particular, the paper will refer to the use of multimicrophone systems to overc...

Training of HMM with Filtered Speech Material for Hands-Free Recognition

by D. Giuliani, M. Matassoni, M. Omologo, P. Svaizer - in Proc. ICASSP , 1999
"... This paper addresses the problem of hands-free speech recognition in a noisy office environment. An array of six omnidirectional microphones and a corresponding time delay compensation module are used to provide a beamformed signal as input to a HMM-based recognizer. Training of HMMs is performed e ..."
Abstract - Cited by 16 (5 self) - Add to MetaCart
This paper addresses the problem of hands-free speech recognition in a noisy office environment. An array of six omnidirectional microphones and a corresponding time delay compensation module are used to provide a beamformed signal as input to a HMM-based recognizer. Training of HMMs is performed either using a clean speech database or using a filtered version of the same database. Filtering consists in a convolution with the acoustic impulse response between speaker and microphone, to reproduce the reverberation effect. Background noise is summed to provide the desired SNR. The paper shows that the new models trained on these data perform better than the baseline ones. Furthermore, the paper investigates on MLLR adaptation of the new models. It is shown that a further performance improvement is obtained, allowing to reach a 98.7% WRR in a connected digit recognition task, when the talker is at 1.5 m distance from the array. 1. INTRODUCTION Hands-free continuous speech recognition ...

Microphone Array Based Speech Recognition With Different Talker-Array Positions

by Maurizio Omologo, Marco Matassoni, Piergiorgio Svaizer, Diego Giuliani - Proc. of ICASSP , 1997
"... The use of a microphone array for hands-free continuous speech recognition in noisy and reverberant environment is investigated. An array of eight omnidirectional microphones was placed at different angles and distances from the talker. A time delay compensation module was used to provide a beamform ..."
Abstract - Cited by 12 (4 self) - Add to MetaCart
The use of a microphone array for hands-free continuous speech recognition in noisy and reverberant environment is investigated. An array of eight omnidirectional microphones was placed at different angles and distances from the talker. A time delay compensation module was used to provide a beamformed signal as input to a Hidden Markov Model (HMM) based recognizer. A phone HMM adaptation, based on a small amount of phonetically rich sentences, further improved the recognition rate obtained by applying only beamforming. These results were confirmed both by experiments conducted in a noisy and reverberant environment and by simulations. In the latter case, different conditions were recreated by using the image method to reproduce synthetic versions of the array microphone signals. 1. INTRODUCTION In the last years, many experimental activities were devoted to investigate the use of microphone arrays for hands-free continuous speech recognition [1, 2, 3, 4, 5, 6]. The system under study...

Perceptually Inspired Signal-processing Strategies for Robust Speech Recognition in Reverberant Environments

by Brian E. D. Kingsbury , 1998
"... Natural, hands-free interaction with computers is currently one of the great unfulfilled promises of automatic speech recognition (ASR), in part because ASR systems cannot reliably recognize speech under everyday, reverberant conditions that pose no problems for most human listeners. The specific pr ..."
Abstract - Cited by 12 (0 self) - Add to MetaCart
Natural, hands-free interaction with computers is currently one of the great unfulfilled promises of automatic speech recognition (ASR), in part because ASR systems cannot reliably recognize speech under everyday, reverberant conditions that pose no problems for most human listeners. The specific properties of the auditory representation of speech likely contribute to reliable human speech recognition under such conditions. This dissertation explores the use of perceptually inspired signal-processing strategies -- critical-band-like frequency analysis, an emphasis of slow changes in the spectral structure of the speech signal, adaptation, integration of phonetic information over syllabic durations, and use of multiple signal representations for...

Acoustic Diversity For Improved Speech Recognition In

by Reverberant Environments Bradford, Bradford W. Gillespie, Les E. Atlas - in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing , 2002
"... We show that even moderate reverberation has a detrimental effect on the audible quality of speech and automatic speech recognition (ASR) accuracy. In the presence of room reverberation, we assess the performance of several important speech enhancement techniques, and show that little improvement is ..."
Abstract - Add to MetaCart
We show that even moderate reverberation has a detrimental effect on the audible quality of speech and automatic speech recognition (ASR) accuracy. In the presence of room reverberation, we assess the performance of several important speech enhancement techniques, and show that little improvement is offered. We experimentally show that multiple microphones are necessary for complete equalization of the speaker-to-receiver impulse response. Furthermore, if complete equalization is not possible, long reverberation time (RT60) is shown to affect ASR accuracy far more negatively than a low signal-to-reverberation ratio (SRR). Using this knowledge we develop an equalizing strategy that improves ASR accuracy by reducing RT60.
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University