• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Linear Gaussian Models for Speech Recognition (2004)

by A-V I Rosti
Add To MetaCart

Tools

Sorted by:
Results 1 - 6 of 6

Uncertainty decoding for noise robust speech recognition

by Hank Liao - in Proc. Interspeech , 2004
"... This dissertation is the result of my own work and includes nothing which is the outcome of work done in collaboration. It has not been submitted in whole or in part for a degree at any other university. Some of the work has been published previously in conference proceedings ..."
Abstract - Cited by 26 (8 self) - Add to MetaCart
This dissertation is the result of my own work and includes nothing which is the outcome of work done in collaboration. It has not been submitted in whole or in part for a degree at any other university. Some of the work has been published previously in conference proceedings

Switching Linear Dynamical Systems for Noise Robust Speech Recognition

by David Barber, David Barber - IEEE Trans. Audio, Speech and Language Processing , 2007
"... to appear in ..."
Abstract - Cited by 11 (5 self) - Add to MetaCart
to appear in

Augmented Statistical Models for Classifying Sequence Data

by Martin Layton , 2006
"... Declaration This dissertation is the result of my own work and includes nothing that is the outcome of work done in collaboration. It has not been submitted in whole or in part for a degree at any other university. Some of the work has been published previously in conference proceedings [66,69], two ..."
Abstract - Cited by 7 (0 self) - Add to MetaCart
Declaration This dissertation is the result of my own work and includes nothing that is the outcome of work done in collaboration. It has not been submitted in whole or in part for a degree at any other university. Some of the work has been published previously in conference proceedings [66,69], two journal articles [36,68], two workshop papers [35,67] and a tech-nical report [65]. The length of this thesis including appendices, bibliography, footnotes, tables and equations is approximately 60,000 words. This thesis contains 27 figures and 20 tables. i

Modeling musical sounds with an interpolating state model

by Anssi Klapuri, Tuomas Virtanen, Marko Helén - In Proceedings of European Signal Processing Conference , 2005
"... A computationally efficient algorithm is proposed for modeling and coding the time-varying spectra of musical sounds. The aim is to encode individual data sets and not the statistical properties of the sounds. A given sequence of acoustic feature vectors is modeled by finding such a set of “states ” ..."
Abstract - Cited by 1 (1 self) - Add to MetaCart
A computationally efficient algorithm is proposed for modeling and coding the time-varying spectra of musical sounds. The aim is to encode individual data sets and not the statistical properties of the sounds. A given sequence of acoustic feature vectors is modeled by finding such a set of “states ” (anchor points in the feature space) that the input data can be efficiently represented by interpolating between them. The achieved modeling accuracy for a database of musical sounds was approximately two times better than that of a conventional “vector quantization ” model where the input data was k-means clustered and the input data vectors were then replaced by their corresponding cluster centroids. The computational complexity of the proposed algorithm as a function of the input sequence length T is O(TlogT). 1.

Development and Exploration of a Timbre Space Representation of Audio

by Craig Andrew Nicol , 2005
"... Sound is an important part of the human experience and provides valuable infor-mation about the world around us. Auditory human-computer interfaces do not have the same richness of expression and variety as audio in the world, and it has been said that this is primarily due to a lack of reasonable d ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
Sound is an important part of the human experience and provides valuable infor-mation about the world around us. Auditory human-computer interfaces do not have the same richness of expression and variety as audio in the world, and it has been said that this is primarily due to a lack of reasonable design tools for audio interfaces. There are a number of good guidelines for audio design and a strong psychoacoustic understanding of how sounds are interpreted. There are also a number of sound manipulation techniques developed for computer music. This research takes these ideas as the basis for an audio interface design system. A proof-of-concept of this system has been developed in order to explore the design possibilities allowed by the new system. The core of this novel audio design system is the timbre space. This provides a multi-dimensional representation of a sound. Each sound is represented as a path in the timbre space and this path can be manipulated geometrically. Several timbre spaces are compared to determine which amongst them is the best one for audio interface design. The various transformations available in the timbre space are discussed and the perceptual relevance of two novel transformations are explored by encoding “urgency ” as a design parameter. This research demonstrates that the timbre space is a viable option for audio inter-face design and provides novel features that are not found in current audio design systems. A number of problems with the approach and some suggested solutions are discussed. The timbre space opens up new possibilities for audio designers to explore combinations of sounds and sound design based on perceptual cues rather than synthesiser parameters.

Generative factor analyzed HMM for automatic speech recognition

by Kaisheng Yao, Kuldip K. Paliwal, Te-Won Lee , 2005
"... We present a generativefacer analyzed hidden Markov model (GFA-HMM) forautomatic speec recticJ)F In a standard HMM, observationvecerv are represented by mixture of Gaussians (MoG) that are dependent ondiscFfifiT valued hidden statesequencfi The GFA-HMMintroducE a hierarcJ ofc)AfiqSTJAS)))Tc ..."
Abstract - Add to MetaCart
We present a generativefacer analyzed hidden Markov model (GFA-HMM) forautomatic speec recticJ)F In a standard HMM, observationvecerv are represented by mixture of Gaussians (MoG) that are dependent ondiscFfifiT valued hidden statesequencfi The GFA-HMMintroducE a hierarcJ ofc)AfiqSTJAS)))Tc latent representation of observationvecerva where latent vecntT in one level areacAq)qFTJAfiA dependent and latent vecntT in a higher level are acxI)ETJAEF) independent.
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University