Results 1 -
5 of
5
Lexical Modeling Of Non-Native Speech For Automatic Speech Recognition
, 2000
"... This paper examines the recognition of non-native speech in jupiter, a speaker-independent, spontaneous-speech conversational system. Because the non-native speech in this domain is limited and varied, speaker- and accent-specific methods are impractical. We therefore chose to model all of the non-n ..."
Abstract
-
Cited by 24 (1 self)
- Add to MetaCart
This paper examines the recognition of non-native speech in jupiter, a speaker-independent, spontaneous-speech conversational system. Because the non-native speech in this domain is limited and varied, speaker- and accent-specific methods are impractical. We therefore chose to model all of the non-native data with a single model. In particular, this paper describes an attempt to better model non-native lexical patterns. These patterns are incorporated by applying context-independent phonetic confusion rules, whose probabilities are estimated from training data. Using this approach, the word error rate on a non-native test set is reduced from 20.9% to 18.8%. 1. INTRODUCTION Speech recognition accuracy has been observed to be drastically lower for non-native speakers of the target language than for native speakers [3, 13, 14]. Research on both nonnative accent modeling and dialect-specific modeling shows that large gains in performance can be achieved when the acoustics [1, 9, 14] and ...
Statistical Dialect Classification Based On Mean Phonetic Features
- In Proc. International Conference on Spoken Language Processing
, 1996
"... Our paper describes work done on a text-dependent method for automatic utterance classi#cation and dialect model selection using mean cepstral and duration features on a per phoneme basis. From transcribed dialect data, we build a linear discriminant to separate the dialects in feature space. This m ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Our paper describes work done on a text-dependent method for automatic utterance classi#cation and dialect model selection using mean cepstral and duration features on a per phoneme basis. From transcribed dialect data, we build a linear discriminant to separate the dialects in feature space. This method is potentially much faster than our previous selection algorithm. Wehave been able to achieve error rates of 8# for distinguishing Northern US speakers from Southern US speakers, and average error rates of 13# on a variety of #ner pairwise dialect discriminations. We also presenta description of the training and test corpora collected for this work.
A comparison of two unsupervised approaches to accent iden
- Proc. ICSLP
, 1998
"... The ability to automatically identify a speaker’s accent would be very useful for a speech recognition system as it would enable the system to use both a pronunciation dictionary and speech models specific to the accent, techniques which have been shown to improve accuracy. Here, we describe some ex ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
The ability to automatically identify a speaker’s accent would be very useful for a speech recognition system as it would enable the system to use both a pronunciation dictionary and speech models specific to the accent, techniques which have been shown to improve accuracy. Here, we describe some experiments in unsupervised accent classification. Two techniques have been investigated to classify British- and Americanaccented speech: an acoustic approach, in which we analyse the pattern of usage of the distributions in the recogniser by a speaker to decide on his most probable accent, and a high-level approach in which we use a phonotactic model for classification of the accent. Results show that both techniques give excellent performance on this task which is maintained when testing is done on data from an independent dataset. 1.
Automatic Language Identification with Sequences of Language-Independent Phoneme Clusters
, 1996
"... Automatic language identification involves analyzing language-specific features in speech to determine the language of an utterance without regard to topic, speaker or length of speech. Although much progress has been made in recent years, language identification systems have not been built on under ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Automatic language identification involves analyzing language-specific features in speech to determine the language of an utterance without regard to topic, speaker or length of speech. Although much progress has been made in recent years, language identification systems have not been built on underlying theory or linguistically meaningful design criteria. This thesis is motivated by the belief that features used to discriminate between languages should be linguistically sound; the result is a unique combination of design, theory and implementation. In this thesis a "word-spotting" algorithm is introduced motivated by a perceptual study [82] reporting that human subjects use language- dependent phonemes and short sequences to identify languages. In order to find an optimal set of phoneme-like tokens to represent speech in a linguistically meaningful way, a mathematical model of the discrimination between two languages is developed. This model permits the automatic design of a token representation of speech by selecting a list of discriminating "words" in a data-driven manner. The resulting system has the flexibility to automatically take into account the inherent structure of the languages to be discriminated. A second mathematical model is developed to measure the impact of inaccurate automatic alignment of tokens on language discrimination. This model indicates why some algorithms aiming to compensate for these inaccuracies have not been successful. The theoretical models and the "word"-spotting algorithms have been implemented and validated on both generated and real-world speech data. This dissertation makes several significant contributions: the design of a simple and linguistically sound language-identification module; a flexible automatic feature extraction algorithm; a mathematical model to estimate the discriminability of two languages; and a mathematical model to capture the impact of inaccurate alignment on the discriminability of two languages.
AUTOMATIC SPEECH RECOGNITION AND INTRINSIC SPEECH VARIATION
"... This paper briefly reviews state of the art related to the topic of speech variability sources in automatic speech recognition systems. It focuses on some variations within the speech signal that make the ASR task difficult. The variations detailed in the paper are intrinsic to the speech and affect ..."
Abstract
- Add to MetaCart
This paper briefly reviews state of the art related to the topic of speech variability sources in automatic speech recognition systems. It focuses on some variations within the speech signal that make the ASR task difficult. The variations detailed in the paper are intrinsic to the speech and affect the different levels of the ASR processing chain. For different sources of speech variation, the paper summarizes the current knowledge and highlights specific feature extraction or modeling weaknesses and current trends. 1.

