Results 1 -
4 of
4
Automatic Language Identification: Performance vs. Complexity
- In Proceedings of the Sixth Annual South Africa Workshop on Pattern Recognition
, 1997
"... Automatic Language Identification is the process of classifying an utterance as belonging to one of a number of previously encountered languages. The field has been very active during the last couple of years and has progressed rapidly. We compare two approaches to the problem. The first approach, u ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Automatic Language Identification is the process of classifying an utterance as belonging to one of a number of previously encountered languages. The field has been very active during the last couple of years and has progressed rapidly. We compare two approaches to the problem. The first approach, using hand-labelled speech data has previously been shown to yield better results, but at the price of vastly increased effort needed to label a training corpus. The second approach, not requiring human intervention, trades reduced complexity and effort for performance. Experiments are performed using the OGI Telephone Speech Corpus which contain fine-phonetically labelled utterances in English, German, Japanese, Mandarin, Spanish and Hindi. We present a summary of our research over the last three years, together with insights that we have gained into spoken language processing in general and automatic language identification in particular. I. Introduction T HE goal of Automatic Language Id...
Automatic Language Identification with Sequences of Language-Independent Phoneme Clusters
, 1996
"... Automatic language identification involves analyzing language-specific features in speech to determine the language of an utterance without regard to topic, speaker or length of speech. Although much progress has been made in recent years, language identification systems have not been built on under ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Automatic language identification involves analyzing language-specific features in speech to determine the language of an utterance without regard to topic, speaker or length of speech. Although much progress has been made in recent years, language identification systems have not been built on underlying theory or linguistically meaningful design criteria. This thesis is motivated by the belief that features used to discriminate between languages should be linguistically sound; the result is a unique combination of design, theory and implementation. In this thesis a "word-spotting" algorithm is introduced motivated by a perceptual study [82] reporting that human subjects use language- dependent phonemes and short sequences to identify languages. In order to find an optimal set of phoneme-like tokens to represent speech in a linguistically meaningful way, a mathematical model of the discrimination between two languages is developed. This model permits the automatic design of a token representation of speech by selecting a list of discriminating "words" in a data-driven manner. The resulting system has the flexibility to automatically take into account the inherent structure of the languages to be discriminated. A second mathematical model is developed to measure the impact of inaccurate automatic alignment of tokens on language discrimination. This model indicates why some algorithms aiming to compensate for these inaccuracies have not been successful. The theoretical models and the "word"-spotting algorithms have been implemented and validated on both generated and real-world speech data. This dissertation makes several significant contributions: the design of a simple and linguistically sound language-identification module; a flexible automatic feature extraction algorithm; a mathematical model to estimate the discriminability of two languages; and a mathematical model to capture the impact of inaccurate alignment on the discriminability of two languages.
Efficient high-order hidden Markov modelling
- in Proceedings of the International Conference on Spoken Language Processing
, 1998
"... I, the undersigned, hereby declare that the work contained in this dissertation is my own original work and that I have not previously in its entirety or in part submitted it at any university for a degree. Signature: Date: ii Currently, first-order hidden Markov models (HMMs) form the backbone arou ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
I, the undersigned, hereby declare that the work contained in this dissertation is my own original work and that I have not previously in its entirety or in part submitted it at any university for a degree. Signature: Date: ii Currently, first-order hidden Markov models (HMMs) form the backbone around which most automatic speech processing applications are built. Their higher-order extensions are known to be more powerful, but, due to their complexity and computational demands, they are seldomly used. It is the purpose of this work to advance their application In this work we unify HMMs of all orders by deriving and proving the ORder rEDucing (ORED) algorithm. This algorithm will reduce any higher-order HMM (also mixed-order) to an equivalent first-order representation. This makes it possible to process any higher-order HMM using known first-order algorithms, thereby
LANGUAGE IDENTIFICATION AND MULTILINGUAL SPEECH RECOGNITION USING DISCRIMINATIVELY TRAINED ACOUSTIC MODELS
"... We perform language identification experiments for four prominent South-African languages using a multilingual speech recognition system. Specifically, we show how successfully Afrikaans, English, Xhosa and Zulu may be identified using a single set of HMMs and a single recognition pass. We further d ..."
Abstract
- Add to MetaCart
We perform language identification experiments for four prominent South-African languages using a multilingual speech recognition system. Specifically, we show how successfully Afrikaans, English, Xhosa and Zulu may be identified using a single set of HMMs and a single recognition pass. We further demonstrate the effect of language identification-specific discriminative acoustic model training on both the per-language recognition accuracy as well as the accuracy of the language identification process. Experiments indicate that discriminative training leads to a small overall improvement in language identification accuracy while not affecting the speech recognition performance strongly. Furthermore, language identification is found to be more error prone and discriminative training less effective for code-mixed utterances, indicating that these may require special treatment within a multilingual speech recognition system. 1.

