Results 1 -
5 of
5
Morphology-Based Language Modeling for Arabic Speech Recognition
- In Proc. of ICSLP
, 2004
"... Language modeling is a difficult problem for languages with rich morphology. In this paper we investigate the use of morphology-based language models at different stages in a speech recognition system for conversational Arabic. Classbased and single-stream factored language models using morphologica ..."
Abstract
-
Cited by 21 (5 self)
- Add to MetaCart
Language modeling is a difficult problem for languages with rich morphology. In this paper we investigate the use of morphology-based language models at different stages in a speech recognition system for conversational Arabic. Classbased and single-stream factored language models using morphological word representations are applied within an N-best list rescoring framework. In addition, we explore the use of factored language models in first-pass recognition, which is facilitated by two novel procedures: the data-driven optimization of a multi-stream language model structure, and the conversion of a factored language model to a standard word-based model. We evaluate these techniques on a large-vocabulary recognition task and demonstrate that they lead to perplexity and word error rate reductions.
Recent innovations in speech-to-text transcription at sri-icsi-uw
- IEEE Transactions on Audio, Speech & Language Processing
, 2006
"... Abstract — We summarize recent progress in automatic speechto-text ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
Abstract — We summarize recent progress in automatic speechto-text
Development of a conversational telephone speech recognizer for Levantine Arabic
- in Proc. Interspeech
, 2005
"... Many languages, including Arabic, are characterized by a wide variety of different dialects that often differ strongly from each other. When developing speech technology for dialect-rich languages, the portability and reusability of data, algorithms, and system components becomes extremely important ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Many languages, including Arabic, are characterized by a wide variety of different dialects that often differ strongly from each other. When developing speech technology for dialect-rich languages, the portability and reusability of data, algorithms, and system components becomes extremely important. In this paper, we describe the development of a large-vocabulary speech recognition system for Levantine Arabic, which was a new dialectal recognition task for our existing system. We discuss the dialect-specific modeling choices (grapheme vs. phoneme based acoustic models, automatic vowelization techniques, and morphological language models) and investigate to what extent techniques previously tested on other languages are portable to the present task. We present stateof-the-art
GA-FLM User’s Manual
, 2004
"... GA-FLM is a genetic algorithms program for automatically learning factored language model structures. It is used as an extension to the factored language model programs in the SRI Language Modeling toolkit. The program takes as input some training/development text files and some ..."
Abstract
- Add to MetaCart
GA-FLM is a genetic algorithms program for automatically learning factored language model structures. It is used as an extension to the factored language model programs in the SRI Language Modeling toolkit. The program takes as input some training/development text files and some
INTERSPEECH 2011 Analysis of Dialectal Influence in Pan-Arabic ASR
"... In this paper, we analyze the impact of five Arabic dialects on the front-end and pronunciation dictionary components of an Automatic Speech Recognition (ASR) system. We use ASR’s phonetic decision tree as a diagnostic tool to compare the robustness of MFCC and MLP front-ends to dialectal variations ..."
Abstract
- Add to MetaCart
In this paper, we analyze the impact of five Arabic dialects on the front-end and pronunciation dictionary components of an Automatic Speech Recognition (ASR) system. We use ASR’s phonetic decision tree as a diagnostic tool to compare the robustness of MFCC and MLP front-ends to dialectal variations in the speech data and found that MLP Bottle-Neck features are less robust to such variations. We also perform a rule-based analysis of the pronunciation dictionary, which enables us to identify dialectal words in the vocabulary and automatically generate pronunciations for unseen words. We show that our technique produces pronunciations with an average phone error rate 9.2%. Index Terms: automatic speech recognition, dialect analysis, front-end evaluation

