Results 1 -
4 of
4
Pronunciation Modeling for Improved Spelling Correction
- Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics
, 2002
"... This paper presents a method for incorporating word pronunciation information in a noisy channel model for spelling correction. ..."
Abstract
-
Cited by 30 (0 self)
- Add to MetaCart
This paper presents a method for incorporating word pronunciation information in a noisy channel model for spelling correction.
Bi-directional Conversion Between Graphemes and Phonemes Using a Joint N-gram Model
, 2001
"... We present in this paper a statistical model for languageindependent bi-directional conversion between spelling and pronunciation, based on joint grapheme/phoneme units 1 extracted from automatically aligned data. The model is evaluated on spelling-to-pronunciation and pronunciation-tospelling conv ..."
Abstract
-
Cited by 16 (1 self)
- Add to MetaCart
We present in this paper a statistical model for languageindependent bi-directional conversion between spelling and pronunciation, based on joint grapheme/phoneme units 1 extracted from automatically aligned data. The model is evaluated on spelling-to-pronunciation and pronunciation-tospelling conversion on the NetTalk database and the CMU dictionary. We also study the effect of including lexical stress in the pronunciation. Although a direct comparison is difficult to make, our model's performance appears to be as good or better than that of other data-driven approaches that have been applied to the same tasks. 1.
Improving Pronunciation Accuracy of Proper Names with Language Origin Classes
- in Proc. of the Seventh ESSLLI Student Session
, 2001
"... I would like to thank my advisor Alan Black for all his support and dedication, without him this thesis would not have been possible; Kenji Sagae for the insightful discussions about this thesis and, most importantly, for his patience and support; Guy Lebanon and Christian Monson, LTI colleagues, fo ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
I would like to thank my advisor Alan Black for all his support and dedication, without him this thesis would not have been possible; Kenji Sagae for the insightful discussions about this thesis and, most importantly, for his patience and support; Guy Lebanon and Christian Monson, LTI colleagues, for the discussion about unsupervised clustering; and Toni Badia for having introduced me to the field of Natural Language Processing and for his support during all these years. This work was supported by a “La Caixa ” Fellowship. ii Table of Contents Abbreviations...................................................................................................................... v Abstract.............................................................................................................................. vi
Identifying Language Origin of Person Names with N-Grams of Different Units
- Acoustics, Speech and Signal Processing
, 2006
"... Identifying the language origin of a name in English is important for generating its correct pronunciation. In this paper, N-grams of syllable-based letter clusters are proposed for the task. The performance of the N-gram model of a set of frequently used letter clusters (correspond to syllables) is ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Identifying the language origin of a name in English is important for generating its correct pronunciation. In this paper, N-grams of syllable-based letter clusters are proposed for the task. The performance of the N-gram model of a set of frequently used letter clusters (correspond to syllables) is compared to that of letter N-gram model in a four-language task: English, German, French, and Portuguese. On average, the letter cluster N-gram, which has 26 % error rate, is slightly better than the letter N-gram, which has 27.2 % error rate. Furthermore, it is found that the error distributions from the two N-grams have fairly large differences. Therefore, AdaBoost is used to combine the results from N-grams of different units. The error rate is reduced to 22.5 % or a relative 17.5 % error reduction is achieved after the combination. 1.

