Results 1 -
1 of
1
Improving PinYin to Chinese Conversion With a Whole Sentence . . .
"... We address the problem of statistical language modeling in the context of PinYin to Chinese (PTC) conversion, a similar problem to speech recognition but without acoustic recognition step. Inputted phonetic syllables were first segmented and converted into word lattice, which was then scored ..."
Abstract
- Add to MetaCart
We address the problem of statistical language modeling in the context of PinYin to Chinese (PTC) conversion, a similar problem to speech recognition but without acoustic recognition step. Inputted phonetic syllables were first segmented and converted into word lattice, which was then scored within a Source-Channel framework in order to find the most probable Chinese sentence. In particular, we discuss the use of a Whole Sentence Maximum Entropy (WSME) model, an expressive framework for constructing language models with diverse features. Experiment showed WSME model trained with d2-ngrams and word triggers achieved a 20% reduction in perplexity and a 11.05% reduction in character conversion error over a baseline trigram.

