TreeTalk: Memory-based word phonemisation (2001)
| Venue: | In Data-Driven Techniques in Speech Synthesis, Kluwer |
| Citations: | 2 - 0 self |
BibTeX
@INPROCEEDINGS{Daelemans01treetalk:memory-based,
author = {Walter Daelemans and Antal van den Bosch},
title = {TreeTalk: Memory-based word phonemisation},
booktitle = {In Data-Driven Techniques in Speech Synthesis, Kluwer},
year = {2001},
pages = {149--172}
}
OpenURL
Abstract
We propose a memory-based (similarity-based) approach to learning the mapping of words into phonetic representations for use in speech synthesis systems. The main advantage of memory-based data mining techniques is their high accuracy, the main disadvantage is processing speed. We introduce a hybrid between memory-based and decision-tree-based learning (TRIBL) which optimises the trade-off between efficiency and accuracy. TRIBL was used in TREETALK, a methodology for fast engineering of word-to-phonetics conversion systems. We also show that for English,a single TRIBL classifier trained on predicting phonetic transcription and word stress at the same time performs better than a `modular' approach in which different classifiers corresponding to linguistically relevant representations such as morphological and syllable structure are separately trained and integrated.







