Results 1 - 10
of
10
Parallel Networks that Learn to Pronounce English Text
- COMPLEX SYSTEMS
, 1987
"... This paper describes NETtalk, a class of massively-parallel network systems that learn to convert English text to speech. The memory representations for pronunciations are learned by practice and are shared among many processing units. The performance of NETtalk has some similarities with observed h ..."
Abstract
-
Cited by 413 (5 self)
- Add to MetaCart
This paper describes NETtalk, a class of massively-parallel network systems that learn to convert English text to speech. The memory representations for pronunciations are learned by practice and are shared among many processing units. The performance of NETtalk has some similarities with observed human performance. (i) The learning follows a power law. (;i) The more words the network learns, the better it is at generalizing and correctly pronouncing new words, (iii) The performance of the network degrades very slowly as connections in the network are damaged: no single link or processing unit is essential. (iv) Relearning after damage is much faster than learning during the original training. (v) Distributed or spaced practice is more effective for long-term retention than massed practice. Network models can be constructed that have the same performance and learning characteristics on a particular task, but differ completely at the levels of synaptic strengths and single-unit responses. However, hierarchical clustering techniques applied to NETtalk reveal that these different networks have similar internal representations of letter-to-sound correspondences within groups of processing units. This suggests that invariant internal representations may be found in assemblies of neurons intermediate in size between highly localized and completely distributed representations.
Algorithms for Grapheme-Phoneme Translation for English and French: Applications
- COMPUTATIONAL LINGUISTICS
, 1997
"... Letter-to-sound rules, also known as grapheme-to-phoneme rules, are important computational tools and have been used for a variety of purposes including word or name lookups for database searches and speech synthesis. These rules are especially useful when integrated into database searches on names ..."
Abstract
-
Cited by 34 (0 self)
- Add to MetaCart
Letter-to-sound rules, also known as grapheme-to-phoneme rules, are important computational tools and have been used for a variety of purposes including word or name lookups for database searches and speech synthesis. These rules are especially useful when integrated into database searches on names and ad-dresses, since they can complement orthographic search algorithms that make use of permutation, deletion, and insertion by allowing for a comparison with the phonetic equivalent. In databases, phonetics can help retrieve a word or a proper name without the user needing to know the correct spelling. A phonetic index is built with the vocabulary of the application. This could be an entire dictionary, or a list of proper names. The searched word is then converted into phonetics and retrieved with its information, if the word is in the phonetic index. This phonetic lookup can be used to retrieve a misspelled word in a dictionary or a database, or in a text editor to suggest corrections. Such rules are also necessary to formalize grapheme-phoneme correspondences in speech synthesis architecture. In text-to-speech systems, these rules are typically used to create phonemes
A Multi-Strategy Approach to Improving Pronunciation by Analogy
"... Pronunciation by analogy (PbA) is a data-driven method for relating letters to sound, with potential application to next-generation text-to-speech systems. This paper extends previous work on PbA in several directions. First, we have included `full' pattern matching between input letter string and d ..."
Abstract
-
Cited by 25 (3 self)
- Add to MetaCart
Pronunciation by analogy (PbA) is a data-driven method for relating letters to sound, with potential application to next-generation text-to-speech systems. This paper extends previous work on PbA in several directions. First, we have included `full' pattern matching between input letter string and dictionary entries, as well as including lexical stress in letter-to-phoneme conversion. Second, we have extended the method to phonemeto -letter conversion. Third, and most important, we have experimented with multiple, different strategies for scoring the candidate pronunciations. Individual scores for each strategy are obtained on the basis of rank and either multiplied or summed to produce a final, overall score. Five strategies have been studied and results obtained from all 31 possible combinations. The two combination methods perform comparably, with the product rule only very marginally superior to the sum rule. Nonparametric statistical analysis reveals that performance improves as more strategies are included in the combination: this trend is very highly significant ( p 0 0005). Accordingly for letter-to-phoneme conversion, best results are obtained when all five strategies are combined: word accuracy is raised to 65.5% relative to 61.7% for our best previous result and 63.0% for the best-performing single strategy. These improvements are very highly significant ( p 0 and p 0 00011 respectively). Similar results were found for phoneme-to-letter and letter-to-stress conversion, although the former was an easier problem for PbA than letter-to-phoneme conversion and the latter was harder. The main sources of error for the multi-strategy approach are very similar to those for the best single strategy, and mostly involve vowel letters and phonemes. 1
Nonword Pronunciation and Models of Word Recognition
, 1994
"... Nonword pronunciation is a form of generalization behavior that has been at the center of debates about models of word recognition, the role of rules in explaining behavior, and the adequacy of the parallel distributed processing approach. An experiment yielded data concerning the pronunciation of a ..."
Abstract
-
Cited by 11 (8 self)
- Add to MetaCart
Nonword pronunciation is a form of generalization behavior that has been at the center of debates about models of word recognition, the role of rules in explaining behavior, and the adequacy of the parallel distributed processing approach. An experiment yielded data concerning the pronunciation of a large corpus of nonwords. The data were then used to assess 2 models of naming: a model developed by D. C. Plaut and J. L. McClelland (1993), which is similar to the one described by M. S. Seidenberg and J. L. McClelland (1989) but uses improved orthographic and phonological representations, and the grapheme-phoneme correspondence rules of M. Coltheart, B. Curtis, P. Atkins, and M. Haller's (1993) dual-route model. Both models generate plausible nonword pronunciations and match subjects' responses accurately. The dual-route model does so by using rules that generate correct output for most words but mispronounce a significant number of exceptions. The parallel distributed processing model does so by finding a set of weights that allow it to generate correct output for both "rule-governed" items and exceptions. Some ways in which the two approaches differ and other issues facing them are also discussed.
Age of Acquisition Effects in Word Reading and Other Tasks
, 2002
"... this article were implemented using software developed by Michael Harm, whom we also thank ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
this article were implemented using software developed by Michael Harm, whom we also thank
Aligning Letters And Phonemes For Speech Synthesis
"... A common requirement in speech technology is to align two different symbolic representations of the same linguistic `message'. For instance, we often need to align letters of words listed in a dictionary with the corresponding phonemes specifying their pronunciation. As dictionaries become ever bigg ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
A common requirement in speech technology is to align two different symbolic representations of the same linguistic `message'. For instance, we often need to align letters of words listed in a dictionary with the corresponding phonemes specifying their pronunciation. As dictionaries become ever bigger, manual alig nment becomes less and less tenable yet automatic alignment is a hard problem for a language like English. In this paper, we describe use of a form of the expectation-maximization (EM) algorithm to achieve automatic alignment of English text and phonemes. The quality of alignment is assessed by the performance of a pronunciation by analogy system using the aligned dictionary data. We find excellent performance---the best so far reported in the literature of letter-phoneme conversion---independent of the start point for alignment, indicating that the EM search space is strongly convex.
Phonological priming in the lexical decision task: Regularity effects are not necessary evidence for assembly
- Journal of Experimental Psychology: Human Perception & Performance
, 1997
"... The contribution of assembled phonology in reading English was examined in the lexical decision task by comparing two markers: regularity effects and phonological priming. Strategic control was assessed by manipulating the phonological lexicality of the foils: Experiment 1 used legal nonwords, where ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
The contribution of assembled phonology in reading English was examined in the lexical decision task by comparing two markers: regularity effects and phonological priming. Strategic control was assessed by manipulating the phonological lexicality of the foils: Experiment 1 used legal nonwords, whereas Experiment 2 used pseudohomophones. Replicating existing findings, null regularity effects were obtained in the presence of legal nonwords. Modest regularity effects, in accuracy only, were observed with pseudohomophone foils. In contrast, phonological priming effects emerged in each of the experiments, regardless of the presence of regularity effects. Assembled phonology thus constrains reading under conditions that strongly discourage its use. However, regularity effects are not necessary evidence for its presence. The dissociation of regularity and phonological priming effects is discussed in terms of the two-cycles model. Dual-route models of visual word recognition (e.g., Baron
How predictable is spelling? Developing and
"... this paper, we refer to sound--spelling contingency as the number of times a grapheme occurs with a phoneme divided by the total number of times the phoneme occurs; we refer to sound--spelling consistency as the number of times an orthographic body occurs with a rime divided by the total number of t ..."
Abstract
- Add to MetaCart
this paper, we refer to sound--spelling contingency as the number of times a grapheme occurs with a phoneme divided by the total number of times the phoneme occurs; we refer to sound--spelling consistency as the number of times an orthographic body occurs with a rime divided by the total number of times the rime occurs. Note that this contingencyscore is identical to the conditionalprobabilityscores (consistency ratios) used in previousdatabasestudies (Berndt, Reggia, & Mitchum, 1987; Ziegler & Ferrand, 1998; Ziegler et al., 1997)
The grapho-phonological system of written French: Statistical
- In ACL’99: The 37th [207
, 1999
"... The processes through which readers evoke mental representations of phonological forms from print constitute a hotly debated and controversial issue in current psycholinguistics. In this paper we present a computational analysis of the grapho-phonological system of written French, and an empirical v ..."
Abstract
- Add to MetaCart
The processes through which readers evoke mental representations of phonological forms from print constitute a hotly debated and controversial issue in current psycholinguistics. In this paper we present a computational analysis of the grapho-phonological system of written French, and an empirical validation of some of the obtained descriptive statistics. The results provide direct evidence demonstrating that both grapheme frequency and grapheme entropy influence performance on pseudoword naming. We discuss the implications of those findings for current models of phonological coding in visual word recognition.
PREREADING: A DEVELOPMENTAL PERSPECTIVE
, 1981
"... of Education under Contract No. HEW-NIE-C-400-76-0116 and was prepared ..."

