Results 1 - 10
of
34
Phonology, reading acquisition, and dyslexia: insights from connectionist models
- PSYCHOL. REV.
, 1999
"... The development of reading skill and bases of developmental dyslexia were explored using connectionist models. Four issues were examined: the acquisition of phonological knowledge prior to reading, how this knowledge facilitates learning to read, phonological and non phonological bases of dyslexia, ..."
Abstract
-
Cited by 52 (3 self)
- Add to MetaCart
The development of reading skill and bases of developmental dyslexia were explored using connectionist models. Four issues were examined: the acquisition of phonological knowledge prior to reading, how this knowledge facilitates learning to read, phonological and non phonological bases of dyslexia, and effects of literacy on phonological representation. Compared with simple feedforward networks, representing phonological knowledge in an attractor network yielded improved learning and generalization. Phonological and surface forms of developmental dyslexia, which are usually attributed to impairments in distinct lexical and nonlexical processing “routes,” were derived from different types of damage to the network. The results provide a computationally explicit account of many aspects of reading acquisition using connectionist principles.
Knowledge-Free Induction of Morphology Using Latent Semantic Analysis
, 2000
"... Morphology induction is a subproblem of important tasks like automatic learning of machine-readable dictionaries and grammar induction. Previous morphology induction approaches have relied solely on statistics of hypothesized stems and affixes to choose which affixes to consider legitimate. Relying ..."
Abstract
-
Cited by 48 (3 self)
- Add to MetaCart
Morphology induction is a subproblem of important tasks like automatic learning of machine-readable dictionaries and grammar induction. Previous morphology induction approaches have relied solely on statistics of hypothesized stems and affixes to choose which affixes to consider legitimate. Relying on stem-and-affix statistics rather than semantic knowledge leads to a number of problems, such as the inappropriate use of valid affixes ("ally" stemming to "all"). We introduce a semantic-based algorithm for learning morphology which only proposes affixes when the stem and stem-plus-affix are sufficiently similar semantically. We implement our approach using Latent Semantic Analysis and show that our semantics-only approach provides morphology induction results that rival a current state-of-the-art system.
Unsupervised morpheme segmentation and morphology induction from text corpora using Morfessor 1.0
- Helsinki University of Technology
, 2005
"... In this work, we describe the first public version of the Morfessor software, which is a program that takes as input a corpus of unannotated text and produces a segmentation of the word forms observed in the text. The segmentation obtained often resembles a linguistic morpheme segmentation. Morfesso ..."
Abstract
-
Cited by 35 (9 self)
- Add to MetaCart
In this work, we describe the first public version of the Morfessor software, which is a program that takes as input a corpus of unannotated text and produces a segmentation of the word forms observed in the text. The segmentation obtained often resembles a linguistic morpheme segmentation. Morfessor is not language-dependent. The number of segments per word is not restricted to two or three as in some other existing morphology learning models. The current version of the software essentially implements two morpheme segmentation models presented earlier by us (Creutz and Lagus, 2002; Creutz, 2003). The document contains user’s instructions, as well as the mathematical formulation of the model and a description of the search algorithm used. Additionally, a few experiments on Finnish and English text corpora are reported in order to give the user some ideas of how to apply the program to his own data sets and how to evaluate the results. 1
Unsupervised Language Acquisition: Theory and Practice
, 2001
"... In this thesis I present various algorithms for the unsupervised machine learning of aspects of natural languages using a variety of statistical models. The scientific object of the work is to examine the validity of the so-called Argument from the Poverty of the Stimulus advanced in favour of the p ..."
Abstract
-
Cited by 32 (0 self)
- Add to MetaCart
In this thesis I present various algorithms for the unsupervised machine learning of aspects of natural languages using a variety of statistical models. The scientific object of the work is to examine the validity of the so-called Argument from the Poverty of the Stimulus advanced in favour of the proposition that humans have language-specific innate knowledge. I start by examining an a priori argument based on Gold's theorem, that purports to prove that natural languages cannot be learned, and some formal issues related to the choice of statistical grammars rather than symbolic grammars. I present three novel algorithms for learning various parts of natural languages: first, an algorithm for the induction of syntactic categories from unlabelled text using distributional information, that can deal with ambiguous and rare words; secondly, a set of algorithms for learning morphological processes in a variety of languages, including languages such as Arabic with nonconcatenative morphology; thirdly an algorithm for the unsupervised induction of a context-free grammar from tagged text. I carefully examine the interaction between the various components, and show how these algorithms can form the basis for a empiricist model of language acquisition. I therefore conclude that the Argument from the Poverty of the Stimulus is unsupported by the evidence.
Unsupervised models for morpheme segmentation and morphology learning
- ACM Trans. Speech Lang. Process
, 2007
"... We present a model family called Morfessor for the unsupervised induction of a simple morphology from raw text data. The model is formulated in a probabilistic maximum a posteriori framework. Morfessor can handle highly inflecting and compounding languages where words can consist of lengthy sequence ..."
Abstract
-
Cited by 32 (6 self)
- Add to MetaCart
We present a model family called Morfessor for the unsupervised induction of a simple morphology from raw text data. The model is formulated in a probabilistic maximum a posteriori framework. Morfessor can handle highly inflecting and compounding languages where words can consist of lengthy sequences of morphemes. A lexicon of word segments, called morphs, is induced from the data. The lexicon stores information about both the usage and form of the morphs. Several instances of the model are evaluated quantitatively in a morpheme segmentation task on different sized sets of Finnish as well as English data. Morfessor is shown to perform very well compared to a widely known benchmark algorithm, in particular on Finnish data.
Serial and Strategic Effects in Reading Aloud
"... Coltheart and Rastle (1994) reported that the size of the regularity effect on word naming latency decreases across position of irregularity, implicating a serial process in reading aloud. In response to criticism by Plaut, McClelland, Seidenberg, and Patterson (1996), we replicate these results her ..."
Abstract
-
Cited by 20 (3 self)
- Add to MetaCart
Coltheart and Rastle (1994) reported that the size of the regularity effect on word naming latency decreases across position of irregularity, implicating a serial process in reading aloud. In response to criticism by Plaut, McClelland, Seidenberg, and Patterson (1996), we replicate these results here using monosyllabic words which have been controlled for consistency at each of five orthographic segments. A successful simulation of these data by the DRC model (Coltheart, Curtis, Atkins, & Haller, 1993) is presented. These findings were used in a second experiment to produce a strategy effect in reading aloud. Subjects named nonword or regular word targets mixed with either first position irregular fillers or third position irregular fillers. Target naming was slowed when first position irregular fillers were present compared with target naming when third position irregular fillers were present. These data suggest that the use of the nonlexical route is not fixed; subjects can slow its ...
Morphological and semantic effects in visual word recognition: A time-course study. Language and Cognitive
- Processes, 15, 407 437. DECOMPOSITION IN VISUAL WORD RECOGNITION 419 Downloaded By: [University of Cambridge] At: 12:19 21 April 2008 Rastle
, 2000
"... Some theories of visual word recognition postulate that there is a level of processing or representation at which morphemes are treated differently from whole words. Support for these theories has been derived from priming experiments in which the recognition of a target word is facilitated by the p ..."
Abstract
-
Cited by 17 (5 self)
- Add to MetaCart
Some theories of visual word recognition postulate that there is a level of processing or representation at which morphemes are treated differently from whole words. Support for these theories has been derived from priming experiments in which the recognition of a target word is facilitated by the prior presentation of a morphologicallyrelatedprime (departure-DEPART). In English, such facilitation could be due to morphological relatedness, or to some combination of the orthographic and semantic relatedness characteristic of derivationally related words. We report two sets of visual priming experiments in which the morphological, semantic, and orthographic relationships between primes and targets are varied in three SOA conditions (43 ms, 72 ms, and 230 ms). Results showed that morphological structure plays a signi�cant role in the early visual recognition of English words that is independent of both semantic and orthographic relatedness. Findings are discussed in terms of current approaches to morphological processing. Requests for reprints should be addressed to Kathleen Rastle, Department of Experimental
Lexique 2: A new French lexical database
- Behavior Research Methods, Instruments, & Computers
, 2004
"... In this paper, we present a new lexical database for French: Lexique. In addition to classical word information such as gender, number, and grammatical category, Lexique also includes a series of interesting new characteristics. First, word frequencies are based on two cues: a contemporary corpus of ..."
Abstract
-
Cited by 15 (2 self)
- Add to MetaCart
In this paper, we present a new lexical database for French: Lexique. In addition to classical word information such as gender, number, and grammatical category, Lexique also includes a series of interesting new characteristics. First, word frequencies are based on two cues: a contemporary corpus of texts and the number of web pages containing the word. Second, the database is split in several tables: a graphemic table with all the relevant frequencies, a table structured around lemmas, which is particularly interesting for the study of the inflectional family, and a table about surface frequency cues. Third, Lexique is distributed under a GNU-like license allowing people to contribute to it. Finally, a meta search engine called Open Lexique has been developed so that new databases can be added very easily to the existing ones. Lexique can either be downloaded or interrogated freely from
Morphological influences on the recognition of monosyllabic monomorphemic words
- Journal of Memory and Language
, 2006
"... Balota, Cortese, Sergent-Marschall, Spieler, and Yap (2004) have cautioned researchers in the field about the drawbacks of factorial designs where variables are manipulated in a noncontinuous manner and effects are assessed in terms of the presence or absence of a significant effect. They have eloqu ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
Balota, Cortese, Sergent-Marschall, Spieler, and Yap (2004) have cautioned researchers in the field about the drawbacks of factorial designs where variables are manipulated in a noncontinuous manner and effects are assessed in terms of the presence or absence of a significant effect. They have eloquently demonstrated for us the power of regression analyses based on hundreds or even thousands of data points and the potential
Acoustical Features As Predictors For Prominence In Read Aloud Dutch Sentences Used In ANN's
- In Proc. Eurospeech '99, Budapest
, 1999
"... In this paper we present several acoustical features, which are used as predictors for prominence. A set of 1244 sentences from 273 different speakers is selected from the Dutch Polyphone Corpus. Via listening experiments the subjective prominence markers are obtained. Several acoustical features co ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
In this paper we present several acoustical features, which are used as predictors for prominence. A set of 1244 sentences from 273 different speakers is selected from the Dutch Polyphone Corpus. Via listening experiments the subjective prominence markers are obtained. Several acoustical features concerning F 0 , energy and duration are derived and used as predictors for prominence. The sentences are divided in a test and a training set, to test and train neural networks with different topologies and different input features. The first results show that a classification of prominent and nonprominent words is possible with 82.1% correct for an independent test set. 1. INTRODUCTION Knowing the relevant features for perceived prominence can be useful in several speech technology applications. For example in speech synthesis, where words can be realized with an accent-lending pitch movement, it should be the next step to introduce different degrees of prominence. Knowing more about the r...

