Results 1 - 10
of
25
Model Selection based on Minimum Description Length
- Journal of Mathematical Psychology
, 1999
"... this paper is, of necessity, quite technical. To get a first but much gentler glimpse, we advise to just read the following (section 2) and the last section (7), which discusses in what sense we may expect Occam's razor to actually work. 2 The Fundamental Idea ..."
Abstract
-
Cited by 32 (3 self)
- Add to MetaCart
this paper is, of necessity, quite technical. To get a first but much gentler glimpse, we advise to just read the following (section 2) and the last section (7), which discusses in what sense we may expect Occam's razor to actually work. 2 The Fundamental Idea
Ambiguity Resolution in Sentence Processing: Evidence against Frequency-Based Accounts
- Journal of Memory and Language
, 2000
"... This article addresses the question of how the processor decides on its initial strategy for syntactic ambiguity resolution. At a point of ambiguity, more than one analysis is possible. An effective strategy might be to adopt the analysis that has most frequently turned out to be correct in the past ..."
Abstract
-
Cited by 28 (8 self)
- Add to MetaCart
This article addresses the question of how the processor decides on its initial strategy for syntactic ambiguity resolution. At a point of ambiguity, more than one analysis is possible. An effective strategy might be to adopt the analysis that has most frequently turned out to be correct in the past. Assuming that the world stays the same in most respects, the analysis that has most frequently been correct in the past should provide a good estimate of which analysis is most likely to be correct again. Hence, by adopting this analysis, the processor should make fewer errors than if it chose any other analysis
Perceptual Issues in Music Pattern Recognition: Complexity of Rhythm and Key Finding
- Computers and the Humanities
, 2001
"... We consider several perceptual issues in the context of machine recognition of music patterns. It is argued that a successful implementation of a music recognition system must incorporate perceptual information and error criteria. We discuss several measures of rhythm complexity which are used for d ..."
Abstract
-
Cited by 18 (5 self)
- Add to MetaCart
We consider several perceptual issues in the context of machine recognition of music patterns. It is argued that a successful implementation of a music recognition system must incorporate perceptual information and error criteria. We discuss several measures of rhythm complexity which are used for determining relative weights of pitch and rhythm errors. Then, a new method for determining a localized tonal context is proposed. This method is based on empirically derived key distances. The generated key assignments are then used to construct the perceptual pitch error criterion which is based on note relatedness ratings obtained from experiments with human listeners.
Language Evolution by Iterated Learning With Bayesian Agents
, 2007
"... Languages are transmitted from person to person and generation to generation via a process of iterated learning: people learn a language from other people who once learned that language themselves. We analyze the consequences of iterated learning for learning algorithms based on the principles of Ba ..."
Abstract
-
Cited by 18 (6 self)
- Add to MetaCart
Languages are transmitted from person to person and generation to generation via a process of iterated learning: people learn a language from other people who once learned that language themselves. We analyze the consequences of iterated learning for learning algorithms based on the principles of Bayesian inference, assuming that learners compute a posterior distribution over languages by combining a prior (representing their inductive biases) with the evidence provided by linguistic data. We show that when learners sample languages from this posterior distribution, iterated learning converges to a distribution over languages that is determined entirely by the prior. Under these conditions, iterated learning is a form of Gibbs sampling, a widely-used Markov chain Monte Carlo algorithm. The consequences of iterated learning are more complicated when learners choose the language with maximum posterior probability, being affected by both the prior of the learners and the amount of information transmitted between generations. We show that in this case, iterated learning corresponds to another statistical inference algorithm, a variant of the expectation-maximization (EM) algorithm. These results clarify the role of iterated learning in explanations of linguistic universals and provide a formal connection between constraints on language acquisition and the languages that come to be spoken, suggesting that information transmitted via iterated learning will ultimately come to mirror the minds of the learners.
The Exploitation of Regularities in the Environment by the Brain
- Behavioral and Brain Sciences
"... Statistical regularities of the environment are important for learning, memory, intelligence,
inductive inference, and in fact for any area of cognitive science where an informationprocessing
brain promotes survival by exploiting them. This has been recognised by many
of those interested in cognitiv ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
Statistical regularities of the environment are important for learning, memory, intelligence,
inductive inference, and in fact for any area of cognitive science where an informationprocessing
brain promotes survival by exploiting them. This has been recognised by many
of those interested in cognitive function, starting with Helmholtz, Mach and Pearson, and
continuing through Craik, Tolman, Attneave, and Brunswik. In the current era many of us
have begun to show how neural mechanisms exploit the regular statistical properties of
natural images. Shepard proposed that the apparent trajectory of an object when seen
successively at two positions results from internalising the rules of kinematic geometry, and
although kinematic geometry is not statistical in nature, this is clearly a related idea. Here
it is argued that Shepard's term, "internalisation", is insufficient because it is also
necessary to derive an advantage from the process. Having mechanisms selectively sensitive
to the spatio-temporal patterns of excitation commonly experienced when viewing moving
objects would facilitate the detection, interpolation, and extrapolation of such motions, and
might explain the twisting motions that are experienced. Although Shepard's explanation
in terms of Chasles' rule seems doubtful, his theory and experiments illustrate that local
twisting motions are needed for the analysis of moving objects and provoke thoughts about
how they might be detected.
Back-off as Parameter Estimation for DOP models
, 2002
"... Data-Oriented Parsing (DOP) is a probabilistic performance approach to parsing natural language. Several DOP models have been proposed since it was introduced by Scha (1990), achieving promising results. One important feature of these models is the probability estimation procedure. Two major estimat ..."
Abstract
-
Cited by 15 (1 self)
- Add to MetaCart
Data-Oriented Parsing (DOP) is a probabilistic performance approach to parsing natural language. Several DOP models have been proposed since it was introduced by Scha (1990), achieving promising results. One important feature of these models is the probability estimation procedure. Two major estimators have been put forward: Bod (1993) uses a relative frequency estimator; Bonnema (1999) adds a rescaling factor to correct for tree size effects. Both estimators, however, present biases. Moreover, Bod's estimator has been shown to be inconsistent (Johnson, 2002), meaning that the probability estimates hypothesized by the model do not approach the true probabilities that generated the data as the sample size grows. In this thesis, we implement a new estimation procedure that tackles the shortcomings of the two previous methods. The main idea is to treat derivation events not as disjoint, but as interrelated in a hierarchical cascade of parse tree derivations. We show that this new estimator -- called the Back-Off DOP (BO-DOP) estimator -- outperforms both previous models. We tested it on the OVIS treebank, a Dutch language, speech-based system, and report error reductions of up to 11.4% and 15% when compared to, respectively, Bod's and Bonnema's estimators.
The Generalized Universal Law of Generalization
- Journal of Mathematical Psychology
, 2001
"... It has been argued by Shepard that there is a robust psychological law that relates the distance between a pair of items in psychological space and the probability that they will be confused with each other. Specifically, the probability of confusion is a negative exponential function of the dista ..."
Abstract
-
Cited by 14 (3 self)
- Add to MetaCart
It has been argued by Shepard that there is a robust psychological law that relates the distance between a pair of items in psychological space and the probability that they will be confused with each other. Specifically, the probability of confusion is a negative exponential function of the distance between the pair of items. In experimental contexts, distance is typically defined in terms of a multidimensional Euclidean space---but this assumption seems unlikely to hold for complex stimuli. We show that, nonetheless, the Universal Law of Generalization can be derived in the more complex setting of arbitrary stimuli, using a much more universal measure of distance. This universal distance is defined as the length of the shortest program that transforms the representations of the two items of interest into one another: the algorithmic information distance. It is universal in the sense that it minorizes every computable distance: it is the smallest computable distance. We show ...
Algorithmic Complexity
- M B
, 1993
"... The theory of algorithmic complexity (commonly known as Kolmogorov complexity) or algorithmic information theory is a novel mathematical approach combining the theory of computation with information theory. It is the theory that finally formalizes the elusive notion of the amount of information in i ..."
Abstract
-
Cited by 13 (8 self)
- Add to MetaCart
The theory of algorithmic complexity (commonly known as Kolmogorov complexity) or algorithmic information theory is a novel mathematical approach combining the theory of computation with information theory. It is the theory that finally formalizes the elusive notion of the amount of information in individual objects, in contrast to entropy that is a statistical notion of average code word length to transmit a message form a random source. This powerful new theory has successfully resolved ancient questions about the nature of randomness of individual objects, inductive reasoning and prediction, and has applications in mathematics, computer science, physics, biology, and other sciences, including social and behavioral sciences.
Simplicity versus likelihood in visual perception: from surprisals to precisals
- Psychological Bulletin
, 2000
"... The likelihood principle states that the visual system prefers the most likely interpretation of a stimulus, whereas the simplicity principle states that it prefers the most simple interpretation. This study investi-gates how close these seemingly very different principles are by combining findings ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
The likelihood principle states that the visual system prefers the most likely interpretation of a stimulus, whereas the simplicity principle states that it prefers the most simple interpretation. This study investi-gates how close these seemingly very different principles are by combining findings from classical, algorithmic, and structural information theory. It is argued that, in visual perception, the two principles are perhaps very different with respect to the viewpoint-independent aspects of perception but probably very close with respect to the viewpoint-dependent aspects which, moreover, seem decisive in everyday perception. This implies that either principle may have guided the evolution of visual systems and that the simplicity paradigm may provide perception models with the necessary quantitative specifications of the often plausible but also intuitive ideas provided by the likelihood paradigm. In visual perception research, an ongoing debate concerns the question of whether the likelihood principle (Von Helmholtz, 1909/1962) or the simplicity principle (Hochberg & McAlister, 1953) provides the best explanation of the human interpretation of visual stimuli. The phenomenon to be explained is, more specifi-cally, that human subjects usually show a clear preference for only
Fast, frugal, and rational: How rational norms explain behavior
- ORGANIZATIONAL BEHAVIOR AND HUMAN DECISION PROCESSES
, 2003
"... Much research on judgment and decision making has focussed on the adequacy of classical rationality as a description of human reasoning. But more recently it has been argued that classical rationality should also be rejected even as normative standards for human reasoning. For example, Gigerenzer an ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
Much research on judgment and decision making has focussed on the adequacy of classical rationality as a description of human reasoning. But more recently it has been argued that classical rationality should also be rejected even as normative standards for human reasoning. For example, Gigerenzer and Goldstein (1996) and Gigerenzer and Todd (1999a) argue that reasoning involves ‘‘fast and frugal’ ’ algorithms which are not justified by rational norms, but which succeed in the environment. They provide three lines of argument for this view, based on: (A) the importance of the environment; (B) the existence of cognitive limitations; and (C) the fact that an algorithm with no apparent rational basis, Take-the-Best, succeeds in an judgment task (judging which of two cities is the larger, based on lists of features of each city). We reconsider (A)–(C), arguing that standard patterns of explanation in psychology and the social and biological sciences, use rational norms to explain why simple cognitive algorithms can succeed. We also present new computer simulations that compare Take-the-Best with other cognitive models (which use connectionist, exemplarbased, and decision-tree algorithms). Although Take-the-Best still performs well, it does not perform noticeably better than the other models. We conclude that these results provide no strong reason to prefer Take-the-Best over alternative cognitive models.

