Results 1  10
of
110
Modeling Human Performance in Statistical Word Segmentation
"... What mechanisms support the ability of human infants, adults, and other primates to identify words from fluent speech using distributional regularities? In order to better characterize this ability, we collected data from adults in an artificial language segmentation task similar to Saffran, Newport ..."
Abstract

Cited by 47 (16 self)
 Add to MetaCart
What mechanisms support the ability of human infants, adults, and other primates to identify words from fluent speech using distributional regularities? In order to better characterize this ability, we collected data from adults in an artificial language segmentation task similar to Saffran, Newport, and Aslin (1996) in which the length of sentences was systematically varied between groups of participants. We then compared the fit of a variety of computational models— including simple statistical models of transitional probability and mutual information, a clustering model based on mutual information by Swingley (2005), PARSER (Perruchet & Vintner, 1998), and a Bayesian model. We found that while all models were able to successfully complete the task, fit to the human data varied considerably, with the Bayesian model achieving the highest correlation with our results.
2009. Bayesian learning of a tree substitution grammar
 In Proceedings of the 47th Annual Meeting of the Association for Computational Linguistics (ACL09), Suntec
"... Tree substitution grammars (TSGs) offer many advantages over contextfree grammars (CFGs), but are hard to learn. Past approaches have resorted to heuristics. In this paper, we learn a TSG using Gibbs sampling with a nonparametric prior to control subtree size. The learned grammars perform significa ..."
Abstract

Cited by 46 (10 self)
 Add to MetaCart
(Show Context)
Tree substitution grammars (TSGs) offer many advantages over contextfree grammars (CFGs), but are hard to learn. Past approaches have resorted to heuristics. In this paper, we learn a TSG using Gibbs sampling with a nonparametric prior to control subtree size. The learned grammars perform significantly better than heuristically extracted ones on parsing accuracy. 1
A tutorial on Bayesian nonparametric models
 Journal of Mathematical Psychology
"... A key problem in statistical modeling is model selection, how to choose a model at an appropriate level of complexity. This problem appears in many settings, most prominently in choosing the number of clusters in mixture models or the number of factors in factor analysis. In this tutorial we describ ..."
Abstract

Cited by 39 (8 self)
 Add to MetaCart
(Show Context)
A key problem in statistical modeling is model selection, how to choose a model at an appropriate level of complexity. This problem appears in many settings, most prominently in choosing the number of clusters in mixture models or the number of factors in factor analysis. In this tutorial we describe Bayesian nonparametric methods, a class of methods that sidesteps this issue by allowing the data to determine the complexity of the model. This tutorial is a highlevel introduction to Bayesian nonparametric methods and contains several examples of their application. 1
Latent variable models of selectional preference
 In ACL 2010
, 2010
"... This paper describes the application of socalled topic models to selectional preference induction. Three models related to Latent Dirichlet Allocation, a proven method for modelling documentword cooccurrences, are presented and evaluated on datasets of human plausibility judgements. Compared to pr ..."
Abstract

Cited by 33 (1 self)
 Add to MetaCart
(Show Context)
This paper describes the application of socalled topic models to selectional preference induction. Three models related to Latent Dirichlet Allocation, a proven method for modelling documentword cooccurrences, are presented and evaluated on datasets of human plausibility judgements. Compared to previously proposed techniques, these models perform very competitively, especially for infrequent predicateargument combinations where they exceed the quality of Webscale predictions while using relatively little data. 1
A Probabilistic Model of Syntactic and Semantic Acquisition from ChildDirected Utterances and their Meanings
"... This paper presents an incremental probabilistic learner that models the acquistion of syntax and semantics from a corpus of childdirected utterances paired with possible representations of their meanings. These meaning representations approximate the contextual input available to the child; they d ..."
Abstract

Cited by 28 (6 self)
 Add to MetaCart
(Show Context)
This paper presents an incremental probabilistic learner that models the acquistion of syntax and semantics from a corpus of childdirected utterances paired with possible representations of their meanings. These meaning representations approximate the contextual input available to the child; they do not specify the meanings of individual words or syntactic derivations. The learner then has to infer the meanings and syntactic properties of the words in the input along with a parsing model. We use the CCG grammatical framework and train a nonparametric Bayesian model of parse structure with online variational Bayesian expectation maximization. When tested on utterances from the CHILDES corpus, our learner outperforms a stateoftheart semantic parser. In addition, it models such aspects of child acquisition as “fast mapping,” while also countering previous criticisms of statistical syntactic learners. 1
A tutorial introduction to Bayesian models of cognitive development
"... We present an introduction to Bayesian inference as it is used in probabilistic models of cognitive development. Our goal is to provide an intuitive and accessible guide to the what, the how, and the why of the Bayesian approach: what sorts of problems and data the framework is most relevant for, an ..."
Abstract

Cited by 17 (1 self)
 Add to MetaCart
We present an introduction to Bayesian inference as it is used in probabilistic models of cognitive development. Our goal is to provide an intuitive and accessible guide to the what, the how, and the why of the Bayesian approach: what sorts of problems and data the framework is most relevant for, and how and why it may be useful for developmentalists. We emphasize a qualitative understanding of Bayesian inference, but also include information about additional resources for those interested in the cognitive science applications, mathematical foundations, or machine learning details in more depth. In addition, we discuss some important interpretation issues that often arise when evaluating Bayesian models in cognitive science.
TypeBased MCMC
"... Most existing algorithms for learning latentvariable models—such as EM and existing Gibbs samplers—are tokenbased, meaning that they update the variables associated with one sentence at a time. The incremental nature of these methods makes them susceptible to local optima/slow mixing. In this paper ..."
Abstract

Cited by 16 (0 self)
 Add to MetaCart
(Show Context)
Most existing algorithms for learning latentvariable models—such as EM and existing Gibbs samplers—are tokenbased, meaning that they update the variables associated with one sentence at a time. The incremental nature of these methods makes them susceptible to local optima/slow mixing. In this paper, we introduce a typebased sampler, which updates a block of variables, identified by a type, which spans multiple sentences. We show improvements on partofspeech induction, word segmentation, and learning treesubstitution grammars. 1
Online learning mechanisms for bayesian models of word segmentation. Research on Language and Computation, 8(2), 107– 132. (special issue on computational models of language acquisition
 In Proceedings of the 34th
, 2011
"... © The Author(s) 2011. This article is published with open access at Springerlink.com Abstract In recent years, Bayesian models have become increasingly popular as a way of understanding human cognition. Ideal learner Bayesian models assume that cognition can be usefully understood as optimal behavio ..."
Abstract

Cited by 15 (7 self)
 Add to MetaCart
(Show Context)
© The Author(s) 2011. This article is published with open access at Springerlink.com Abstract In recent years, Bayesian models have become increasingly popular as a way of understanding human cognition. Ideal learner Bayesian models assume that cognition can be usefully understood as optimal behavior under uncertainty, a hypothesis that has been supported by a number of modeling studies across various domains (e.g., Griffiths and Tenenbaum, Cognitive Psychology, 51, 354–384, 2005; Xu and Tenenbaum, Psychological Review, 114, 245–272, 2007). The models in these studies aim to explain why humans behave as they do given the task and data they encounter, but typically avoid some questions addressed by more traditional psychological models, such as how the observed behavior is produced given constraints on memory and processing. Here, we use the task of word segmentation as a case study for investigating these questions within a Bayesian framework. We consider some limitations of the infant learner, and develop several online learning algorithms that take these limitations into account. Each algorithm can be viewed as a different method of approximating the same ideal learner. When tested on corpora of English childdirected speech, we find that the constrained learner’s behavior depends nontrivially on how the learner’s limitations are implemented. Interestingly, sometimes biases that are helpful to an ideal learner hinder a constrained learner, and in a few cases, constrained learners perform equivalently or better than the ideal learner. This suggests that the transition from a computationallevel solution for acquisition to an algorithmiclevel one is not straightforward.
Synergies in learning words and their referents
"... This paper presents Bayesian nonparametric models that simultaneously learn to segment words from phoneme strings and learn the referents of some of those words, and shows that there is a synergistic interaction in the acquisition of these two kinds of linguistic information. The models themselves ..."
Abstract

Cited by 12 (8 self)
 Add to MetaCart
(Show Context)
This paper presents Bayesian nonparametric models that simultaneously learn to segment words from phoneme strings and learn the referents of some of those words, and shows that there is a synergistic interaction in the acquisition of these two kinds of linguistic information. The models themselves are novel kinds of Adaptor Grammars that are an extension of an embedding of topic models into PCFGs. These models simultaneously segment phoneme sequences into words and learn the relationship between nonlinguistic objects to the words that refer to them. We show (i) that modelling interword dependencies not only improves the accuracy of the word segmentation but also of wordobject relationships, and (ii) that a model that simultaneously learns wordobject relationships and word segmentation segments more accurately than one that just learns word segmentation on its own. We argue that these results support an interactive view of language acquisition that can take advantage of synergies such as these. 1
1 Inference in Hidden Markov Models with Explicit State Duration Distributions
"... Abstract—Explicitstateduration hidden Markov models (EDHMM) are HMMs that have latent states consisting of both discrete stateindicator and discrete stateduration random variables. In contrast to the implicit geometric state duration distribution possessed by the standard HMM, EDHMMs allow the d ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
(Show Context)
Abstract—Explicitstateduration hidden Markov models (EDHMM) are HMMs that have latent states consisting of both discrete stateindicator and discrete stateduration random variables. In contrast to the implicit geometric state duration distribution possessed by the standard HMM, EDHMMs allow the direct parameterisation and estimation of perstate duration distributions. As most duration distributions are defined over the positive integers, truncation or other approximations are usually required to perform EDHMM inference. In this letter we borrow from the inference techniques developed for unbounded statecardinality (nonparametric) variants of the HMM and use them to develop a tuningparameter free, blackbox inference procedure for EDHMMs. I.