Results 1  10
of
14
Hierarchical Dirichlet processes
 Journal of the American Statistical Association
, 2004
"... program. The authors wish to acknowledge helpful discussions with Lancelot James and Jim Pitman and the referees for useful comments. 1 We consider problems involving groups of data, where each observation within a group is a draw from a mixture model, and where it is desirable to share mixture comp ..."
Abstract

Cited by 535 (56 self)
 Add to MetaCart
program. The authors wish to acknowledge helpful discussions with Lancelot James and Jim Pitman and the referees for useful comments. 1 We consider problems involving groups of data, where each observation within a group is a draw from a mixture model, and where it is desirable to share mixture components between groups. We assume that the number of mixture components is unknown a priori and is to be inferred from the data. In this setting it is natural to consider sets of Dirichlet processes, one for each group, where the wellknown clustering property of the Dirichlet process provides a nonparametric prior for the number of mixture components within each group. Given our desire to tie the mixture models in the various groups, we consider a hierarchical model, specifically one in which the base measure for the child Dirichlet processes is itself distributed according to a Dirichlet process. Such a base measure being discrete, the child Dirichlet processes necessarily share atoms. Thus, as desired, the mixture models in the different groups necessarily share mixture components. We discuss representations of hierarchical Dirichlet processes in terms of
Hierarchical topic models and the nested Chinese restaurant process
 Advances in Neural Information Processing Systems
, 2004
"... We address the problem of learning topic hierarchies from data. The model selection problem in this domain is daunting—which of the large collection of possible trees to use? We take a Bayesian approach, generating an appropriate prior via a distribution on partitions that we refer to as the nested ..."
Abstract

Cited by 188 (25 self)
 Add to MetaCart
We address the problem of learning topic hierarchies from data. The model selection problem in this domain is daunting—which of the large collection of possible trees to use? We take a Bayesian approach, generating an appropriate prior via a distribution on partitions that we refer to as the nested Chinese restaurant process. This nonparametric prior allows arbitrarily large branching factors and readily accommodates growing data collections. We build a hierarchical topic model by combining this prior with a likelihood that is based on a hierarchical variant of latent Dirichlet allocation. We illustrate our approach on simulated data and with an application to the modeling of NIPS abstracts. 1
Infinite Latent Feature Models and the Indian Buffet Process
, 2005
"... We define a probability distribution over equivalence classes of binary matrices with a finite number of rows and an unbounded number of columns. This distribution ..."
Abstract

Cited by 180 (38 self)
 Add to MetaCart
We define a probability distribution over equivalence classes of binary matrices with a finite number of rows and an unbounded number of columns. This distribution
Theorybased causal induction
 In
, 2003
"... Inducing causal relationships from observations is a classic problem in scientific inference, statistics, and machine learning. It is also a central part of human learning, and a task that people perform remarkably well given its notorious difficulties. People can learn causal structure in various s ..."
Abstract

Cited by 33 (14 self)
 Add to MetaCart
Inducing causal relationships from observations is a classic problem in scientific inference, statistics, and machine learning. It is also a central part of human learning, and a task that people perform remarkably well given its notorious difficulties. People can learn causal structure in various settings, from diverse forms of data: observations of the cooccurrence frequencies between causes and effects, interactions between physical objects, or patterns of spatial or temporal coincidence. These different modes of learning are typically thought of as distinct psychological processes and are rarely studied together, but at heart they present the same inductive challenge—identifying the unobservable mechanisms that generate observable relations between variables, objects, or events, given only sparse and limited data. We present a computationallevel analysis of this inductive problem and a framework for its solution, which allows us to model all these forms of causal learning in a common language. In this framework, causal induction is the product of domaingeneral statistical inference guided by domainspecific prior knowledge, in the form of an abstract causal theory. We identify 3 key aspects of abstract prior knowledge—the ontology of entities, properties, and relations that organizes a domain; the plausibility of specific causal relationships; and the functional form of those relationships—and show how they provide the constraints that people need to induce useful causal models from sparse data.
Structured priors for structure learning
 In Proceedings of the 22nd Conference on Uncertainty in Artificial Intelligence (UAI
, 2006
"... Traditional approaches to Bayes net structure learning typically assume little regularity in graph structure other than sparseness. However, in many cases, we expect more systematicity: variables in realworld systems often group into classes that predict the kinds of probabilistic dependencies they ..."
Abstract

Cited by 19 (8 self)
 Add to MetaCart
Traditional approaches to Bayes net structure learning typically assume little regularity in graph structure other than sparseness. However, in many cases, we expect more systematicity: variables in realworld systems often group into classes that predict the kinds of probabilistic dependencies they participate in. Here we capture this form of prior knowledge in a hierarchical Bayesian framework, and exploit it to enable structure learning and type discovery from small datasets. Specifically, we present a nonparametric generative model for directed acyclic graphs as a prior for Bayes net structure learning. Our model assumes that variables come in one or more classes and that the prior probability of an edge existing between two variables is a function only of their classes. We derive an MCMC algorithm for simultaneous inference of the number of classes, the class assignments of variables, and the Bayes net structure over variables. For several realistic, sparse datasets, we show that the bias towards systematicity of connections provided by our model can yield more accurate learned networks than the traditional approach of using a uniform prior, and that the classes found by our model are appropriate. 1
Incremental Learning of Subtasks from Unsegmented Demonstration
"... Abstract — We propose to incrementally learn the segmentation of a demonstrated task into subtasks and the individual subtask policies themselves simultaneously. Previous robot learning from demonstration techniques have either learned the individual subtasks in isolation, combined known subtasks, o ..."
Abstract

Cited by 13 (0 self)
 Add to MetaCart
Abstract — We propose to incrementally learn the segmentation of a demonstrated task into subtasks and the individual subtask policies themselves simultaneously. Previous robot learning from demonstration techniques have either learned the individual subtasks in isolation, combined known subtasks, or used knowledge of the overall task structure to perform segmentation. Our infinite mixture of experts approach instead automatically infers an appropriate partitioning (number of subtasks and assignment of data points to each one) directly from the data. We illustrate the applicability of our technique by learning a suitable set of subtasks from the demonstration of a finitestate machine robot soccer goal scorer. I.
A nonparametric Bayesian alternative to spike sorting
 Journal of Neuroscience Methods
"... The analysis of extracellular neural recordings typically begins with careful spike sorting and all analysis of the data then rests on the correctness of the resulting spike trains. In many situations this is unproblematic as experimental and spike sorting procedures often focus on well isolated un ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
The analysis of extracellular neural recordings typically begins with careful spike sorting and all analysis of the data then rests on the correctness of the resulting spike trains. In many situations this is unproblematic as experimental and spike sorting procedures often focus on well isolated units. There is evidence in the literature, however, that errors in spike sorting can occur even with carefully collected and selected data. Additionally, chronically implanted electrodes and arrays with fixed electrodes cannot be easily adjusted to provide well isolated units. In these situations, multiple units may be recorded and the assignment of waveforms to units may be ambiguous. At the same time, analysis of such data may be both scientifically important and clinically relevant. In this paper we address this issue using a novel probabilistic model that accounts for several important sources of uncertainty and error in spike sorting. In lieu of sorting neural data to produce a single best spike train, we estimate a probabilistic model of spike trains given the observed data. We show how such a distribution over spike sortings can support standard neuroscientific questions while providing a representation of uncertainty in the analysis. As a representative illustration of the approach, we analyzed primary motor cortical tuning with respect to hand movement in data recorded with a chronic multielectrode array in nonhuman primates. We found that the probabilistic analysis generally agrees with human sorters but suggests the presence of tuned units not detected by humans.
Categorization as nonparametric Bayesian density estimation
"... Rational models of cognition aim to explain the structure of human thought and behavior as an optimal solution to the computational problems that are posed by our environment ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
Rational models of cognition aim to explain the structure of human thought and behavior as an optimal solution to the computational problems that are posed by our environment
Context, Learning, and Extinction
"... A. Redish et al. (2007) proposed a reinforcement learning model of contextdependent learning and extinction in conditioning experiments, using the idea of “state classification ” to categorize new observations into states. In the current article, the authors propose an interpretation of this idea i ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
A. Redish et al. (2007) proposed a reinforcement learning model of contextdependent learning and extinction in conditioning experiments, using the idea of “state classification ” to categorize new observations into states. In the current article, the authors propose an interpretation of this idea in terms of normative statistical inference. They focus on renewal and latent inhibition, 2 conditioning paradigms in which contextual manipulations have been studied extensively, and show that online Bayesian inference within a model that assumes an unbounded number of latent causes can characterize a diverse set of behavioral results from such manipulations, some of which pose problems for the model of Redish et al. Moreover, in both paradigms, context dependence is absent in younger animals, or if hippocampal lesions are made prior to training. The authors suggest an explanation in terms of a restricted capacity to infer new causes.
Teaching Old Dogs New Tricks: Incremental Multimap Regression for Interactive Robot Learning from Demonstration
, 2010
"... We consider autonomous robots as having associated control policies that determine their actions in response to perceptions of the environment. Often, these controllers are explicitly transferred from a human via programmatic description or physical instantiation. Alternatively, Robot Learning from ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
We consider autonomous robots as having associated control policies that determine their actions in response to perceptions of the environment. Often, these controllers are explicitly transferred from a human via programmatic description or physical instantiation. Alternatively, Robot Learning from Demonstration (RLfD) can enable a robot to learn a policy from observing only demonstrations of the task itself. We focus on interactive, teleoperative teaching, where the user manually controls the robot and provides demonstrations while receiving learner feedback. With regression, the collected perceptionactuation pairs are used to directly estimate the underlying policy mapping. This dissertation contributes an RLfD methodology for interactive, mixedinitiative learning of unknown tasks. The goal of the technique is to enable users to implicitly instantiate autonomous robot controllers that perform desired tasks as well as the demonstrator, as measured by taskspecific metrics. With standard regression techniques, we show that such “onpar” learning is restricted to policies typified by a manytoone mapping (a unimap) from perception to actuation. Thus, controllers representable as multistate Finite State Machines (FSMs) and that exhibit a onetomany mapping (a multimap) cannot be learnt. To be able to do so we must address the three issues of model selection (how many subtasks or FSM states), policy learning (for each subtask),