Results 1  10
of
45
Policy gradient methods for robotics
 In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS
, 2006
"... Abstract — The aquisition and improvement of motor skills and control policies for robotics from trial and error is of essential importance if robots should ever leave precisely prestructured environments. However, to date only few existing reinforcement learning methods have been scaled into the d ..."
Abstract

Cited by 79 (19 self)
 Add to MetaCart
Abstract — The aquisition and improvement of motor skills and control policies for robotics from trial and error is of essential importance if robots should ever leave precisely prestructured environments. However, to date only few existing reinforcement learning methods have been scaled into the domains of highdimensional robots such as manipulator, legged or humanoid robots. Policy gradient methods remain one of the few exceptions and have found a variety of applications. Nevertheless, the application of such methods is not without peril if done in an uninformed manner. In this paper, we give an overview on learning with policy gradient methods for robotics with a strong focus on recent advances in the field. We outline previous applications to robotics and show how the most recently developed methods can significantly improve learning performance. Finally, we evaluate our most promising algorithm in the application of hitting a baseball with an anthropomorphic arm. I.
Toward a method of selecting among computational models of cognition
 Psychological Review
, 2002
"... The question of how one should decide among competing explanations of data is at the heart of the scientific enterprise. Computational models of cognition are increasingly being advanced as explanations of behavior. The success of this line of inquiry depends on the development of robust methods to ..."
Abstract

Cited by 74 (4 self)
 Add to MetaCart
The question of how one should decide among competing explanations of data is at the heart of the scientific enterprise. Computational models of cognition are increasingly being advanced as explanations of behavior. The success of this line of inquiry depends on the development of robust methods to guide the evaluation and selection of these models. This article introduces a method of selecting among mathematical models of cognition known as minimum description length, which provides an intuitive and theoretically wellgrounded understanding of why one model should be chosen. A central but elusive concept in model selection, complexity, can also be derived with the method. The adequacy of the method is demonstrated in 3 areas of cognitive modeling: psychophysics, information integration, and categorization. How should one choose among competing theoretical explanations of data? This question is at the heart of the scientific enterprise, regardless of whether verbal models are being tested in an experimental setting or computational models are being evaluated in simulations. A number of criteria have been proposed to assist in this endeavor, summarized nicely by Jacobs and Grainger
Structure Learning in Conditional Probability Models via an Entropic Prior and Parameter Extinction
, 1998
"... We introduce an entropic prior for multinomial parameter estimation problems and solve for its maximum... ..."
Abstract

Cited by 66 (0 self)
 Add to MetaCart
We introduce an entropic prior for multinomial parameter estimation problems and solve for its maximum...
A tutorial introduction to the minimum description length principle
 in Advances in Minimum Description Length: Theory and Applications. 2005
"... ..."
Predictability, Complexity, and Learning
, 2001
"... We define predictive information Ipred(T) as the mutual information between the past and the future of a time series. Three qualitatively different behaviors are found in the limit of large observation times T: Ipred(T) can remain finite, grow logarithmically, or grow as a fractional power law. If t ..."
Abstract

Cited by 30 (2 self)
 Add to MetaCart
We define predictive information Ipred(T) as the mutual information between the past and the future of a time series. Three qualitatively different behaviors are found in the limit of large observation times T: Ipred(T) can remain finite, grow logarithmically, or grow as a fractional power law. If the time series allows us to learn a model with a finite number of parameters, then Ipred(T) grows logarithmically with a coefficient that counts the dimensionality of the model space. In contrast, powerlaw growth is associated, for example, with the learning of infinite parameter (or nonparametric) models such as continuous functions with smoothness constraints. There are connections between the predictive information and measures of complexity that have been defined both in learning theory and the analysis of physical systems through statistical mechanics and dynamical systems theory. Furthermore, in the same way that entropy provides the unique measure of available information consistent with some simple and plausible conditions, we argue that the divergent part of Ipred(T) provides the unique measure for the complexity of dynamics underlying a time series. Finally, we discuss how these ideas may be useful in problems in physics, statistics, and biology.
Assessing the Distinguishability of Models and the Informativeness of Data
"... A difficulty in the development and testing of psychological models is that they are typically evaluated solely on their ability to fit experimental data, with little consideration given to their ability to fit other possible data patterns. By examining how well model A fits data generated by mod ..."
Abstract

Cited by 13 (2 self)
 Add to MetaCart
A difficulty in the development and testing of psychological models is that they are typically evaluated solely on their ability to fit experimental data, with little consideration given to their ability to fit other possible data patterns. By examining how well model A fits data generated by model B, and vice versa (a technique that we call landscaping), much safer inferences can be made about the meaning of a models fit to data. We demonstrate the landscaping technique using four models of retention and 77 historical data sets, and show how the method can be used to (1) evaluate the distinguishability of models, (2) evaluate the informativeness of data in distinguishing between models, and (3) suggest new ways to distinguish between models. The generality of the method is demonstrated in two other research areas (information integration and categorization), and its relationship to the important notion of model complexity is discussed.
Model Selection by Normalized Maximum Likelihood
, 2005
"... The Minimum Description Length (MDL) principle is an information theoretic approach to inductive inference that originated in algorithmic coding theory. In this approach, data are viewed as codes to be compressed by the model. From this perspective, models are compared on their ability to compress a ..."
Abstract

Cited by 12 (3 self)
 Add to MetaCart
The Minimum Description Length (MDL) principle is an information theoretic approach to inductive inference that originated in algorithmic coding theory. In this approach, data are viewed as codes to be compressed by the model. From this perspective, models are compared on their ability to compress a data set by extracting useful information in the data apart from random noise. The goal of model selection is to identify the model, from a set of candidate models, that permits the shortest description length (code) of the data. Since Rissanen originally formalized the problem using the crude ‘twopart code ’ MDL method in the 1970s, many significant strides have been made, especially in the 1990s, with the culmination of the development of the refined ‘universal code’ MDL method, dubbed Normalized Maximum Likelihood (NML). It represents an elegant solution to the model selection problem. The present paper provides a tutorial review on these latest developments with a special focus on NML. An application example of NML in cognitive modeling is also provided.
An empirical study of minimum description length model selection with infinite parametric complexity
 JOURNAL OF MATHEMATICAL PSYCHOLOGY
, 2006
"... Parametric complexity is a central concept in Minimum Description Length (MDL) model selection. In practice it often turns out to be infinite, even for quite simple models such as the Poisson and Geometric families. In such cases, MDL model selection as based on NML and Bayesian inference based on J ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
Parametric complexity is a central concept in Minimum Description Length (MDL) model selection. In practice it often turns out to be infinite, even for quite simple models such as the Poisson and Geometric families. In such cases, MDL model selection as based on NML and Bayesian inference based on Jeffreys ’ prior can not be used. Several ways to resolve this problem have been proposed. We conduct experiments to compare and evaluate their behaviour on small sample sizes. We find interestingly poor behaviour for the plugin predictive code; a restricted NML model performs quite well but it is questionable if the results validate its theoretical motivation. A Bayesian marginal distribution with Jeffreys’ prior can still be used if one sacrifices the first observation to make a proper posterior; this approach turns out to be most dependable.
Process Pathway Inference via Time Series Analysis
 Experimental Mechanics
, 2003
"... Motivated by recent experimental developments in functional genomics, we construct and test a numerical technique for inferring process pathways, in which one process calls another process, from time series data. We validate using a case in which data are readily available and formulate an extension ..."
Abstract

Cited by 9 (3 self)
 Add to MetaCart
Motivated by recent experimental developments in functional genomics, we construct and test a numerical technique for inferring process pathways, in which one process calls another process, from time series data. We validate using a case in which data are readily available and formulate an extension, appropriate for genetic regulatory networks, which exploits Bayesian inference and in which the present–day undersampling is compensated for by prior understanding of genetic regulation. Preprint number: NSFITP0247 1