Results 1  10
of
21
Convergence of Stochastic Iterative Dynamic Programming Algorithms
 Neural Computation
, 1994
"... Increasing attention has recently been paid to algorithms based on dynamic programming (DP) due to the suitability of DP for learning problems involving control. In stochastic environments where the system being controlled is only incompletely known, however, a unifying theoretical account of th ..."
Abstract

Cited by 207 (8 self)
 Add to MetaCart
Increasing attention has recently been paid to algorithms based on dynamic programming (DP) due to the suitability of DP for learning problems involving control. In stochastic environments where the system being controlled is only incompletely known, however, a unifying theoretical account of the behavior of these methods has been missing. In this paper we relate DPbased learning algorithms to powerful techniques of stochastic approximation via a new convergence theorem, enabling us to establish a class of convergent algorithms to which both TD() and Qlearning belong. 1
2007a, Monetary policy with model uncertainty: distribution forecast targeting, unpublished manuscript
"... We examine optimal and other monetary policies in a linearquadratic setup with a relatively general form of model uncertainty, socalled Markov jumplinearquadratic systems extended to include forwardlooking variables and unobservable “modes. ” The form of model uncertainty our framework encompas ..."
Abstract

Cited by 40 (12 self)
 Add to MetaCart
We examine optimal and other monetary policies in a linearquadratic setup with a relatively general form of model uncertainty, socalled Markov jumplinearquadratic systems extended to include forwardlooking variables and unobservable “modes. ” The form of model uncertainty our framework encompasses includes: simple i.i.d. model deviations; serially correlated model deviations; estimable regimeswitching models; more complex structural uncertainty about very different models, for instance, backward and forwardlooking models; timevarying centralbank judgment about the state of model uncertainty; and so forth. We provide an algorithm for finding the optimal policy as well as solutions for arbitrary policy functions. This allows us to compute and plot consistent distribution forecasts—fan charts—of target variables and instruments. Our methods hence extend certainty equivalence and “mean forecast targeting ” to more general certainty nonequivalence and “distribution forecast targeting.” JEL Classification: E42, E52, E58
Direct Updating of Intertemporal Criterion Functions for a Class of Adaptive Control Problems
 IEEE Transactions on Systems, Man, and Cybernetics
, 1979
"... Abstract Previous papers demonstrate the feasibility of direct criterion function updating for a class of adaptive control problems with sequential myopic expected cost objectives. In contrast, the present paper demonstrates the feasibility ofdirect criterion function updating for a class of adaptiv ..."
Abstract

Cited by 10 (6 self)
 Add to MetaCart
Abstract Previous papers demonstrate the feasibility of direct criterion function updating for a class of adaptive control problems with sequential myopic expected cost objectives. In contrast, the present paper demonstrates the feasibility ofdirect criterion function updating for a class of adaptive control problems with intertemporal expected cost objectives. Specifically, the proposed criterion filter sequentially updates an initial estimate for expected cost plus costtogo in a dynamic programming framework. The control law generated by the filtered expected cost plus costtogo estimates is tested against the optimal control law for a linearquadratic system with random state coefficients. Computer simulation results indicate that the total costs realized under the filter control law are approximately on par with the total costs realized under the optimal control law for the tested range of time horizons, cost function coefficients, and mean and standard deviation values for the random state coefficients. I.
The Extended Kalman Filter as a Local Asymptotic Observer for Nonlinear DiscreteTime Systems
, 1995
"... The convergence aspects of the extended Kalman #lter, when used as a deterministic observer for a nonlinear discretetime system, are analyzed. To a certain extent, the results parallel those of #1# for continuoustime systems. However, in addition to the analysis done in #1#, the case of systems wi ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
The convergence aspects of the extended Kalman #lter, when used as a deterministic observer for a nonlinear discretetime system, are analyzed. To a certain extent, the results parallel those of #1# for continuoustime systems. However, in addition to the analysis done in #1#, the case of systems with nonlinear output maps is treated and the conditions needed to ensure the uniform boundedness of certain Riccati equations are related to the observability properties of the underlying nonlinear system. Furthermore, we show the convergence of the #lter without any a priori boundedness assumptions on the error covariances whenever the states stay within a convex compact domain. y Work supported by the National Science Foundation under contract NSF ECS8896136 with matching funds provided by the FORD MO. CO. September 13, 1991 1. Introduction Designing an observer for a nonlinear system is quite a challenge. Thus, as a #rst step, it is interesting to see how classical linearization techn...
Robust monetary policy with misspecified models: control: Does model uncertainty always call for attenuated policy
 Journal of Economic Dynamics and Control
, 2001
"... This paper explores Knightian model uncertainty as a possible explanation of the considerable difference between estimated interest rate rules and optimal feedback descriptions of monetary policy. We focus on two types of uncertainty: (i) unstructured model uncertainty reflected in additive shock er ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
This paper explores Knightian model uncertainty as a possible explanation of the considerable difference between estimated interest rate rules and optimal feedback descriptions of monetary policy. We focus on two types of uncertainty: (i) unstructured model uncertainty reflected in additive shock error processes that result from omittedvariable misspecifications, and (ii) structured model uncertainty, where one or more parameters are identified as the source of misspecification. For an estimated forwardlooking model of the U.S. economy, we find that rules that are robust against uncertainty, the nature of which is unspecifiable, or against onetime parametric shifts, are more aggressive than the optimal linear quadratic rule. However, policies designed to protect the economy against the worstcase consequences of misspecified dynamics are less aggressive and turn out to be good approximations of the estimated rule. A possible drawback of such policies is that the losses incurred from protecting against worstcase scenarios are concentrated among the same business cycle frequencies that normally occupy the attention of policymakers.
A Probabilistic ParticleControl Approximation of ChanceConstrained Stochastic Predictive Control
"... Abstract—Robotic systems need to be able to plan control actions that are robust to the inherent uncertainty in the real world. This uncertainty arises due to uncertain state estimation, disturbances, and modeling errors, as well as stochastic mode transitions such as component failures. Chancecons ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
Abstract—Robotic systems need to be able to plan control actions that are robust to the inherent uncertainty in the real world. This uncertainty arises due to uncertain state estimation, disturbances, and modeling errors, as well as stochastic mode transitions such as component failures. Chanceconstrained control takes into account uncertainty to ensure that the probability of failure, due to collision with obstacles, for example, is below a given threshold. In this paper, we present a novel method for chanceconstrained predictive stochastic control of dynamic systems. The method approximates the distribution of the system state using a finite number of particles. By expressing these particles in terms of the control variables, we are able to approximate the original stochastic control problem as a deterministic one; furthermore, the approximation becomes exact as the number of particles tends to infinity. This method applies to arbitrary noise distributions, and for systems with linear or jump Markov linear dynamics, we show that the approximate problem can be solved using efficient mixedinteger linearprogramming techniques. We also introduce an important weighting extension that enables the method to deal with lowprobability mode transitions such as failures. We demonstrate in simulation that the new method is able to control an aircraft in turbulence and can control a ground vehicle while being robust to brake failures. Index Terms—Chance constraints, hybrid discretecontinuous systems, nonholonomic motion planning, planning under stochastic uncertainty. I.
A proposed stochastic simulation framework for the government of canada’s debt strategy problem. Bank of Canada: Working Paper 200310
, 2003
"... The views expressed in this paper are those of the author. No responsibility for them should be attributed to the Bank of Canada. Contents ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
The views expressed in this paper are those of the author. No responsibility for them should be attributed to the Bank of Canada. Contents
Should Macroeconomic Policy Makers Consider Parameter Covariances?, Working Paper 9701
, 1997
"... Abstract. Many macroeconomic policy exercises consider the mean values of parameter estimates but do not use the variances and covariances. One can argue that the uncertainty of these parameter estimates is sufficiently small that it can safely be ignored. Or one can take the position that this kind ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
Abstract. Many macroeconomic policy exercises consider the mean values of parameter estimates but do not use the variances and covariances. One can argue that the uncertainty of these parameter estimates is sufficiently small that it can safely be ignored. Or one can take the position that this kind of uncertainty cannot be avoided no matter what one does. Thus it is just as well to ignore it while making policy decisions. In this paper we address both of these positions in the presence of learning and find that they are lacking. To the contrary, we find evidence that the potential damage from ignoring the variances and covariances of the parameter estimates is substantial and that taking them into account can improve matters. 1.
Caution in Macroeconomic Policy: Uncertainty and the Relative Intensity of Policy
"... Two lines of literature show that an increase in the uncertainty will result in a decrease in the vigor of the control variable in the first time period. The first line uses static models and the second dynamic models. In this paper the results in the dynamic line are extended from onestate and one ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
Two lines of literature show that an increase in the uncertainty will result in a decrease in the vigor of the control variable in the first time period. The first line uses static models and the second dynamic models. In this paper the results in the dynamic line are extended from onestate and onecontrol models to models with a pair of control variables. We confirm the result of Johansen from the static line that in this case one control will be used less intensely and the other more intensely when current uncertainty is increased. Then we extend this result to models where there are zero weights on the controls to obtain a linear complementarity outcome. The analysis from both lines of literature concerns essentially single period results since even the studies in the dynamic line have focused on the effects of current uncertainty. In this paper we follow a suggestion from Craine to extend the results to a multiperiod framework. We study the effects of an increase in uncertainty in...
Global and approximate global optimality of myopic economic decisions
 Journal of Economic Dynamics and Control
, 1980
"... A general discretetime stochastic control model is developed which encompasses many wellknown economic models. In the context of the general model, sufficient conditions are derived for the equivalence and approximate equivalence of myopic (sequential singleperiod) and global (simultaneous multi ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
A general discretetime stochastic control model is developed which encompasses many wellknown economic models. In the context of the general model, sufficient conditions are derived for the equivalence and approximate equivalence of myopic (sequential singleperiod) and global (simultaneous multiperiod) expected return maximization. A bound provided for the global return loss resulting from myopic optimization is shown to vary directly with the degree of uncertainty and inversely with the degree of positive correlation in periodbyperiod returns. Characteristics of intermediateperiod return functions which partially order them in terms of finalperiod expected return performance are clarified. Results are illustrated by portfolio and macro policy model examples. 1.