## Adaptive Critic Based Approximate Dynamic Programming for Tuning Fuzzy Controllers (2000)

Venue: | in Proceedings of lEEE-FUZZ 2000, IEEE. [141 Werbos, P |

Citations: | 8 - 6 self |

### BibTeX

@INPROCEEDINGS{Shannon00adaptivecritic,

author = {Thaddeus T. Shannon and George G. Lendaris},

title = {Adaptive Critic Based Approximate Dynamic Programming for Tuning Fuzzy Controllers},

booktitle = {in Proceedings of lEEE-FUZZ 2000, IEEE. [141 Werbos, P},

year = {2000},

publisher = {IEEE Press}

}

### OpenURL

### Abstract

Abstract: In this paper we show the applicability of the Dual Heuristic Programming (DHP) method of Approximate Dynamic Programming to parameter tuning of a fuzzy control system. DHP and related techniques have been developed in the neurocontrol context but can be equally productive when used with fuzzy controllers or neuro-fuzzy hybrids. We demonstrate this technique on a highly nonlinear 2 nd order plant proposed by Sanner and Slotine. Throughout our example application, we take advantage of the TS model framework to initialize our tunable parameters with reasonable problem specific values, a practice difficult to perform when applying DHP to neurocontrol. I.

### Citations

2953 |
Dynamic Programming
- Bellman
- 1957
(Show Context)
Citation Context ...ATE DYNAMIC PROGRAMMING Dynamic Programming is a general approach for sequential optimization applicable under very broad conditions. Fundamental to this approach is Bellman's Principle of Optimality =-=[2]-=-: that an optimal trajectory has the property that no matter how an intermediate point is reached, the rest of the trajectory must coincide with an optimal trajectory as calculated with the intermedia... |

1079 |
Fuzzy identification of systems and its applications to modeling and control
- Takagi, Sugeno
- 1985
(Show Context)
Citation Context ... by backpropagation through a neural network plant model. In [17] we showed that such derivative information could be explicitly estimated in the form of a Takagi-Sugeno (TS) fuzzy model of the plant =-=[18]-=-. Based on our experiences detailed in [17] we now suggest that the overall approximate dynamic programming technique can easily be adapted to the tuning of fuzzy controllers. As has been observed in ... |

516 |
Neuronlike adaptive elements that can solve difficult control problems
- Barto, Sutton, et al.
- 1983
(Show Context)
Citation Context ...iterature recently, falling into model-based methods such as Dual Heuristic Programming (DHP), and non-model-based methods such as Action Dependent Heuristic Dynamic Programming (ADHDP) or Q-learning =-=[1]-=-[5][6][10] [11][12][15][16][19][20]. Previous applications of Adaptive Critic based reinforcement learning to the tuning of fuzzy controllers have relied on non-model based temporal differencing schem... |

167 | Slotine, “Gaussian networks for direct adaptive control
- Sanner, E
- 1992
(Show Context)
Citation Context ...n non-model based temporal differencing schemes [3][4][7][8]. Equivalent neural network based techniques have been shown to be generally less effective than model based techniques such as DHP [10][11]=-=[14]-=-. Model based methods utilize the Jacobian of the coupled plant-controller system to train both the controller and critic networks. These derivatives can be found explicitly from an analytic model, or... |

149 |
Learning and tuning fuzzy logic controllers through reinforcements
- Berenji, Khedkar
- 1992
(Show Context)
Citation Context ...[6][10] [11][12][15][16][19][20]. Previous applications of Adaptive Critic based reinforcement learning to the tuning of fuzzy controllers have relied on non-model based temporal differencing schemes =-=[3]-=-[4][7][8]. Equivalent neural network based techniques have been shown to be generally less effective than model based techniques such as DHP [10][11][14]. Model based methods utilize the Jacobian of t... |

89 |
Approximate dynamic programming for real-time control and neural modeling
- Werbös
- 1992
(Show Context)
Citation Context ...odel-based methods such as Dual Heuristic Programming (DHP), and non-model-based methods such as Action Dependent Heuristic Dynamic Programming (ADHDP) or Q-learning [1][5][6][10] [11][12][15][16][19]=-=[20]-=-. Previous applications of Adaptive Critic based reinforcement learning to the tuning of fuzzy controllers have relied on non-model based temporal differencing schemes [3][4][7][8]. Equivalent neural ... |

85 | Adaptive critic designs
- Prokhorov, Wunsch
- 1997
(Show Context)
Citation Context ...ntly, falling into model-based methods such as Dual Heuristic Programming (DHP), and non-model-based methods such as Action Dependent Heuristic Dynamic Programming (ADHDP) or Q-learning [1][5][6][10] =-=[11]-=-[12][15][16][19][20]. Previous applications of Adaptive Critic based reinforcement learning to the tuning of fuzzy controllers have relied on non-model based temporal differencing schemes [3][4][7][8]... |

67 | Fuzzy Logic - Yen, Lengari - 1999 |

62 |
A menu of designs for reinforcement learning over time,” in Neural Networks for Control
- Werbos
- 1990
(Show Context)
Citation Context ...to model-based methods such as Dual Heuristic Programming (DHP), and non-model-based methods such as Action Dependent Heuristic Dynamic Programming (ADHDP) or Q-learning [1][5][6][10] [11][12][15][16]=-=[19]-=-[20]. Previous applications of Adaptive Critic based reinforcement learning to the tuning of fuzzy controllers have relied on non-model based temporal differencing schemes [3][4][7][8]. Equivalent neu... |

48 |
Punish/reward: learning with a critic in adaptive threshold systems
- Widrow, Gupta, et al.
- 1973
(Show Context)
Citation Context ...was proposed by Werbos [19][20]. These networks are often called Adaptive Critics, though this term can be applied more generally to any network that provides learning reinforcement to another entity =-=[21]-=-. As a practical matter, any computational structure capable of acting as a universal function approximator can be used in this role (i.e. neural networks, fuzzy rule structures, etc.). The gradient o... |

40 |
Neuro Fuzzy Systems
- Lin, Lee
- 1996
(Show Context)
Citation Context ...] [11][12][15][16][19][20]. Previous applications of Adaptive Critic based reinforcement learning to the tuning of fuzzy controllers have relied on non-model based temporal differencing schemes [3][4]=-=[7]-=-[8]. Equivalent neural network based techniques have been shown to be generally less effective than model based techniques such as DHP [10][11][14]. Model based methods utilize the Jacobian of the cou... |

35 |
Neural Fuzzy Systems
- Lin, Lee
- 1996
(Show Context)
Citation Context ...] [11][12][15][16][19][20]. Previous applications of Adaptive Critic based reinforcement learning to the tuning of fuzzy controllers have relied on non-model based temporal differencing schemes [3][4]=-=[7]-=-[8]. Equivalent neural network based techniques have been shown to be generally less effective than model based techniques such as DHP [10][11][14]. Model based methods utilize the Jacobian of the cou... |

21 |
Adaptive Critic Designs: A Case Study For Neurocontrol
- Prokhorov, Santiago, et al.
(Show Context)
Citation Context ..., falling into model-based methods such as Dual Heuristic Programming (DHP), and non-model-based methods such as Action Dependent Heuristic Dynamic Programming (ADHDP) or Q-learning [1][5][6][10] [11]=-=[12]-=-[15][16][19][20]. Previous applications of Adaptive Critic based reinforcement learning to the tuning of fuzzy controllers have relied on non-model based temporal differencing schemes [3][4][7][8]. Eq... |

17 |
Adaptive Critic Designs and their Applications
- Prokhorov
- 1997
(Show Context)
Citation Context ... recently, falling into model-based methods such as Dual Heuristic Programming (DHP), and non-model-based methods such as Action Dependent Heuristic Dynamic Programming (ADHDP) or Q-learning [1][5][6]=-=[10]-=- [11][12][15][16][19][20]. Previous applications of Adaptive Critic based reinforcement learning to the tuning of fuzzy controllers have relied on non-model based temporal differencing schemes [3][4][... |

15 |
A Self-Learning Rule-Based Controller Employing Approximate Reasoning and Neural Net Concepts
- Lee
- 1991
(Show Context)
Citation Context ...[10] [11][12][15][16][19][20]. Previous applications of Adaptive Critic based reinforcement learning to the tuning of fuzzy controllers have relied on non-model based temporal differencing schemes [3]=-=[4]-=-[7][8]. Equivalent neural network based techniques have been shown to be generally less effective than model based techniques such as DHP [10][11][14]. Model based methods utilize the Jacobian of the ... |

15 |
A neural fuzzy system with linguistic teaching signals
- Lin, Lu
- 1995
(Show Context)
Citation Context ...11][12][15][16][19][20]. Previous applications of Adaptive Critic based reinforcement learning to the tuning of fuzzy controllers have relied on non-model based temporal differencing schemes [3][4][7]=-=[8]-=-. Equivalent neural network based techniques have been shown to be generally less effective than model based techniques such as DHP [10][11][14]. Model based methods utilize the Jacobian of the couple... |

11 |
Variable neural networks for adaptive control of nonlinear systems
- Liu, Kadirkamanthan, et al.
(Show Context)
Citation Context ...ussian radial basis functions (RBFs). They demonstrated very precise tracking of a bandwidth limited small amplitude signal using several thousand elements in their network. More recently, Liu et al. =-=[9]-=- demonstrated an adaptive control scheme based on an RBF network using a variable grid method. They showed that their method could meet a somewhat less stringent error bound for tracking a sinusoidal ... |

11 |
A New Progress Towards Truly Brain-Like Control
- Santiago, Werbos
- 1994
(Show Context)
Citation Context ...lling into model-based methods such as Dual Heuristic Programming (DHP), and non-model-based methods such as Action Dependent Heuristic Dynamic Programming (ADHDP) or Q-learning [1][5][6][10] [11][12]=-=[15]-=-[16][19][20]. Previous applications of Adaptive Critic based reinforcement learning to the tuning of fuzzy controllers have relied on non-model based temporal differencing schemes [3][4][7][8]. Equiva... |

9 | A Comparison of Training Algorithms for DHP Adaptive Critic Neuro-control
- Lendaris, Shannon, et al.
- 1999
(Show Context)
Citation Context ...ure recently, falling into model-based methods such as Dual Heuristic Programming (DHP), and non-model-based methods such as Action Dependent Heuristic Dynamic Programming (ADHDP) or Q-learning [1][5]=-=[6]-=-[10] [11][12][15][16][19][20]. Previous applications of Adaptive Critic based reinforcement learning to the tuning of fuzzy controllers have relied on non-model based temporal differencing schemes [3]... |

7 | Qualitative Models for Adaptive Critic Neurocontrol
- Shannon, Lendaris
- 1999
(Show Context)
Citation Context ...train both the controller and critic networks. These derivatives can be found explicitly from an analytic model, or implicitly, for example by backpropagation through a neural network plant model. In =-=[17]-=- we showed that such derivative information could be explicitly estimated in the form of a Takagi-Sugeno (TS) fuzzy model of the plant [18]. Based on our experiences detailed in [17] we now suggest th... |

6 | Partial, Noisy and Qualitative Models for Adaptive Critic Based Neurocontrol
- Shannon
- 1999
(Show Context)
Citation Context ...g into model-based methods such as Dual Heuristic Programming (DHP), and non-model-based methods such as Action Dependent Heuristic Dynamic Programming (ADHDP) or Q-learning [1][5][6][10] [11][12][15]=-=[16]-=-[19][20]. Previous applications of Adaptive Critic based reinforcement learning to the tuning of fuzzy controllers have relied on non-model based temporal differencing schemes [3][4][7][8]. Equivalent... |

3 | Designing (Approximate) Optimal Controllers Via DHP Adaptive Critics & Neural Networks
- Lendaris, Shannon
- 1999
(Show Context)
Citation Context ...rature recently, falling into model-based methods such as Dual Heuristic Programming (DHP), and non-model-based methods such as Action Dependent Heuristic Dynamic Programming (ADHDP) or Q-learning [1]=-=[5]-=-[6][10] [11][12][15][16][19][20]. Previous applications of Adaptive Critic based reinforcement learning to the tuning of fuzzy controllers have relied on non-model based temporal differencing schemes ... |

2 |
Generalized Adaptive Critics and their Applications", presented at IJCNN'99, session 6.5
- Prokhorov, Feldkamp
- 1999
(Show Context)
Citation Context ...entation and partitioning. Another feature of adaptive critic based approximate dynamic programming techniques is the potential to use the critic function as a guarantor of system stability, e.g. [10]=-=[13]-=-. It is also important to notice the applicability of these techniques to adaptive control problems. Our current example illustrated DHP for tuning a controller for a time invariant plant. For non-sta... |

2 |
Generalized Adaptive Critics and their Applications", presented at IJCNN'99, session 6.5
- Prokhorov, Feldkamp
- 1999
(Show Context)
Citation Context ...entation and partitioning. Another feature of adaptive critic based approximate dynamic programming techniques is the potential to use the critic function as a guarantor of system stability, e.g. [10]=-=[13]-=-. It is also important to notice the applicability of these techniques to adaptive control problems. Our current example illustrated DHP for tuning a controller for a time invariant plant. For non-sta... |

1 |
Lendaris "Qualitative Models for Adaptive Critic Neurocontrol
- Shannon, G
- 1999
(Show Context)
Citation Context ...train both the controller and critic networks. These derivatives can be found explicitly from an analytic model, or implicitly, for example by backpropagation through a neural network plant model. In =-=[17]-=- we showed that such derivative information could be explicitly estimated in the form of a Takagi-Sugeno (TS) fuzzy model of the plant [18]. Based on our experiences detailed in [17] we now suggest th... |