Results 1 
8 of
8
Multicriteria Reinforcement Learning
, 1998
"... We consider multicriteria sequential decision making problems where the vectorvalued evaluations are compared by a given, fixed total ordering. Conditions for the optimality of stationary policies and the Bellman optimality equation are given. The analysis requires special care as the topology int ..."
Abstract

Cited by 34 (0 self)
 Add to MetaCart
We consider multicriteria sequential decision making problems where the vectorvalued evaluations are compared by a given, fixed total ordering. Conditions for the optimality of stationary policies and the Bellman optimality equation are given. The analysis requires special care as the topology introduced by pointwise convergence and the ordertopology introduced by the preference order are in general incompatible. Reinforcement learning algorithms are proposed and analyzed. Preliminary computer experiments confirm the validity of the derived algorithms. It is observed that in the mediumterm multicriteria RL often converges to better solutions (measured by the first criterion) than their singlecriterion counterparts. These type of multicriteria problems are most useful when there are several optimal solutions to a problem and one wants to choose the one among these which is optimal according to another fixed criterion. Example applications include alternating games, when in addition...
Some Basic Facts Concerning Minimax Sequential Decision Processes
, 1996
"... this report. The interested reader may found the proofs (in a more general form) in [3]. Definition 6.1 Let T : R ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
this report. The interested reader may found the proofs (in a more general form) in [3]. Definition 6.1 Let T : R
unknown title
"... AbslmfFor interacting agents in timecritical applications, learning whether a subtask can be scheduled reliably is an important issue. The identification of subproblems of this nature may promote e.g. planning, scheduling and segmenting in Markov decision processes. We define a subtask Io be sched ..."
Abstract
 Add to MetaCart
AbslmfFor interacting agents in timecritical applications, learning whether a subtask can be scheduled reliably is an important issue. The identification of subproblems of this nature may promote e.g. planning, scheduling and segmenting in Markov decision processes. We define a subtask Io be schedulable if its execution time has a small variance. We present an algorithm for finding such subtasks. I.
Multicriteria Reinforcement Learning
, 1998
"... We consider multicriteria sequential decision making problems where the vectorvalued evaluations are compared by a given, fixed total ordering. Conditions for the optimality of stationary policies and the Bellman optimality equation are given. The analysis requires special care as the topology int ..."
Abstract
 Add to MetaCart
We consider multicriteria sequential decision making problems where the vectorvalued evaluations are compared by a given, fixed total ordering. Conditions for the optimality of stationary policies and the Bellman optimality equation are given. The analysis requires special care as the topology introduced by pointwise convergence and the ordertopology introduced by the preference order are in general incompatible. Reinforcement learning algorithms are proposed and analyzed. Preliminary computer experiments confirm the validity of the derived algorithms. It is observed that in the mediumterm multicriteria RL often converges to better solutions (measured by the first criterion) than their singlecriterion counterparts. These type of multicriteria problems are most useful when there are several optimal solutions to a problem and one wants to choose the one among these which is optimal according to another fixed criterion. Example applications include alternating games, when in addition...
unknown title
"... An algorithm for finding schedulable plans Abstract — For interacting agents in timecritical applications, learning of the possibility of scheduling subtasks is an important issue. The identification of subproblems of this nature may promote e.g. planning, scheduling and segmenting Markov decision ..."
Abstract
 Add to MetaCart
(Show Context)
An algorithm for finding schedulable plans Abstract — For interacting agents in timecritical applications, learning of the possibility of scheduling subtasks is an important issue. The identification of subproblems of this nature may promote e.g. planning, scheduling and segmenting Markov decision processes. We define a plan as being schedulable if its execution time has a small variance. We present an algorithm for finding such plans. I.