Results 1 - 10
of
51
Planning and acting in partially observable stochastic domains
- ARTIFICIAL INTELLIGENCE
, 1998
"... In this paper, we bring techniques from operations research to bear on the problem of choosing optimal actions in partially observable stochastic domains. We begin by introducing the theory of Markov decision processes (mdps) and partially observable mdps (pomdps). We then outline a novel algorithm ..."
Abstract
-
Cited by 629 (24 self)
- Add to MetaCart
In this paper, we bring techniques from operations research to bear on the problem of choosing optimal actions in partially observable stochastic domains. We begin by introducing the theory of Markov decision processes (mdps) and partially observable mdps (pomdps). We then outline a novel algorithm for solving pomdps offline and show how, in some cases, a finite-memory controller can be extracted from the solution to a pomdp. We conclude with a discussion of how our approach relates to previous work, the complexity of finding exact solutions to pomdps, and of some possibilities for finding approximate solutions.
Quantitative Stochastic Parity Games
"... We study perfect-information stochastic parity games. These are two-player nonterminating games which are played on a graph with turn-based probabilistic transitions. A play results in an infinite path and the conflicting goals of the two players are!-regular path properties, formalized as parity w ..."
Abstract
-
Cited by 39 (15 self)
- Add to MetaCart
We study perfect-information stochastic parity games. These are two-player nonterminating games which are played on a graph with turn-based probabilistic transitions. A play results in an infinite path and the conflicting goals of the two players are!-regular path properties, formalized as parity winning conditions. The qualitative solution of such a game amounts to computing the set of vertices from which a player has a strategy to win with probability 1 (or with positive probability). The quantitative solution amounts to computing the value of the game in every vertex, i.e., the highest probability with which a player can guarantee satisfaction of his own objective in a play that starts from the vertex. For the important special case of one-player stochastic parity games (parity Markov decision processes) we give polynomial-time algorithms both for the qualitative and the quantitative solution. The running time of the qualitative solution is O(d \Delta m 3=2) for graphs with m edges and d priorities. The quantitative solution is based on a linearprogramming formulation.
Concurrent Reachability Games
, 2008
"... We consider concurrent two-player games with reachability objectives. In such games, at each round, player 1 and player 2 independently and simultaneously choose moves, and the two choices determine the next state of the game. The objective of player 1 is to reach a set of target states; the objecti ..."
Abstract
-
Cited by 36 (18 self)
- Add to MetaCart
We consider concurrent two-player games with reachability objectives. In such games, at each round, player 1 and player 2 independently and simultaneously choose moves, and the two choices determine the next state of the game. The objective of player 1 is to reach a set of target states; the objective of player 2 is to prevent this. These are zero-sum games, and the reachability objective is one of the most basic objectives: determining the set of states from which player 1 can win the game is a fundamental problem in control theory and system verification. There are three types of winning states, according to the degree of certainty with which player 1 can reach the target. From type-1 states, player 1 has a deterministic strategy to always reach the target. From type-2 states, player 1 has a randomized strategy to reach the target with probability 1. From type-3 states, player 1 has for every real ε> 0 a randomized strategy to reach the target with probability greater than 1 − ε. We show that for finite state spaces, all three sets of winning states can be computed in polynomial time: type-1 states in linear time, and type-2 and type-3 states in quadratic time. The algorithms to compute the three sets of winning states also enable the construction of the winning and spoiling strategies.
A discrete subexponential algorithm for parity games
- STACS’03
, 2003
"... We suggest a new randomized algorithm for solving parity games with worst case time complexity roughly ..."
Abstract
-
Cited by 30 (8 self)
- Add to MetaCart
We suggest a new randomized algorithm for solving parity games with worst case time complexity roughly
A combinatorial strongly subexponential strategy improvement algorithm for mean payoff games
- DISCRETE APPLIED MATHEMATICS
, 2004
"... We suggest the first strongly subexponential and purely combinatorial algorithm for solving the mean payoff games problem. It is based on iteratively improving the longest shortest distances to a sink in a possibly cyclic directed graph. We identify a new “controlled” version of the shortest paths p ..."
Abstract
-
Cited by 29 (4 self)
- Add to MetaCart
We suggest the first strongly subexponential and purely combinatorial algorithm for solving the mean payoff games problem. It is based on iteratively improving the longest shortest distances to a sink in a possibly cyclic directed graph. We identify a new “controlled” version of the shortest paths problem. By selecting exactly one outgoing edge in each of the controlled vertices we want to make the shortest distances from all vertices to the unique sink as long as possible. Under reasonable assumptions the problem belongs to the complexity class NP∩coNP. Mean payoff games are easily reducible to this problem. We suggest an algorithm for computing longest shortest paths. Player Max selects a strategy (one edge in each controlled vertex) and player Min responds by evaluating shortest paths to the sink in the remaining graph. Then Max locally changes choices in controlled vertices looking at attractive switches that seem to increase shortest paths lengths (under the current evaluation). We show that this is a monotonic strategy improvement, and every locally optimal strategy is globally optimal. This allows us to construct a randomized algorithm of complexity min(poly · W, 2 O( √ n log n)), which is simultaneously pseudopolynomial (W is the maximal absolute edge weight) and subexponential in the number of vertices n. All previous algorithms for mean payoff games were either exponential or pseudopolynomial (which is purely exponential for exponentially large edge weights).
Simple stochastic parity games
- In CSL’03, volume 2803 of LNCS
, 2003
"... p m), compared with O(mn) best algorithm known ..."
A deterministic subexponential algorithm for solving parity games
- SODA
, 2006
"... The existence of polynomial time algorithms for the solution of parity games is a major open problem. The fastest known algorithms for the problem are randomized algorithms that run in subexponential time. These algorithms are all ultimately based on the randomized subexponential simplex algorithms ..."
Abstract
-
Cited by 29 (2 self)
- Add to MetaCart
The existence of polynomial time algorithms for the solution of parity games is a major open problem. The fastest known algorithms for the problem are randomized algorithms that run in subexponential time. These algorithms are all ultimately based on the randomized subexponential simplex algorithms of Kalai and of Matousek, Sharir and Welzl. Randomness seems to play an essential role in these algorithms. We use a completely different, and elementary, approach to obtain a deterministic subexponential algorithm for the solution of parity games. The new algorithm, like the existing randomized subexponential algorithms, uses only polynomial space, and it is almost as fast as the randomized subexponential algorithms mentioned above.
Staying Alive As Cheaply As Possible
- In Proc. of 7th Intl. Workshop on Hybrid Systems: Computation and Control (HSCC), volume 2993 of Lect. Notes in Comp. Sci
, 2004
"... This paper is concerned with the derivation of infinite schedules for timed automata that are in some sense optimal. To cover a wide class of optimality criteria we start out by introducing an extension of the (priced) timed automata model that includes both costs and rewards as separate modellin ..."
Abstract
-
Cited by 26 (16 self)
- Add to MetaCart
This paper is concerned with the derivation of infinite schedules for timed automata that are in some sense optimal. To cover a wide class of optimality criteria we start out by introducing an extension of the (priced) timed automata model that includes both costs and rewards as separate modelling features. A precise definition is then given of what constitutes optimal infinite behaviours for this class of models. We subsequently show that the derivation of optimal non-terminating schedules for such double-priced timed automata is computable.
An exponential lower bound for the parity game strategy improvement algorithm as we know it
- In Proc. of 24th LICS
, 2009
"... This paper presents a new lower bound for the discrete strategy improvement algorithm for solving parity games due to Vöge and Jurdziński. First, we informally show which structures are difficult to solve for the algorithm. Second, we outline a family of games on which the algorithm requires exponen ..."
Abstract
-
Cited by 12 (6 self)
- Add to MetaCart
This paper presents a new lower bound for the discrete strategy improvement algorithm for solving parity games due to Vöge and Jurdziński. First, we informally show which structures are difficult to solve for the algorithm. Second, we outline a family of games on which the algorithm requires exponentially many strategy iterations, answering in the negative the long-standing question whether this algorithm runs in polynomial time. Additionally we note that the same family of games can be used to prove a similar result w.r.t. the strategy improvement variant by Schewe as well as the strategy iteration for solving discounted payoff games due to Puri. 1.

