Results 1  10
of
78
Planning and acting in partially observable stochastic domains
 ARTIFICIAL INTELLIGENCE
, 1998
"... In this paper, we bring techniques from operations research to bear on the problem of choosing optimal actions in partially observable stochastic domains. We begin by introducing the theory of Markov decision processes (mdps) and partially observable mdps (pomdps). We then outline a novel algorithm ..."
Abstract

Cited by 832 (30 self)
 Add to MetaCart
In this paper, we bring techniques from operations research to bear on the problem of choosing optimal actions in partially observable stochastic domains. We begin by introducing the theory of Markov decision processes (mdps) and partially observable mdps (pomdps). We then outline a novel algorithm for solving pomdps offline and show how, in some cases, a finitememory controller can be extracted from the solution to a pomdp. We conclude with a discussion of how our approach relates to previous work, the complexity of finding exact solutions to pomdps, and of some possibilities for finding approximate solutions.
Quantitative Stochastic Parity Games
"... We study perfectinformation stochastic parity games. These are twoplayer nonterminating games which are played on a graph with turnbased probabilistic transitions. A play results in an infinite path and the conflicting goals of the two players are!regular path properties, formalized as parity w ..."
Abstract

Cited by 52 (23 self)
 Add to MetaCart
We study perfectinformation stochastic parity games. These are twoplayer nonterminating games which are played on a graph with turnbased probabilistic transitions. A play results in an infinite path and the conflicting goals of the two players are!regular path properties, formalized as parity winning conditions. The qualitative solution of such a game amounts to computing the set of vertices from which a player has a strategy to win with probability 1 (or with positive probability). The quantitative solution amounts to computing the value of the game in every vertex, i.e., the highest probability with which a player can guarantee satisfaction of his own objective in a play that starts from the vertex. For the important special case of oneplayer stochastic parity games (parity Markov decision processes) we give polynomialtime algorithms both for the qualitative and the quantitative solution. The running time of the qualitative solution is O(d \Delta m 3=2) for graphs with m edges and d priorities. The quantitative solution is based on a linearprogramming formulation.
A deterministic subexponential algorithm for solving parity games
 SODA
, 2006
"... The existence of polynomial time algorithms for the solution of parity games is a major open problem. The fastest known algorithms for the problem are randomized algorithms that run in subexponential time. These algorithms are all ultimately based on the randomized subexponential simplex algorithms ..."
Abstract

Cited by 44 (2 self)
 Add to MetaCart
The existence of polynomial time algorithms for the solution of parity games is a major open problem. The fastest known algorithms for the problem are randomized algorithms that run in subexponential time. These algorithms are all ultimately based on the randomized subexponential simplex algorithms of Kalai and of Matousek, Sharir and Welzl. Randomness seems to play an essential role in these algorithms. We use a completely different, and elementary, approach to obtain a deterministic subexponential algorithm for the solution of parity games. The new algorithm, like the existing randomized subexponential algorithms, uses only polynomial space, and it is almost as fast as the randomized subexponential algorithms mentioned above.
Concurrent Reachability Games
, 2008
"... We consider concurrent twoplayer games with reachability objectives. In such games, at each round, player 1 and player 2 independently and simultaneously choose moves, and the two choices determine the next state of the game. The objective of player 1 is to reach a set of target states; the objecti ..."
Abstract

Cited by 43 (18 self)
 Add to MetaCart
We consider concurrent twoplayer games with reachability objectives. In such games, at each round, player 1 and player 2 independently and simultaneously choose moves, and the two choices determine the next state of the game. The objective of player 1 is to reach a set of target states; the objective of player 2 is to prevent this. These are zerosum games, and the reachability objective is one of the most basic objectives: determining the set of states from which player 1 can win the game is a fundamental problem in control theory and system verification. There are three types of winning states, according to the degree of certainty with which player 1 can reach the target. From type1 states, player 1 has a deterministic strategy to always reach the target. From type2 states, player 1 has a randomized strategy to reach the target with probability 1. From type3 states, player 1 has for every real ε> 0 a randomized strategy to reach the target with probability greater than 1 − ε. We show that for finite state spaces, all three sets of winning states can be computed in polynomial time: type1 states in linear time, and type2 and type3 states in quadratic time. The algorithms to compute the three sets of winning states also enable the construction of the winning and spoiling strategies.
A combinatorial strongly subexponential strategy improvement algorithm for mean payoff games
 DISCRETE APPLIED MATHEMATICS
, 2004
"... We suggest the first strongly subexponential and purely combinatorial algorithm for solving the mean payoff games problem. It is based on iteratively improving the longest shortest distances to a sink in a possibly cyclic directed graph. We identify a new “controlled” version of the shortest paths p ..."
Abstract

Cited by 41 (4 self)
 Add to MetaCart
We suggest the first strongly subexponential and purely combinatorial algorithm for solving the mean payoff games problem. It is based on iteratively improving the longest shortest distances to a sink in a possibly cyclic directed graph. We identify a new “controlled” version of the shortest paths problem. By selecting exactly one outgoing edge in each of the controlled vertices we want to make the shortest distances from all vertices to the unique sink as long as possible. Under reasonable assumptions the problem belongs to the complexity class NP∩coNP. Mean payoff games are easily reducible to this problem. We suggest an algorithm for computing longest shortest paths. Player Max selects a strategy (one edge in each controlled vertex) and player Min responds by evaluating shortest paths to the sink in the remaining graph. Then Max locally changes choices in controlled vertices looking at attractive switches that seem to increase shortest paths lengths (under the current evaluation). We show that this is a monotonic strategy improvement, and every locally optimal strategy is globally optimal. This allows us to construct a randomized algorithm of complexity min(poly · W, 2 O( √ n log n)), which is simultaneously pseudopolynomial (W is the maximal absolute edge weight) and subexponential in the number of vertices n. All previous algorithms for mean payoff games were either exponential or pseudopolynomial (which is purely exponential for exponentially large edge weights).
Simple stochastic parity games
 In CSL’03, volume 2803 of LNCS
, 2003
"... p m), compared with O(mn) best algorithm known ..."
A discrete subexponential algorithm for parity games
 STACS’03
, 2003
"... We suggest a new randomized algorithm for solving parity games with worst case time complexity roughly ..."
Abstract

Cited by 33 (8 self)
 Add to MetaCart
We suggest a new randomized algorithm for solving parity games with worst case time complexity roughly
Staying Alive As Cheaply As Possible
 In Proc. of 7th Intl. Workshop on Hybrid Systems: Computation and Control (HSCC), volume 2993 of Lect. Notes in Comp. Sci
, 2004
"... This paper is concerned with the derivation of infinite schedules for timed automata that are in some sense optimal. To cover a wide class of optimality criteria we start out by introducing an extension of the (priced) timed automata model that includes both costs and rewards as separate modellin ..."
Abstract

Cited by 32 (18 self)
 Add to MetaCart
This paper is concerned with the derivation of infinite schedules for timed automata that are in some sense optimal. To cover a wide class of optimality criteria we start out by introducing an extension of the (priced) timed automata model that includes both costs and rewards as separate modelling features. A precise definition is then given of what constitutes optimal infinite behaviours for this class of models. We subsequently show that the derivation of optimal nonterminating schedules for such doublepriced timed automata is computable.
Better quality in synthesis through quantitative objectives
 In CoRR, abs/0904.2638
, 2009
"... Abstract. Most specification languages express only qualitative constraints. However, among two implementations that satisfy a given specification, one may be preferred to another. For example, if a specification asks that every request is followed by a response, one may prefer an implementation tha ..."
Abstract

Cited by 22 (9 self)
 Add to MetaCart
Abstract. Most specification languages express only qualitative constraints. However, among two implementations that satisfy a given specification, one may be preferred to another. For example, if a specification asks that every request is followed by a response, one may prefer an implementation that generates responses quickly but does not generate unnecessary responses. We use quantitative properties to measure the “goodness ” of an implementation. Using games with corresponding quantitative objectives, we can synthesize “optimal ” implementations, which are preferred among the set of possible implementations that satisfy a given specification. In particular, we show how automata with lexicographic meanpayoff conditions can be used to express many interesting quantitative properties for reactive systems. In this framework, the synthesis of optimal implementations requires the solution of lexicographic meanpayoff games (for safety requirements), and the solution of games with both lexicographic meanpayoff and parity objectives (for liveness requirements). We present algorithms for solving both kinds of novel graph games. 1
An exponential lower bound for the parity game strategy improvement algorithm as we know it
 In Proc. of 24th LICS
, 2009
"... This paper presents a new lower bound for the discrete strategy improvement algorithm for solving parity games due to Vöge and Jurdziński. First, we informally show which structures are difficult to solve for the algorithm. Second, we outline a family of games on which the algorithm requires exponen ..."
Abstract

Cited by 21 (8 self)
 Add to MetaCart
This paper presents a new lower bound for the discrete strategy improvement algorithm for solving parity games due to Vöge and Jurdziński. First, we informally show which structures are difficult to solve for the algorithm. Second, we outline a family of games on which the algorithm requires exponentially many strategy iterations, answering in the negative the longstanding question whether this algorithm runs in polynomial time. Additionally we note that the same family of games can be used to prove a similar result w.r.t. the strategy improvement variant by Schewe as well as the strategy iteration for solving discounted payoff games due to Puri. 1.