• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Improved approximation of interactive dynamic influence diagrams using discriminative model updates (2009)

by P Doshi, Y Zeng
Venue:In AAMAS ’09
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 11
Next 10 →

Teamwork with Limited Knowledge of Teammates

by Samuel Barrett, Peter Stone, Sarit Kraus, Avi Rosenfeld
"... While great strides have been made in multiagent teamwork, existing approaches typically assume extensive information exists about teammates and how to coordinate actions. This paper addresses how robust teamwork can still be created even if limited or no information exists about a specific group of ..."
Abstract - Cited by 7 (4 self) - Add to MetaCart
While great strides have been made in multiagent teamwork, existing approaches typically assume extensive information exists about teammates and how to coordinate actions. This paper addresses how robust teamwork can still be created even if limited or no information exists about a specific group of teammates, as in the ad hoc teamwork scenario. The main contribution of this paper is the first empirical evaluation of an agent cooperating with teammates not created by the authors, where the agent is not provided expert knowledge of its teammates. For this purpose, we develop a generalpurpose teammate modeling method and test the resulting ad hoc team agent’s ability to collaborate with more than 40 unknown teams of agents to accomplish a benchmark task. These agents were designed by people other than the authors without these designers planning for the ad hoc teamwork setting. A secondary contribution of the paper is a new transfer learning algorithm, TwoStageTransfer, that can improve results when the ad hoc team agent does have some limited observations of its current teammates. 1

Learning teammate models for ad hoc teamwork

by Samuel Barrett, Peter Stone, Sarit Kraus, Avi Rosenfeld
"... ie ..."
Abstract - Cited by 3 (1 self) - Add to MetaCart
Abstract not found
(Show Context)

Citation Context

... algorithm [6] which achieves convergence and rationality in repeated games. Another approach is to explicitly model and reason about other agents’ beliefs such as the work on I-POMDPs [9] and I-DIDs =-=[7]-=-. However, modeling other agents’ beliefs greatly expands the space for planning, and these approaches do not currently scale to larger problems.6. CONCLUSION Most existing research on ad hoc teamwor...

ǫ-Subjective Equivalence of Models for Interactive Dynamic Influence Diagrams

by Prashant Doshi, Muthukumaran Chandrasekaran, Yifeng Zeng
"... Abstract—Interactive dynamic influence diagrams (I-DID) are graphical models for sequential decision making in uncertain settings shared by other agents. Algorithms for solving I-DIDs face the challenge of an exponentially growing space of candidate models ascribed to other agents, over time. Prunin ..."
Abstract - Cited by 3 (2 self) - Add to MetaCart
Abstract—Interactive dynamic influence diagrams (I-DID) are graphical models for sequential decision making in uncertain settings shared by other agents. Algorithms for solving I-DIDs face the challenge of an exponentially growing space of candidate models ascribed to other agents, over time. Pruning behaviorally equivalent models is one way toward minimizing the model set. We seek to further reduce the complexity by additionally pruning models that are approximately subjectively equivalent. Toward this, we define subjective equivalence in terms of the distribution over the subject agent’s future actionobservation paths, and introduce the notion of ǫ-subjective equivalence. We present a new approximation technique that reduces the candidate model space by removing models that are ǫ-subjectively equivalent with representative ones. I.

Communicating with Unknown Teammates

by Samuel Barrett , Noa Agmon , Noam Hazon , Sarit Kraus , Peter Stone - Proceedings of the 13th Adaptive and Learning Agents workshop , 2013
"... ABSTRACT Past research has investigated a number of methods for coordinating teams of agents, but, with the growing number of sources of agents, it is likely that agents will encounter teammates that do not share their coordination methods. Therefore, it is desirable for agents to form an effective ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
ABSTRACT Past research has investigated a number of methods for coordinating teams of agents, but, with the growing number of sources of agents, it is likely that agents will encounter teammates that do not share their coordination methods. Therefore, it is desirable for agents to form an effective ad hoc team. This research tackles the problem of communication in ad hoc teams, introducing a minimal version of the multiagent, multi-armed bandit problem with limited communication between the agents. This abstract summarizes theoretical results that prove that this problem setting can be solved in polynomial time when the agent knows the set of possible teammates, and the empirical results that show that the problems can be solved in practice.

Approximate Solutions of Interactive Dynamic Influence Diagrams Using ǫ-Behavioral Equivalence

by Muthukumaran C, Prashant Doshi, Yifeng Zeng
"... Interactive dynamic influence diagrams (I-DID) are graphical models for sequential decision making in uncertain settings shared by other agents. Algorithms for solving I-DIDs face the challenge of an exponentially growing space of candidate models ascribed to other agents, over time. Pruning the beh ..."
Abstract - Cited by 2 (2 self) - Add to MetaCart
Interactive dynamic influence diagrams (I-DID) are graphical models for sequential decision making in uncertain settings shared by other agents. Algorithms for solving I-DIDs face the challenge of an exponentially growing space of candidate models ascribed to other agents, over time. Pruning the behaviorally equivalent models is one way toward identifying a minimal model set. We further reduce the complexity by pruning models that are approximately behaviorally equivalent. Toward this, we redefine behavioral equivalence in terms of the distribution over the subject agent’s future action-observation paths, and introduce the notion of ǫ-behavioral equivalence. We present a new approximation method that reduces the candidate models by pruning models that are ǫ-behaviorally equivalent with representative ones. 1

systems

by Z. Y. Zeng, F. Claro , 2001
"... Delocalization and conductance quantization in one-dimensional ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
Delocalization and conductance quantization in one-dimensional
(Show Context)

Citation Context

...te space. Previous I-DID solutions, including both exact and approximate ones, mainly exploit the concept of BE to reduce the dimensionality of the state space. For example, the proposed technique in =-=[5]-=- updates only those models that lead to behaviorally distinct models at the next time step. It results in a minimal model space. A central component of this technique is the way of identifying equival...

Approximating Behavioral Equivalence of Models Using Top-K Policy Paths

by Yifeng Zeng, Yingke Chen, Prashant Doshi
"... Decision making and game play in multiagent settings must often contend with behavioral models of other agents in order to predict their actions. One approach that reduces the complexity of the unconstrained model space is to group models that tend to be behaviorally equivalent. In this paper, we se ..."
Abstract - Cited by 1 (1 self) - Add to MetaCart
Decision making and game play in multiagent settings must often contend with behavioral models of other agents in order to predict their actions. One approach that reduces the complexity of the unconstrained model space is to group models that tend to be behaviorally equivalent. In this paper, we seek to further compress the model space by introducing an approximate measure of behavioral equivalence and using it to group models. Categories and Subject Descriptors I.2.11 [Distributed Artificial Intelligence]: Multiagent systems
(Show Context)

Citation Context

...ssibly similar ways. Previous I-DID solutions, including both exact and approximate ones, mainly exploit the concept of BE to reduce the dimensionality of the state space. For example, Doshi and Zeng =-=[4]-=- minimize the model space by updating only those models that lead to behaviorally distinct models at the next time step. While this approach speeds up solutions of I-DIDs considerably and is the state...

Approximate Model Equivalence for . . .

by Muthukumaran Chandrasekaran
"... ..."
Abstract - Add to MetaCart
Abstract not found

Cooperating with Unknown Teammates in Complex Domains: A Robot Soccer Case Study of Ad Hoc Teamwork

by Samuel Barrett, Peter Stone
"... Many scenarios require that robots work together as a team in order to effectively accomplish their tasks. However, pre-coordinating these teams may not always be possible given the growing number of companies and research labs creating these robots. Therefore, it is desirable for robots to be able ..."
Abstract - Add to MetaCart
Many scenarios require that robots work together as a team in order to effectively accomplish their tasks. However, pre-coordinating these teams may not always be possible given the growing number of companies and research labs creating these robots. Therefore, it is desirable for robots to be able to reason about ad hoc teamwork and adapt to new team-mates on the fly. Past research on ad hoc teamwork has fo-cused on relatively simple domains, but this paper demon-strates that agents can reason about ad hoc teamwork in com-plex scenarios. To handle these complex scenarios, we intro-duce a new algorithm, PLASTIC–Policy, that builds on an existing ad hoc teamwork approach. Specifically, PLASTIC– Policy learns policies to cooperate with past teammates and reuses these policies to quickly adapt to new teammates. This approach is tested in the 2D simulation soccer league of RoboCup using the half field offense task. 1

Cooperating with Unknown Teammates in Robot Soccer

by Samuel Barrett, Peter Stone , 2014
"... Many scenarios require that robots work together as a team in order to effectively accomplish their tasks. However, pre-coordinating these teams may not always be possible given the growing number of companies and research labs creating these robots. Therefore, it is desirable for robots to be able ..."
Abstract - Add to MetaCart
Many scenarios require that robots work together as a team in order to effectively accomplish their tasks. However, pre-coordinating these teams may not always be possible given the growing number of companies and research labs creating these robots. Therefore, it is desirable for robots to be able to reason about ad hoc teamwork and adapt to new teammates on the fly. This paper adopts an approach of learning policies to cooperate with past teammates and reusing these policies to quickly adapt to the new teammates. This approach is applied to the complex domain of robot soccer in the form of half field offense in the RoboCup simulated 2D league. This paper represents a preliminary investigation into this domain and presents a promising approach for tackling this problem.
(Show Context)

Citation Context

...os and Ramamoorthy explore modeling adversaries in the RoboCup domain in [14]. Another approach is to explicitly model and reason about other agents’ beliefs such as the work on I-POMDPs [15], I-DIDs =-=[16]-=-, and NIDs [17]. However, modeling other agents’ beliefs greatly expands the planning space, and these approaches do not currently scale to larger problems. 3 Problem Description In this paper, we con...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University