Results 1 -
5 of
5
Online planning for ad hoc autonomous agent teams
- In Proceedings of the 22nd International Joint Conference on Artificial Intelligence
, 2011
"... We propose a novel online planning algorithm for ad hoc team settings—challenging situations in which an agent must collaborate with unknown teammates without prior coordination. Our approach is based on constructing and solving a series of stage games, and then using biased adaptive play to choose ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
We propose a novel online planning algorithm for ad hoc team settings—challenging situations in which an agent must collaborate with unknown teammates without prior coordination. Our approach is based on constructing and solving a series of stage games, and then using biased adaptive play to choose actions. The utility function in each stage game is estimated via Monte-Carlo tree search using the UCT algorithm. We establish analytically the convergence of the algorithm and show that it performs well in a variety of ad hoc team domains. 1
Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence Incorporating Reviewer and Product Information for Review Rating Prediction
"... Traditional sentiment analysis mainly considers binary classifications of reviews, but in many real-world sentiment classification problems, nonbinary review ratings are more useful. This is especially true when consumers wish to compare two products, both of which are not negative. Previous work ha ..."
Abstract
- Add to MetaCart
Traditional sentiment analysis mainly considers binary classifications of reviews, but in many real-world sentiment classification problems, nonbinary review ratings are more useful. This is especially true when consumers wish to compare two products, both of which are not negative. Previous work has addressed this problem by extracting various features from the review text for learning a predictor. Since the same word may have different sentiment effects when used by different reviewers on different products, we argue that it is necessary to model such reviewer and product dependent effects in order to predict review ratings more accurately. In this paper, we propose a novel learning framework to incorporate reviewer and product information into the text based learner for rating prediction. The reviewer, product and text features are modeled as a three-dimension tensor. Tensor factorization techniques can then be employed to reduce the data sparsity problems. We perform extensive experiments to demonstrate the effectiveness of our model, which has a significant improvement compared to state of the art methods, especially for reviews with unpopular products and inactive reviewers. 1
Representation, Planning, and Learning of Dynamic Ad Hoc Robot Teams
, 2011
"... Task allocation involves the division of tasks among a team of robots, such that each robot is responsible for a subset of the tasks. Similarly, in role assignment, roles are typically defined to be performed by a single robot whose performance is independent of the composition of its team. Complex ..."
Abstract
- Add to MetaCart
Task allocation involves the division of tasks among a team of robots, such that each robot is responsible for a subset of the tasks. Similarly, in role assignment, roles are typically defined to be performed by a single robot whose performance is independent of the composition of its team. Complex tasks, that cannot be sub-divided and require multiple robots cooperating, require the formation of a coalition of robots to complete. We are interested in forming an effective ad hoc team to solve a task, through observations of the robots ’ performance in the task and modeling the synergistic effects among robots in the team. Ad hoc teams are common in sports such as soccer, where human players without prior interactions form a team and are capable of playing the game. Currently, while robots within a team can play soccer well, they are unable to form ad hoc teams with robots developed by multiple research groups. This general problem is also seen in urban search-and-rescue (USAR), where large groups of heterogeneous robots would be deployed to solve complex tasks such as putting out fires, rescuing people and clearing road blockages. This thesis represents team performance as a function of the individual capabilities of the
Leading Ad Hoc Agents in Joint Action Settings with Multiple Teammates ∗
"... The growing use of autonomous agents in practice may require agents to cooperate as a team in situations where they have limited prior knowledge about one another, cannot communicate directly, or do not share the same world models. These situations raise the need to design ad hoc team members, i.e., ..."
Abstract
- Add to MetaCart
The growing use of autonomous agents in practice may require agents to cooperate as a team in situations where they have limited prior knowledge about one another, cannot communicate directly, or do not share the same world models. These situations raise the need to design ad hoc team members, i.e., agents that will be able to cooperate without coordination in order to reach an optimal team behavior. This paper considers the problem of leading N-agent teams by an agent toward their optimal joint utility, where the agents compute their next actions based only on their most recent observations of their teammates ’ actions. We show that compared to previous results in two-agent teams, in larger teams the agent might not be able to lead the team to the action with maximal joint utility, thus its optimal strategy is to lead the team to the best possible reachable cycle of joint actions. We describe a graphical model of the problem and a polynomial time algorithm for solving it. We then consider other variations of the problem, including leading teams of agents where they base their actions on longer history of past observations, leading a team by more than one ad hoc agent, and leading a teammate while the ad hoc agent is uncertain of its behavior.
U N I V E R S I
"... Multiagent Learning (MAL) is the algorithmic study of learning in a group of two or more agents. If the agents are based on different algorithms, and if there is no form of prior coordination between the agents, then this is called an ad hoc team problem [48]. Following a literature review [2] and a ..."
Abstract
- Add to MetaCart
Multiagent Learning (MAL) is the algorithmic study of learning in a group of two or more agents. If the agents are based on different algorithms, and if there is no form of prior coordination between the agents, then this is called an ad hoc team problem [48]. Following a literature review [2] and a research proposal [1], the work at hand compares the performance of five MAL algorithms in ad hoc teams. These include the Joint Action Learner [12], the Conditional Joint Action Learner [3], Win or Learn Fast with Policy Hill Climbing [6], Modified Regret-Matching [21], and the Nash Q-Learner [24]. The algorithms are evaluated in a range of strategic games, including no-conflict games in which the players agree on what is most preferred, and conflict games in which the players disagree on what is most preferred [40]. In addition, we use an evaluation procedure proposed by Stone et al. [48]. Our performance criteria include the convergence rate, the final expected payoff, social welfare and fairness, and the rates of different solution types. From the results we conclude that (a) all algorithms perform well in some sense (i.e., there is no clear winner), and (b) the performance of an algorithm ultimately

