Results 1 - 10
of
25
Reinforcement learning for RoboCup-soccer keepaway
- Adaptive Behavior
, 2005
"... 1 RoboCup simulated soccer presents many challenges to reinforcement learning methods, in-cluding a large state space, hidden and uncertain state, multiple independent agents learning simultaneously, and long and variable delays in the effects of actions. We describe our appli-cation of episodic SMD ..."
Abstract
-
Cited by 85 (31 self)
- Add to MetaCart
1 RoboCup simulated soccer presents many challenges to reinforcement learning methods, in-cluding a large state space, hidden and uncertain state, multiple independent agents learning simultaneously, and long and variable delays in the effects of actions. We describe our appli-cation of episodic SMDP Sarsa(λ) with linear tile-coding function approximation and variable λ to learning higher-level decisions in a keepaway subtask of RoboCup soccer. In keepaway, one team, “the keepers, ” tries to keep control of the ball for as long as possible despite the efforts of “the takers. ” The keepers learn individually when to hold the ball and when to pass to a teammate. Our agents learned policies that significantly outperform a range of benchmark policies. We demonstrate the generality of our approach by applying it to a number of task variations including different field sizes and different numbers of players on each team.
Keepaway soccer: From machine learning testbed to benchmark
- RoboCup-2005: Robot Soccer World Cup IX
, 2006
"... Abstract. Keepaway soccer has been previously put forth as a testbed for machine learning. Although multiple researchers have used it successfully for machine learning experiments, doing so has required a good deal of domain expertise. This paper introduces a set of programs, tools, and resources de ..."
Abstract
-
Cited by 39 (19 self)
- Add to MetaCart
Abstract. Keepaway soccer has been previously put forth as a testbed for machine learning. Although multiple researchers have used it successfully for machine learning experiments, doing so has required a good deal of domain expertise. This paper introduces a set of programs, tools, and resources designed to make the domain easily usable for experimentation without any prior knowledge of RoboCup or the Soccer Server. In addition, we report on new experiments in the Keepaway domain, along with performance results designed to be directly comparable with future experimental results. Combined, the new infrastructure and our concrete demonstration of its use in comparative experiments elevate the domain to a machine learning benchmark, suitable for use by researchers across the field. 1 Introduction Keepaway soccer in the Soccer Server used at RoboCup has been previouslyput forth as a testbed for machine learning [15]. Since then it has been used for
Transfer learning via inter-task mappings for temporal difference learning
- Journal of Machine Learning Research
"... Temporal difference (TD) learning (Sutton and Barto, 1998) has become a popular reinforcement learning technique in recent years. TD methods, relying on function approximators to generalize learning to novel situations, have had some experimental successes and have been shown to exhibit some desirab ..."
Abstract
-
Cited by 22 (9 self)
- Add to MetaCart
Temporal difference (TD) learning (Sutton and Barto, 1998) has become a popular reinforcement learning technique in recent years. TD methods, relying on function approximators to generalize learning to novel situations, have had some experimental successes and have been shown to exhibit some desirable properties in theory, but the most basic algorithms have often been found slow in practice. This empirical result has motivated the development of many methods that speed up reinforcement learning by modifying a task for the learner or helping the learner better generalize to novel situations. This article focuses on generalizing across tasks, thereby speeding up learning, via a novel form of transfer using handcoded task relationships. We compare learning on a complex task with three function approximators, a cerebellar model arithmetic computer (CMAC), an artificial neural network (ANN), and a radial basis function (RBF), and empirically demonstrate that directly transferring the action-value function can lead to a dramatic speedup in learning with all three. Using transfer via inter-task mapping (TVITM), agents are able to learn one task and then markedly reduce the time it takes to learn a more complex task. Our algorithms are fully implemented and tested in the RoboCup soccer Keepaway domain. This article contains and extends material published in two conference papers (Taylor and Stone, 2005; Taylor et al., 2005).
Learning to sportscast: A test of grounded language acquisition
- In Proceedings of 25th International Conference on Machine Learning (ICML-2008
, 2008
"... We present a novel commentator system that learns language from sportscasts of simulated soccer games. The system learns to parse and generate commentaries without any engineered knowledge about the English language. Training is done using only ambiguous supervision in the form of textual human comm ..."
Abstract
-
Cited by 19 (5 self)
- Add to MetaCart
We present a novel commentator system that learns language from sportscasts of simulated soccer games. The system learns to parse and generate commentaries without any engineered knowledge about the English language. Training is done using only ambiguous supervision in the form of textual human commentaries and simulation states of the soccer games. The system simultaneously tries to establish correspondences between the commentaries and the simulation states as well as build a translation model. We also present a novel algorithm, Iterative Generation Strategy Learning (IGSL), for deciding which events to comment on. Human evaluations of the generated commentaries indicate they are of reasonable quality compared to human commentaries. 1.
Know thine enemy: A champion RoboCup coach agent
- In Proceedings of the Twenty-First National Conference on Artificial Intelligence
, 2006
"... In a team-based multiagent system, the ability to construct a model of an opponent team’s joint behavior can be useful for determining an agent’s expected distribution over future world states, and thus can inform its planning of future actions. This paper presents an approach to team opponent model ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
In a team-based multiagent system, the ability to construct a model of an opponent team’s joint behavior can be useful for determining an agent’s expected distribution over future world states, and thus can inform its planning of future actions. This paper presents an approach to team opponent modeling in the context of the RoboCup simulation coach competition. Specifically, it introduces an autonomous coach agent capable of analyzing past games of the current opponent, advising its own team how to play against this opponent, and identifying patterns or weaknesses on the part of the opponent. Our approach is fully implemented and tested within the RoboCup soccer server, and was the champion of the RoboCup 2005 simulation coach competition.
The ut austin villa 2003 champion simulator coach: A machine learning approach
- RoboCup-2004: Robot Soccer World Cup VIII
, 2005
"... Abstract. The UT Austin Villa 2003 simulated online soccer coach was a first time entry in the RoboCup Coach Competition. In developing the coach, the main research focus was placed on treating advice-giving as a machine learning problem. Competing against a field of mostly handcoded coaches, the UT ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
Abstract. The UT Austin Villa 2003 simulated online soccer coach was a first time entry in the RoboCup Coach Competition. In developing the coach, the main research focus was placed on treating advice-giving as a machine learning problem. Competing against a field of mostly handcoded coaches, the UT Austin Villa coach earned first place in the competition. In this paper, we present the multi-faceted learning strategy that our coach used and examine which aspects contributed most to the coach’s success. 1
Training a Multilingual Sportscaster: Using Perceptual Context to Learn Language
- Journal of Artificial Intelligence Research
, 2010
"... We present a novel framework for learning to interpret and generate language using only perceptual context as supervision. We demonstrate its capabilities by developing a system that learns to sportscast simulated robot soccer games in both English and Korean without any language-specific prior know ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
We present a novel framework for learning to interpret and generate language using only perceptual context as supervision. We demonstrate its capabilities by developing a system that learns to sportscast simulated robot soccer games in both English and Korean without any language-specific prior knowledge. Training employs only ambiguous supervision consisting of a stream of descriptive textual comments and a sequence of events extracted from the simulation trace. The system simultaneously establishes correspondences between individual comments and the events that they describe while building a translation model that supports both parsing and generation. We also present a novel algorithm for learning which events are worth describing. Human evaluations of the generated commentaries indicate they are of reasonable quality and in some cases even on par with those produced by humans for our limited domain. 1.
Learning for semantic parsing using statistical machine translation techniques. Doctoral Dissertation Proposal
, 2005
"... Semantic parsing is the construction of a complete, formal, symbolic meaning representation of a sentence. While it is crucial to natural language understanding, the problem of semantic parsing has received relatively little attention from the machine learning community. Recent work on natural langu ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
Semantic parsing is the construction of a complete, formal, symbolic meaning representation of a sentence. While it is crucial to natural language understanding, the problem of semantic parsing has received relatively little attention from the machine learning community. Recent work on natural language understanding has mainly focused on shallow semantic analysis, such as word-sense disambiguation and semantic role labeling. Semantic parsing, on the other hand, involves deep semantic analysis in which word senses, semantic roles and other components are combined to produce useful meaning representations for a particular application domain (e.g. database query). Prior research in machine learning for semantic parsing is mainly based on inductive logic programming or deterministic parsing, which lack some of the robustness that characterizes statistical learning. Existing statistical approaches to semantic parsing, however, are mostly concerned with relatively simple application domains in which a meaning representation is no more than a single semantic frame. In this proposal, we present a novel statistical approach to semantic parsing, WASP, which can handle meaning representations with a nested structure. The WASP algorithm learns a semantic parser given a set of sentences annotated with their correct meaning representations. The parsing model is based on the
Learning for semantic parsing
- IN COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING: PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE
, 2007
"... Semantic parsing is the task of mapping a natural language sentence into a complete, formal meaning representation. Over the past decade, we have developed a number of machine learning methods for inducing semantic parsers by training on a corpus of sentences paired with their meaning representatio ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
Semantic parsing is the task of mapping a natural language sentence into a complete, formal meaning representation. Over the past decade, we have developed a number of machine learning methods for inducing semantic parsers by training on a corpus of sentences paired with their meaning representations in a specified formal language. We have demonstrated these methods on the automated construction of naturallanguage interfaces to databases and robot command languages. This paper reviews our prior work on this topic and discusses directions for future research.
The Champion UT Austin Villa 2003 Simulator Online Coach Team
- RoboCup-2003: Robot Soccer World Cup VII
, 2003
"... The UT Austin Villa 2003 simulated online soccer coach was a rst time entry in the RoboCup Coach Competition. In developing the coach, the main research focus was placed on treating advice-giving as a learning problem. The coach learns to predict agent behavior from past observations and automa ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
The UT Austin Villa 2003 simulated online soccer coach was a rst time entry in the RoboCup Coach Competition. In developing the coach, the main research focus was placed on treating advice-giving as a learning problem. The coach learns to predict agent behavior from past observations and automatically generates advice to improve its team's performance. Using this approach, the UT Austin Villa coach earned rst place in this year's competition.

