Results 1  10
of
30
Forward models: Supervised learning with a distal teacher
 Cognitive Science
, 1992
"... Internal models of the environment have an important role to play in adaptive systems in general and are of particular importance for the supervised learning paradigm. In this paper we demonstrate that certain classical problems associated with the notion of the \teacher " in supervised lea ..."
Abstract

Cited by 410 (8 self)
 Add to MetaCart
Internal models of the environment have an important role to play in adaptive systems in general and are of particular importance for the supervised learning paradigm. In this paper we demonstrate that certain classical problems associated with the notion of the \teacher &quot; in supervised learning can be solved by judicious use of learned internal models as components of the adaptive system. In particular, we show how supervised learning algorithms can be utilized in cases in which an unknown dynamical system intervenes between actions and desired outcomes. Our approach applies to any supervised learning algorithm that is capable of learning in multilayer networks.
Efficient Exploration In Reinforcement Learning
, 1992
"... Exploration plays a fundamental role in any active learning system. This study evaluates the role of exploration in active learning and describes several local techniques for exploration in finite, discrete domains, embedded in a reinforcement learning framework (delayed reinforcement). This paper d ..."
Abstract

Cited by 149 (3 self)
 Add to MetaCart
Exploration plays a fundamental role in any active learning system. This study evaluates the role of exploration in active learning and describes several local techniques for exploration in finite, discrete domains, embedded in a reinforcement learning framework (delayed reinforcement). This paper distinguishes between two families of exploration schemes: undirected and directed exploration. While the former family is closely related to random walk exploration, directed exploration techniques memorize explorationspecific knowledge which is used for guiding the exploration search. In many finite deterministic domains, any learning technique based on undirected exploration is inefficient in terms of learning time, i.e. learning time is expected to scale exponentially with the size of the state space (Whitehead, 1991b) . We prove that for all these domains, reinforcement learning using a directed technique can always be performed in polynomial time, demonstrating the important role of e...
Learning Maps for Indoor Mobile Robot Navigation
 ARTIFICIAL INTELLIGENCE (ACCEPTED FOR PUBLICATION)
, 1997
"... Autonomous robots must be able to learn and maintain models of their environments. Research on mobile robot navigation has produced two major paradigms for mapping indoor environments: gridbased and topological. While gridbased methods produce accurate metric maps, their complexity often prohibits ..."
Abstract

Cited by 91 (10 self)
 Add to MetaCart
(Show Context)
Autonomous robots must be able to learn and maintain models of their environments. Research on mobile robot navigation has produced two major paradigms for mapping indoor environments: gridbased and topological. While gridbased methods produce accurate metric maps, their complexity often prohibits efficient planning and problem solving in largescale indoor environments. Topological maps, on the other hand, can be used much more efficiently, yet accurate and consistent topological maps are often difficult to learn and maintain in largescale environments, particularly if momentary sensor data is highly ambiguous. This paper describes an approach that integrates both paradigms: gridbased and topological. Gridbased maps are learned using artificial neural networks and naive Bayesian integration. Topological maps are generated on top of the gridbased maps, by partitioning the latter into coherent regions. By combining both paradigms, the approach presented here gains advantages from both worlds: accuracy/consistency and efficiency. The paper gives results for autonomous exploration, mapping and operation of a mobile robot in populated multiroom environments.
Lifelong Robot Learning
 Robotics and Autonomous Systems
, 1993
"... . Learning provides a useful tool for the automatic design of autonomous robots. Recent research on learning robot control has predominantly focussed on learning single tasks that were studied in isolation. If robots encounter a multitude of control learning tasks over their entire lifetime, however ..."
Abstract

Cited by 78 (4 self)
 Add to MetaCart
. Learning provides a useful tool for the automatic design of autonomous robots. Recent research on learning robot control has predominantly focussed on learning single tasks that were studied in isolation. If robots encounter a multitude of control learning tasks over their entire lifetime, however, there is an opportunity to transfer knowledge between them. In order to do so, robots may learn the invariants of the individual tasks and environments. This taskindependent knowledge can be employed to bias generalization when learning control, which reduces the need for realworld experimentation. We argue that knowledge transfer is essential if robots are to learn control with moderate learning times in complex scenarios. Two approaches to lifelong robot learning which both capture invariant knowledge about the robot and its environments are presented. Both approaches have been evaluated using a HERO2000 mobile robot. Learning tasks included navigation in unknown indoor environments an...
Active Exploration in Dynamic Environments
, 1992
"... Whenever an agent learns to control an unknown environment, two opposing principles have to be combined, namely: exploration (longterm optimization) and exploitation (shortterm optimization). Many realvalued connectionist approaches to learning control realize exploration by randomness in action ..."
Abstract

Cited by 71 (4 self)
 Add to MetaCart
Whenever an agent learns to control an unknown environment, two opposing principles have to be combined, namely: exploration (longterm optimization) and exploitation (shortterm optimization). Many realvalued connectionist approaches to learning control realize exploration by randomness in action selection. This might be disadvantageous when costs are assigned to "negative experiences". The basic idea presented in this paper is to make an agent explore unknown regions in a more directed manner. This is achieved by a socalled competence map, which is trained to predict the controller's accuracy, and is used for guiding exploration. Based on this, a bistable system enables smoothly switching attention between two behaviors  exploration and exploitation  depending on expected costs and knowledge gain. The appropriateness of this method is demonstrated by a simple robot navigation task.
Hidden State and Reinforcement Learning with InstanceBased State Identification
 IEEE Transations on Systems, Man, and Cybernetics
"... Real robots with real sensors are not omniscient. When a robot's next course of action depends on information that is hidden from the sensors because of problems such as occlusion, restricted range, bounded field of view and limited attention, we say the robot suffers from the hidden state prob ..."
Abstract

Cited by 40 (1 self)
 Add to MetaCart
(Show Context)
Real robots with real sensors are not omniscient. When a robot's next course of action depends on information that is hidden from the sensors because of problems such as occlusion, restricted range, bounded field of view and limited attention, we say the robot suffers from the hidden state problem. State identification techniques use history information to uncover hidden state. Some previous approaches to encoding history include: finite state machines [12, 28], recurrent neural networks [25] and genetic programming with indexed memory [49]. A chief disadvantage of all these techniques is their long training time. This paper presents instancebased state identification, a new approach to reinforcement learning with state identification that learns with much fewer training steps. Noting that learning with history and learning in continuous spaces both share the property that they begin without knowing the granularity of the state space, the approach applies instancebased (or "memoryba...
A Unified GradientDescent/Clustering Architecture for Finite State Machine Induction
 NIPS
, 1994
"... Although recurrent neural nets have been moderately successful in learning to emulate finitestate machines (FSMs), the continuous internal state dynamics of a neural net are not well matched to the discrete behavior of an FSM. We describe an architecture, called DOLCE, that allows discrete states t ..."
Abstract

Cited by 32 (0 self)
 Add to MetaCart
Although recurrent neural nets have been moderately successful in learning to emulate finitestate machines (FSMs), the continuous internal state dynamics of a neural net are not well matched to the discrete behavior of an FSM. We describe an architecture, called DOLCE, that allows discrete states to evolve in a net as learning progresses. dolce consists of a standard recurrent neural net trained by gradient descent and an adaptive clustering technique that quantizes the state space. dolce is based on the assumption that a finite set of discrete internal states is required for the task, and that the actual network state belongs to this set but has been corrupted by noise due to inaccuracy in the weights. dolce learns to recover the discrete state with maximum a posteriori probability from the noisy state. Simulations show that dolce leads to a significant improvement in generalization performance over earlier neural net approaches to FSM induction.
Learning a Class of Large Finite State Machines with a Recurrent Neural Network
, 1995
"... One of the issues in any learning model is how it scales with problem size. The problem of learning finite state machine (FSMs) from examples with recurrent neural networks has been extensively explored. However, these results are somewhat disappointing in the sense that the machines that can be le ..."
Abstract

Cited by 22 (11 self)
 Add to MetaCart
One of the issues in any learning model is how it scales with problem size. The problem of learning finite state machine (FSMs) from examples with recurrent neural networks has been extensively explored. However, these results are somewhat disappointing in the sense that the machines that can be learned are too small to be competitive with existing grammatical inference algorithms. We show that a type of recurrent neural network (Narendra & Parthasarathy, 1990, IEEE Trans. Neural Networks, 1, 427) which has feedback but no hidden state neurons can learn a special type of FSM called a finite memory machine (FMM) under certain constraints. These machines have a large number of states (simulations are for 256 and 512 state FMMs) but have minimal order, relatively small depth and little logic when the FMM is implemented as a sequential machine,
The Neural Network Pushdown Automaton: Model, Stack and Learning Simulations
, 1993
"... In order for neural networks to learn complex languages or grammars, they must have sufficient computational power or resources to recognize or generate such languages. Though many approaches to effectively utilizing the computational power of neural networks have been discussed, an obvious one is t ..."
Abstract

Cited by 18 (2 self)
 Add to MetaCart
In order for neural networks to learn complex languages or grammars, they must have sufficient computational power or resources to recognize or generate such languages. Though many approaches to effectively utilizing the computational power of neural networks have been discussed, an obvious one is to couple a recurrent neural network with an external stack memory in effect creating a neural network pushdown automata (NNPDA). This NNPDA generalizes the concept of a recurrent network so that the network becomes a more complex computing structure. This paper discusses in detail a NNPDA its construction, how it can be trained and how useful symbolic information can be extracted from the trained network. To effectively couple the external stack to the neural network, an optimization method is developed which uses an error function that connects the learning of the state automaton of the neural network to the learning of the operation of the external stack: push, pop, and nooperation. To minimize the error function using gradient descent learning, an analog stack is designed such that the action and storage of information in the stack are continuous. One interpretation of a continuous stack is the probabilistic storage of and action on data. After training on sample strings of an unknown source grammar, a quantization procedure extracts from the analog stack and neural network a discrete pushdown automata (PDA). Simulations show that in learning deterministic contextfree grammars the balanced parenthesis language, 1 n 0 n, and the deterministic Palindrome the extracted PDA is correct in the sense that it can correctly recognize unseen strings of arbitrary length. In addition, the extracted PDAs can be shown to be identical or equivalent to the PDAs of the source grammars which were used to generate the training strings.