Shortlist: a connectionist model of continuous speech recognition
 Cognition
, 1994
"... Previous work has shown how a backpropagation network with recurrent connections can successfully model many aspects of human spoken word recognition (Norris, 1988, 1990, 1992, 1993). However, such networks are unable to revise their decisions in the light of subsequent context. TRACE (McClelland ..."
Cited by 259 (9 self)
Previous work has shown how a backpropagation network with recurrent connections can successfully model many aspects of human spoken word recognition (Norris, 1988, 1990, 1992, 1993). However, such networks are unable to revise their decisions in the light of subsequent context. TRACE (McClelland & Elman, 1986), on the other hand, manages to deal appropriately with following context, but only by using a highly implausible architecture that fails to account for some important experimental results. A new model is presented which displays the more desirable properties of each of these models. In contrast to TRACE the new model is entirely bottomup and can readily perform simulations with vocabularies of tens of thousands of words. 1.
Collaborative MultiRobot Exploration
, 2000
"... In this paper we consider the problem of exploring an unknown environment by a team of robots. As in singlerobot exploration the goal is to minimize the overall exploration time. The key problem to be solved therefore is to choose appropriate target points for the individual robots so that they sim ..."
Cited by 254 (31 self)
In this paper we consider the problem of exploring an unknown environment by a team of robots. As in singlerobot exploration the goal is to minimize the overall exploration time. The key problem to be solved therefore is to choose appropriate target points for the individual robots so that they simultaneously explore different regions of their environment. We present a probabilistic approach for the coordination of multiple robots which, in contrast to previous approaches, simultaneously takes into account the costs of reaching a target point and the utility of target points. The utility of target points is given by the size of the unexplored area that a robot can cover with its sensors upon reaching a target position. Whenever a target point is assigned to a specific robot, the utility of the unexplored area visible from this target position is reduced for the other robots. This way, a team of multiple robots assigns different target points to the individual robots. The technique has...
The partigame algorithm for variable resolution reinforcement learning in multidimensional statespaces
 MACHINE LEARNING
, 1995
"... Partigame is a new algorithm for learning feasible trajectories to goal regions in high dimensional continuous statespaces. In high dimensions it is essential that learning does not plan uniformly over a statespace. Partigame maintains a decisiontree partitioning of statespace and applies tec ..."
Cited by 249 (8 self)
Partigame is a new algorithm for learning feasible trajectories to goal regions in high dimensional continuous statespaces. In high dimensions it is essential that learning does not plan uniformly over a statespace. Partigame maintains a decisiontree partitioning of statespace and applies techniques from gametheory and computational geometry to efficiently and adaptively concentrate high resolution only on critical areas. The current version of the algorithm is designed to find feasible paths or trajectories to goal regions in high dimensional spaces. Future versions will be designed to find a solution that optimizes a realvalued criterion. Many simulated problems have been tested, ranging from twodimensional to ninedimensional statespaces, including mazes, path planning, nonlinear dynamics, and planar snake robots in restricted spaces. In all cases, a good solution is found in less than ten trials and a few minutes.
Multiagent Learning Using a Variable Learning Rate
 Artificial Intelligence
, 2002
"... Learning to act in a multiagent environment is a difficult problem since the normal definition of an optimal policy no longer applies. The optimal policy at any moment depends on the policies of the other agents and so creates a situation of learning a moving target. Previous learning algorithms hav ..."
Cited by 218 (9 self)
Learning to act in a multiagent environment is a difficult problem since the normal definition of an optimal policy no longer applies. The optimal policy at any moment depends on the policies of the other agents and so creates a situation of learning a moving target. Previous learning algorithms have one of two shortcomings depending on their approach. They either converge to a policy that may not be optimal against the specific opponents' policies, or they may not converge at all. In this article we examine this learning problem in the framework of stochastic games. We look at a number of previous learning algorithms showing how they fail at one of the above criteria. We then contribute a new reinforcement learning technique using a variable learning rate to overcome these shortcomings. Specifically, we introduce the WoLF principle, "Win or Learn Fast", for varying the learning rate. We examine this technique theoretically, proving convergence in selfplay on a restricted class of iterated matrix games. We also present empirical results on a variety of more general stochastic games, in situations of selfplay and otherwise, demonstrating the wide applicability of this method.
Reinforcement Learning with Replacing Eligibility Traces
 MACHINE LEARNING
, 1996
"... The eligibility trace is one of the basic mechanisms used in reinforcement learning to handle delayed reward. In this paper we introduce a new kind of eligibility trace, the replacing trace, analyze it theoretically, and show that it results in faster, more reliable learning than the conventional ..."
Cited by 218 (13 self)
The eligibility trace is one of the basic mechanisms used in reinforcement learning to handle delayed reward. In this paper we introduce a new kind of eligibility trace, the replacing trace, analyze it theoretically, and show that it results in faster, more reliable learning than the conventional trace. Both kinds of trace assign credit to prior events according to how recently they occurred, but only the conventional trace gives greater credit to repeated events. Our analysis is for conventional and replacetrace versions of the offline TD(1) algorithm applied to undiscounted absorbing Markov chains. First, we show that these methods converge under repeated presentations of the training set to the same predictions as two well known Monte Carlo methods. We then analyze the relative efficiency of the two Monte Carlo methods. We show that the method corresponding to conventional TD is biased, whereas the method corresponding to replacetrace TD is unbiased. In addition, we show that t...
Steps toward artificial intelligence
 Computers and Thought
, 1961
"... Harvard University. The work toward attaining "artificial intelligence’ ’ is the center of considerable computer research, design, and application. The field is in its starting transient, characterized by many varied and independent efforts. Marvin Minsky has been requested to draw this wor ..."
Cited by 216 (0 self)
Harvard University. The work toward attaining &quot;artificial intelligence’ ’ is the center of considerable computer research, design, and application. The field is in its starting transient, characterized by many varied and independent efforts. Marvin Minsky has been requested to draw this work together into a coherent summary, supplement it with appropriate explanatory or theoretical noncomputer information, and introduce his assessment of the state of the art. This paper emphasizes the class of activities in which a generalpurpose computer, complete with a library of basic programs, is further programmed to perform operations leading to ever higherlevel information processing functions such as learning and problem solving. This informative article will be of real interest to both the general Proceedings reader and the computer specialist. The Guest Editor.
SPUDD: Stochastic planning using decision diagrams
 In Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence
, 1999
"... Recently, structured methods for solving factored Markov decisions processes (MDPs) with large state spaces have been proposed recently to allow dynamic programming to be applied without the need for complete state enumeration. We propose and examine a new value iteration algorithm for MDPs that use ..."
Cited by 216 (20 self)
Recently, structured methods for solving factored Markov decisions processes (MDPs) with large state spaces have been proposed recently to allow dynamic programming to be applied without the need for complete state enumeration. We propose and examine a new value iteration algorithm for MDPs that uses algebraic decision diagrams (ADDs) to represent value functions and policies, assuming an ADD input representation of the MDP. Dynamic programming is implemented via ADD manipulation. We demonstrate our method on a class of large MDPs (up to 63 million states) and show that significant gains can be had when compared to treestructured representations (with up to a thirtyfold reduction in the number of nodes required to represent optimal value functions). 1
Hidden Markov processes
 IEEE Trans. Inform. Theory
, 2002
"... Abstract—An overview of statistical and informationtheoretic aspects of hidden Markov processes (HMPs) is presented. An HMP is a discretetime finitestate homogeneous Markov chain observed through a discretetime memoryless invariant channel. In recent years, the work of Baum and Petrie on finite ..."
Cited by 215 (5 self)
Abstract—An overview of statistical and informationtheoretic aspects of hidden Markov processes (HMPs) is presented. An HMP is a discretetime finitestate homogeneous Markov chain observed through a discretetime memoryless invariant channel. In recent years, the work of Baum and Petrie on finitestate finitealphabet HMPs was expanded to HMPs with finite as well as continuous state spaces and a general alphabet. In particular, statistical properties and ergodic theorems for relative entropy densities of HMPs were developed. Consistency and asymptotic normality of the maximumlikelihood (ML) parameter estimator were proved under some mild conditions. Similar results were established for switching autoregressive processes. These processes generalize HMPs. New algorithms were developed for estimating the state, parameter, and order of an HMP, for universal coding and classification of HMPs, and for universal decoding of hidden Markov channels. These and other related topics are reviewed in this paper. Index Terms—Baum–Petrie algorithm, entropy ergodic theorems, finitestate channels, hidden Markov models, identifiability, Kalman filter, maximumlikelihood (ML) estimation, order estimation, recursive parameter estimation, switching autoregressive processes, Ziv inequality. I.
The interactive museum tourguide robot
, 1998
"... This paper describes the software architecture of an autonomous tourguide/tutor robot. This robot was recently deployed in the “Deutsches Museum Bonn, ” were it guided hundreds of visitors through the museum during a sixday deployment period. The robot’s control software integrates lowlevel proba ..."
Cited by 215 (33 self)
This paper describes the software architecture of an autonomous tourguide/tutor robot. This robot was recently deployed in the “Deutsches Museum Bonn, ” were it guided hundreds of visitors through the museum during a sixday deployment period. The robot’s control software integrates lowlevel probabilistic reasoning with highlevel problem solving embedded in first order logic. A collection of software innovations, described in this paper, enabled the robot to navigate at high speeds through dense crowds, while reliably avoiding collisions with obstacles—some of which could not even be perceived. Also described in this paper is a user interface tailored towards nonexpert users, which was essential for the robot’s success in the museum. Based on these experiences, this paper argues that time is ripe for the development of AIbased commercial service robots that assist people in everyday life.
An Overview of QualityofService Routing for the Next Generation HighSpeed Networks: Problems and Solutions
"... The upcoming Gbps highspeed networks are expected to support a wide range of communicationintensive, realtime multimedia applications. The requirement for timely delivery of digitized audiovisual information raises new challenges for the next generation integratedservice broadband networks. On ..."
Cited by 210 (21 self)
The upcoming Gbps highspeed networks are expected to support a wide range of communicationintensive, realtime multimedia applications. The requirement for timely delivery of digitized audiovisual information raises new challenges for the next generation integratedservice broadband networks. One of the key issues is the QualityofService (QoS) routing. It selects network routes with sufficient resources for the requested QoS parameters. The goal of routing solutions is twofold: (1) satisfying the QoS requirements for every admitted connection and (2) achieving the global efficiency in resource utilization. Many unicast/multicast QoS routing algorithms were published recently, and they work with a variety of QoS requirements and resource constraints. Overall, they can be partitioned into three broad classes: (1) source routing, (2) distributed routing and (3) hierarchical routing algorithms. In this paper we give an overview of the QoS routing problem as well as the existing solutions. We present the strengths and the weaknesses of different routing strategies and outline the challenges. We also discuss the basic algorithms in each class, classify and compare them, and point out possible future directions in the QoS routing area.