Results 1  10
of
30
Recent advances in hierarchical reinforcement learning
, 2003
"... A preliminary unedited version of this paper was incorrectly published as part of Volume ..."
Abstract

Cited by 161 (23 self)
 Add to MetaCart
A preliminary unedited version of this paper was incorrectly published as part of Volume
Dynamic Conditional Random Fields: Factorized Probabilistic Models for Labeling and Segmenting Sequence Data
 IN ICML
, 2004
"... In sequence modeling, we often wish to represent complex interaction between labels, such as when performing multiple, cascaded labeling tasks on the same sequence, or when longrange dependencies exist. We present dynamic conditional random fields (DCRFs), a generalization of linearchain cond ..."
Abstract

Cited by 122 (11 self)
 Add to MetaCart
In sequence modeling, we often wish to represent complex interaction between labels, such as when performing multiple, cascaded labeling tasks on the same sequence, or when longrange dependencies exist. We present dynamic conditional random fields (DCRFs), a generalization of linearchain conditional random fields (CRFs) in which each time slice contains a set of state variables and edgesa distributed state representation as in dynamic Bayesian networks (DBNs)and parameters are tied across slices. Since exact
Linear Time Inference in Hierarchical HMMs
 In Proceedings of Neural Information Processing Systems
, 2001
"... The hierarchical hidden Markov model (HHMM) is a generalization of the hidden Markov model (HMM) that models sequences with structure at many length/time scales [FST98]. Unfortunately, the original inference algorithm is rather complicated, and takes O(T ) time, where T is the length of the s ..."
Abstract

Cited by 87 (4 self)
 Add to MetaCart
The hierarchical hidden Markov model (HHMM) is a generalization of the hidden Markov model (HMM) that models sequences with structure at many length/time scales [FST98]. Unfortunately, the original inference algorithm is rather complicated, and takes O(T ) time, where T is the length of the sequence, making it impractical for many domains. In this paper, we show how HHMMs are a special kind of dynamic Bayesian network (DBN), and thereby derive a much simpler inference algorithm, which only takes O(T ) time. Furthermore, by drawing the connection between HHMMs and DBNs, we enable the application of many standard approximation techniques to further speed up inference.
Mapbased navigation in mobile robots.  I. A review of localization strategies
, 2003
"... For a robot, an animal, and even for man, to be able to use an internal representation of the spatial layout of its environment to position itself is a very complex task, which raises numerous issues of perception, categorization and motor control that must all be solved in an integrated manner to p ..."
Abstract

Cited by 32 (10 self)
 Add to MetaCart
For a robot, an animal, and even for man, to be able to use an internal representation of the spatial layout of its environment to position itself is a very complex task, which raises numerous issues of perception, categorization and motor control that must all be solved in an integrated manner to promote survival. This point is illustrated here, within the framework of a review of localization strategies in mobile robots. The allothetic and idiothetic sensors that may be used by these robots to build internal representations of their environment, and the maps in which these representations may be instantiated, are first described. Then mapbased navigation systems are categorized according to a 3level hierarchy of localization strategies, which respectively call upon direct position inference, singlehypothesis tracking, and multiplehypothesis tracking. The advantages and drawbacks of these strategies, notably with respect to the limitations of the sensors on which they rely, are discussed throughout the text.
Mapbased navigation in mobile robots  II. A review of maplearning and pathplanning strategies
, 2003
"... This article reviews maplearning and pathplanning strategies within the context of mapbased navigation in mobile robots. Concerning maplearning, it distinguishes metric maps from topological maps and describes procedures that help maintain the coherency of these maps. Concerning pathplanning, i ..."
Abstract

Cited by 29 (8 self)
 Add to MetaCart
This article reviews maplearning and pathplanning strategies within the context of mapbased navigation in mobile robots. Concerning maplearning, it distinguishes metric maps from topological maps and describes procedures that help maintain the coherency of these maps. Concerning pathplanning, it distinguishes continuous from discretized spaces and describes procedures applicable when the execution of a plan fails. It insists on the need for an integrated conception of such procedures, that must be tightly tailored to the specific robot that is used  notably to the capacities and limitations of its sensorymotor equipment  and to the specific environment that is experienced. A hierarchy of navigation strategies is outlined in the discussion, together with the sort of adaptive capacities each affords to cope with unexpected obstacles or dangers encountered on an animat or robot's way to its goal.
Representing hierarchical POMDPs as DBNs for multiscale robot localization
, 2004
"... We explore the advantages of representing hierarchical partially observable Markov decision processes (HPOMDPs) as dynamic Bayesian networks (DBNs). In particular, we focus on the special case of using HPOMDPs to represent multiresolution spatial maps for indoor robot navigation. Our results show ..."
Abstract

Cited by 29 (2 self)
 Add to MetaCart
We explore the advantages of representing hierarchical partially observable Markov decision processes (HPOMDPs) as dynamic Bayesian networks (DBNs). In particular, we focus on the special case of using HPOMDPs to represent multiresolution spatial maps for indoor robot navigation. Our results show that a DBN representation of HPOMDPs can train significantly faster than the original learning algorithm for HPOMDPs or the equivalent flat POMDP, and requires much less data. In addition, the DBN formulation can easily be extended to parameter tying and factoring of variables, which further reduces the time and sample complexity. This enables us to apply HPOMDP methods to much larger problems than previously possible. 1.
An Integrated Approach to Hierarchy and Abstraction for POMDPs
, 2002
"... This paper presents an algorithm for planning in structured partially observable Markov Decision Processes (POMDPs). The new algorithm, named PolCA (for PolicyContingent Abstraction) uses an actionbased decomposition to partition complex POMDP problems into a hierarchy of smaller subproblems. Low ..."
Abstract

Cited by 22 (1 self)
 Add to MetaCart
This paper presents an algorithm for planning in structured partially observable Markov Decision Processes (POMDPs). The new algorithm, named PolCA (for PolicyContingent Abstraction) uses an actionbased decomposition to partition complex POMDP problems into a hierarchy of smaller subproblems. Lowlevel subtasks are solved rst, and their partial policies are used to model abstract actions in the context of higherlevel subtasks. At all levels of the hierarchy, subtasks need only consider a reduced action, state and observation space. The reduced action set is provided by a designer, whereas the reduced state and observations sets are discovered automatically on a subtaskpersubtask basis. This typically results in lowerlevel subtasks having few, but highresolution, state/observations features, whereas highlevel subtasks tend to have many, but lowresolution, state/observation features. This paper presents a detailed overview of PolCA in the context of a POMDP hierarchical planning and execution algorithm. It also includes theoretical results demonstrating that in the special case of fully observable MDPs, the algorithm converges to a recursively optimal solution. Experimental results included in the paper demonstrate the usefulness of the approach on a range of problems, and show favorable performance compared to competing functionapproximation POMDP algorithms. Finally, the paper presents a realworld implementation and deployment of a robotic system which uses PolCA in the context of a highlevel robot behavior control task.
Robot introspection through learned hidden Markov models
, 2006
"... In this paper we describe a machine learning approach for acquiring a model of a robot behaviour from raw sensor data. We are interested in automating the acquisition of behavioural models to provide a robot with an introspective capability. We assume that the behaviour of a robot in achieving a tas ..."
Abstract

Cited by 19 (4 self)
 Add to MetaCart
In this paper we describe a machine learning approach for acquiring a model of a robot behaviour from raw sensor data. We are interested in automating the acquisition of behavioural models to provide a robot with an introspective capability. We assume that the behaviour of a robot in achieving a task can be modelled as a finite stochastic state transition system. Beginning with data recorded by a robot in the execution of a task, we use unsupervised learning techniques to estimate a hidden Markov model (HMM) that can be used both for predicting and explaining the behaviour of the robot in subsequent executions of the task. We demonstrate that it is feasible to automate the entire process of learning a high quality HMM from the data recorded by the robot during execution of its task. The learned HMM can be used both for monitoring and controlling the behaviour of the robot. The ultimate purpose of our work is to learn models for the full set of tasks associated with a given problem domain, and to integrate these models with a generative task planner. We want to show that these models can be used successfully in controlling the execution of a plan. However, this paper does not develop the planning and control aspects of our work, focussing instead on the learning methodology and the evaluation of a learned model. The essential property of the models we seek to construct is that the most probable trajectory through a model, given the observations made by the robot, accurately diagnoses, or explains, the behaviour that the robot actually performed when making these observations. In the work reported here we consider a navigation task. We explain
Mining temporal sequences to discover interesting patterns
 In: Proceedings of the 2004 International Conference on Knowledge Discovery and Data Mining
, 2004
"... When mining temporal sequences, knowledge discovery techniques can be applied that discover interesting patterns of interactions. Existing approaches use frequency, and sometimes length, as measurements for interestingness. Because these are temporal sequences, additional characteristics, such as pe ..."
Abstract

Cited by 12 (3 self)
 Add to MetaCart
When mining temporal sequences, knowledge discovery techniques can be applied that discover interesting patterns of interactions. Existing approaches use frequency, and sometimes length, as measurements for interestingness. Because these are temporal sequences, additional characteristics, such as periodicity, may also be interesting. We propose that information theoretic principles can be used to evaluate interesting characteristics of timeordered input sequences. In this paper, we present a novel data mining technique based on the Minimum Description Length principle that discovers interesting features in a timeordered sequence. We discuss features of our realtime mining approach, show applications of the knowledge mined by the approach, and present a technique to bootstrap a decision maker from the mined patterns. Categories and Subject Descriptors Mining data streams, novel data mining algorithms, preprocessing and post processing for data mining, spatial and temporal data mining.
A Multisine Approach for Trajectory Optimization Based on Information Gain
, 2002
"... This paper presents a multisine approach for trajectory optimization based on information gain, with distance and orientation sensing to known beacons. It addresses the problem of active sensing, i.e. the selection of a robot motion or sequence of motions, which make the robot arrive in its desired ..."
Abstract

Cited by 10 (5 self)
 Add to MetaCart
This paper presents a multisine approach for trajectory optimization based on information gain, with distance and orientation sensing to known beacons. It addresses the problem of active sensing, i.e. the selection of a robot motion or sequence of motions, which make the robot arrive in its desired goal configuration (position and orientation) with maximum accuracy, given the available sensor information. The optimal trajectory is parameterized as a linear combination of sine functions. An appropriate optimality criterion is selected which takes into account various requirements (e.g. maximum accuracy and minimum time). Several constraints can be formulated, e.g. with respect to collision avoidance. The optimal trajectory is then determined by numerical optimization techniques. The approach is applicable to both nonholonomic and holonomic robots. Its e#ectiveness is illustrated here for a nonholonomic wheeled mobile robot (WMR) in an environment with and without obstacles.