Results 1  10
of
62
Learning policies for partially observable environments: Scaling up
, 1995
"... Partially observable Markov decision processes (pomdp's) model decision problems in which an agent tries to maximize its reward in the face of limited and/or noisy sensor feedback. While the study of pomdp's is motivated by a need to address realistic problems, existing techniques for fin ..."
Abstract

Cited by 236 (11 self)
 Add to MetaCart
Partially observable Markov decision processes (pomdp's) model decision problems in which an agent tries to maximize its reward in the face of limited and/or noisy sensor feedback. While the study of pomdp's is motivated by a need to address realistic problems, existing techniques for finding optimal behavior do not appear to scale well and have been unable to find satisfactory policies for problems with more than a dozen states. After a brief review of pomdp's, this paper discusses several simple solution methods and shows that all are capable of finding nearoptimal policies for a selection of extremely small pomdp's taken from the learning literature. In contrast, we show that none are able to solve a slightly larger and noisier problem based on robot navigation. We find that a combination of two novel approaches performs well on these problems and suggest methods for scaling to even larger and more complicated domains. 1 Introduction Mobile robots must act on the basis of thei...
Algorithms for Sequential Decision Making
, 1996
"... Sequential decision making is a fundamental task faced by any intelligent agent in an extended interaction with its environment; it is the act of answering the question "What should I do now?" In this thesis, I show how to answer this question when "now" is one of a finite set of ..."
Abstract

Cited by 179 (8 self)
 Add to MetaCart
Sequential decision making is a fundamental task faced by any intelligent agent in an extended interaction with its environment; it is the act of answering the question "What should I do now?" In this thesis, I show how to answer this question when "now" is one of a finite set of states, "do" is one of a finite set of actions, "should" is maximize a longrun measure of reward, and "I" is an automated planning or learning system (agent). In particular,
NeuroAnimator: Fast Neural Network Emulation and Control of PhysicsBased Models
, 1998
"... Animation through the numerical simulation of physicsbased graphics models offers unsurpassed realism, but it can be computationally demanding. Likewise, finding controllers that enable physicsbased models to produce desired animations usually entails formidable computational cost. This paper de ..."
Abstract

Cited by 85 (3 self)
 Add to MetaCart
Animation through the numerical simulation of physicsbased graphics models offers unsurpassed realism, but it can be computationally demanding. Likewise, finding controllers that enable physicsbased models to produce desired animations usually entails formidable computational cost. This paper demonstrates the possibility of replacing the numerical simulation and control of model dynamics with a dramatically more efficient alternative. In particular, we propose the NeuroAnimator, a novel approach to creating physically realistic animation that exploits neural networks. NeuroAnimators are automatically trained offline to emulate physical dynamics through the observation of physicsbased models in action. Depending on the model, its neural network emulator can yield physically realistic animation one or two orders of magnitude faster than conventional numerical simulation. Furthermore, by exploiting the network structure of the NeuroAnimator, we introduce a fast algorithm for learning controllers that enables either physicsbased models or their neural network emulators to synthesize motions satisfying prescribed animation goals. We demonstrate NeuroAnimators for passive and active (actuated) rigid body, articulated, and deformable physicsbased models.
Improving the Rprop Learning Algorithm
 PROCEEDINGS OF THE SECOND INTERNATIONAL SYMPOSIUM ON NEURAL COMPUTATION (NC 2000)
, 2000
"... The Rprop algorithm proposed by Riedmiller and Braun is one of the best performing firstorder learning methods for neural networks. We introduce modifications of the algorithm that improve its learning speed. The resulting speedup is experimentally shown for a set of neural network learning tasks a ..."
Abstract

Cited by 44 (7 self)
 Add to MetaCart
The Rprop algorithm proposed by Riedmiller and Braun is one of the best performing firstorder learning methods for neural networks. We introduce modifications of the algorithm that improve its learning speed. The resulting speedup is experimentally shown for a set of neural network learning tasks as well as for artificial error surfaces.
Bidirectional recurrent neural networks
 IEEE Transactions on Signal Processing
, 1997
"... Abstract—In the first part of this paper, a regular recurrent neural network (RNN) is extended to a bidirectional recurrent neural network (BRNN). The BRNN can be trained without the limitation of using input information just up to a preset future frame. This is accomplished by training it simultane ..."
Abstract

Cited by 39 (2 self)
 Add to MetaCart
Abstract—In the first part of this paper, a regular recurrent neural network (RNN) is extended to a bidirectional recurrent neural network (BRNN). The BRNN can be trained without the limitation of using input information just up to a preset future frame. This is accomplished by training it simultaneously in positive and negative time direction. Structure and training procedure of the proposed network are explained. In regression and classification experiments on artificial data, the proposed structure gives better results than other approaches. For real data, classification experiments for phonemes from the TIMIT database show the same tendency. In the second part of this paper, it is shown how the proposed bidirectional structure can be easily modified to allow efficient estimation of the conditional posterior probability of complete symbol sequences without making any explicit assumption about the shape of the distribution. For this part, experiments on real data are reported. Index Terms—Recurrent neural networks.
Pattern analysis for machine olfaction: a review
 IEEE Sens. J
, 2002
"... Abstract—Pattern analysis constitutes a critical building block in the development of gas sensor array instruments capable of detecting, identifying, and measuring volatile compounds, a technology that has been proposed as an artificial substitute of the human olfactory system. The successful design ..."
Abstract

Cited by 33 (7 self)
 Add to MetaCart
Abstract—Pattern analysis constitutes a critical building block in the development of gas sensor array instruments capable of detecting, identifying, and measuring volatile compounds, a technology that has been proposed as an artificial substitute of the human olfactory system. The successful design of a pattern analysis system for machine olfaction requires a careful consideration of the various issues involved in processing multivariate data: signalpreprocessing, feature extraction, feature selection, classification, regression, clustering, and validation. A considerable number of methods from statistical pattern recognition, neural networks, chemometrics, machine learning, and biological cybernetics has been used to process electronic nose data. The objective of this review paper is to provide a summary and guidelines for using the most widely used pattern analysis techniques, as well as to identify research directions that are at the frontier of sensorbased machine olfaction. Index Terms—Classification, clustering, dimensionality reduction, electronic nose, multicomponent analysis, pattern analysis, preprocessing, validation. I.
Generalized information potential criterion for adaptive system training
 IEEE Trans. Neural Networks
, 2002
"... Abstract—We have recently proposed the quadratic Renyi’s error entropy as an alternative cost function for supervised adaptive system training. An entropy criterion instructs the minimization of the average information content of the error signal rather than merely trying to minimize its energy. In ..."
Abstract

Cited by 32 (17 self)
 Add to MetaCart
Abstract—We have recently proposed the quadratic Renyi’s error entropy as an alternative cost function for supervised adaptive system training. An entropy criterion instructs the minimization of the average information content of the error signal rather than merely trying to minimize its energy. In this paper, we propose a generalization of the error entropy criterion that enables the use of any order of Renyi’s entropy and any suitable kernel function in density estimation. It is shown that the proposed entropy estimator preserves the global minimum of actual entropy. The equivalence between global optimization by convolution smoothing and the convolution by the kernel in Parzen windowing is also discussed. Simulation results are presented for timeseries prediction and classification where experimental demonstration of all the theoretical concepts is presented. Index Terms—Minimum error entropy, Parzen windowing, Renyi’s entropy, supervised training.
Automatic Recognition of Cortical Sulci of the Human Brain Using a Congregation of Neural Networks
 Elsevier, Medical Image Analysis
, 2002
"... This paper describes a complete system allowing automatic recognition of the main sulci of the human cortex. This system relies on a preprocessing of magnetic resonance images leading to abstract structural representations of the cortical folding patterns. The representation nodes are cortical folds ..."
Abstract

Cited by 24 (7 self)
 Add to MetaCart
This paper describes a complete system allowing automatic recognition of the main sulci of the human cortex. This system relies on a preprocessing of magnetic resonance images leading to abstract structural representations of the cortical folding patterns. The representation nodes are cortical folds, which are given a sulcus name by a contextual pattern recognition method. This method can be interpreted as a graph matching approach, which is driven by the minimization of a global function made up of local potentials. Each potential is a measure of the likelihood of the labelling of a restricted area. This potential is given by a multilayer perceptron trained on a learning database. A base of 26 brains manually labelled by a neuroanatomist is used to validate our approach. The whole system developed for the right hemisphere is made up of 265 neural networks. The mean recognition rate is 86% for the learning base and 76% for a generalization base, which is very satisfying considering the current weak understanding of the variability of the cortical folding patterns.
Ant colony optimization and stochastic gradient descent
 Artificial Life
, 2002
"... In this paper, we study the relationship between the two techniques known as ant colony optimization (aco) and stochastic gradient descent. More precisely, we show that some empirical aco algorithms approximate stochastic gradient descent in the space of pheromones, and we propose an implementation ..."
Abstract

Cited by 19 (6 self)
 Add to MetaCart
In this paper, we study the relationship between the two techniques known as ant colony optimization (aco) and stochastic gradient descent. More precisely, we show that some empirical aco algorithms approximate stochastic gradient descent in the space of pheromones, and we propose an implementation of stochastic gradient descent that belongs to the family of aco algorithms. We then use this insight to explore the mutual contributions of the two techniques.
Using Kohonen’s selforganizing feature map to uncover automobile bodily injury claims fraud
 The Journal of Risk and Insurance
, 1998
"... Claims fraud is an increasingly vexing problem confronting the insurance industry. In this empirical study, we apply Kohonen's SelfOrganizing Feature Map to classify automobile bodily injury (BI) claims by the degree of fraud suspicion. Feed forward neural networks and a back propagation algor ..."
Abstract

Cited by 19 (4 self)
 Add to MetaCart
Claims fraud is an increasingly vexing problem confronting the insurance industry. In this empirical study, we apply Kohonen's SelfOrganizing Feature Map to classify automobile bodily injury (BI) claims by the degree of fraud suspicion. Feed forward neural networks and a back propagation algorithm are used to investigate the validity of the Feature Map approach. Comparative experiments illustrate the potential usefulness of the proposed methodology. We show that this technique performs better than both an insurance adjuster's fraud assessment and an insurance investigator's fraud assessment with respect to consistency and reliability. INTRODUCTION AND BACKGROUND One vexing problem confronting the propertycasualty insurance industry is claims fraud. Individuals and conspiratorial rings of claimants and providers unfortunately can and do manipulate the claim processing system for their own undeserved benefit (Derrig and Ostaszewski, 1994; Cummins and Tennyson, 1992). The