Results 11 - 20
of
221
Online Feature Learning for Reinforcement Learning Online Feature Learning for Reinforcement Learning
, 2014
"... Hiermit versichere ich, die vorliegende Bachelor-Thesis ohne Hilfe Dritter nur mit den angegebenen Quellen und Hilfsmitteln angefertigt zu haben. Alle Stellen, die aus Quellen entnommen wurden, sind als solche kenntlich gemacht. Diese Arbeit hat in gleicher oder ähnlicher Form noch keiner Prüfungs-b ..."
Abstract
- Add to MetaCart
is not available. Initially, to acquire the needed knowledge, many experiments have to be used to explore the input data space at the before training. Preferably, the process of learning useful features should be done on the fly, during the training of the agent. One major challenge for this online feature
Efficient Exploration for Reinforcement Learning Based Distributed Spectrum Sharing in Cognitive Radio System
"... ABSTRACT: In this paper, we investigate how distributed reinforcement learning-based resource assignment algorithms can be used to improve the performance of a cognitive radio system. Today's decision making in most wireless systems include cognitive radio systems in development, depends purel ..."
Abstract
- Add to MetaCart
purely on instantaneous measurement. Two system architectures have been investigated in this paper. A point-to-point architecture is examined first in an open spectrum scenario. Then, the distributed reinforcement learning-based algorithms are developed by modifying the traditional reinforcement learning
Exploration and Model Building in Mobile Robot Domains
- In Proceedings of the ICNN-93
, 1993
"... I present first results on COLUMBUS, an autonomous mobile robot. COLUMBUS operates in initially unknown, structured environments. Its task is to explore and model the environment efficiently while avoiding collisions with obstacles. COLUMBUS uses an instance-based learning technique for modeling its ..."
Abstract
-
Cited by 64 (17 self)
- Add to MetaCart
I present first results on COLUMBUS, an autonomous mobile robot. COLUMBUS operates in initially unknown, structured environments. Its task is to explore and model the environment efficiently while avoiding collisions with obstacles. COLUMBUS uses an instance-based learning technique for modeling
Exploring the Predictable
, 2002
"... Details of complex event sequences are often not predictable, but their reduced abstract representations are. I study an embedded active learner that can limit its predictions to almost arbitrary computable aspects of spatio-temporal events. It constructs probabilistic algorithms that (1) control in ..."
Abstract
-
Cited by 33 (13 self)
- Add to MetaCart
. If their opinions dier then the system checks who's right, punishes the loser (the surprised one), and rewards the winner. An evolutionary or reinforcement learning algorithm forces each module to maximize reward. This motivates both modules to lure each other into agreeing upon experiments involving
Reinforcement learning for autonomic network repair
- In ICAC
, 2004
"... We report on our efforts to formulate autonomic network repair as a reinforcement-learning problem. Our implemented system is able to learn to efficiently restore network connectivity after a failure. Our research explores a reinforcement-learning (Sutton & Barto 1998) formulation we call cost-s ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
We report on our efforts to formulate autonomic network repair as a reinforcement-learning problem. Our implemented system is able to learn to efficiently restore network connectivity after a failure. Our research explores a reinforcement-learning (Sutton & Barto 1998) formulation we call cost
Decentralized Bayesian reinforcement learning for online agent collaboration
- In AAMAS
, 2012
"... Solving complex but structured problems in a decentralized manner via multiagent collaboration has received much attention in recent years. This is natural, as on one hand, multiagent systems usually possess a structure that determines the allowable interactions among the agents; and on the other ha ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
the unknown environment parameters while forming (and following) local policies in an online fashion. In this paper, we provide the first Bayesian reinforcement learning (BRL) approach for distributed coordination and learning in a cooperative multiagent system by devising two solutions to this type
REINFORCEMENT LEARNING-BASED DYNAMIC SCHEDULING FOR THREAT EVALUATION
"... A novel reinforcement learning-based sensor scan optimisation scheme is presented for the purpose of multi-target tracking and threat evaluation from helicopter platforms. Reinforcement learn-ing is an unsupervised learning technique that has been shown to be effective in highly dynamic and noisy en ..."
Abstract
- Add to MetaCart
environments. The problem is made suitable for the use of reinforcement learning by its casting into a “sensor scheduling ” framework. An innovative action ex-ploration policy utilising a Gibbs distribution is shown to improve agent performance over a more conventional random action selec-tion policy
Optimizing microstimulation using a reinforcement learning framework
- IEEE Engineering in Medicine and Biology Magazine
, 2011
"... Abstract — The ability to provide sensory feedback is desired to enhance the functionality of neuroprosthetics. Somatosensory feedback provides closed-loop control to the motor system, which is lacking in feedforward neuroprosthetics. In the case of existing somatosensory function, a template of the ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
data recorded for natural touch and thalamic microstimulation, and we examine the methods efficiency in exploring the parameter space while concentrating on promising parameter forms. The best matching stimulation parameters, from k = 68 different forms, are selected by the reinforcement learning
Predicting the Labels of an Unknown Graph via Adaptive Exploration
, 2010
"... Motivated by a problem of targeted advertising in social networks, we introduce a new model of online learning on labeled graphs where the graph is initially unknown and the algorithm is free to choose which vertex to predict next. For this learning model, we define an appropriate measure of regular ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Motivated by a problem of targeted advertising in social networks, we introduce a new model of online learning on labeled graphs where the graph is initially unknown and the algorithm is free to choose which vertex to predict next. For this learning model, we define an appropriate measure
Adaptive Execution: Exploration and Learning of Price Impact
, 2012
"... We consider a model in which a trader aims to maximize expected risk-adjusted profit while trading a single security. In our model, each price change is a linear combination of observed factors, impact resulting from the trader’s current and prior activity, and unpredictable random effects. The trad ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
learning algorithm that is designed to explore efficiently in linear-quadratic control problems. Key words: adaptive execution, price impact, reinforcement learning, regret bound 1
Results 11 - 20
of
221