• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 11 - 20 of 221
Next 10 →

Online Feature Learning for Reinforcement Learning Online Feature Learning for Reinforcement Learning

by Laux Darmstadt, Melvin Laux Darmstadt, Tag Der Einreichung , 2014
"... Hiermit versichere ich, die vorliegende Bachelor-Thesis ohne Hilfe Dritter nur mit den angegebenen Quellen und Hilfsmitteln angefertigt zu haben. Alle Stellen, die aus Quellen entnommen wurden, sind als solche kenntlich gemacht. Diese Arbeit hat in gleicher oder ähnlicher Form noch keiner Prüfungs-b ..."
Abstract - Add to MetaCart
is not available. Initially, to acquire the needed knowledge, many experiments have to be used to explore the input data space at the before training. Preferably, the process of learning useful features should be done on the fly, during the training of the agent. One major challenge for this online feature

Efficient Exploration for Reinforcement Learning Based Distributed Spectrum Sharing in Cognitive Radio System

by U Kiran , D Praveen Kumar , K Rajesh Reddy , M Ranjith
"... ABSTRACT: In this paper, we investigate how distributed reinforcement learning-based resource assignment algorithms can be used to improve the performance of a cognitive radio system. Today's decision making in most wireless systems include cognitive radio systems in development, depends purel ..."
Abstract - Add to MetaCart
purely on instantaneous measurement. Two system architectures have been investigated in this paper. A point-to-point architecture is examined first in an open spectrum scenario. Then, the distributed reinforcement learning-based algorithms are developed by modifying the traditional reinforcement learning

Exploration and Model Building in Mobile Robot Domains

by Sebastian B. Thrun - In Proceedings of the ICNN-93 , 1993
"... I present first results on COLUMBUS, an autonomous mobile robot. COLUMBUS operates in initially unknown, structured environments. Its task is to explore and model the environment efficiently while avoiding collisions with obstacles. COLUMBUS uses an instance-based learning technique for modeling its ..."
Abstract - Cited by 64 (17 self) - Add to MetaCart
I present first results on COLUMBUS, an autonomous mobile robot. COLUMBUS operates in initially unknown, structured environments. Its task is to explore and model the environment efficiently while avoiding collisions with obstacles. COLUMBUS uses an instance-based learning technique for modeling

Exploring the Predictable

by Jürgen Schmidhuber , 2002
"... Details of complex event sequences are often not predictable, but their reduced abstract representations are. I study an embedded active learner that can limit its predictions to almost arbitrary computable aspects of spatio-temporal events. It constructs probabilistic algorithms that (1) control in ..."
Abstract - Cited by 33 (13 self) - Add to MetaCart
. If their opinions dier then the system checks who's right, punishes the loser (the surprised one), and rewards the winner. An evolutionary or reinforcement learning algorithm forces each module to maximize reward. This motivates both modules to lure each other into agreeing upon experiments involving

Reinforcement learning for autonomic network repair

by Michael L. Littman, Nishkam Ravi, Eitan Fenson, Rich Howard - In ICAC , 2004
"... We report on our efforts to formulate autonomic network repair as a reinforcement-learning problem. Our implemented system is able to learn to efficiently restore network connectivity after a failure. Our research explores a reinforcement-learning (Sutton & Barto 1998) formulation we call cost-s ..."
Abstract - Cited by 12 (0 self) - Add to MetaCart
We report on our efforts to formulate autonomic network repair as a reinforcement-learning problem. Our implemented system is able to learn to efficiently restore network connectivity after a failure. Our research explores a reinforcement-learning (Sutton & Barto 1998) formulation we call cost

Decentralized Bayesian reinforcement learning for online agent collaboration

by W. T. L. Teacy, G. Chalkiadakis, A. Farinelli - In AAMAS , 2012
"... Solving complex but structured problems in a decentralized manner via multiagent collaboration has received much attention in recent years. This is natural, as on one hand, multiagent systems usually possess a structure that determines the allowable interactions among the agents; and on the other ha ..."
Abstract - Cited by 9 (2 self) - Add to MetaCart
the unknown environment parameters while forming (and following) local policies in an online fashion. In this paper, we provide the first Bayesian reinforcement learning (BRL) approach for distributed coordination and learning in a cooperative multiagent system by devising two solutions to this type

REINFORCEMENT LEARNING-BASED DYNAMIC SCHEDULING FOR THREAT EVALUATION

by Nimrod Lilith
"... A novel reinforcement learning-based sensor scan optimisation scheme is presented for the purpose of multi-target tracking and threat evaluation from helicopter platforms. Reinforcement learn-ing is an unsupervised learning technique that has been shown to be effective in highly dynamic and noisy en ..."
Abstract - Add to MetaCart
environments. The problem is made suitable for the use of reinforcement learning by its casting into a “sensor scheduling ” framework. An innovative action ex-ploration policy utilising a Gibbs distribution is shown to improve agent performance over a more conventional random action selec-tion policy

Optimizing microstimulation using a reinforcement learning framework

by Austin J. Brockmeier, Student Member, John S. Choi, Marcello M. Distasio, Joseph T. Francis, Jose ́ C. Prı́ncipe - IEEE Engineering in Medicine and Biology Magazine , 2011
"... Abstract — The ability to provide sensory feedback is desired to enhance the functionality of neuroprosthetics. Somatosensory feedback provides closed-loop control to the motor system, which is lacking in feedforward neuroprosthetics. In the case of existing somatosensory function, a template of the ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
data recorded for natural touch and thalamic microstimulation, and we examine the methods efficiency in exploring the parameter space while concentrating on promising parameter forms. The best matching stimulation parameters, from k = 68 different forms, are selected by the reinforcement learning

Predicting the Labels of an Unknown Graph via Adaptive Exploration

by Nicolò Cesa-bianchi, Claudio Gentile, Fabio Vitale , 2010
"... Motivated by a problem of targeted advertising in social networks, we introduce a new model of online learning on labeled graphs where the graph is initially unknown and the algorithm is free to choose which vertex to predict next. For this learning model, we define an appropriate measure of regular ..."
Abstract - Cited by 1 (1 self) - Add to MetaCart
Motivated by a problem of targeted advertising in social networks, we introduce a new model of online learning on labeled graphs where the graph is initially unknown and the algorithm is free to choose which vertex to predict next. For this learning model, we define an appropriate measure

Adaptive Execution: Exploration and Learning of Price Impact

by Beomsoo Park, Benjamin Van Roy , 2012
"... We consider a model in which a trader aims to maximize expected risk-adjusted profit while trading a single security. In our model, each price change is a linear combination of observed factors, impact resulting from the trader’s current and prior activity, and unpredictable random effects. The trad ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
learning algorithm that is designed to explore efficiently in linear-quadratic control problems. Key words: adaptive execution, price impact, reinforcement learning, regret bound 1
Next 10 →
Results 11 - 20 of 221
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University