Results 1 
7 of
7
Reinforcement Learning for Dynamic Channel Allocation in Cellular Telephone Systems
"... In cellular telephone systems, an important problem is to dynamically allocate the communication resource (channels) so as to maximize service in a stochastic caller environment. This problem is naturally formulated as a dynamic programming problem and we use a reinforcement learning (RL) method to ..."
Abstract

Cited by 124 (5 self)
 Add to MetaCart
In cellular telephone systems, an important problem is to dynamically allocate the communication resource (channels) so as to maximize service in a stochastic caller environment. This problem is naturally formulated as a dynamic programming problem and we use a reinforcement learning (RL) method to find dynamic channel allocation policies that are better than previous heuristic solutions. The policies obtained perform well for a broad variety of call traffic patterns. We present results on a large cellular system with approximately 49^49 states.
Capacity of the Trapdoor Channel with Feedback
"... We establish that the feedback capacity of the trapdoor channel is the logarithm of the golden ratio and provide a simple communication scheme that achieves capacity. As part of the analysis, we formulate a class of dynamic programs that characterize capacities of unifilar finitestate channels. The ..."
Abstract

Cited by 16 (7 self)
 Add to MetaCart
We establish that the feedback capacity of the trapdoor channel is the logarithm of the golden ratio and provide a simple communication scheme that achieves capacity. As part of the analysis, we formulate a class of dynamic programs that characterize capacities of unifilar finitestate channels. The trapdoor channel is an instance that admits a simple analytic solution.
A Comparison of Discrete and Parametric Approximation Methods for ContinuousState Dynamic Programming Problems
, 2000
"... We compare alternative numerical methods for approximating solutions to continuousstate dynamic programming (DP) problems. We distinguish two approaches: discrete approximation and parametric approximation. In the former, the continuous state space is discretized into a finite number of points N , ..."
Abstract

Cited by 13 (8 self)
 Add to MetaCart
We compare alternative numerical methods for approximating solutions to continuousstate dynamic programming (DP) problems. We distinguish two approaches: discrete approximation and parametric approximation. In the former, the continuous state space is discretized into a finite number of points N , and the resulting finitestate DP problem is solved numerically. In the latter, a function associated with the DP problem such as the value function, the policy function, or some other related function is approximated by a smooth function of K unknown parameters. Values of the parameters are chosen so that the parametric function approximates the true function as closely as possible. We focus on approximations that are linear in parameters, i.e. where the parametric approximation is a linear combination of K basis functions. We also focus on methods that approximate the value function V as the solution to the Bellman equation associated with the DP problem. In finite state DP problems...
How to Make Software Agents Do the Right Thing: An Introduction to Reinforcement Learning
 Adaptive Systems Group, Harlequin Inc
, 1996
"... This article explains why programming agents is not just businessasusual; rather it requires a new way of looking at problems and their solutions. When you hire a human agent to do something for you, you rarely spell out a detailed plan of action. Instead, you define the state of the environment t ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
This article explains why programming agents is not just businessasusual; rather it requires a new way of looking at problems and their solutions. When you hire a human agent to do something for you, you rarely spell out a detailed plan of action. Instead, you define the state of the environment that you want to achieve (e.g., you tell a contractor that you want a new front porch with comfortable seating, for under $2000). In more complex and uncertain situations, you specify your preferences rather than stating outright goals, as when you tell a stock broker agent that the more money you make the better, but count capital gains as, say, 30% better than dividend income. Your hired agent then takes actions on your behalf, even negotiates with other agents, all to help you achieve your preferences. We would like our software agents to behave the same way. That means we will need a way to describe our preferences to software agents, and a methodology for building agents that best satisfy our preferences. The pleasant surprise is that for many problems, once we know the preferences, we're almost done! Given the preferences, a list of possible actions, and enough time to practice taking actions, we can apply the formalism of Reinforcement Learning (or RL) to build an agent that acts according to the preferences in a nearoptimal way. This article shows how. 2 The Elevator Problem
RLMAC: A QoSAware Reinforcement Learning based MAC Protocol for Wireless Sensor Networks
"... Abstract — This paper introduces RLMAC, a novel adaptive media access control (MAC) protocol for wireless sensor networks (WSN) that employs a reinforcement learning framework. Existing schemes center around scheduling the nodes ’ sleep and active periods as means of minimizing the energy consumpti ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Abstract — This paper introduces RLMAC, a novel adaptive media access control (MAC) protocol for wireless sensor networks (WSN) that employs a reinforcement learning framework. Existing schemes center around scheduling the nodes ’ sleep and active periods as means of minimizing the energy consumption. Recent protocols employ adaptive duty cycles as means of further optimizing the energy utilization [1][2]. However, in most cases each node determines the duty cycle as a function of its own traffic load. In this paper, nodes actively infer the state of other nodes, using a reinforcement learning based control mechanism, thereby achieving high throughput and low power consumption for a wide range of traffic conditions. Moreover, the computational complexity of the proposed scheme is moderate rendering it pragmatic for practical deployments. We further demonstrate how Quality of Service (QoS) provisioning can directly be incorporated in the proposed framework. Index Terms — media access control, energyefficient protocol, wireless sensor networks, reinforcement learning, QoS. I.
To AaiBaba, BB, and Paulami
, 2011
"... This work would not have been possible without the encouragement, help, and guidance that I received over the years from many individuals. First and foremost is my advisor, Prof. Michael J. Neely. I remember attending a seminar talk that he gave at USC as a new faculty member sometime in early 2004. ..."
Abstract
 Add to MetaCart
This work would not have been possible without the encouragement, help, and guidance that I received over the years from many individuals. First and foremost is my advisor, Prof. Michael J. Neely. I remember attending a seminar talk that he gave at USC as a new faculty member sometime in early 2004. At that time, I was a Master’s student, barely contemplating the idea of pursuing a Ph.D. I still remember the sense of awe that I felt listening to him talk about his doctoral work. I thought that if I were to ever pursue a Ph.D., this is the kind of research I would want to do. I am forever grateful to Prof. Neely for taking me as his student and patiently helping me through thick and thin, from being a role model and a guide to sharing his candid assessments of my work and providing valuable feedback. I hope I have been able to achieve at least some of what I set out to do in my work. I would like to express my sincere thanks to Prof. Bhaskar Krishnamachari who was my research advisor during my Master’s studies and who served on my committee. It was with Bhaskar that I got the first opportunity to do research in the area of Wireless Networks. His enthusiasm for research, learning, and mentoring students is truly inspiring and I feel lucky to have benefitted from that. I am also thankful to my other committee