Results 1 - 10
of
103,199
Policy Gradient Methods for Reinforcement Learning with Function Approximation
, 1999
"... Function approximation is essential to reinforcement learning, but the standard approach of approximating a value function and determining a policy from it has so far proven theoretically intractable. In this paper we explore an alternative approach in which the policy is explicitly represented by i ..."
Abstract
-
Cited by 433 (20 self)
- Add to MetaCart
by its own function approximator, independent of the value function, and is updated according to the gradient of expected reward with respect to the policy parameters. Williams’s REINFORCE method and actor–critic methods are examples of this approach. Our main new result is to show that the gradient can
Coverage Control for Mobile Sensing Networks
, 2002
"... This paper presents control and coordination algorithms for groups of vehicles. The focus is on autonomous vehicle networks performing distributed sensing tasks where each vehicle plays the role of a mobile tunable sensor. The paper proposes gradient descent algorithms for a class of utility functio ..."
Abstract
-
Cited by 572 (47 self)
- Add to MetaCart
This paper presents control and coordination algorithms for groups of vehicles. The focus is on autonomous vehicle networks performing distributed sensing tasks where each vehicle plays the role of a mobile tunable sensor. The paper proposes gradient descent algorithms for a class of utility
Least-Squares Policy Iteration
- JOURNAL OF MACHINE LEARNING RESEARCH
, 2003
"... We propose a new approach to reinforcement learning for control problems which combines value-function approximation with linear architectures and approximate policy iteration. This new approach ..."
Abstract
-
Cited by 461 (12 self)
- Add to MetaCart
We propose a new approach to reinforcement learning for control problems which combines value-function approximation with linear architectures and approximate policy iteration. This new approach
Optimization Flow Control, I: Basic Algorithm and Convergence
- IEEE/ACM TRANSACTIONS ON NETWORKING
, 1999
"... We propose an optimization approach to flow control where the objective is to maximize the aggregate source utility over their transmission rates. We view network links and sources as processors of a distributed computation system to solve the dual problem using gradient projection algorithm. In thi ..."
Abstract
-
Cited by 690 (64 self)
- Add to MetaCart
We propose an optimization approach to flow control where the objective is to maximize the aggregate source utility over their transmission rates. We view network links and sources as processors of a distributed computation system to solve the dual problem using gradient projection algorithm
Data Security
, 1979
"... The rising abuse of computers and increasing threat to personal privacy through data banks have stimulated much interest m the techmcal safeguards for data. There are four kinds of safeguards, each related to but distract from the others. Access controls regulate which users may enter the system and ..."
Abstract
-
Cited by 611 (3 self)
- Add to MetaCart
The rising abuse of computers and increasing threat to personal privacy through data banks have stimulated much interest m the techmcal safeguards for data. There are four kinds of safeguards, each related to but distract from the others. Access controls regulate which users may enter the system and subsequently whmh data sets an active user may read or wrote. Flow controls regulate the dissemination of values among the data sets accessible to a user. Inference controls protect statistical databases by preventing questioners from deducing confidential information by posing carefully designed sequences of statistical queries and correlating the responses. Statlstmal data banks are much less secure than most people beheve. Data encryption attempts to prevent unauthorized disclosure of confidential information in transit or m storage. This paper describes the general nature of controls of each type, the kinds of problems they can and cannot solve, and their inherent limitations and weaknesses. The paper is intended for a general audience with little background in the area.
Detection and Tracking of Point Features
- International Journal of Computer Vision
, 1991
"... The factorization method described in this series of reports requires an algorithm to track the motion of features in an image stream. Given the small inter-frame displacement made possible by the factorization approach, the best tracking method turns out to be the one proposed by Lucas and Kanade i ..."
Abstract
-
Cited by 622 (2 self)
- Add to MetaCart
The factorization method described in this series of reports requires an algorithm to track the motion of features in an image stream. Given the small inter-frame displacement made possible by the factorization approach, the best tracking method turns out to be the one proposed by Lucas and Kanade in 1981. The method defines the measure of match between fixed-size feature windows in the past and current frame as the sum of squared intensity differences over the windows. The displacement is then defined as the one that minimizes this sum. For small motions, a linearization of the image intensities leads to a Newton-Raphson style minimization. In this report, after rederiving the method in a physically intuitive way, we answer the crucial question of how to choose the feature windows that are best suited for tracking. Our selection criterion is based directly on the definition of the tracking algorithm, and expresses how well a feature can be tracked. As a result, the criterion is optima...
Good Error-Correcting Codes based on Very Sparse Matrices
, 1999
"... We study two families of error-correcting codes defined in terms of very sparse matrices. "MN" (MacKay--Neal) codes are recently invented, and "Gallager codes" were first investigated in 1962, but appear to have been largely forgotten, in spite of their excellent properties. The ..."
Abstract
-
Cited by 741 (23 self)
- Add to MetaCart
We study two families of error-correcting codes defined in terms of very sparse matrices. "MN" (MacKay--Neal) codes are recently invented, and "Gallager codes" were first investigated in 1962, but appear to have been largely forgotten, in spite of their excellent properties. The decoding of both codes can be tackled with a practical sum-product algorithm. We prove that these codes are "very good," in that sequences of codes exist which, when optimally decoded, achieve information rates up to the Shannon limit. This result holds not only for the binary-symmetric channel but also for any channel with symmetric stationary ergodic noise. We give experimental results for binary-symmetric channels and Gaussian channels demonstrating that practical performance substantially better than that of standard convolutional and concatenated codes can be achieved; indeed, the performance of Gallager codes is almost as close to the Shannon limit as that of turbo codes.
Managing Energy and Server Resources in Hosting Centers
- In Proceedings of the 18th ACM Symposium on Operating System Principles (SOSP
, 2001
"... Interact hosting centers serve multiple service sites from a common hardware base. This paper presents the design and implementation of an architecture for resource management in a hosting center op-erating system, with an emphasis on energy as a driving resource management issue for large server cl ..."
Abstract
-
Cited by 558 (37 self)
- Add to MetaCart
Interact hosting centers serve multiple service sites from a common hardware base. This paper presents the design and implementation of an architecture for resource management in a hosting center op-erating system, with an emphasis on energy as a driving resource management issue for large server clusters. The goals are to provi-sion server resources for co-hosted services in a way that automati-cally adapts to offered load, improve the energy efficiency of server dusters by dynamically resizing the active server set, and respond to power supply disruptions or thermal events by degrading service in accordance with negotiated Service Level Agreements (SLAs). Our system is based on an economic approach to managing shared server resources, in which services "bid " for resources as a func-tion of delivered performance. The system continuously moni-tors load and plans resource allotments by estimating the value of their effects on service performance. A greedy resource allocation algorithm adjusts resource prices to balance supply and demand, allocating resources to their most efficient use. A reconfigurable server switching infrastructure directs request traffic to the servers assigned to each service. Experimental results from a prototype confirm that the system adapts to offered load and resource avail-ability, and can reduce server energy usage by 29 % or more for a typical Web workload. 1.
Taming the Underlying Challenges of Reliable Multihop Routing in Sensor Networks
- In SenSys
, 2003
"... The dynamic and lossy nature of wireless communication poses major challenges to reliable, self-organizing multihop networks. These non-ideal characteristics are more problematic with the primitive, low-power radio transceivers found in sensor networks, and raise new issues that routing protocols mu ..."
Abstract
-
Cited by 775 (21 self)
- Add to MetaCart
The dynamic and lossy nature of wireless communication poses major challenges to reliable, self-organizing multihop networks. These non-ideal characteristics are more problematic with the primitive, low-power radio transceivers found in sensor networks, and raise new issues that routing protocols must address. Link connectivity statistics should be captured dynamically through an efficient yet adaptive link estimator and routing decisions should exploit such connectivity statistics to achieve reliability. Link status and routing information must be maintained in a neighborhood table with constant space regardless of cell density. We study and evaluate link estimator, neighborhood table management, and reliable routing protocol techniques. We focus on a many-to-one, periodic data collection workload. We narrow the design space through evaluations on large-scale, high-level simulations to 50-node, in-depth empirical experiments. The most effective solution uses a simple time averaged EWMA estimator, frequency based table management, and cost-based routing.
Results 1 - 10
of
103,199