• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 17,264
Next 10 →

Finite-time analysis of the multiarmed bandit problem

by Peter Auer, Paul Fischer, Jyrki Kivinen - Machine Learning , 2002
"... Abstract. Reinforcement learning policies face the exploration versus exploitation dilemma, i.e. the search for a balance between exploring the environment to find profitable actions while taking the empirically best action as often as possible. A popular measure of a policy’s success in addressing ..."
Abstract - Cited by 817 (15 self) - Add to MetaCart
Abstract. Reinforcement learning policies face the exploration versus exploitation dilemma, i.e. the search for a balance between exploring the environment to find profitable actions while taking the empirically best action as often as possible. A popular measure of a policy’s success in addressing

Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning

by Richard S. Sutton , Doina Precup , Satinder Singh , 1999
"... Learning, planning, and representing knowledge at multiple levels of temporal abstraction are key, longstanding challenges for AI. In this paper we consider how these challenges can be addressed within the mathematical framework of reinforcement learning and Markov decision processes (MDPs). We exte ..."
Abstract - Cited by 569 (38 self) - Add to MetaCart
extend the usual notion of action in this framework to include options|closed-loop policies for taking action over a period of time. Examples of options include picking up an object, going to lunch, and traveling to a distant city, as well as primitive actions such as muscle twitches and joint knowledge

Decentralized Trust Management

by Matt Blaze, Joan Feigenbaum, Jack Lacy - In Proceedings of the 1996 IEEE Symposium on Security and Privacy , 1996
"... We identify the trust management problem as a distinct and important component of security in network services. Aspects of the trust management problem include formulating security policies and security credentials, determining whether particular sets of credentials satisfy the relevant policies, an ..."
Abstract - Cited by 1025 (24 self) - Add to MetaCart
approach to trust management, based on a simple language for specifying trusted actions and trust relationships. It also describes a prototype implementation of a new trust management system, called PolicyMaker, that will facilitate the development of security features in a wide range of network services

Implications of rational inattention

by Christopher A. Sims - JOURNAL OF MONETARY ECONOMICS , 2002
"... A constraint that actions can depend on observations only through a communication channel with finite Shannon capacity is shown to be able to play a role very similar to that of a signal extraction problem or an adjustment cost in standard control problems. The resulting theory looks enough like fa ..."
Abstract - Cited by 525 (11 self) - Add to MetaCart
A constraint that actions can depend on observations only through a communication channel with finite Shannon capacity is shown to be able to play a role very similar to that of a signal extraction problem or an adjustment cost in standard control problems. The resulting theory looks enough like

Department of Economic and Social Affairs, Population Division

by United Nations , 1999
"... vital interface between global policies in the economic, social and environmental spheres and national action. The Department works in three main interlinked areas: (i) it compiles, generates and analyses a wide range of economic, social and environmental data and information on which Member States ..."
Abstract - Cited by 505 (3 self) - Add to MetaCart
vital interface between global policies in the economic, social and environmental spheres and national action. The Department works in three main interlinked areas: (i) it compiles, generates and analyses a wide range of economic, social and environmental data and information on which Member States

What Can Economists Learn from Happiness Research?

by Bruno S. Frey, Alois Stutzer - FORTHCOMING IN JOURNAL OF ECONOMIC LITERATURE , 2002
"... Happiness is generally considered to be an ultimate goal in life; virtually everybody wants to be happy. The United States Declaration of Independence of 1776 takes it as a self-evident truth that the “pursuit of happiness” is an “unalienable right”, comparable to life and liberty. It follows that e ..."
Abstract - Cited by 545 (24 self) - Add to MetaCart
for economists to consider happiness. The first is economic policy. At the micro-level, it is often impossible to make a Pareto-optimal proposal, because a social action entails costs for some individuals. Hence an evaluation of the net effects, in terms of individual utilities, is needed. On an aggregate level

Investor psychology and security market under- and overreactions

by Kent Daniel, David Hirshleifer - Journal of Finance , 1998
"... We propose a theory of securities market under- and overreactions based on two well-known psychological biases: investor overconfidence about the precision of private information; and biased self-attribution, which causes asymmetric shifts in investors ’ confidence as a function of their investment ..."
Abstract - Cited by 698 (43 self) - Add to MetaCart
outcomes. We show that overconfidence implies negative long-lag autocorrelations, excess volatility, and, when managerial actions are correlated with stock mispricing, public-event-based return predictability. Biased self-attribution adds positive short-lag autocorrela-tions ~“momentum”!, short

Motivation through the Design of Work: Test of a Theory. Organizational Behavior and Human Performance,

by ] Richard Hackman , Grec R Oldham , 1976
"... A model is proposed that specifies the conditions under which individuals will become internally motivated to perform effectively on their jobs. The model focuses on the interaction among three classes of variables: (a) the psychological states of employees that must be present for internally motiv ..."
Abstract - Cited by 622 (2 self) - Add to MetaCart
under government sponsorship are encouraged to express their own judgment freely, this report does not necessarily represent the official opinion or policy of the government. redesign are not fully adequate to meet the problems encountered in their application. Especially troublesome is the paucity

Implementation intentions. Strong effects of simple plans

by Peter M. Gollwitzer - AMERICAN PSYCHOLOGIST , 1999
"... When people encounter problems in translating their goals into action (e.g., failing to get started, becoming distracted, or falling into bad habits), they may strategically call on automatic processes in an attempt to secure goal attain-ment. This can be achieved by plans in the form of imple-menta ..."
Abstract - Cited by 478 (52 self) - Add to MetaCart
When people encounter problems in translating their goals into action (e.g., failing to get started, becoming distracted, or falling into bad habits), they may strategically call on automatic processes in an attempt to secure goal attain-ment. This can be achieved by plans in the form of imple

Policy gradient methods for reinforcement learning with function approximation.

by Richard S Sutton , David Mcallester , Satinder Singh , Yishay Mansour - In NIPS, , 1999
"... Abstract Function approximation is essential to reinforcement learning, but the standard approach of approximating a value function and determining a policy from it has so far proven theoretically intractable. In this paper we explore an alternative approach in which the policy is explicitly repres ..."
Abstract - Cited by 439 (20 self) - Add to MetaCart
that the gradient can be written in a form suitable for estimation from experience aided by an approximate action-value or advantage function. Using this result, we prove for the first time that a version of policy iteration with arbitrary differentiable function approximation is convergent to a locally optimal
Next 10 →
Results 1 - 10 of 17,264
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University