Results 1  10
of
652
Cognitive Radio: BrainEmpowered Wireless Communications
 IEEE J. Selected Areas in Comm
, 2005
"... Abstract—Cognitive radio is viewed as a novel approach for improving the utilization of a precious natural resource: the radio electromagnetic spectrum. The cognitive radio, built on a softwaredefined radio, is defined as an intelligent wireless communication system that is aware of its environment ..."
Abstract

Cited by 543 (0 self)
 Add to MetaCart
Abstract—Cognitive radio is viewed as a novel approach for improving the utilization of a precious natural resource: the radio electromagnetic spectrum. The cognitive radio, built on a softwaredefined radio, is defined as an intelligent wireless communication system that is aware of its environment and uses the methodology of understandingbybuilding to learn from the environment and adapt to statistical variations in the input stimuli, with two primary objectives in mind: • highly reliable communication whenever and wherever needed; • efficient utilization of the radio spectrum. Following the discussion of interference temperature as a new metric for the quantification and management of interference, the paper addresses three fundamental cognitive tasks. 1) Radioscene analysis. 2) Channelstate estimation and predictive modeling. 3) Transmitpower control and dynamic spectrum management. This paper also discusses the emergent behavior of cognitive radio. Index Terms—Awareness, channelstate estimation and predictive modeling, cognition, competition and cooperation, emergent behavior, interference temperature, machine learning, radioscene analysis, rate feedback, spectrum analysis, spectrum holes, spectrum management, stochastic games, transmitpower control, water filling.
Consensus and cooperation in networked multiagent systems
 Proceedings of the IEEE
"... Summary. This paper provides a theoretical framework for analysis of consensus algorithms for multiagent networked systems with an emphasis on the role of directed information flow, robustness to changes in network topology due to link/node failures, timedelays, and performance guarantees. An over ..."
Abstract

Cited by 279 (2 self)
 Add to MetaCart
Summary. This paper provides a theoretical framework for analysis of consensus algorithms for multiagent networked systems with an emphasis on the role of directed information flow, robustness to changes in network topology due to link/node failures, timedelays, and performance guarantees. An overview of basic concepts of information consensus in networks and methods of convergence and performance analysis for the algorithms are provided. Our analysis framework is based on tools from matrix theory, algebraic graph theory, and control theory. We discuss the connections between consensus problems in networked dynamic systems and diverse applications including synchronization of coupled oscillators, flocking, formation control, fast consensus in smallworld networks, Markov processes and gossipbased algorithms, load balancing in networks, rendezvous in space, distributed sensor fusion in sensor networks, and belief propagation. We establish direct connections between spectral and structural properties of complex networks and the speed of information diffusion of consensus algorithms. A brief introduction is provided on networked systems with nonlocal information flow that are considerably faster than distributed systems with latticetype nearest neighbor interactions. Simulation results are presented that demonstrate the role of smallworld effects on the speed of consensus algorithms and cooperative control of multivehicle formations.
Robust Incentive Techniques for PeertoPeer Networks
, 2004
"... Lack of cooperation (free riding) is one of the key problems that confronts today's P2P systems. What makes this problem particularly difficult is the unique set of challenges that P2P systems pose: large populations, high turnover, asymmetry of interest, collusion, zerocost identities, and traitor ..."
Abstract

Cited by 198 (3 self)
 Add to MetaCart
Lack of cooperation (free riding) is one of the key problems that confronts today's P2P systems. What makes this problem particularly difficult is the unique set of challenges that P2P systems pose: large populations, high turnover, asymmetry of interest, collusion, zerocost identities, and traitors. To tackle these challenges we model the P2P system using the Generalized Prisoner's Dilemma (GPD), and propose the Reciprocative decision function as the basis of a family of incentives techniques. These techniques are fully distributed and include: discriminating server selection, maxflowbased subjective reputation, and adaptive stranger policies. Through simulation, we show that these techniques can drive a system of strategic users to nearly optimal levels of cooperation.
Online Convex Programming and Generalized Infinitesimal Gradient Ascent
, 2003
"... Convex programming involves a convex set F R and a convex function c : F ! R. The goal of convex programming is to nd a point in F which minimizes c. In this paper, we introduce online convex programming. In online convex programming, the convex set is known in advance, but in each step of some ..."
Abstract

Cited by 183 (4 self)
 Add to MetaCart
Convex programming involves a convex set F R and a convex function c : F ! R. The goal of convex programming is to nd a point in F which minimizes c. In this paper, we introduce online convex programming. In online convex programming, the convex set is known in advance, but in each step of some repeated optimization problem, one must select a point in F before seeing the cost function for that step. This can be used to model factory production, farm production, and many other industrial optimization problems where one is unaware of the value of the items produced until they have already been constructed. We introduce an algorithm for this domain, apply it to repeated games, and show that it is really a generalization of in nitesimal gradient ascent, and the results here imply that generalized in nitesimal gradient ascent (GIGA) is universally consistent.
Multiagent Learning Using a Variable Learning Rate
 Artificial Intelligence
, 2002
"... Learning to act in a multiagent environment is a difficult problem since the normal definition of an optimal policy no longer applies. The optimal policy at any moment depends on the policies of the other agents and so creates a situation of learning a moving target. Previous learning algorithms hav ..."
Abstract

Cited by 180 (8 self)
 Add to MetaCart
Learning to act in a multiagent environment is a difficult problem since the normal definition of an optimal policy no longer applies. The optimal policy at any moment depends on the policies of the other agents and so creates a situation of learning a moving target. Previous learning algorithms have one of two shortcomings depending on their approach. They either converge to a policy that may not be optimal against the specific opponents' policies, or they may not converge at all. In this article we examine this learning problem in the framework of stochastic games. We look at a number of previous learning algorithms showing how they fail at one of the above criteria. We then contribute a new reinforcement learning technique using a variable learning rate to overcome these shortcomings. Specifically, we introduce the WoLF principle, "Win or Learn Fast", for varying the learning rate. We examine this technique theoretically, proving convergence in selfplay on a restricted class of iterated matrix games. We also present empirical results on a variety of more general stochastic games, in situations of selfplay and otherwise, demonstrating the wide applicability of this method.
Sequential optimality and coordination in multiagent systems
 In International Joint Conference on Artificial Intelligence
, 1999
"... Coordination of agent activities is a key problem in multiagent systems. Set in a larger decision theoretic context, the existence of coordination problems leads to difficulty in evaluating the utility of a situation. This in turn makes defining optimal policies for sequential decision processes pro ..."
Abstract

Cited by 144 (3 self)
 Add to MetaCart
Coordination of agent activities is a key problem in multiagent systems. Set in a larger decision theoretic context, the existence of coordination problems leads to difficulty in evaluating the utility of a situation. This in turn makes defining optimal policies for sequential decision processes problematic. We propose a method for solving sequential multiagent decision problems by allowing agents to reason explicitly about specific coordination mechanisms. We define an extension of value iteration in which the system’s state space is augmented with the state of the coordination mechanism adopted, allowing agents to reason about the short and long term prospects for coordination, the long term consequences of (mis)coordination, and make decisions to engage or avoid coordination problems based on expected value. We also illustrate the benefits of mechanism generalization. 1
Cooperative MultiAgent Learning: The State of the Art
 Autonomous Agents and MultiAgent Systems
, 2005
"... Cooperative multiagent systems are ones in which several agents attempt, through their interaction, to jointly solve tasks or to maximize utility. Due to the interactions among the agents, multiagent problem complexity can rise rapidly with the number of agents or their behavioral sophistication. ..."
Abstract

Cited by 113 (6 self)
 Add to MetaCart
Cooperative multiagent systems are ones in which several agents attempt, through their interaction, to jointly solve tasks or to maximize utility. Due to the interactions among the agents, multiagent problem complexity can rise rapidly with the number of agents or their behavioral sophistication. The challenge this presents to the task of programming solutions to multiagent systems problems has spawned increasing interest in machine learning techniques to automate the search and optimization process. We provide a broad survey of the cooperative multiagent learning literature. Previous surveys of this area have largely focused on issues common to specific subareas (for example, reinforcement learning or robotics). In this survey we attempt to draw from multiagent learning work in a spectrum of areas, including reinforcement learning, evolutionary computation, game theory, complex systems, agent modeling, and robotics. We find that this broad view leads to a division of the work into two categories, each with its own special issues: applying a single learner to discover joint solutions to multiagent problems (team learning), or using multiple simultaneous learners, often one per agent (concurrent learning). Additionally, we discuss direct and indirect communication in connection with learning, plus open issues in task decomposition, scalability, and adaptive dynamics. We conclude with a presentation of multiagent learning problem domains, and a list of multiagent learning resources. 1
Nash QLearning for GeneralSum Stochastic Games
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2003
"... We extend Qlearning to a noncooperative multiagent context, using the framework of generalsum stochastic games. A learning agent maintains Qfunctions over joint actions, and performs updates based on assuming Nash equilibrium behavior over the current Qvalues. This learning protocol provably conv ..."
Abstract

Cited by 108 (0 self)
 Add to MetaCart
We extend Qlearning to a noncooperative multiagent context, using the framework of generalsum stochastic games. A learning agent maintains Qfunctions over joint actions, and performs updates based on assuming Nash equilibrium behavior over the current Qvalues. This learning protocol provably converges given certain restrictions on the stage games (defined by Qvalues) that arise during learning. Experiments with a pair of twoplayer grid games suggest that such restrictions on the game structure are not necessarily required. Stage games encountered during learning in both grid environments violate the conditions. However, learning consistently converges in the first grid game, which has a unique equilibrium Qfunction, but sometimes fails to converge in the second, which has three different equilibrium Qfunctions. In a comparison of offline learning performance in both games, we find agents are more likely to reach a joint optimal path with Nash Qlearning than with a singleagent Qlearning method. When at least one agent adopts Nash Qlearning, the performance of both agents is better than using singleagent Qlearning. We have also implemented an online version of Nash Qlearning that balances exploration with exploitation, yielding improved performance.
Adversarial Classification
 IN KDD
, 2004
"... Essentially all data mining algorithms assume that the datagenerating process is independent of the data miner's activities. However, in many domains, including spam detection, intrusion detection, fraud detection, surveillance and counterterrorism, this is far from the case: the data is actively m ..."
Abstract

Cited by 96 (0 self)
 Add to MetaCart
Essentially all data mining algorithms assume that the datagenerating process is independent of the data miner's activities. However, in many domains, including spam detection, intrusion detection, fraud detection, surveillance and counterterrorism, this is far from the case: the data is actively manipulated by an adversary seeking to make the classifier produce false negatives. In these domains, the performance of a classifier can degrade rapidly after it is deployed, as the adversary learns to defeat it. Currently the only solution to this is repeated, manual, ad hoc reconstruction of the classifier. In this paper we develop a formal framework and algorithms for this problem. We view classification as a game between the classifier and the adversary, and produce a classifier that is optimal given the adversary's optimal strategy. Experiments in a spam detection domain show that this approach can greatly outperform a classifier learned in the standard way, and (within the parameters of the problem) automatically adapt the classifier to the adversary's evolving manipulations.