Results 11 - 20
of
89
Indexability of Restless Bandit Problems and Optimality . . .
"... We consider a class of restless multi-armed bandit problems (RMBP) that arises in dynamic multichannel access, user/server scheduling, and optimal activation in multi-agent systems. For this class of RMBP, we establish the indexability and obtain Whittle’s index in closed-form for both discounted an ..."
Abstract
-
Cited by 11 (10 self)
- Add to MetaCart
We consider a class of restless multi-armed bandit problems (RMBP) that arises in dynamic multichannel access, user/server scheduling, and optimal activation in multi-agent systems. For this class of RMBP, we establish the indexability and obtain Whittle’s index in closed-form for both discounted and average reward criteria. These results lead to a direct implementation of Whittle’s index policy with remarkably low complexity. When arms are stochastically identical, we show that Whittle’s index policy is optimal under certain conditions. Furthermore, it has a semi-universal structure that obviates the need to know the Markov transition probabilities. The optimality and the semi-universal structure result from the equivalency between Whittle’s index policy and the myopic policy established in this work. For non-identical arms, we develop efficient algorithms for computing a performance upper bound given by Lagrangian relaxation. The tightness of the upper bound and the near-optimal performance of Whittle’s index policy are illustrated with simulation examples.
Revenue generation for truthful spectrum auction in dynamic spectrum access
- In Proc. ACM International Symposium on Mobile Ad Hoc Networking and Computing
, 2009
"... Spectrum is a critical yet scarce resource and it has been shown that dynamic spectrum access can significantly improve spectrum utilization. To achieve this, it is important to incentivize the primary license holders to open up their under-utilized spectrum for sharing. In this paper we present a s ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
Spectrum is a critical yet scarce resource and it has been shown that dynamic spectrum access can significantly improve spectrum utilization. To achieve this, it is important to incentivize the primary license holders to open up their under-utilized spectrum for sharing. In this paper we present a secondary spectrum market where a primary license holder can sell access to its unused or under-used spectrum resources in the form of certain fine-grained spectrumspace-time unit. Secondary wireless service providers can purchase such contracts to deploy new service, enhance their existing service, or deploy ad hoc service to meet flash crowds demand. Within the context of this market, we investigate how to use auction mechanisms to allocate and price spectrum resources so that the primary license holder’s revenue is maximized. We begin by classifying a number of alternative auction formats in terms of spectrum demand. We then study a specific auction format where secondary wireless service providers have demands for fixed locations (cells). We propose an optimal auction based on the concept of virtual valuation. Assuming the knowledge of valuation distributions, the optimal auction uses the Vickrey-Clarke-Groves (VCG) mechanism to maximize the expected revenue while enforcing truthfulness. To reduce the computational complexity, we further design a truthful suboptimal auction with polynomial time complexity. It uses a monotone allocation and critical value payment to enforce truthfulness. Simulation results show that this suboptimal auction can generate stable expected revenue.
Low-Complexity Approaches to Spectrum Opportunity Tracking
- in Proc. of CrownCom
, 2007
"... Abstract—We consider opportunistic spectrum access under design constraints imposed at both node and link levels. First, hardware and energy limitations at node level may prevent a secondary user from sensing all the channels in the spectrum simultaneously. A channel selection strategy is thus neces ..."
Abstract
-
Cited by 10 (9 self)
- Add to MetaCart
Abstract—We consider opportunistic spectrum access under design constraints imposed at both node and link levels. First, hardware and energy limitations at node level may prevent a secondary user from sensing all the channels in the spectrum simultaneously. A channel selection strategy is thus necessary to track the time-varying spectrum opportunities. Second, sensing errors are inevitable. A secondary user needs to decide, based on imperfect sensing outcomes, whether to access the sensed channel and how to update its statistical knowledge of spectrum dynamics for better tracking in the future. Third, a secondary transmitter and its intended receiver need to hop synchronously in the spectrum in order to communicate. When a dynamic opportunity tracking strategy is used where the channel selection depends on the sensing history, achieving this synchrony is nontrivial in the absence of a dedicated control channel and in the presence of sensing errors. These practical constraints significantly complicate the design of opportunistic spectrum access, and the optimal performance requires the joint design of the spectrum sensor, opportunity tracking strategy, and spectrum access decisions. The focus of this paper is on developing lowcomplexity approaches for opportunistic spectrum access. We show that under certain conditions on the spectrum dynamics, simple myopic strategies can provide optimal performance for the joint design of spectrum sensor, opportunity tracking, and opportunity exploitation. We also propose an alternate lowcomplexity indexing strategy for other conditions that takes into account the expected time to channel availability. Index Terms—Opportunistic spectrum access, POMDP, myopic policy, spectrum opportunity tracking.
A measurement-based model for dynamic spectrum access
- in Proc. IEEE Conference on Military Communications (MILCOM
, 2006
"... Abstract — In this paper we consider dynamically sharing the spectrum in the time-domain by exploiting whitespace between the bursty transmissions of a set of users, represented by an 802.11b based wireless LAN (WLAN). Realizing that exploiting the under-utilization of the channel requires a good mo ..."
Abstract
-
Cited by 10 (6 self)
- Add to MetaCart
Abstract — In this paper we consider dynamically sharing the spectrum in the time-domain by exploiting whitespace between the bursty transmissions of a set of users, represented by an 802.11b based wireless LAN (WLAN). Realizing that exploiting the under-utilization of the channel requires a good model of the these users ’ medium access, we propose a continuous-time semi-Markov model that captures the WLAN’s behavior yet remains tractable enough to be used for deriving optimal control strategies within a decision-theoretic framework. Our model is based on actual measurements in the 2.4 GHz ISM band using a vector signal analyzer to collect complex baseband data. We explore two different sensing strategies to identify spectrum opportunities depending on whether the primary user’s transmission scheme is known. The collected data is used to statistically characterize the idle and busy periods of the channel. Furthermore, we show that a continuous-time semi-Markov model is able to capture the data with good accuracy. The Kolmogorov-Smirnov test is used to validate the model and to assess the model’s goodness-of-fit quantitatively. A conclusion summarizes the main results of the paper. I.
Channel Probing for Opportunistic Access with Multi-channel Sensing
"... Abstract—We consider an opportunistic communication system consisting of multiple independent channels with time-varying states. We formulate the problem of optimal sequential channel selection as a restless multi-armed bandit process, for which a powerful policy—Whittle’s index policy—can be implem ..."
Abstract
-
Cited by 8 (7 self)
- Add to MetaCart
Abstract—We consider an opportunistic communication system consisting of multiple independent channels with time-varying states. We formulate the problem of optimal sequential channel selection as a restless multi-armed bandit process, for which a powerful policy—Whittle’s index policy—can be implemented based on the indexability of the system. We obtain Whittle’s index in closed-form under the average reward criterion, which leads to the direct implementation of Whittle’s index policy. To evaluate the performance of Whittle’s index policy, we provide simple algorithms to calculate an upper bound of the optimal performance. The tightness of the upper bound and the nearoptimal performance of Whittle’s index policy are illustrated with simulation examples. When channels are stochastically identical, we show that Whittle’s index policy is equivalent to the myopic policy, which has a simple and robust structure. Based on this structure, we establish the approximation factors of the performance of Whittle’s index policy. Furthermore, we show that Whittle’s index policy is optimal under certain conditions. Index Terms—Multi-channel opportunistic access, restless multi-armed bandit, Whittle’s index, indexability
Cognitive Medium Access: Exploration, Exploitation and Competition
"... Abstract — This paper establishes the equivalence between cognitive medium access and the competitive multi-armed bandit problem. First, the scenario in which a single cognitive user wishes to opportunistically exploit the availability of empty frequency bands in the spectrum with multiple bands is ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
Abstract — This paper establishes the equivalence between cognitive medium access and the competitive multi-armed bandit problem. First, the scenario in which a single cognitive user wishes to opportunistically exploit the availability of empty frequency bands in the spectrum with multiple bands is considered. In this scenario, the availability probability of each channel is unknown to the cognitive user. Hence efficient medium access strategies must strike a balance between exploring the availability of other free channels and exploiting the opportunities identified thus far. By adopting a Bayesian approach for this classical bandit problem, the optimal medium access strategy is derived and its underlying recursive structure is illustrated via examples. To avoid the prohibitive computational complexity of the optimal strategy, a low complexity asymptotically optimal strategy is developed. The proposed strategy does not require any prior statistical knowledge about the traffic pattern on the different channels. Next, the multi-cognitive user scenario is considered and low complexity medium access protocols, which strike the optimal balance between exploration and exploitation in such competitive environments, are developed. Finally, this formalism is extended to the case in which each cognitive user is capable of sensing and using multiple channels simultaneously. I.
Optimal Cognitive Access of Markovian Channels under Tight Collision Constraints
"... Abstract—The problem of cognitive access of channels of primary users by a secondary user is considered. The transmissions of primary users are modeled as independent continuous-time Markovian on-off processes. A secondary cognitive user employs a slotted transmission format, and it senses one of th ..."
Abstract
-
Cited by 7 (6 self)
- Add to MetaCart
Abstract—The problem of cognitive access of channels of primary users by a secondary user is considered. The transmissions of primary users are modeled as independent continuous-time Markovian on-off processes. A secondary cognitive user employs a slotted transmission format, and it senses one of the possible channels before transmission. The objective of the cognitive user is to maximize its throughput subject to collision constraints imposed by the primary users. The optimal access strategy is in general a solution of a constrained partially observable Markov decision process, which involves a constrained optimization in an infinite dimensional functional space. It is shown in this paper that, when the collision constraints are tight, the optimal access strategy can be implemented by a simple memoryless access policy with periodic channel sensing. Analytical expressions are given for the thresholds on collision probabilities for which memoryless access performs optimally. Extensions to multiple secondary users are also presented. Numerical and theoretical results are presented to validate and extend the analysis for different practical scenarios. Index Terms—Cognitive radio, Dynamic spectrum allocation, Cognitive medium access, Markov decision processes.
Distributed sensing and access in cognitive radio networks
- in Proc. 10th International Symposium on Spread Spectrum Techniques and Applications (ISSSTA
, 2008
"... Abstract—We consider an ad hoc network of secondary users searching for idle frequency bands in a spectrum consisting of multiple channels. In each slot, a secondary user chooses one channel to sense and decide whether to access based on the sensing outcome. A sensing strategy for intelligent channe ..."
Abstract
-
Cited by 6 (5 self)
- Add to MetaCart
Abstract—We consider an ad hoc network of secondary users searching for idle frequency bands in a spectrum consisting of multiple channels. In each slot, a secondary user chooses one channel to sense and decide whether to access based on the sensing outcome. A sensing strategy for intelligent channel selection is crucial to track the rapidly varying spectrum opportunities. In an ad hoc network without a central controller or common control channels, a secondary user can only resort to its local observations in the decision making. The tradeoff is between choosing the channel most likely to be idle and avoiding other competing secondary users. We show that the problem can be formulated as a decentralized Partially Observable Markov Decision Process (POMDP). A suboptimal randomized sensing policy is then proposed. This policy effectively addresses this design tradeoff and offers significant improvement in network throughput over the optimal single-user design. Index Terms—Cognitive radio, spectrum opportunity tracking, multi-user diversity, decentralized POMDP, randomized policy. I.
Dynamic multichannel access with imperfect channel state detection
- IEEE Trans. Signal Process
, 2010
"... Abstract—A restless multi-armed bandit problem that arises in multichannel opportunistic communications is considered, where channels are modeled as independent and identical Gilbert–Elliot channels and channel state detection is subject to errors. A simple structure of the myopic policy is establis ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
Abstract—A restless multi-armed bandit problem that arises in multichannel opportunistic communications is considered, where channels are modeled as independent and identical Gilbert–Elliot channels and channel state detection is subject to errors. A simple structure of the myopic policy is established under a certain condition on the false alarm probability of the channel state detector. It is shown that myopic actions can be obtained by maintaining a simple channel ordering without knowing the underlying Markovian model. The optimality of the myopic policy is proved for the case of two channels and conjectured for general cases. Lower and upper bounds on the performance of the myopic policy are obtained in closed-form, which characterize the scaling behavior of the achievable throughput of the multichannel opportunistic system. The approximation factor of the myopic policy is also analyzed to bound its worst-case performance loss with respect to the optimal performance. Index Terms—Cognitive radio, dynamic multichannel access, myopic policy, restless multi-armed bandit.
A DECISION-THEORETIC FRAMEWORK FOR OPPORTUNISTIC SPECTRUM ACCESS
"... The authors identify basic components, fundamental trade-offs, and practical constraints in opportunistic spectrum access. A decision-theoretic framework based on the theory of partially observable Markov decision processes is introduced. ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
The authors identify basic components, fundamental trade-offs, and practical constraints in opportunistic spectrum access. A decision-theoretic framework based on the theory of partially observable Markov decision processes is introduced.

