Results 1  10
of
168
Finitetime analysis of the multiarmed bandit problem
 Machine Learning
, 2002
"... Abstract. Reinforcement learning policies face the exploration versus exploitation dilemma, i.e. the search for a balance between exploring the environment to find profitable actions while taking the empirically best action as often as possible. A popular measure of a policy’s success in addressing ..."
Abstract

Cited by 426 (13 self)
 Add to MetaCart
(Show Context)
Abstract. Reinforcement learning policies face the exploration versus exploitation dilemma, i.e. the search for a balance between exploring the environment to find profitable actions while taking the empirically best action as often as possible. A popular measure of a policy’s success in addressing this dilemma is the regret, that is the loss due to the fact that the globally optimal policy is not followed all the times. One of the simplest examples of the exploration/exploitation dilemma is the multiarmed bandit problem. Lai and Robbins were the first ones to show that the regret for this problem has to grow at least logarithmically in the number of plays. Since then, policies which asymptotically achieve this regret have been devised by Lai and Robbins and many others. In this work we show that the optimal logarithmic regret is also achievable uniformly over time, with simple and efficient policies, and for all reward distributions with bounded support. Keywords: bandit problems, adaptive allocation rules, finite horizon regret 1.
Quest: a bayesian adaptive psychometric method,” Percept Psychophys 33
, 1983
"... An adaptive psychometric procedure that places each trial at the current most probable Baye& ian estimate of threshold is described. The procedure takes advantage of the common finding that the human psychometric function is invariant in form when expressed as a function of log intensity. The pr ..."
Abstract

Cited by 170 (21 self)
 Add to MetaCart
An adaptive psychometric procedure that places each trial at the current most probable Baye& ian estimate of threshold is described. The procedure takes advantage of the common finding that the human psychometric function is invariant in form when expressed as a function of log intensity. The procedure is simple, fast, ana efficient, and may be easily implemented on any computer. A psychometric function describes the relation between some physical measure of a stimulus and the probability of a particular psychophysical response. The physical measure is usually stimulus strength, and the response is "yes " (in a yes/no experiment) or "correct " (in a forcedchoice experiment). More generally, there are several possible responses, each with its own psychometric function (e.g., in a rating scale experiment). Psychometric procedures are ways of testing the observer so as to gain information about the psychometric function. The advantages of adaptive procedures, which make use of previous responses to guide further testing, have been discussed by numerous authors (Cocnsweet, 1962; Levitt, 1971; Taylor & Creelman, 1967; Wetherill & Levitt, 1965). Because trials are more effective when · they are judiciously placed, an adaptive procedure will be more efficient the more it makes intelligent use of available information. This information is of two sorts: that derived from previous trials (data), and that drawn from the memory of the experimenter, published reports, and so on (prior knowledge). Prior knowledge may be further divided into information about the shape of the psychometric function and knowledge about threshold in the particular condition under study. Several recent methods make efficient use of the data and of knowledge about the shape of the psychometric function by calculating after each trial a maximum likelihood estimate of
Improving Regression Estimation: Averaging Methods for Variance Reduction with Extensions to General Convex Measure Optimization
, 1993
"... ..."
Population structure and eigenanalysis
 PLoS Genet 2(12): e190 DOI: 10.1371/journal.pgen.0020190
, 2006
"... Current methods for inferring population structure from genetic data do not provide formal significance tests for population differentiation. We discuss an approach to studying population structure (principal components analysis) that was first applied to genetic data by CavalliSforza and colleague ..."
Abstract

Cited by 75 (0 self)
 Add to MetaCart
(Show Context)
Current methods for inferring population structure from genetic data do not provide formal significance tests for population differentiation. We discuss an approach to studying population structure (principal components analysis) that was first applied to genetic data by CavalliSforza and colleagues. We place the method on a solid statistical footing, using results from modern statistics to develop formal significance tests. We also uncover a general ‘‘phase change’ ’ phenomenon about the ability to detect structure in genetic data, which emerges from the statistical theory we use, and has an important implication for the ability to discover structure in genetic data: for a fixed but large dataset size, divergence between two populations (as measured, for example, by a statistic like F ST) below a threshold is essentially undetectable, but a little above threshold, detection will be easy. This means that we can predict the dataset size needed to detect structure.
Heuristics for cardinality constrained portfolio optimisation
, 2000
"... In this paper we consider the problem of finding the efficient frontier associated with the standard meanvariance portfolio optimisation model. We extend the standard model to include cardinality constraints that limit a portfolio to have a specified number of assets, and to impose limits on the pr ..."
Abstract

Cited by 67 (4 self)
 Add to MetaCart
(Show Context)
In this paper we consider the problem of finding the efficient frontier associated with the standard meanvariance portfolio optimisation model. We extend the standard model to include cardinality constraints that limit a portfolio to have a specified number of assets, and to impose limits on the proportion of the portfolio held in a given asset (if any of the asset is held). We illustrate the differences that arise in the shape of this efficient frontier when such constraints are present. We present three heuristic algorithms based upon genetic algorithms, tabu search and simulated annealing for finding the cardinality constrained efficient frontier. Computational results are presented for five data sets involving up to 225 assets.
A note on universality of the distribution of the largest eigenvalues in certain sample covariance matrices
 J. Statist. Phys
, 2002
"... Recently Johansson (21) and Johnstone (16) proved that the distribution of the (properly rescaled) largest principal component of the complex (real) Wishart matrix X g X(X t X) converges to the Tracy–Widom law as n, p (the dimensions of X) tend to. in some ratio n/p Q c>0.We extend these results ..."
Abstract

Cited by 63 (3 self)
 Add to MetaCart
(Show Context)
Recently Johansson (21) and Johnstone (16) proved that the distribution of the (properly rescaled) largest principal component of the complex (real) Wishart matrix X g X(X t X) converges to the Tracy–Widom law as n, p (the dimensions of X) tend to. in some ratio n/p Q c>0.We extend these results in two directions. First of all, we prove that the joint distribution of the first, second, third, etc. eigenvalues of a Wishart matrix converges (after a proper rescaling) to the Tracy–Widom distribution. Second of all, we explain how the combinatorial machinery developed for Wigner random matrices in refs. 27, 38, and 39 allows to extend the results by Johansson and Johnstone to the case of X with nonGaussian entries, provided n − p=O(p 1/3). We also prove that l max [ (n 1/2 +p 1/2) 2 +O(p 1/2 log(p)) (a.e.) for general c>0. KEY WORDS: Sample covariance matrices; principal component; Tracy– Widom distribution.
A data distortion by probability distribution
 ACM TRANSACTIONS ON DATABASE SYSTEMS
, 1985
"... This paper introduces data distortion by probability distribution, a probability distortion that involves three steps. The first step is to identify the underlying density function of the original series and to estimate the parameters of this density function. The second step is to generate a series ..."
Abstract

Cited by 62 (0 self)
 Add to MetaCart
This paper introduces data distortion by probability distribution, a probability distortion that involves three steps. The first step is to identify the underlying density function of the original series and to estimate the parameters of this density function. The second step is to generate a series of data from the estimated density function. And the final step is to map and replace the generated series for the original one. Because it is replaced by the distorted data set, probability distortion guards the privacy of an individual belonging to the original data set. At the same time, the probability distorted series provides asymptotically the same statistical properties as those of the original series, since both are under the same distribution. Unlike conventional point distortion, probability distortion is difficult to compromise by repeated queries, and provides a maximum exposure for statistical analysis.
Localized FaultTolerant Event Boundary Detection in Sensor Networks
 In Proc. of IEEE INFOCOM
, 2005
"... Abstract — This paper targets the identification of faulty sensors and detection of the reach of events in sensor networks with faulty sensors. Typical applications include the detection of the transportation front line of a contamination and the diagnosis of network health. We propose and analyze t ..."
Abstract

Cited by 59 (9 self)
 Add to MetaCart
(Show Context)
Abstract — This paper targets the identification of faulty sensors and detection of the reach of events in sensor networks with faulty sensors. Typical applications include the detection of the transportation front line of a contamination and the diagnosis of network health. We propose and analyze two novel algorithms for faulty sensor identification and faulttolerant event boundary detection. These algorithms are purely localized and thus scale well to large sensor networks. Their computational overhead is low, since only simple numerical operations are involved. Simulation results indicate that these algorithms can clearly detect the event boundary and can identify faulty sensors with a high accuracy and a low false alarm rate when as many as 20 % sensors become faulty. Our work is exploratory in that the proposed algorithms can accept any kind of scalar values as inputs, a dramatic improvement over existing works that take only 0/1 decision predicates. Therefore, our algorithms are generic. They can be applied as long as the “events ” can be modelled by numerical numbers. Though designed for sensor networks, our algorithms can be applied to the outlier detection and regional data analysis in spatial data mining.
Wornell, “Secure transmission with multiple antennas II: The MIMOME wiretap channel
 IEEE Trans. Inf. Theory
"... Abstract—The role of multiple antennas for secure communication is investigated within the framework of Wyner’s wiretap channel. We characterize the secrecy capacity in terms of generalized eigenvalues when the sender and eavesdropper have multiple antennas, the intended receiver has a single antenn ..."
Abstract

Cited by 57 (7 self)
 Add to MetaCart
(Show Context)
Abstract—The role of multiple antennas for secure communication is investigated within the framework of Wyner’s wiretap channel. We characterize the secrecy capacity in terms of generalized eigenvalues when the sender and eavesdropper have multiple antennas, the intended receiver has a single antenna, and the channel matrices are fixed and known to all the terminals, and show that a beamforming strategy is capacityachieving. In addition, we study a masked beamforming scheme that radiates power isotropically in all directions and show that it attains nearoptimal performance in the high SNR regime. Insights into the scaling behavior of the capacity in the large antenna regime as well as extensions to ergodic fading channels are also provided. Index Terms—Artificial noise, broadcast channel, cryptography, generalized eigenvalues, masked beamforming, MIMO systems, multiple antennas, secrecy capacity, secure spacetime codes, wiretap channel. I.
The Effect of Resource Limits and Task Complexity on Collaborative Planning in Dialogue
 Artificial Intelligence Journal
, 1996
"... This paper shows how agents' choice in communicative action can be designed to mitigate the effect of their resource 1/mits in the context of particular features of a collaborative planning task. I first motivate a number of hypotheses about effective language behavior based on a statistical an ..."
Abstract

Cited by 53 (11 self)
 Add to MetaCart
This paper shows how agents' choice in communicative action can be designed to mitigate the effect of their resource 1/mits in the context of particular features of a collaborative planning task. I first motivate a number of hypotheses about effective language behavior based on a statistical analysis of a corpus of natural collaborative planning dialogues. These hypotheses are then tested in a dialogue testbed whose design is motivated by the corpus analysis. Experiments in the testbed examine the interaction between (1) agents' resource 1/mits in attentional capacity and inferential capacity; (2) agents' choice in communication; and (3) features of communicative tasks that affect task difficulty such as inferential complexity, degree of belief coordination required, and tolerance for errors. The results show that good algorithms for communication must be defined relative to the agents' resource 1/mits and the features of the task. Algorithms that are inefficient for inferentially simple, low coordination or faulttolerant tasks are effective when tasks require coordination or complex inferences, or are faultintolerant. The results provide an explanation for the occurrence of utterances in human dialogues that, prima facie, appear inefficient, and provide the basis for the design of effective algorithms for communicative choice for resource limited agents.