Results 1  10
of
49
Nearoptimal sensor placements in gaussian processes
 In ICML
, 2005
"... When monitoring spatial phenomena, which can often be modeled as Gaussian processes (GPs), choosing sensor locations is a fundamental task. There are several common strategies to address this task, for example, geometry or disk models, placing sensors at the points of highest entropy (variance) in t ..."
Abstract

Cited by 175 (26 self)
 Add to MetaCart
When monitoring spatial phenomena, which can often be modeled as Gaussian processes (GPs), choosing sensor locations is a fundamental task. There are several common strategies to address this task, for example, geometry or disk models, placing sensors at the points of highest entropy (variance) in the GP model, and A, D, or Eoptimal design. In this paper, we tackle the combinatorial optimization problem of maximizing the mutual information between the chosen locations and the locations which are not selected. We prove that the problem of finding the configuration that maximizes mutual information is NPcomplete. To address this issue, we describe a polynomialtime approximation that is within (1 − 1/e) of the optimum by exploiting the submodularity of mutual information. We also show how submodularity can be used to obtain online bounds, and design branch and bound search procedures. We then extend our algorithm to exploit lazy evaluations and local structure in the GP, yielding significant speedups. We also extend our approach to find placements which are robust against node failures and uncertainties in the model. These extensions are again associated with rigorous theoretical approximation guarantees, exploiting the submodularity of the objective function. We demonstrate the advantages of our approach towards optimizing mutual information in a very extensive empirical study on two realworld data sets.
Nearoptimal nonmyopic value of information in graphical models
 In Annual Conference on Uncertainty in Artificial Intelligence
"... A fundamental issue in realworld systems, such as sensor networks, is the selection of observations which most effectively reduce uncertainty. More specifically, we address the long standing problem of nonmyopically selecting the most informative subset of variables in a graphical model. We present ..."
Abstract

Cited by 91 (17 self)
 Add to MetaCart
A fundamental issue in realworld systems, such as sensor networks, is the selection of observations which most effectively reduce uncertainty. More specifically, we address the long standing problem of nonmyopically selecting the most informative subset of variables in a graphical model. We present the first efficient randomized algorithm providing a constant factor (1 − 1/e − ε) approximation guarantee for any ε> 0 with high confidence. The algorithm leverages the theory of submodular functions, in combination with a polynomial bound on sample complexity. We furthermore prove that no polynomial time algorithm can provide a constant factor approximation better than (1 − 1/e) unless P = NP. Finally, we provide extensive evidence of the effectiveness of our method on two complex realworld datasets. 1
Maximizing nonmonotone submodular functions
 In Proceedings of 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS
, 2007
"... Submodular maximization generalizes many important problems including Max Cut in directed/undirected graphs and hypergraphs, certain constraint satisfaction problems and maximum facility location problems. Unlike the problem of minimizing submodular functions, the problem of maximizing submodular fu ..."
Abstract

Cited by 85 (12 self)
 Add to MetaCart
Submodular maximization generalizes many important problems including Max Cut in directed/undirected graphs and hypergraphs, certain constraint satisfaction problems and maximum facility location problems. Unlike the problem of minimizing submodular functions, the problem of maximizing submodular functions is NPhard. In this paper, we design the first constantfactor approximation algorithms for maximizing nonnegative submodular functions. In particular, we give a deterministic local search 1 2approximation and a randomizedapproximation algo
Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design
"... Many applications require optimizing an unknown, noisy function that is expensive to evaluate. We formalize this task as a multiarmed bandit problem, where the payoff function is either sampled from a Gaussian process (GP) or has low RKHS norm. We resolve the important open problem of deriving regre ..."
Abstract

Cited by 46 (9 self)
 Add to MetaCart
Many applications require optimizing an unknown, noisy function that is expensive to evaluate. We formalize this task as a multiarmed bandit problem, where the payoff function is either sampled from a Gaussian process (GP) or has low RKHS norm. We resolve the important open problem of deriving regret bounds for this setting, which imply novel convergence rates for GP optimization. We analyze GPUCB, an intuitive upperconfidence based algorithm, and bound its cumulative regret in terms of maximal information gain, establishing a novel connection between GP optimization and experimental design. Moreover, by bounding the latter in terms of operator spectra, we obtain explicit sublinear regret bounds for many commonly used covariance functions. In some important cases, our bounds have surprisingly weak dependence on the dimensionality. In our experiments on real sensor data, GPUCB compares favorably with other heuristical GP optimization approaches. 1.
Spectral bounds for sparse PCA: Exact and greedy algorithms
 Advances in Neural Information Processing Systems 18
, 2006
"... Sparse PCA seeks approximate sparse “eigenvectors ” whose projections capture the maximal variance of data. As a cardinalityconstrained and nonconvex optimization problem, it is NPhard and yet it is encountered in a wide range of applied fields, from bioinformatics to finance. Recent progress ha ..."
Abstract

Cited by 44 (4 self)
 Add to MetaCart
Sparse PCA seeks approximate sparse “eigenvectors ” whose projections capture the maximal variance of data. As a cardinalityconstrained and nonconvex optimization problem, it is NPhard and yet it is encountered in a wide range of applied fields, from bioinformatics to finance. Recent progress has focused mainly on continuous approximation and convex relaxation of the hard cardinality constraint. In contrast, we consider an alternative discrete spectral formulation based on variational eigenvalue bounds and provide an effective greedy strategy as well as provably optimal solutions using branchandbound search. Moreover, the exact methodology used reveals a simple renormalization step that improves approximate solutions obtained by any continuous method. The resulting performance gain of discrete algorithms is demonstrated on realworld benchmark data and in extensive Monte Carlo evaluation trials. 1
Algorithms for Subset Selection in Linear Regression
 STOC'08
, 2008
"... We study the problem of selecting a subset of k random variables to observe that will yield the best linear prediction of another variable of interest, given the pairwise correlations between the observation variables and the predictor variable. Under approximation preserving reductions, this proble ..."
Abstract

Cited by 30 (3 self)
 Add to MetaCart
We study the problem of selecting a subset of k random variables to observe that will yield the best linear prediction of another variable of interest, given the pairwise correlations between the observation variables and the predictor variable. Under approximation preserving reductions, this problem is also equivalent to the“sparse approximation”problem of approximating signals concisely. We propose and analyze exact and approximation algorithms for several special cases of practical interest. We give an FPTAS when the covariance matrix has constant bandwidth, and exact algorithms when the associated covariance graph, consisting of edges for pairs of variables with nonzero correlation, forms a tree or has a large (known) independent set. Furthermore, we give an exact algorithm when the variables can be embedded into a line such that the covariance decreases exponentially in the distance, and a constantfactor approximation when the variables have no “conditional suppressor variables”. Much of our reasoning is based on perturbation results for the R 2 multiple correlation measure, frequently used as a measure for “goodnessoffit statistics”. It lies at the core of our FPTAS, and also allows us to extend exact algorithms to approximation algorithms when the matrix “nearly ” falls into one of the above classes. We also use perturbation analysis to prove approximation guarantees for the widely used “Forward Regression ” heuristic when the observation variables are nearly independent.
Nonmyopic active learning of gaussian processes: An explorationexploitation approach
 IN ICML
, 2007
"... When monitoring spatial phenomena, such as the ecological condition of a river, deciding where to make observations is a challenging task. In these settings, a fundamental question is when an active learning, or sequential design, strategy, where locations are selected based on previous measurements ..."
Abstract

Cited by 28 (3 self)
 Add to MetaCart
When monitoring spatial phenomena, such as the ecological condition of a river, deciding where to make observations is a challenging task. In these settings, a fundamental question is when an active learning, or sequential design, strategy, where locations are selected based on previous measurements, will perform significantly better than sensing at an a priori specified set of locations. For Gaussian Processes (GPs), which often accurately model spatial phenomena, we present an analysis and efficient algorithms that address this question. Central to our analysis is a theoretical bound which quantifies the performance difference between active and a priori design strategies. We consider GPs with unknown kernel parameters and present a nonmyopic approach for trading off exploration, i.e., decreasing uncertainty about the model parameters, and exploitation, i.e., nearoptimally selecting observations when the parameters are (approximately) known. We discuss several exploration strategies, and present logarithmic sample complexity bounds for the exploration phase. We then extend our algorithm to handle nonstationary GPs exploiting local structure in the model. A variational approach allows us to perform efficient inference in this class of nonstationary models. We also present extensive empirical evaluation on several realworld problems.
Generalized Spectral Bounds for Sparse LDA
 IN INTERNATIONAL CONFERENCE ON MACHINE LEARNING. ICML’06
, 2006
"... We present a discrete spectral framework for the sparse or cardinalityconstrained solution of a generalized Rayleigh quotient. This NPhard combinatorial optimization problem is central to supervised learning tasks such as sparse LDA, feature selection and relevance ranking for classification. We ..."
Abstract

Cited by 27 (3 self)
 Add to MetaCart
We present a discrete spectral framework for the sparse or cardinalityconstrained solution of a generalized Rayleigh quotient. This NPhard combinatorial optimization problem is central to supervised learning tasks such as sparse LDA, feature selection and relevance ranking for classification. We derive a new generalized form of the Inclusion Principle for variational eigenvalue bounds, leading to exact and optimal sparse linear discriminants using branchandbound search. An efficient greedy (approximate) technique is also presented. The generalization performance of our sparse LDA algorithms is demonstrated with realworld UCI ML benchmarks and compared to a leading SVMbased gene selection algorithm for cancer classification.
A Note on the Budgeted Maximization of Submodular Functions
, 2005
"... Many set functions F in combinatorial optimization satisfy the diminishing returns property F (A ∪ X) − F (A) ≥ F (A ′ ∪ X) − F (A ′ ) for A ⊂ A ′. Such functions are called submodular. A result from Nemhauser et.al. states that the problem of selecting kelement subsets maximizing a nondecreasin ..."
Abstract

Cited by 26 (6 self)
 Add to MetaCart
Many set functions F in combinatorial optimization satisfy the diminishing returns property F (A ∪ X) − F (A) ≥ F (A ′ ∪ X) − F (A ′ ) for A ⊂ A ′. Such functions are called submodular. A result from Nemhauser et.al. states that the problem of selecting kelement subsets maximizing a nondecreasing submodular function can be approximated with a constant factor (1 − 1/e) performance guarantee. Khuller et.al. showed that for the special submodular function involved in the MAXCOVER problem, this approximation result generalizes to a budgeted setting under linear nonnegative costfunctions. In this note, we extend this result to general submodular functions. Motivated by the problem of maximizing entropy in discrete graphical models, where the submodular objective cannot be evaluated exactly, we generalize our result to account for absolute errors.
Generalized MaximumEntropy Sampling
, 1999
"... We introduce the Generalized Constrained MaximumEntropy Sampling Problem (GCMESP) as a common generalization of the ordinary Constrained MaximumEntropy Sampling Problem (CMESP) and the Constrained DOptimality Problem (CDOPTP). Exact algorithms for both CMESP and CDOPTP are based on branchandbou ..."
Abstract

Cited by 25 (9 self)
 Add to MetaCart
We introduce the Generalized Constrained MaximumEntropy Sampling Problem (GCMESP) as a common generalization of the ordinary Constrained MaximumEntropy Sampling Problem (CMESP) and the Constrained DOptimality Problem (CDOPTP). Exact algorithms for both CMESP and CDOPTP are based on branchandbound methods. We extend a spectral upperbounding method for the CMESP to the GCMESP. Introduction The Constrained DOptimality Problem (CDOPTP) and the Constrained MaximumEntropy Sampling Problem (CMESP) are both fundamental problems in experimental design. The CDOPTP has application in any statistical setting where we wish to fit a linear model and sampling is costly. The CMESP arises in such areas as environmental, geological and atmospheric monitoring, where establishing and maintaining monitoring devices is costly. In Section 1, we introduce the Generalized Constrained MaximumEntropy Sampling Problem (GCMESP) as a common generalization of the CMESP and the CDOPTP. In Section 2, generali...