Results 1 -
6 of
6
Near-optimal sensor placements in gaussian processes
- In ICML
, 2005
"... When monitoring spatial phenomena, which can often be modeled as Gaussian processes (GPs), choosing sensor locations is a fundamental task. There are several common strategies to address this task, for example, geometry or disk models, placing sensors at the points of highest entropy (variance) in t ..."
Abstract
-
Cited by 91 (24 self)
- Add to MetaCart
When monitoring spatial phenomena, which can often be modeled as Gaussian processes (GPs), choosing sensor locations is a fundamental task. There are several common strategies to address this task, for example, geometry or disk models, placing sensors at the points of highest entropy (variance) in the GP model, and A-, D-, or E-optimal design. In this paper, we tackle the combinatorial optimization problem of maximizing the mutual information between the chosen locations and the locations which are not selected. We prove that the problem of finding the configuration that maximizes mutual information is NP-complete. To address this issue, we describe a polynomial-time approximation that is within (1 − 1/e) of the optimum by exploiting the submodularity of mutual information. We also show how submodularity can be used to obtain online bounds, and design branch and bound search procedures. We then extend our algorithm to exploit lazy evaluations and local structure in the GP, yielding significant speedups. We also extend our approach to find placements which are robust against node failures and uncertainties in the model. These extensions are again associated with rigorous theoretical approximation guarantees, exploiting the submodularity of the objective function. We demonstrate the advantages of our approach towards optimizing mutual information in a very extensive empirical study on two real-world data sets.
Near-optimal nonmyopic value of information in graphical models
- In Annual Conference on Uncertainty in Artificial Intelligence
"... A fundamental issue in real-world systems, such as sensor networks, is the selection of observations which most effectively reduce uncertainty. More specifically, we address the long standing problem of nonmyopically selecting the most informative subset of variables in a graphical model. We present ..."
Abstract
-
Cited by 50 (13 self)
- Add to MetaCart
A fundamental issue in real-world systems, such as sensor networks, is the selection of observations which most effectively reduce uncertainty. More specifically, we address the long standing problem of nonmyopically selecting the most informative subset of variables in a graphical model. We present the first efficient randomized algorithm providing a constant factor (1 − 1/e − ε) approximation guarantee for any ε> 0 with high confidence. The algorithm leverages the theory of submodular functions, in combination with a polynomial bound on sample complexity. We furthermore prove that no polynomial time algorithm can provide a constant factor approximation better than (1 − 1/e) unless P = NP. Finally, we provide extensive evidence of the effectiveness of our method on two complex real-world datasets. 1
Active Learning For Identifying Function Threshold Boundaries
"... We present an efficient algorithm to actively select queries for learning the boundaries separating a function domain into regions where the function is above and below a given threshold. We develop experiment selection methods based on entropy, misclassification rates, variance, and their combi ..."
Abstract
-
Cited by 6 (4 self)
- Add to MetaCart
We present an efficient algorithm to actively select queries for learning the boundaries separating a function domain into regions where the function is above and below a given threshold. We develop experiment selection methods based on entropy, misclassification rates, variance, and their combinations, and show how they perform on a number of data sets. We then show how these algorithms are used to determine simultaneously valid 1 - # confidence intervals for seven cosmological parameters. Experimentation shows that the algorithm reduces the computation necessary for the parameter estimation problem by an order of magnitude.
Adaptive design of supercomputer experiments
, 2006
"... Computer experiments are often performed to allow modeling of a response surface of a physical experi-ment that can be too costly or difficult to run except using a simulator. Running the experiment over a dense grid can be prohibitively expensive, yet running over a sparse design chosen in advance ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Computer experiments are often performed to allow modeling of a response surface of a physical experi-ment that can be too costly or difficult to run except using a simulator. Running the experiment over a dense grid can be prohibitively expensive, yet running over a sparse design chosen in advance can result in obtaining insufficient information in parts of the space, particularly when the surface is nonstation-ary. We propose an approach which automatically explores the space while simultaneously fitting the response surface, using predictive uncertainty to guide subsequent experimental runs. The newly devel-oped Bayesian treed Gaussian process is used as the surrogate model, and a fully Bayesian approach allows explicit nonstationary measures of uncertainty. Our adaptive sequential design framework has been developed to cope with an asynchronous, random, agent-based supercomputing environment. We take a hybrid approach which melds optimal strategies from the statistics literature with flexible strate-gies from the active learning literature. The merits of this approach are borne out in several examples, including the motivating example of a computational fluid dynamics simulation of rocket booster. Key words: nonstationary spatial model, treed partitioning, sequential design, active learning 1
Actively Learning Level-Sets of Composite Functions
"... Scientists frequently have multiple types of experiments and data sets on which they can test the validity of their parameterized models and locate plausible regions for the model parameters. By examining multiple data sets, scientists can obtain inferences which typically are much more informative ..."
Abstract
- Add to MetaCart
Scientists frequently have multiple types of experiments and data sets on which they can test the validity of their parameterized models and locate plausible regions for the model parameters. By examining multiple data sets, scientists can obtain inferences which typically are much more informative than the deductions derived from each of the data sources independently. Several standard data combination techniques result in target functions which are a weighted sum of the observed data sources. Thus, computing constraints on the plausible regions of the model parameter space can be formulated as finding a level set of a target function which is the sum of observable functions. We propose an active learning algorithm for this problem which selects both a a sample (from the parameter space) and an observable function upon which to compute the next sample. Empirical tests on synthetic functions and on real data for an eight parameter cosmological model show that our algorithm significantly reduces the number of samples required to identify the desired level-set. 1.
CMU-ML-07-121 Actively Learning Level-Sets of Composite Functions
, 2007
"... Scientists frequently have multiple types of experiments and data sets on which they can test the validity of their models and the plausible or optimal regions for the model parameters. Identifying these parameter regions reduces to finding a level set on a function defined as a composite of the eva ..."
Abstract
- Add to MetaCart
Scientists frequently have multiple types of experiments and data sets on which they can test the validity of their models and the plausible or optimal regions for the model parameters. Identifying these parameter regions reduces to finding a level set on a function defined as a composite of the evaluations of each experiment or data set for a parameter setting. An active learning algorithm for this problem must at each iteration select a parameter setting to be tested and decide which experiment type to use for the test. We propose an active learning algorithm for identifying level sets of composite functions. Empirical tests on synthetic functions and on real data for a 7D cosmological model show it significantly reduces the number of samples required to identify desired regions. 1

