Results 1  10
of
280
Bayesian Experimental Design: A Review
 Statistical Science
, 1995
"... This paper reviews the literature on Bayesian experimental design, both for linear and nonlinear models. A unified view of the topic is presented by putting experimental design in a decision theoretic framework. This framework justifies many optimality criteria, and opens new possibilities. Various ..."
Abstract

Cited by 258 (1 self)
 Add to MetaCart
(Show Context)
This paper reviews the literature on Bayesian experimental design, both for linear and nonlinear models. A unified view of the topic is presented by putting experimental design in a decision theoretic framework. This framework justifies many optimality criteria, and opens new possibilities. Various design criteria become part of a single, coherent approach.
DETERMINANT MAXIMIZATION WITH LINEAR MATRIX INEQUALITY CONSTRAINTS
"... The problem of maximizing the determinant of a matrix subject to linear matrix inequalities arises in many fields, including computational geometry, statistics, system identification, experiment design, and information and communication theory. It can also be considered as a generalization of the s ..."
Abstract

Cited by 200 (18 self)
 Add to MetaCart
The problem of maximizing the determinant of a matrix subject to linear matrix inequalities arises in many fields, including computational geometry, statistics, system identification, experiment design, and information and communication theory. It can also be considered as a generalization of the semidefinite programming problem. We give an overview of the applications of the determinant maximization problem, pointing out simple cases where specialized algorithms or analytical solutions are known. We then describe an interiorpoint method, with a simplified analysis of the worstcase complexity and numerical results that indicate that the method is very efficient, both in theory and in practice. Compared to existing specialized algorithms (where they are available), the interiorpoint method will generally be slower; the advantage is that it handles a much wider variety of problems.
Covariate shift adaptation by importance weighted cross validation
, 2000
"... A common assumption in supervised learning is that the input points in the training set follow the same probability distribution as the input points that will be given in the future test phase. However, this assumption is not satisfied, for example, when the outside of the training region is extrapo ..."
Abstract

Cited by 108 (51 self)
 Add to MetaCart
(Show Context)
A common assumption in supervised learning is that the input points in the training set follow the same probability distribution as the input points that will be given in the future test phase. However, this assumption is not satisfied, for example, when the outside of the training region is extrapolated. The situation where the training input points and test input points follow different distributions while the conditional distribution of output values given input points is unchanged is called the covariate shift. Under the covariate shift, standard model selection techniques such as cross validation do not work as desired since its unbiasedness is no longer maintained. In this paper, we propose a new method called importance weighted cross validation (IWCV), for which we prove its unbiasedness even under the covariate shift. The IWCV procedure is the only one that can be applied for unbiased classification under covariate shift, whereas alternatives to IWCV exist for regression. The usefulness of our proposed method is illustrated by simulations, and furthermore demonstrated in the braincomputer interface, where strong nonstationarity effects can be seen between training and test sessions. c2000 Masashi Sugiyama, Matthias Krauledat, and KlausRobert Müller.
Optimizing linear counting queries under differential privacy
 In PODS ’10: Proceedings of the twentyninth ACM SIGMODSIGACTSIGART symposium on Principles of database systems of data
, 2010
"... Differential privacy is a robust privacy standard that has been successfully applied to a range of data analysis tasks. But despite much recent work, optimal strategies for answering a collection of related queries are not known. We propose the matrix mechanism, a new algorithm for answering a workl ..."
Abstract

Cited by 68 (7 self)
 Add to MetaCart
(Show Context)
Differential privacy is a robust privacy standard that has been successfully applied to a range of data analysis tasks. But despite much recent work, optimal strategies for answering a collection of related queries are not known. We propose the matrix mechanism, a new algorithm for answering a workload of predicate counting queries. Given a workload, the mechanism requests answers to a different set of queries, called a query strategy, which are answered using the standard Laplace mechanism. Noisy answers to the workload queries are then derived from the noisy answers to the strategy queries. This two stage process can result in a more complex correlated noise distribution that preserves differential privacy but increases accuracy. We provide a formal analysis of the error of query answers produced by the mechanism and investigate the problem of computing the optimal query strategy in support of a given workload. We show this problem can be formulated as a rankconstrained semidefinite program. Finally, we analyze two seemingly distinct techniques, whose similar behavior is explained by viewing them as instances of the matrix mechanism.
Sensor selection via convex optimization
 IEEE Transactions on Signal Processing
, 2009
"... Abstract—We consider the problem of choosing a set of sensor measurements, from a set of possible or potential sensor measurements, that minimizes the error in estimating some parameters. Solving this problem by evaluating the performance for each of the possible choices of sensor measurements is no ..."
Abstract

Cited by 65 (2 self)
 Add to MetaCart
(Show Context)
Abstract—We consider the problem of choosing a set of sensor measurements, from a set of possible or potential sensor measurements, that minimizes the error in estimating some parameters. Solving this problem by evaluating the performance for each of the possible choices of sensor measurements is not practical unless and are small. In this paper, we describe a heuristic, based on convex optimization, for approximately solving this problem. Our heuristic gives a subset selection as well as a bound on the best performance that can be achieved by any selection of sensor measurements. There is no guarantee that the gap between the performance of the chosen subset and the performance bound is always small; but numerical experiments suggest that the gap is small in many cases. Our heuristic method requires on the order of operations; for 1000 possible sensors, we can carry out sensor selection in a few seconds on a 2GHz personal computer. Index Terms—Convex optimization, experiment design, sensor selection. I.
and response: experiments in sampling the environment
 in Proceedings of the 2nd international
"... Monitoring of environmental phenomena with embedded networked sensing confronts the challenges of both unpredictable variability in the spatial distribution of phenomena, coupled with demands for a high spatial sampling rate in three dimensions. For example, low distortion mapping of critical solar ..."
Abstract

Cited by 61 (12 self)
 Add to MetaCart
(Show Context)
Monitoring of environmental phenomena with embedded networked sensing confronts the challenges of both unpredictable variability in the spatial distribution of phenomena, coupled with demands for a high spatial sampling rate in three dimensions. For example, low distortion mapping of critical solar radiation properties in forest environments may require twodimensional spatial sampling rates of greater than 10 samples/m 2 over transects exceeding 1000 m 2. Clearly, adequate sampling coverage of such a transect requires an impractically large number of sensing nodes. This paper describes a new approach where the deployment of a combination of autonomousarticulated and static sensor nodes enables sufficient spatiotemporal sampling density over large transects to meet a general set of environmental mapping
Strong duality for semidefinite programming
 SIAM J. Optim
, 1997
"... Abstract. It is well known that the duality theory for linear programming (LP) is powerful and elegant and lies behind algorithms such as simplex and interiorpoint methods. However, the standard Lagrangian for nonlinear programs requires constraint qualifications to avoid duality gaps. Semidefinite ..."
Abstract

Cited by 60 (19 self)
 Add to MetaCart
Abstract. It is well known that the duality theory for linear programming (LP) is powerful and elegant and lies behind algorithms such as simplex and interiorpoint methods. However, the standard Lagrangian for nonlinear programs requires constraint qualifications to avoid duality gaps. Semidefinite linear programming (SDP) is a generalization of LP where the nonnegativity constraints are replaced by a semidefiniteness constraint on the matrix variables. There are many applications, e.g., in systems and control theory and combinatorial optimization. However, the Lagrangian dual for SDP can have a duality gap. We discuss the relationships among various duals and give a unified treatment for strong duality in semidefinite programming. These duals guarantee strong duality, i.e., a zero duality gap and dual attainment. This paper is motivated by the recent paper by Ramana where one of these duals is introduced.
InputDependent Estimation of Generalization Error under Covariate Shift
 STATISTICS & DECISIONS, VOL.23, NO.4, PP.249–279, 2005
, 2005
"... A common assumption in supervised learning is that the training and test input points follow the same probability distribution. However, this assumption is not fulfilled, e.g., in interpolation, extrapolation, active learning, or classification with imbalanced data. The violation of this assumption— ..."
Abstract

Cited by 56 (30 self)
 Add to MetaCart
A common assumption in supervised learning is that the training and test input points follow the same probability distribution. However, this assumption is not fulfilled, e.g., in interpolation, extrapolation, active learning, or classification with imbalanced data. The violation of this assumption—known as the covariate shift— causes a heavy bias in standard generalization error estimation schemes such as crossvalidation or Akaike’s information criterion, and thus they result in poor model selection. In this paper, we propose an alternative estimator of the generalization error for the squared loss function when training and test distributions are different. The proposed generalization error estimator is shown to be exactly unbiased for finite samples if the learning target function is realizable and asymptotically unbiased in general. We also show that, in addition to the unbiasedness, the proposed generalization error estimator can accurately estimate the difference of the generalization error among different models, which is a desirable property in model selection. Numerical studies show that the proposed method compares favorably with existing model selection methods in regression for extrapolation and in classification with imbalanced data.
Active Learning in Multilayer Perceptrons
, 1996
"... We propose an active learning method with hiddenunit reduction, which is devised specially for multilayer perceptrons (MLP). First, we review our active learning method, and point out that many Fisherinformationbased methods applied to MLP have a critical problem: the information matrix may be si ..."
Abstract

Cited by 52 (1 self)
 Add to MetaCart
We propose an active learning method with hiddenunit reduction, which is devised specially for multilayer perceptrons (MLP). First, we review our active learning method, and point out that many Fisherinformationbased methods applied to MLP have a critical problem: the information matrix may be singular. To solve this problem, we derive the singularity condition of an information matrix, and propose an active learning technique that is applicable to MLP. Its effectiveness is verified through experiments. 1 INTRODUCTION When one trains a learning machine using a set of data given by the true system, its ability can be improved if one selects the training data actively. In this paper, we consider the problem of active learning in multilayer perceptrons (MLP). First, we review our method of active learning (Fukumizu el al., 1994), in which we prepare a probability distribution and obtain training data as samples from the distribution. This methodology leads us to an informationmatrix...