Results 1  10
of
19
Robust submodular observation selection
, 2008
"... In many applications, one has to actively select among a set of expensive observations before making an informed decision. For example, in environmental monitoring, we want to select locations to measure in order to most effectively predict spatial phenomena. Often, we want to select observations wh ..."
Abstract

Cited by 46 (4 self)
 Add to MetaCart
In many applications, one has to actively select among a set of expensive observations before making an informed decision. For example, in environmental monitoring, we want to select locations to measure in order to most effectively predict spatial phenomena. Often, we want to select observations which are robust against a number of possible objective functions. Examples include minimizing the maximum posterior variance in Gaussian Process regression, robust experimental design, and sensor placement for outbreak detection. In this paper, we present the Submodular Saturation algorithm, a simple and efficient algorithm with strong theoretical approximation guarantees for cases where the possible objective functions exhibit submodularity, an intuitive diminishing returns property. Moreover, we prove that better approximation algorithms do not exist unless NPcomplete problems admit efficient algorithms. We show how our algorithm can be extended to handle complex cost functions (incorporating nonunit observation cost or communication and path costs). We also show how the algorithm can be used to nearoptimally trade off expectedcase (e.g., the Mean Square Prediction Error in Gaussian Process regression) and worstcase (e.g., maximum predictive variance) performance. We show that many important machine learning problems fit our robust submodular observation selection formalism, and provide extensive empirical evaluation on several realworld problems. For Gaussian Process regression, our algorithm compares favorably with stateoftheart heuristics described in the geostatistics literature, while being simpler, faster and providing theoretical guarantees. For robust experimental design, our algorithm performs favorably compared to SDPbased algorithms.
Support points of locally optimal designs for nonlinear models with two parameters.
 Ann. Statist.
, 2009
"... Abstract We propose a new approach for identifying the support points of a locally optimal design when the model is a nonlinear model. In contrast to the commonly used geometric approach, we use an approach based on algebraic tools. Considerations are restricted to models with two parameters, and ..."
Abstract

Cited by 14 (3 self)
 Add to MetaCart
Abstract We propose a new approach for identifying the support points of a locally optimal design when the model is a nonlinear model. In contrast to the commonly used geometric approach, we use an approach based on algebraic tools. Considerations are restricted to models with two parameters, and the general results are applied to often used special cases, including logistic, probit, double exponential and double reciprocal * Research sponsored by NSF grants DMS0304661 and DMS0707013 (Yang) and DMS0706917 (Stufken).
Sequential experimental designs for generalized linear models
 Journal of the American Statistical Association
, 2008
"... We consider the problem of experimental design when the response is modeled by a generalized linear model (GLM) and the experimental plan can be determined sequentially. Most previous research on this problem has been limited either to onefactor, binary response experiments or to augmenting the de ..."
Abstract

Cited by 11 (0 self)
 Add to MetaCart
(Show Context)
We consider the problem of experimental design when the response is modeled by a generalized linear model (GLM) and the experimental plan can be determined sequentially. Most previous research on this problem has been limited either to onefactor, binary response experiments or to augmenting the design when there are already sufficient data to compute parameter estimates. We suggest a new procedure for the sequential choice of observations that offers five important advantages: (1) It can be applied to multifactor experiments and is not limited to the onefactor setting; (2) it can be used with any GLM, not just binary responses; (3) both fully sequential and group sequential settings are treated; (4) it enables efficient design from the outset of the experiment; and (5) the experimenter is not constrained to specify a single model and can use the prior to reflect uncertainty as to the link function and the form of the linear predictor. Our procedure is based on a Doptimality criterion and on a Bayesian analysis that exploits a discretization of the parameter space to efficiently represent the posterior distribution. In the onefactor setting, a simulation study shows that our method is superior in efficiency to commonly used procedures, such as the "Bruceton" test, Neyer's procedure, or Wu's improved RobbinsMonro method. We also present a comparison of results obtained with the new algorithm versus the "Bruceton" method on an actual sensitivity test conducted recently at an industrial plant. Source code for the algorithms and examples throughout the article is available at http://www.math.tau.ac.il/~dms/GLM_Design.
From SemiSupervised to Transfer Counting of Crowds
"... Regressionbased techniques have shown promising results for people counting in crowded scenes. However, most existing techniques require expensive and laborious data annotation for model training. In this study, we propose to address this problem from three perspectives: (1) Instead of exhaustivel ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
Regressionbased techniques have shown promising results for people counting in crowded scenes. However, most existing techniques require expensive and laborious data annotation for model training. In this study, we propose to address this problem from three perspectives: (1) Instead of exhaustively annotating every single frame, the most informative frames are selected for annotation automatically and actively. (2) Rather than learning from only labelled data, the abundant unlabelled data are exploited. (3) Labelled data from other scenes are employed to further alleviate the burden for data annotation. All three ideas are implemented in a unified active and semisupervised regression framework with ability to perform transfer learning, by exploiting the underlying geometric structure of crowd patterns via manifold analysis. Extensive experiments validate the effectiveness of our approach. 1.
Optimal designs for generalized linear models with multiple design variables
 Statist. Sinica
, 2011
"... Abstract: Binary response experiments are common in scientific studies. However, the study of optimal designs in this area is in a very underdeveloped stage. Sitter and Torsney (1995a) studied optimal designs for binary response experiments with two design variables. In this paper, we consider a ge ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Abstract: Binary response experiments are common in scientific studies. However, the study of optimal designs in this area is in a very underdeveloped stage. Sitter and Torsney (1995a) studied optimal designs for binary response experiments with two design variables. In this paper, we consider a general situation with multiple design variables. A novel approach is proposed to identify optimal designs for the commonly used multifactor logistic and probit models. We give explicit formulas for a large class of optimal designs, including D, A, and Eoptimal designs. In addition, we identify the general structure of optimal designs, which has a relatively simple format. This property makes it feasible to solve seemingly intractable problems. This result can also be applied in a multistage approach.
Optimal designs for additional day effects in generalized linear models with gamma distributed response. Discussion Paper. http://www.statistik
, 2013
"... ..."
(Show Context)
Optimal designs for twoparameter nonlinear models with application to survival models
, 2011
"... Abstract: Censoring occurs in many industrial or biomedical 'time to event' experiments. Finding efficient designs for such experiments can be problematic since the statistical models involved will usually be nonlinear, making the optimal choice of design parameter dependent. We provide a ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Abstract: Censoring occurs in many industrial or biomedical 'time to event' experiments. Finding efficient designs for such experiments can be problematic since the statistical models involved will usually be nonlinear, making the optimal choice of design parameter dependent. We provide analytical characterisations of locally Dand coptimal designs for a class of models, thus reducing the numerical effort for design search substantially. We also investigate standadised maximin Dand coptimal designs. We illustrate our results using the natural proportional hazards parameterisation of the exponential regression model. Different censoring mechanisms are incorporated and the robustness of designs against parameter misspecification is assessed.
Application of General Linear Model to the Reduction of Defectives in Packaging Process of Soap Industry
"... Abstract — The objective of this study is to improve the efficiency of the flowwrap packaging process in soap industry through the reduction of defectives. At the 95 % confidence level, with the regression analysis, the sealing temperature, temperatures of upper and lower crimper are found to be th ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Abstract — The objective of this study is to improve the efficiency of the flowwrap packaging process in soap industry through the reduction of defectives. At the 95 % confidence level, with the regression analysis, the sealing temperature, temperatures of upper and lower crimper are found to be the significant factors for the flowwrap process with respect to the number/percentage of defectives. Twenty seven experiments have been designed and performed according to three levels of each controllable factor. With the general linear model (GLM), the suggested values for the sealing temperature, temperatures of upper and lower crimpers are 185, 85 and 85 ๐ C, respectively. Under the suggested process condition, the percentage of defectives is reduced from 12.47 % to 5.51 % and at the significant level of 5%, the percentage of defectives is between 5.05 % and 5.98%. Index Terms—Experimental design, General linear model, Regression analysis, Reduction of defectives I.
1ROBUST DESIGNS FOR POISSON REGRESSION MODELS
, 2012
"... Robust designs for poisson regression models We consider the problem of how to construct robust designs for Poisson regression models. An analytical expression is derived for robust designs for firstorder Poisson regression models where uncertainty exists in the prior parameter estimates. Given cer ..."
Abstract
 Add to MetaCart
Robust designs for poisson regression models We consider the problem of how to construct robust designs for Poisson regression models. An analytical expression is derived for robust designs for firstorder Poisson regression models where uncertainty exists in the prior parameter estimates. Given cer tain constraints in the methodology, it may be necessary to extend the robust designs for implementation in practical experiments. With these extensions, our methodology constructs designs which perform similarly, in terms of estimation, to current techniques, and offers the solution in a more timely manner. We further apply this analytic result to cases where uncertainty exists in the linear predictor. The application of this method ology to practical design problems such as screening experiments is explored. Given the minimal prior knowledge that is usually available when conducting such experiments, it is recommended to derive designs robust across a variety of systems. However, in corporating such uncertainty into the design process can be a computationally intense exercise. Hence, our analytic approach is explored as an alternative.
SPARSITY MODELING FOR HIGH DIMENSIONAL SYSTEMS: APPLICATIONS IN GENOMICS AND STRUCTURAL BIOLOGY
"... The availability of very high dimensional data has brought sparsity modeling to the forefront of statistical research in recent years. From complex physical models with hundreds of parameters to DNA microarrays which offer observations in tens to hundreds of thousands of dimensions, separating relev ..."
Abstract
 Add to MetaCart
The availability of very high dimensional data has brought sparsity modeling to the forefront of statistical research in recent years. From complex physical models with hundreds of parameters to DNA microarrays which offer observations in tens to hundreds of thousands of dimensions, separating relevant and irrelevant parameters is becoming more and more important. This dissertation will focus on innovations in the area of variable and model selection as they pertain to these high dimensional systems. Chapter 1 will discuss work from the literature on the areas of variable and model selection. Chapter 2 will describe an innovation to hierarchical variable selection modeling that corrects errors that stem from assuming incorrectly that multiple thousands of observations are informing about the same distribution. In Chapter 3, we introduce a novel technique for applying variable selection priors to induce sparsity in variance modeling.