Results 11  20
of
67
Optimization Using Surrogate Objectives on a Helicopter Test Example
 RICE UNIVERSITY
, 1998
"... This paper presents results for a 31 variable helicopter rotor design example. Results are given for several numerical methods. This is a brief description of a portion of the Boeing/IBM/Rice University collaboration whose purpose is to develop effective numerical methods for managing the use of app ..."
Abstract

Cited by 21 (7 self)
 Add to MetaCart
This paper presents results for a 31 variable helicopter rotor design example. Results are given for several numerical methods. This is a brief description of a portion of the Boeing/IBM/Rice University collaboration whose purpose is to develop effective numerical methods for managing the use of approximation concepts or response surface methodology in design optimization.
Bayesian Semiparametric Inference for the Accelerated Failure Time Model
, 1997
"... Bayesian semiparametric inference is considered for a loglinear model. This model consists of a parametric component for the regression coefficients and a nonparametric component for the unknown error distribution. Bayesian analysis is studied for the case of a parametric prior on the regressio ..."
Abstract

Cited by 14 (0 self)
 Add to MetaCart
Bayesian semiparametric inference is considered for a loglinear model. This model consists of a parametric component for the regression coefficients and a nonparametric component for the unknown error distribution. Bayesian analysis is studied for the case of a parametric prior on the regression coefficients and a mixtureofDirichletprocesses prior on the unknown error distribution. A Markov chain Monte Carlo (MCMC) method is developed to compute the features of the posterior distribution. A model selection method for obtaining a more parsimonious set of predictors is studied. The method adds indicator variables to the regression equation. The set of indicator variables represents all the possible subsets to be considered. A MCMC method is developed to search stochastically for the best subset. These procedures are applied to two examples, one with censored data. Key words and phrases: Censored data; Log linear model; Markov chain Monte Carlo algorithm; Metropolis algori...
Predictive Approaches For Choosing Hyperparameters in Gaussian Processes
 Neural Computation
, 1999
"... Gaussian Processes are powerful regression models specified by parametrized mean and covariance functions. Standard approaches to estimate these parameters (known by the name Hyperparameters) are Maximum Likelihood (ML) and Maximum APosterior (MAP) approaches. In this paper, we propose and investiga ..."
Abstract

Cited by 12 (1 self)
 Add to MetaCart
Gaussian Processes are powerful regression models specified by parametrized mean and covariance functions. Standard approaches to estimate these parameters (known by the name Hyperparameters) are Maximum Likelihood (ML) and Maximum APosterior (MAP) approaches. In this paper, we propose and investigate predictive approaches, namely, maximization of Geisser's Surrogate Predictive Probability (GPP) and minimization of mean square error with respect to GPP (referred to as Geisser's Predictive mean square Error (GPE)) to estimate the hyperparameters. We also derive results for the standard CrossValidation (CV) error and make a comparison. These approaches are tested on a number of problems and experimental results show that these approaches are strongly competitive to existing approaches.
Model Selection by Normalized Maximum Likelihood
, 2005
"... The Minimum Description Length (MDL) principle is an information theoretic approach to inductive inference that originated in algorithmic coding theory. In this approach, data are viewed as codes to be compressed by the model. From this perspective, models are compared on their ability to compress a ..."
Abstract

Cited by 12 (3 self)
 Add to MetaCart
The Minimum Description Length (MDL) principle is an information theoretic approach to inductive inference that originated in algorithmic coding theory. In this approach, data are viewed as codes to be compressed by the model. From this perspective, models are compared on their ability to compress a data set by extracting useful information in the data apart from random noise. The goal of model selection is to identify the model, from a set of candidate models, that permits the shortest description length (code) of the data. Since Rissanen originally formalized the problem using the crude ‘twopart code ’ MDL method in the 1970s, many significant strides have been made, especially in the 1990s, with the culmination of the development of the refined ‘universal code’ MDL method, dubbed Normalized Maximum Likelihood (NML). It represents an elegant solution to the model selection problem. The present paper provides a tutorial review on these latest developments with a special focus on NML. An application example of NML in cognitive modeling is also provided.
Fast Unsupervised EgoAction Learning for FirstPerson Sports Videos
"... Portable highquality sports cameras (e.g. head or helmet mounted) built for recording dynamic firstperson video footage are becoming a common item among many sports enthusiasts. We address the novel task of discovering firstperson action categories (which we call egoactions) which can be useful f ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
Portable highquality sports cameras (e.g. head or helmet mounted) built for recording dynamic firstperson video footage are becoming a common item among many sports enthusiasts. We address the novel task of discovering firstperson action categories (which we call egoactions) which can be useful for such tasks as video indexing and retrieval. In order to learn egoaction categories, we investigate the use of motionbased histograms and unsupervised learning algorithms to quickly cluster video content. Our approach assumes a completely unsupervised scenario, where labeled training videos are not available, videos are not presegmented and the number of egoaction categories are unknown. In our proposed framework we show that a stacked Dirichlet process mixture model can be used to automatically learn a motion histogram codebook and the set of egoaction categories. We quantitatively evaluate our approach on both inhouse and public YouTube videos and demonstrate robust egoaction categorization across several sports genres. Comparative analysis shows that our approach outperforms other stateoftheart topic models with respect to both classification accuracy and computational speed. Preliminary results indicate that on average, the categorical content of a 10 minute video sequence can be indexed in under 5 seconds. 1.
Multiple imputation for model checking: Completeddata plots with missing and latent data
 Biometrics
, 2005
"... Summary. In problems with missing or latent data, a standard approach is to first impute the unobserved data, then perform all statistical analyses on the completed dataset—corresponding to the observed data and imputed unobserved data—using standard procedures for completedata inference. Here, we ..."
Abstract

Cited by 12 (3 self)
 Add to MetaCart
Summary. In problems with missing or latent data, a standard approach is to first impute the unobserved data, then perform all statistical analyses on the completed dataset—corresponding to the observed data and imputed unobserved data—using standard procedures for completedata inference. Here, we extend this approach to model checking by demonstrating the advantages of the use of completeddata model diagnostics on imputed completed datasets. The approach is set in the theoretical framework of Bayesian posterior predictive checks (but, as with missingdata imputation, our methods of missingdata model checking can also be interpreted as “predictive inference ” in a nonBayesian context). We consider the graphical diagnostics within this framework. Advantages of the completeddata approach include: (1) One can often check model fit in terms of quantities that are of key substantive interest in a natural way, which is not always possible using observed data alone. (2) In problems with missing data, checks may be devised that do not require to model the missingness or inclusion mechanism; the latter is useful for the analysis of ignorable but unknown data collection mechanisms, such as are often assumed in the analysis of sample surveys and observational studies. (3) In many problems with latent data, it is possible to check qualitative features of the model (for example, independence of two variables) that can be naturally formalized with the help of the latent data. We illustrate with several applied examples.
Penalized loss functions for Bayesian model comparison
"... The deviance information criterion (DIC) is widely used for Bayesian model comparison, despite the lack of a clear theoretical foundation. DIC is shown to be an approximation to a penalized loss function based on the deviance, with a penalty derived from a crossvalidation argument. This approximati ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
The deviance information criterion (DIC) is widely used for Bayesian model comparison, despite the lack of a clear theoretical foundation. DIC is shown to be an approximation to a penalized loss function based on the deviance, with a penalty derived from a crossvalidation argument. This approximation is valid only when the effective number of parameters in the model is much smaller than the number of independent observations. In disease mapping, a typical application of DIC, this assumption does not hold and DIC underpenalizes more complex models. Another deviancebased loss function, derived from the same decisiontheoretic framework, is applied to mixture models, which have previously been considered an unsuitable application for DIC.
On Bayesian model assessment and choice using crossvalidation predictive densities
, 2001
"... We consider the problem of estimating the distribution of the expected utility of the Bayesian model (expected utility is also known as generalization error). We use the crossvalidation predictive densities to compute the expected utilities. We demonstrate that in flexible nonlinear models having ..."
Abstract

Cited by 7 (7 self)
 Add to MetaCart
We consider the problem of estimating the distribution of the expected utility of the Bayesian model (expected utility is also known as generalization error). We use the crossvalidation predictive densities to compute the expected utilities. We demonstrate that in flexible nonlinear models having many parameters, the importance sampling approximated leaveoneout crossvalidation (ISLOOCV) proposed in (Gelfand et al., 1992) may not work. We discuss how the reliability of the importance sampling can be evaluated and in case there is reason to suspect the reliability of the importance sampling, we suggest to use predictive densities from the kfold crossvalidation (kfoldCV). We also note that the kfoldCV has to be used if data points have certain dependencies. As the kfoldCV predictive densities are based on slightly smaller data sets than the full data set, we use a bias correction proposed in (Burman, 1989) when computing the expected utilities. In order to assess the reliability of the estimated expected utilities, we suggest a quick and generic approach based on the Bayesian bootstrap for obtaining samples from the distributions of the expected utilities. Our main goal is to estimate how good (in terms of application field) the predictive ability of the model is, but the distributions of the expected utilities can also be used for comparing different models. With the proposed method, it is easy to compute the probability that one method has better expected utility than some other method. If the predictive likelihood is used as a utility (instead
Fast Bayesian Inference in Dirichlet Process Mixture Models
, 2008
"... There has been increasing interest in applying Bayesian nonparametric methods in large samples and high dimensions. As Markov chain Monte Carlo (MCMC) algorithms are often infeasible, there is a pressing need for much faster algorithms. This article proposes a fast approach for inference in Dirichle ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
There has been increasing interest in applying Bayesian nonparametric methods in large samples and high dimensions. As Markov chain Monte Carlo (MCMC) algorithms are often infeasible, there is a pressing need for much faster algorithms. This article proposes a fast approach for inference in Dirichlet process mixture (DPM) models. Viewing the partitioning of subjects into clusters as a model selection problem, we propose a sequential greedy search algorithm for selecting the partition. Then, when conjugate priors are chosen, the resulting posterior conditionally on the selected partition is available in closed form. This approach allows testing of parametric models versus nonparametric alternatives based on Bayes factors. We evaluate the approach using simulation studies and compare it with four other fast nonparametric methods in the literature. We apply the proposed approach to three datasets including one from a large epidemiologic study. Matlab codes for the simulation and data analyses using the proposed approach are available online in the supplemental materials.
Predictive hidden Markov model selection for speech recognition
 IEEE Trans. Speech Audio Process
, 2005
"... Abstract—This paper surveys a series of model selection approaches and presents a novel predictive information criterion (PIC) for hidden Markov model (HMM) selection. The approximate Bayesian using Viterbi approach is applied for PIC selection of the best HMMs providing the largest prediction infor ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
Abstract—This paper surveys a series of model selection approaches and presents a novel predictive information criterion (PIC) for hidden Markov model (HMM) selection. The approximate Bayesian using Viterbi approach is applied for PIC selection of the best HMMs providing the largest prediction information for generalization of future data. When the perturbation of HMM parameters is expressed by a product of conjugate prior densities, the segmental prediction information is derived at the frame level without Laplacian integral approximation. In particular, a multivariate distribution is attained to characterize the prediction information corresponding to HMM mean vector and precision matrix. When performing model selection in tree structure HMMs, we develop a topdown prior/posterior propagation algorithm for estimation of structural hyperparameters. The prediction information is determined so as to choose the best HMM tree model. Different from maximum likelihood (ML) and minimum description length (MDL) selection criteria, the parameters of PIC chosen HMMs are computed via maximum a posteriori estimation. In the evaluation of continuous speech recognition using decision tree HMMs, the PIC criterion outperforms ML and MDL criteria in building a compact tree structure with moderate tree size and higher recognition rate. Index Terms—Approximate Bayesian, decision tree state tying, model selection, multivariate distribution, predictive information criterion, prior/posterior propagation, speech recognition. I.