Results 11  20
of
120
Generalized Model Selection For Unsupervised Learning In High Dimensions
 Proceedings of Neural Information Processing Systems
, 1999
"... In this paper we describe an approach to model selection in unsupervised learning. This approach determines both the feature set and the number of clusters. To this end we first derive an objective function that explicitly incorporates this generalization. We then evaluate two schemes for model sele ..."
Abstract

Cited by 16 (2 self)
 Add to MetaCart
In this paper we describe an approach to model selection in unsupervised learning. This approach determines both the feature set and the number of clusters. To this end we first derive an objective function that explicitly incorporates this generalization. We then evaluate two schemes for model selection  one using this objective function (a Bayesian estimation scheme that selects the best model structure using the marginal or integrated likelihood) and the second based on a technique using a crossvalidated likelihood criterion. In the first scheme, for a particular application in document clustering, we derive a closedform solution of the integrated likelihood by assuming an appropriate form of the likelihood function and prior. Extensive experiments are carried out to ascertain the validity of both approaches and all results are verified by comparison against ground truth. In our experiments the Bayesian scheme using our objective function gave better results tha n crossvalidatio...
Segmented regression estimators for massive data sets
 In Second SIAM International Conference on Data Mining
, 2002
"... We describe two methodologies for obtaining segmented regression estimators from massive training data sets. The first methodology, called Linear Regression Tree (LRT), is used for continuous response variables, and the second and complementary methodology, called Naive Bayes Tree (NBT), is used for ..."
Abstract

Cited by 12 (6 self)
 Add to MetaCart
We describe two methodologies for obtaining segmented regression estimators from massive training data sets. The first methodology, called Linear Regression Tree (LRT), is used for continuous response variables, and the second and complementary methodology, called Naive Bayes Tree (NBT), is used for categorical response variables. These are implemented in the IBM ProbE TM (Probabilistic Estimation) data mining engine, which is an objectoriented framework for building classes of segmented predictive models from massive training data sets. Based on this methodology, an application called ATMSE TM for directmail targeted marketing has been developed jointly with Fingerhut Business Intelligence [1]).
Fast state discovery for HMM model selection and learning
 In Proc. Int’l Conference on Artificial Intelligence and Statistics
, 2007
"... Choosing the number of hidden states and their topology (model selection) and estimating model parameters (learning) are important problems for Hidden Markov Models. This paper presents a new statesplitting algorithm that addresses both these problems. The algorithm models more information about th ..."
Abstract

Cited by 11 (3 self)
 Add to MetaCart
Choosing the number of hidden states and their topology (model selection) and estimating model parameters (learning) are important problems for Hidden Markov Models. This paper presents a new statesplitting algorithm that addresses both these problems. The algorithm models more information about the dynamic context of a state during a split, enabling it to discover underlying states more effectively. Compared to previous topdown methods, the algorithm also touches a smaller fraction of the data per split, leading to faster model search and selection. Because of its efficiency and ability to avoid local minima, the statesplitting approach is a good way to learn HMMs even if the desired number of states is known beforehand. We compare our approach to previous work on synthetic data as well as several realworld data sets from the literature, revealing significant improvements in efficiency and testset likelihoods. We also compare to previous algorithms on a signlanguage recognition task, with positive results. 1
Testing for Shifts in Trend with an Integrated or Stationary Noise Component
, 2007
"... This paper considers the problem of testing for structural changes in the trend function of a univariate time series without any prior knowledge as to whether the noise component is stationary or contains an autoregressive unit root. We propose a new approach that builds on the work of Perron and Ya ..."
Abstract

Cited by 11 (3 self)
 Add to MetaCart
This paper considers the problem of testing for structural changes in the trend function of a univariate time series without any prior knowledge as to whether the noise component is stationary or contains an autoregressive unit root. We propose a new approach that builds on the work of Perron and Yabu (2005), based on a Feasible Quasi Generalized Least Squares procedure that uses a superefficient estimate of the sum of the autoregressive parameters α when α =1. In the case of a known break date, the resulting Wald test has a chisquare limit distribution in both the I(0) and I(1) cases. When the break date is unknown, the Exp functional of Andrews and Ploberger (1994) yields a test with nearly identical limit distributions in the two cases so that a testing procedure with nearly the same size in the I(0) and I(1) cases can be obtained. To improve the finite sample properties of the tests, we use the bias corrected version of the OLS estimate of α proposed by Roy and Fuller (2001). We show our procedure to be substantially more powerful than currently available alternatives and also to have a power function that is close to that attainable if we knew the true value of α in many cases. The extension to the case of multiple breaks is also discussed.
mixtools: An R package for analyzing finite mixture models
 Journal of Statistical Software
, 2009
"... The mixtools package for R provides a set of functions for analyzing a variety of finite mixture models. These functions include both traditional methods, such as EM algorithms for univariate and multivariate normal mixtures, and newer methods that reflect some recent research in finite mixture mode ..."
Abstract

Cited by 11 (8 self)
 Add to MetaCart
The mixtools package for R provides a set of functions for analyzing a variety of finite mixture models. These functions include both traditional methods, such as EM algorithms for univariate and multivariate normal mixtures, and newer methods that reflect some recent research in finite mixture models. In the latter category, mixtools provides algorithms for estimating parameters in a wide range of different mixtureofregression contexts, in multinomial mixtures such as those arising from discretizing continuous multivariate data, in nonparametric situations where the multivariate component densities are completely unspecified, and in semiparametric situations such as a univariate location mixture of symmetric but otherwise unspecified densities. Many of the algorithms of the mixtools package are EM algorithms or are based on EMlike ideas, so this article includes an overview of EM algorithms for finite mixture models.
Dna segmentation as a model selection process
 In International Conference on Research in Computational Molecular Biology (RECOMB
"... Previous divideandconquer segmentation analyses of DNA sequences do not provide a satisfactory stopping criterion for the recursion. This paper proposes that segmentation be considered as a model selection process. Using the tools in model selection, a limit for the stopping criterion on the relax ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
Previous divideandconquer segmentation analyses of DNA sequences do not provide a satisfactory stopping criterion for the recursion. This paper proposes that segmentation be considered as a model selection process. Using the tools in model selection, a limit for the stopping criterion on the relaxed end can be determined. The Bayesian information criterion, in particular, provides a much more stringent stopping criterion than what is currently used. Such a stringent criterion can be used to delineate larger DNA domains. A relationship between the stopping criterion and the average domain size is empirically determined, which may aid in the determination of isochore borders. 1.
Subspace constrained gaussian mixture models for speech recognition
 IEEE Transactions on Speech and Audio Processing
, 2005
"... Abstract — A standard approach to automatic speech recognition uses Hidden Markov Models whose state dependent distributions are Gaussian mixture models. Each Gaussian can be viewed as an exponential model whose features are linear and quadratic monomials in the acoustic vector. We consider here mod ..."
Abstract

Cited by 10 (3 self)
 Add to MetaCart
Abstract — A standard approach to automatic speech recognition uses Hidden Markov Models whose state dependent distributions are Gaussian mixture models. Each Gaussian can be viewed as an exponential model whose features are linear and quadratic monomials in the acoustic vector. We consider here models in which the weight vectors of these exponential models are constrained to lie in an affine subspace shared by all the Gaussians. This class of models includes Gaussian models with linear constraints placed on the precision (inverse covariance) matrices (such as diagonal covariance, MLLT, or EMLLT) as well as the LDA/HLDA models used for feature selection which tie the part of the Gaussians in the directions not used for discrimination. In this paper we present algorithms for training these models using a maximum likelihood criterion. We present experiments on both small vocabulary, resource constrained, grammar based tasks as well as large vocabulary, unconstrained resource tasks to explore the rather large parameter space of models that fit within our framework. In particular, we demonstrate significant improvements can be obtained in both word error rate and computational complexity. I.
Selecting Hidden Markov Model State Number with CrossValidated Likelihood
 Computational Statistics
"... Abstract: The problem of estimating the number of hidden states in a hidden Markov model is considered. Emphasis is placed on crossvalidated likelihood criteria. Using crossvalidation to assess the number of hidden states allows to circumvent the well documented ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
Abstract: The problem of estimating the number of hidden states in a hidden Markov model is considered. Emphasis is placed on crossvalidated likelihood criteria. Using crossvalidation to assess the number of hidden states allows to circumvent the well documented
Automated Detection and Classification of Positive vs. Negative Robot Interactions With Children With Autism Using DistanceBased Features
"... Recent feasibility studies involving children with autism spectrum disorders (ASD) interacting with socially assistive robots have shown that some children have positive reactions to robots, while others may have negative reactions. It is unlikely that children with ASD will enjoy any robot 100 % of ..."
Abstract

Cited by 8 (4 self)
 Add to MetaCart
Recent feasibility studies involving children with autism spectrum disorders (ASD) interacting with socially assistive robots have shown that some children have positive reactions to robots, while others may have negative reactions. It is unlikely that children with ASD will enjoy any robot 100 % of the time. It is therefore important to develop methods for detecting negative child behaviors in order to minimize distress and facilitate effective humanrobot interaction. Our past work has shown that negative reactions can be readily identified and classified by a human observer from overhead video data alone, and that an automated position tracker combined with humandetermined heuristics can differentiate between the two classes of reactions. This paper describes and validates an improved, nonheuristic method for determining if a child is interacting positively or negatively with a robot, based on Gaussian mixture models (GMM) and a naiveBayes classifier of overhead camera observations. The approach achieves a 91.4 % accuracy rate in classifying robot interaction, parent interaction, avoidance, and hiding against the wall behaviors and demonstrates that these classes are sufficient for distinguishing between positive and negative reactions of the child to the robot.