Results 11  20
of
36
Automatic selection of reliability estimates for individual predictions. Knowledge Engineering Review (in press
, 2008
"... regression predictions ..."
Severity of Local Maxima for the EM Algorithm: Experiences with Hierarchical Latent Class Models
"... It is common knowledge that the EM algorithm can be trapped at local maxima and consequently fails to reach global maxima. We empirically investigate the severity of this problem in the context of hierarchical latent class (HLC) models. Our experiments were run on HLC models where dependency between ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
It is common knowledge that the EM algorithm can be trapped at local maxima and consequently fails to reach global maxima. We empirically investigate the severity of this problem in the context of hierarchical latent class (HLC) models. Our experiments were run on HLC models where dependency between neighboring variables is strong. (The reason for focusing on this class of models will be made clear in the main text.) We first ran EM from randomly generated single starting points, and observed that (1) the probability of hitting global maxima is generally high, (2) it increases with the strength of dependency and sample sizes, and (3) it decreases with the amount of extreme probability values. We also observed that, at high dependence strength levels, local maxima are far apart from global ones in terms of likelihoods. Those imply that local maxima can be reliably avoided by running EM from a few starting points and hence are not a serious issue. This is confirmed by our second set of experiments. 1
TRUSTTECH based expectation maximization for learning finite mixture models
 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2007
"... The Expectation Maximization (EM) algorithm is widely used for learning finite mixture models despite its greedy nature. Most popular modelbased clustering techniques might yield poor clusters if the parameters are not initialized properly. To reduce the sensitivity of initial points, a novel algor ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
The Expectation Maximization (EM) algorithm is widely used for learning finite mixture models despite its greedy nature. Most popular modelbased clustering techniques might yield poor clusters if the parameters are not initialized properly. To reduce the sensitivity of initial points, a novel algorithm for learning mixture models from multivariate data is introduced in this paper. The proposed algorithm takes advantage of TRUSTTECH (TRansformation Under STabilityreTaining Equilibria CHaracterization) to compute neighborhood local maxima on the likelihood surface using stability regions. Basically, our method coalesces the advantages of the traditional EM with that of the dynamic and geometric characteristics of the stability regions of the corresponding nonlinear dynamical system of the loglikelihood function. Two phases namely, the EM phase and the stability region phase, are repeated alternatively in the parameter space to achieve local maxima with improved likelihood values. The EM phase obtains the local maximum of the likelihood function and the stability region phase helps to escape out of the local maximum by moving towards the neighboring stability regions. Though applied to Gaussian mixtures in this paper, our technique can be easily generalized to any other parametric finite mixture model. The algorithm has been tested on both synthetic and real datasets and the improvements in the performance compared to other approaches are demonstrated. The robustness with respect to initialization is also illustrated experimentally.
Towards Reliable Reliability Estimates for Individual Regression Predictions
"... In machine learning, the reliability estimates for individual predictions provide more information about individual prediction error than the average accuracy of predictive model (such as relative mean squared error). Individual reliability estimates may represent a decisive information in risksens ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
In machine learning, the reliability estimates for individual predictions provide more information about individual prediction error than the average accuracy of predictive model (such as relative mean squared error). Individual reliability estimates may represent a decisive information in risksensitive applications of machine learning (e.g. medicine, engineering, business), where they enable the users to distinguish between more or less reliable predictions. In the paper we compare the sensitivitybased reliability estimates, developed in our previous work, with four novel approaches, proposed or inspired by the ideas from the related work. The results, obtained using 8 regression models and 28 domains indicate the potential for the usage of a sensitivitybased estimate, as well as the approach to the local modeling of prediction error, with the regression trees. By combining pairs of individual estimates, we further designed a new estimate which performs better with neural networks and bagging. With various normalizations we succeeded to improve the interpretability and intercomparability of estimates ’ values without affecting their performance. Key words: regression, reliability, reliability estimate, sensitivity analysis, prediction accuracy, prediction error, local modeling, combining of estimates
LEARNING FROM DATA WITH COMPLEX INTERACTIONS AND AMBIGUOUS LABELS
"... Abstract In this thesis, we develop and evaluate machine learning algorithms that can learn effectively from data with complex interactions and ambiguous labels. The need for such algorithms is motivated by such problems as proteinprotein binding and drug activity prediction. In the first part of t ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Abstract In this thesis, we develop and evaluate machine learning algorithms that can learn effectively from data with complex interactions and ambiguous labels. The need for such algorithms is motivated by such problems as proteinprotein binding and drug activity prediction. In the first part of the thesis, we focus on the problem of myopia. This problem arises when greedy learning strategies are applied to learn from data with complex interactions. We present skewing, our approach to alleviating myopia. We describe theoretical results and empirical results on Boolean data that show that our approach can learn effectively from data with complex interactions. We investigate the effects of various parameter choices on our approach, and the effects of dimensionality and classlabel noise. We then propose and evaluate a variant that scales better to highdimensional data. Finally, we propose and evaluate an extension that is able to learn from nonBoolean data with similar complex interactions as in the Boolean case.
Learning bayesian network structure from correlation immune data
 In Proceedings of the International Conference on Uncertainty in Artificial Intelligence (UAI
, 2007
"... Searching the complete space of possible Bayesian networks is intractable for problems of interesting size, so Bayesian network structure learning algorithms, such as the commonly used Sparse Candidate algorithm, employ heuristics. However, these heuristics also restrict the types of relationships t ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Searching the complete space of possible Bayesian networks is intractable for problems of interesting size, so Bayesian network structure learning algorithms, such as the commonly used Sparse Candidate algorithm, employ heuristics. However, these heuristics also restrict the types of relationships that can be learned exclusively from data. They are unable to learn relationships that exhibit “correlationimmunity”, such as exclusiveOR and parity. To learn Bayesian networks in the presence of correlationimmune relationships, we extend the Sparse Candidate algorithm with a technique called “skewing”. This technique uses the observation that relationships that are correlationimmune under a specific input distribution may not be correlationimmune under another, sufficiently different distribution. We show that by extending Sparse Candidate with this technique we are able to discover relationships between random variables that are approximately correlationimmune, with a significantly lower computational cost than the alternative of considering multiple parents of a node at a time. 1
Stability Region based Expectation Maximization for Modelbased Clustering
"... In spite of the initialization problem, the ExpectationMaximization (EM) algorithm is widely used for estimating the parameters in several data mining related tasks. Most popular modelbased clustering techniques might yield poor clusters if the parameters are not initialized properly. To reduce th ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
In spite of the initialization problem, the ExpectationMaximization (EM) algorithm is widely used for estimating the parameters in several data mining related tasks. Most popular modelbased clustering techniques might yield poor clusters if the parameters are not initialized properly. To reduce the sensitivity of initial points, a novel algorithm for learning mixture models from multivariate data is introduced in this paper. The proposed algorithm takes advantage of TRUSTTECH (TRansformation Under STabilityreTaining Equilibra CHaracterization) to compute neighborhood local maxima on likelihood surface using stability regions. Basically, our method coalesces the advantages of the traditional EM with that of the dynamic and geometric characteristics of the stability regions of the corresponding nonlinear dynamical system of the loglikelihood function. Two phases namely, the EM phase and the stability region phase, are repeated alternatively in the parameter space to achieve improvements in the maximum likelihood. Though applied to Gaussian mixtures in this paper, our technique can be easily generalized to any other parametric finite mixture model. The algorithm has been tested on both synthetic and real datasets and the improvements in the performance compared to other approaches are demonstrated. The robustness with respect to initialization is also illustrated experimentally. 1
Exploiting Qualitative Domain Knowledge for Learning Bayesian Network Parameters with Incomplete Data
"... When a large amount of data are missing, or when multiple hidden nodes exist, learning parameters in Bayesian networks (BNs) becomes extremely difficult. This paper presents a learning algorithm to incorporate qualitative domain knowledge to regularize the otherwise illposed problem, limit the sear ..."
Abstract
 Add to MetaCart
When a large amount of data are missing, or when multiple hidden nodes exist, learning parameters in Bayesian networks (BNs) becomes extremely difficult. This paper presents a learning algorithm to incorporate qualitative domain knowledge to regularize the otherwise illposed problem, limit the search space, and avoid local optima. Specifically, the problem is formulated as a constrained optimization problem, where an objective function is defined as a combination of the likelihood function and penalty functions constructed from the qualitative domain knowledge. Then, a gradientdescent procedure is systematically integrated with the Estep and Mstep of the EM algorithm, to estimate the parameters iteratively until it converges. The experiments show our algorithm improves the accuracy of the learned BN parameters significantly over the conventional EM algorithm. 1
Stability Region based Methods for Learning and Discovery
"... Many problems that arise in machine learning and data mining domains deal with nonlinearity and quite often demand users to obtain global optimal solutions rather than local optimal ones. Several algorithms had been proposed in the optimization literature and inherited by the machine learning commun ..."
Abstract
 Add to MetaCart
Many problems that arise in machine learning and data mining domains deal with nonlinearity and quite often demand users to obtain global optimal solutions rather than local optimal ones. Several algorithms had been proposed in the optimization literature and inherited by the machine learning community. Popularly known as the initialization problem, the ideal set of parameters required will significantly depend on the initial values given by the user. In this paper, we propose stability region based methods for systematically exploring the subspace of the parameters to obtain the neighborhood local optimal solutions. The proposed algorithm takes advantage of TRUSTTECH (TRansformation Under STabilityreTaining Equilibria CHaracterization) to compute neighborhood local optimal solutions on the nonlinear surface in a systematic manner using stability regions. Our method explores the dynamic and geometric characteristics of stability boundaries of a nonlinear dynamical system corresponding to the nonlinear function of interest. Basically, our method coalesces the advantages of the traditional local optimizers with that of the