Results 1  10
of
16
Convex Structure Learning for Bayesian Networks: Polynomial Feature Selection and Approximate Ordering
, 2006
"... We present a new approach to learning the structure and parameters of a Bayesian network based on regularized estimation in an exponential family representation. Here we show that, given a fixed variable order, the optimal structure and parameters can be learned efficiently, even without restricting ..."
Abstract

Cited by 11 (2 self)
 Add to MetaCart
We present a new approach to learning the structure and parameters of a Bayesian network based on regularized estimation in an exponential family representation. Here we show that, given a fixed variable order, the optimal structure and parameters can be learned efficiently, even without restricting the size of the parent variable sets. We then consider the problem of optimizing the variable order for a given set of features. This is still a computationally hard problem, but we present a convex relaxation that yields an optimal “soft” ordering in polynomial time. One novel aspect of the approach is that we do not perform a discrete search over DAG structures, nor over variable orders, but instead solve a continuous convex relaxation that can then be rounded to obtain a valid network structure. We conduct an experimental comparison against standard structure search procedures over standard objectives, which cope with local minima, and evaluate the advantages of using convex relaxations that reduce the effects of local minima.
Discriminative model selection for belief net structures
 In Proceedings of the Twentieth National Conference on Artificial Intelligence (AAAI05
, 2005
"... Bayesian belief nets (BNs) are often used for classification tasks, typically to return the most likely class label for a specified instance. Many BNlearners, however, attempt to find the BN that maximizes a different objective function — viz., likelihood, rather than classification accuracy — typi ..."
Abstract

Cited by 10 (4 self)
 Add to MetaCart
(Show Context)
Bayesian belief nets (BNs) are often used for classification tasks, typically to return the most likely class label for a specified instance. Many BNlearners, however, attempt to find the BN that maximizes a different objective function — viz., likelihood, rather than classification accuracy — typically by first using some model selection criterion to identify an appropriate graphical structure, then finding good parameters for that structure. This paper considers a number of possible criteria for selecting the best structure, both generative (i.e., based on likelihood; BIC, BDe) and discriminative (i.e., Conditional BIC (CBIC), resubstitution Classification Error (CE) and Bias 2 +Variance (BV)). We empirically compare these criteria against a variety of different “correct BN structures”, both realworld and synthetic, over a range of complexities. We also explore different ways to set the parameters, dealing with two issues: (1) Should we seek the parameters that maximize likelihood versus the ones that maximize conditional likelihood? (2) Should we use (i) the entire training sample first to learn the best parameters and then to evaluate the models, versus (ii) only a partition for parameter estimation and another partition for evaluation (crossvalidation)? Our results show that the discriminative BV model selection criterion is one of the best measures for identifying the optimal structure, while the discriminative CBIC performs poorly; that we should use the parameters that maximize likelihood; and that it is typically better to use crossvalidation here.
MultiView 3D Object Description with Uncertain Reasoning and Machine Learning
, 2001
"... xi Chapter 1. ..."
A New Hybrid Method for Bayesian Network Learning With Dependency Constraints
"... Abstract — A Bayes net has qualitative and quantitative aspects: The qualitative aspect is its graphical structure that corresponds to correlations among the variables in the Bayes net. The quantitative aspects are the net parameters. This paper develops a hybrid criterion for learning Bayes net str ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
(Show Context)
Abstract — A Bayes net has qualitative and quantitative aspects: The qualitative aspect is its graphical structure that corresponds to correlations among the variables in the Bayes net. The quantitative aspects are the net parameters. This paper develops a hybrid criterion for learning Bayes net structures that is based on both aspects. We combine model selection criteria measuring data fit with correlation information from statistical tests: Given a sample d, search for a structure G that maximizes score(G, d), over the set of structures G that satisfy the dependencies detected in d. We rely on the statistical test only to accept conditional dependencies, not conditional independencies. We show how to adapt local search algorithms to accommodate the observed dependencies. Simulation studies with GES search and the BDeu/BIC scores provide evidence that the additional dependency information leads to Bayes nets that better fit the target model in distribution and structure. I.
Learning Dynamic Bayesian Network Models Via CrossValidation
"... We study crossvalidation as a scoring criterion for learning dynamic Bayesian network models that generalize well. We argue that crossvalidation is more suitable than the Bayesian scoring criterion for one of the most common interpretations of generalization. We confirm this by carrying out an exp ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
We study crossvalidation as a scoring criterion for learning dynamic Bayesian network models that generalize well. We argue that crossvalidation is more suitable than the Bayesian scoring criterion for one of the most common interpretations of generalization. We confirm this by carrying out an experimental comparison of crossvalidation and the Bayesian scoring criterion, as implemented by the Bayesian Dirichlet metric and the Bayesian information criterion. The results show that crossvalidation leads to models that generalize better for a wide range of sample sizes.
Structure learning of probabilistic graphical models: a comprehensive survey,” arXiv preprint arXiv:1111.6925
, 2011
"... ..."
Conditionalloglikelihood MDL and Evolutionary MCMC
 PHD THESIS
, 2006
"... In the current society there is an increasing interest in intelligent techniques that can automatically process, analyze, and summarize the ever growing amount of data. Artificial intelligence is a research field that studies intelligent algorithms to support people in making decisions. Algorithms t ..."
Abstract
 Add to MetaCart
In the current society there is an increasing interest in intelligent techniques that can automatically process, analyze, and summarize the ever growing amount of data. Artificial intelligence is a research field that studies intelligent algorithms to support people in making decisions. Algorithms that are able to induce knowledge from examples are researched in the field of machine learning. This thesis studies improvements of particular machine learning algorithms. In the first part of this thesis we describe methods that are able to select useful attributes (or features) that can be used as inputs by a classification algorithm. We focus on Bayesian network classifiers that use Bayesian networks as knowledge representation and, more in particular, on selecting relevant attributes that should be used as inputs for the Bayesian network classifier. For our goal to construct selective Bayesian network classifiers, we propose and investigate a score function that can evaluate Bayesian network classifiers and that indicates the simplest and the most performant classifier. We theoretically and experimentally show that our proposed conditional loglikelihood minimum description length (MDL) is well suited for constructing simple and well performing Bayesian network classifiers. In the second part of this thesis we integrate some methods from evolutionary computation into a Markov chain Monte Carlo (MCMC) sampler. Sampling is related to optimization, but whereas in optimization we are only interested in the state with the highest fitness, in sampling we are interested in the overall probability distribution over states. To improve MCMC methods that are often used for sampling, we investigate the Evolutionary MCMC (EMCMC) framework, where populationbased MCMCs exchange information between the individual states such that they are still MCMCs at population level. We investigate and propose various evolutionary techniques (e.g. recombination, selection) which we then integrate in the EMCMC framework. We experimentally show that our proposed EMCMCs can outperform the standard MCMC algorithms.
LEARNING BAYESIAN NETWORKS FROM DATA: STRUCTURE OPTIMIZATION AND PARAMETER ESTIMATION
"... of this thesis and to lend or sell such copies for private, scholarly or scientific research purposes only. The author reserves all other publication and other rights in association with the copyright in the thesis, and except as herein before provided, neither the thesis nor any substantial portion ..."
Abstract
 Add to MetaCart
(Show Context)
of this thesis and to lend or sell such copies for private, scholarly or scientific research purposes only. The author reserves all other publication and other rights in association with the copyright in the thesis, and except as herein before provided, neither the thesis nor any substantial portion thereof may be printed or otherwise reproduced in any material form whatever without the author’s prior written permission. Date:
A Bayesian Belief Network Classifier for Predicting Victimization in National Crime Victimization Survey
"... Abstract This paper presents the development of a Bayes net classifier for prediction of a victimization attribute value for the National Crime Victimization Survey dataset. The National Crime Victimization Survey dataset has over 250 attributes and 216,000 data points, and as such poses a largesc ..."
Abstract
 Add to MetaCart
Abstract This paper presents the development of a Bayes net classifier for prediction of a victimization attribute value for the National Crime Victimization Survey dataset. The National Crime Victimization Survey dataset has over 250 attributes and 216,000 data points, and as such poses a largescale problem context for classifier development. The classifier was developed using the Weka machine learning software workbench. A set of structural and parameter learning algorithms for the Bayesian belief network were employed in a development effort while ensuring that the computational complexity in both time and space remained within affordable bounds. A number of structural learning algorithms, including local versions of hillclimbing and K2, provided a classification performance of 99 % on the testing data. Simulation results indicate that it is feasible to develop a successful Bayesian belief network classifier for the victimization attribute of the National Crime Victimization Survey data.
genehmigten Dissertation.
"... Die Dissertation wurde am 07.01.2008 bei der Fakultät der Technischen Universität München eingereicht und durch die Fakultät für Informatik am 28.01.2008 angenommen. To my parents The ever increasing amount of information in every scientific and industrial domain have been an exciting challeng ..."
Abstract
 Add to MetaCart
(Show Context)
Die Dissertation wurde am 07.01.2008 bei der Fakultät der Technischen Universität München eingereicht und durch die Fakultät für Informatik am 28.01.2008 angenommen. To my parents The ever increasing amount of information in every scientific and industrial domain have been an exciting challenge for computer scientist to handle vast amount of data and to represent human understandings of a domain in a systematic and mathematic way. Over decades, probabilistic modeling with probability theory and statistical learning algorithms has been popular for accomplishing this task due to the stochastic characteristics of the nature. Quantitative measurements are generated from various kinds of ”sensors ” in all types of science and industry and we need to make sense of these data, i.e. to extract important patterns and trends, and understand ”what the data says”. This is often called learning from data, reverseengineering or bottomup modeling.