Results 1  10
of
25
Learning Optimal Bayesian Networks: A Shortest Path Perspective
, 2013
"... In this paper, learning a Bayesian network structure that optimizes a scoring function for a given dataset is viewed as a shortest path problem in an implicit statespace search graph. This perspective highlights the importance of two research issues: the development of search strategies for solving ..."
Abstract

Cited by 8 (4 self)
 Add to MetaCart
In this paper, learning a Bayesian network structure that optimizes a scoring function for a given dataset is viewed as a shortest path problem in an implicit statespace search graph. This perspective highlights the importance of two research issues: the development of search strategies for solving the shortest path problem, and the design of heuristic functions for guiding the search. This paper introduces several techniques for addressing the issues. One is an A * search algorithm that learns an optimal Bayesian network structure by only searching the most promising part of the solution space. The others are mainly two heuristic functions. The first heuristic function represents a simple relaxation of the acyclicity constraint of a Bayesian network. Although admissible and consistent, the heuristic may introduce too much relaxation and result in a loose bound. The second heuristic function reduces the amount of relaxation by avoiding directed cycles within some groups of variables. Empirical results show that these methods constitute a promising approach to learning optimal Bayesian network structures.
Discriminative Learning of Bayesian Networks via Factorized Conditional LogLikelihood
"... We propose an efficient and parameterfree scoring criterion, the factorized conditional loglikelihood (ˆfCLL), for learning Bayesian network classifiers. The proposed score is an approximation of the conditional loglikelihood criterion. The approximation is devised in order to guarantee decomposa ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
We propose an efficient and parameterfree scoring criterion, the factorized conditional loglikelihood (ˆfCLL), for learning Bayesian network classifiers. The proposed score is an approximation of the conditional loglikelihood criterion. The approximation is devised in order to guarantee decomposability over the network structure, as well as efficient estimation of the optimal parameters, achieving the same time and space complexity as the traditional loglikelihood scoring criterion. The resulting criterion has an informationtheoretic interpretation based on interaction information, which exhibits its discriminative nature. To evaluate the performance of the proposed criterion, we present an empirical comparison with stateoftheart classifiers. Results on a large suite of benchmark data sets from the UCI repository show that ˆfCLLtrained classifiers achieve at least as good accuracy as the best compared classifiers, using significantly less computational resources.
Learning Locally Minimax Optimal Bayesian Networks
"... We consider the problem of learning Bayesian network models in a noninformative setting, where the only available information is a set of observational data, and no background knowledge is available. The problem can be divided into two different subtasks: learning the structure of the network (a se ..."
Abstract

Cited by 7 (4 self)
 Add to MetaCart
(Show Context)
We consider the problem of learning Bayesian network models in a noninformative setting, where the only available information is a set of observational data, and no background knowledge is available. The problem can be divided into two different subtasks: learning the structure of the network (a set of independence relations), and learning the parameters of the model (that fix the probability distribution from the set of all distributions consistent with the chosen structure). There are not many theoretical frameworks that consistently handle both these problems together, the Bayesian framework being an exception. In this paper we propose an alternative, informationtheoretic framework which sidesteps some of the technical problems facing the Bayesian approach. The framework is based on the minimaxoptimal Normalized Maximum Likelihood (NML) distribution, which is motivated by the Minimum Description Length (MDL) principle. The resulting model selection criterion is consistent, and it provides a way to construct highly predictive Bayesian network models. Our empirical tests show that the proposed method compares favorably with alternative approaches in both model selection and prediction tasks. 1
Efficient Heuristics for Discriminative Structure Learning of Bayesian Network Classifiers
"... We introduce a simple orderbased greedy heuristic for learning discriminative structure within generative Bayesian network classifiers. We propose two methods for establishing an order of N features. They are based on the conditional mutual information and classification rate (i.e., risk), respecti ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
We introduce a simple orderbased greedy heuristic for learning discriminative structure within generative Bayesian network classifiers. We propose two methods for establishing an order of N features. They are based on the conditional mutual information and classification rate (i.e., risk), respectively. Given an ordering, we can find a discriminative structure withO ( N k+1) score evaluations (where constant k is the treewidth of the subgraph over the attributes). We present results on 25 data sets from the UCI repository, for phonetic classification using the TIMIT database, for a visual surface inspection task, and for two handwritten digit recognition tasks. We provide classification performance for both discriminative and generative parameter learning on both discriminatively and generatively structured networks. The discriminative structure found by our new procedures significantly outperforms generatively produced structures, and achieves a classification accuracy on par with the best discriminative (greedy) Bayesian network learning approach, but does so with a factor of ∼1040 speedup. We also show that the advantages of generative discriminatively structured Bayesian network classifiers still hold in the case of missing features, a case where generative classifiers have an advantage over discriminative classifiers.
A New Hybrid Method for Bayesian Network Learning With Dependency Constraints
"... Abstract — A Bayes net has qualitative and quantitative aspects: The qualitative aspect is its graphical structure that corresponds to correlations among the variables in the Bayes net. The quantitative aspects are the net parameters. This paper develops a hybrid criterion for learning Bayes net str ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
(Show Context)
Abstract — A Bayes net has qualitative and quantitative aspects: The qualitative aspect is its graphical structure that corresponds to correlations among the variables in the Bayes net. The quantitative aspects are the net parameters. This paper develops a hybrid criterion for learning Bayes net structures that is based on both aspects. We combine model selection criteria measuring data fit with correlation information from statistical tests: Given a sample d, search for a structure G that maximizes score(G, d), over the set of structures G that satisfy the dependencies detected in d. We rely on the statistical test only to accept conditional dependencies, not conditional independencies. We show how to adapt local search algorithms to accommodate the observed dependencies. Simulation studies with GES search and the BDeu/BIC scores provide evidence that the additional dependency information leads to Bayes nets that better fit the target model in distribution and structure. I.
Finding Optimal Bayesian Networks Using Precedence Constraints
, 2013
"... We consider the problem of finding a directed acyclic graph (DAG) that optimizes a decomposable Bayesian network score. While in a favorable case an optimal DAG can be found in polynomial time, in the worst case the fastest known algorithms rely on dynamic programming across the node subsets, taking ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
We consider the problem of finding a directed acyclic graph (DAG) that optimizes a decomposable Bayesian network score. While in a favorable case an optimal DAG can be found in polynomial time, in the worst case the fastest known algorithms rely on dynamic programming across the node subsets, taking time and space 2 n, to within a factor polynomial in the number of nodes n. In practice, these algorithms are feasible to networks of at most around 30 nodes, mainly due to the large space requirement. Here, we generalize the dynamic programming approach to enhance its feasibility in three dimensions: first, the user may trade space against time; second, the proposed algorithms easily and efficiently parallelize onto thousands of processors; third, the algorithms can exploit any prior knowledge about the precedence relation on the nodes. Underlying all these results is the key observation that, given a partial order P on the nodes, an optimal DAG compatible with P can be found in time and space roughly proportional to the number of ideals of P, which can be significantly less than 2 n. Considering sufficiently many carefully chosen partial orders guarantees that a globally optimal DAG will be found. Aside from the generic scheme, we present and analyze concrete tradeoff schemes based on parallel bucket orders.
SparsityBoost: A New Scoring Function for Learning Bayesian Network Structure
"... We give a new consistent scoring function for structure learning of Bayesian networks. In contrast to traditional approaches to scorebased structure learning, such as BDeu or MDL, the complexity penalty that we propose is datadependent and is given by the probability that a conditional independence ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
We give a new consistent scoring function for structure learning of Bayesian networks. In contrast to traditional approaches to scorebased structure learning, such as BDeu or MDL, the complexity penalty that we propose is datadependent and is given by the probability that a conditional independence test correctly shows that an edge cannot exist. What really distinguishes this new scoring function from earlier work is that it has the property of becoming computationally easier to maximize as the amount of data increases. We prove a polynomial sample complexity result, showing that maximizing this score is guaranteed to correctly learn a structure with no false edges and a distribution close to the generating distribution, whenever there exists a Bayesian network which is a perfect map for the data generating distribution. Although the new score can be used with any search algorithm, we give empirical results showing that it is particularly effective when used together with a linear programming relaxation approach to Bayesian network structure learning. 1
A Comparison of Models for Gene Regulatory Networks Inference
"... Abstract. Gene regulatory networks are complex networks composed of nodes representing genes, transcription factors, microRNAs and other components or modules and their mutual interactions represented by edges. These networks can reveal and depict the fundamental gene regulatory mechanisms in the ce ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract. Gene regulatory networks are complex networks composed of nodes representing genes, transcription factors, microRNAs and other components or modules and their mutual interactions represented by edges. These networks can reveal and depict the fundamental gene regulatory mechanisms in the cells. In this paper we compare the obtained results of gene regulatory networks inference from gene expression microarray data. We have used dynamic Bayesian networks, Boolean networks and graphical Gaussian models as models for network inference. We applied three different size gene expression datasets simulated using a simple autoregressive process. After network inference, we compared the values of the area under ROC curve (AUC) as a validation measure. Some directions for further improved approach for GRNs reconstruction which will include prior knowledge are proposed at the end of this paper.
Spatiotemporal Information Fusion for Fault Detection in Shipboard Auxiliary Systems
"... Abstract — This paper addresses the issues of data analysis and sensor fusion that are critical for information management leading to (realtime) fault detection and classification in distributed physical processes (e.g., shipboard auxiliary systems). The proposed technique utilizes a semantic fram ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract — This paper addresses the issues of data analysis and sensor fusion that are critical for information management leading to (realtime) fault detection and classification in distributed physical processes (e.g., shipboard auxiliary systems). The proposed technique utilizes a semantic framework for multisensor data modeling, where the complexity is reduced by pruning the sensor network through an informationtheoretic (e.g., mutual informationbased) approach. The underlying algorithms are developed to achieve high reliability and computational efficiency while retaining the essential spatiotemporal characteristics of the physical system. The concept is validated on a simulation test bed of shipboard auxiliary systems. I.
Chapter X DATAENABLED HEALTH MANAGEMENT OF COMPLEX INDUSTRIAL SYSTEMS∗
"... Complex industrial systems, such as gas turbine engines, have many heterogeneous subsystems (e.g., thermal, mechanical, hydraulic and electrical) with complex thermalhydraulic, electromechanical, and electronic interactions. Schedulebased policies are largely followed in the industry for maintena ..."
Abstract
 Add to MetaCart
(Show Context)
Complex industrial systems, such as gas turbine engines, have many heterogeneous subsystems (e.g., thermal, mechanical, hydraulic and electrical) with complex thermalhydraulic, electromechanical, and electronic interactions. Schedulebased policies are largely followed in the industry for maintenance of complex systems. Since the degradation profiles of individual systems are usually different due to manufacturing and usage variations, a scheduled maintenance policy in most cases becomes either overly conservative or a source of serious safety concern. This calls for development of reliable and costeffective health management for complex industrial systems. However, reliable firstprinciple modeling of such systems may often be very expensive (e.g., in terms of computational memory, execution time, and cost) and hence may become inappropriate for condition monitoring, fault detection, diagnostics and prognostics. This chapter describes a datadriven framework to obtain abstract models of complex systems from multiple sensor observations to enable condition monitoring, diagnostics and supervisory control. The algorithms are formulated in the setting of