Results 1  10
of
348
The maxmin hillclimbing bayesian network structure learning algorithm
 Machine Learning
, 2006
"... Abstract. We present a new algorithm for Bayesian network structure learning, called MaxMin HillClimbing (MMHC). The algorithm combines ideas from local learning, constraintbased, and searchandscore techniques in a principled and effective way. It first reconstructs the skeleton of a Bayesian n ..."
Abstract

Cited by 148 (8 self)
 Add to MetaCart
Abstract. We present a new algorithm for Bayesian network structure learning, called MaxMin HillClimbing (MMHC). The algorithm combines ideas from local learning, constraintbased, and searchandscore techniques in a principled and effective way. It first reconstructs the skeleton of a Bayesian network and then performs a Bayesianscoring greedy hillclimbing search to orient the edges. In our extensive empirical evaluation MMHC outperforms on average and in terms of various metrics several prototypical and stateoftheart algorithms, namely the PC, Sparse Candidate, Three Phase Dependency Analysis, Optimal Reinsertion, Greedy Equivalence Search, and Greedy Search. These are the first empirical results simultaneously comparing most of the major Bayesian network algorithms against each other. MMHC offers certain theoretical advantages, specifically over the Sparse Candidate algorithm, corroborated by our experiments. MMHC and detailed results of our study are publicly available at
Injecting utility into anonymized datasets
 In SIGMOD
, 2006
"... Limiting disclosure in data publishing requires a careful balance between privacy and utility. Information about individuals must not be revealed, but a dataset should still be useful for studying the characteristics of a population. Privacy requirements such as kanonymity and ℓdiversity are desig ..."
Abstract

Cited by 117 (5 self)
 Add to MetaCart
Limiting disclosure in data publishing requires a careful balance between privacy and utility. Information about individuals must not be revealed, but a dataset should still be useful for studying the characteristics of a population. Privacy requirements such as kanonymity and ℓdiversity are designed to thwart attacks that attempt to identify individuals in the data and to discover their sensitive information. On the other hand, the utility of such data has not been wellstudied. In this paper we will discuss the shortcomings of current heuristic approaches to measuring utility and we will introduce a formal approach to measuring utility. Armed with this utility metric, we will show how to inject additional information into kanonymous and ℓdiverse tables. This information has an intuitive semantic meaning, it increases the utility beyond what is possible in the original kanonymity and ℓdiversity frameworks, and it maintains the privacy guarantees of kanonymity and ℓdiversity. 1.
Estimating highdimensional directed acyclic graphs with the PCalgorithm
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2007
"... We consider the PCalgorithm (Spirtes et al., 2000) for estimating the skeleton and equivalence class of a very highdimensional directed acyclic graph (DAG) with corresponding Gaussian distribution. The PCalgorithm is computationally feasible and often very fast for sparse problems with many nodes ..."
Abstract

Cited by 112 (6 self)
 Add to MetaCart
(Show Context)
We consider the PCalgorithm (Spirtes et al., 2000) for estimating the skeleton and equivalence class of a very highdimensional directed acyclic graph (DAG) with corresponding Gaussian distribution. The PCalgorithm is computationally feasible and often very fast for sparse problems with many nodes (variables), and it has the attractive property to automatically achieve high computational efficiency as a function of sparseness of the true underlying DAG. We prove uniform consistency of the algorithm for very highdimensional, sparse DAGs where the number of nodes is allowed to quickly grow with sample size n, as fast as O(n a) for any 0 < a < ∞. The sparseness assumption is rather minimal requiring only that the neighborhoods in the DAG are of lower order than sample size n. We also demonstrate the PCalgorithm for simulated data.
2003a). Bayesian Epistemology
"... Bayesian epistemology addresses epistemological problems with the help of the mathematical theory of probability. It turns out that the probability calculus is especially suited to represent degrees of belief (credences) and to deal with questions of belief change, confirmation, evidence, justificat ..."
Abstract

Cited by 82 (12 self)
 Add to MetaCart
(Show Context)
Bayesian epistemology addresses epistemological problems with the help of the mathematical theory of probability. It turns out that the probability calculus is especially suited to represent degrees of belief (credences) and to deal with questions of belief change, confirmation, evidence, justification, and coherence.
MEBN: A Language for FirstOrder Bayesian Knowledge Bases
"... Although classical firstorder logic is the de facto standard logical foundation for artificial intelligence, the lack of a builtin, semantically grounded capability for reasoning under uncertainty renders it inadequate for many important classes of problems. Probability is the bestunderstood and m ..."
Abstract

Cited by 63 (23 self)
 Add to MetaCart
(Show Context)
Although classical firstorder logic is the de facto standard logical foundation for artificial intelligence, the lack of a builtin, semantically grounded capability for reasoning under uncertainty renders it inadequate for many important classes of problems. Probability is the bestunderstood and most widely applied formalism for computational scientific reasoning under uncertainty. Increasingly expressive languages are emerging for which the fundamental logical basis is probability. This paper presents MultiEntity Bayesian Networks (MEBN), a firstorder language for specifying probabilistic knowledge bases as parameterized fragments of Bayesian networks. MEBN fragments (MFrags) can be instantiated and combined to form arbitrarily complex graphical probability models. An MFrag represents probabilistic relationships among a conceptually meaningful group of uncertain hypotheses. Thus, MEBN facilitates representation of knowledge at a natural level of granularity. The semantics of MEBN assigns a probability distribution over interpretations of an associated classical firstorder theory on a finite or countably infinite domain. Bayesian inference provides both a proof theory for combining prior knowledge with observations, and a learning theory for refining a representation as evidence accrues. A proof is given that MEBN can represent a probability distribution on interpretations of any finitely axiomatizable firstorder theory.
Beyond covariation: Cues to causal structure
 IN A. GOPNIK & L. SCHULZ (EDS.), CAUSAL LEARNING: PSYCHOLOGY, PHILOSOPHY, AND COMPUTATION
, 2006
"... ..."
Logical Bayesian Networks and their relation to other probabilistic logical models
 In Proceedings of 15th International Conference on Inductive Logic Pogramming (ILP05), volume 3625 of Lecture Notes in Artificial Intelligence
, 2005
"... We review Logical Bayesian Networks, a language for probabilistic logical modelling, and discuss its relation to Probabilistic Relational Models and Bayesian Logic Programs. 1 Probabilistic Logical Models Probabilistic logical models are models combining aspects of probability theory with aspects of ..."
Abstract

Cited by 31 (9 self)
 Add to MetaCart
(Show Context)
We review Logical Bayesian Networks, a language for probabilistic logical modelling, and discuss its relation to Probabilistic Relational Models and Bayesian Logic Programs. 1 Probabilistic Logical Models Probabilistic logical models are models combining aspects of probability theory with aspects of Logic Programming, firstorder logic or relational languages. Recently a variety of languages to describe such models has been introduced. For some languages techniques exist to learn such models from data. Two examples are Probabilistic Relational Models (PRMs) [4] and Bayesian Logic Programs (BLPs) [5]. These two languages are probably the most popular and wellknown in the Relational Data Mining community. We introduce a new language, Logical Bayesian Networks (LBNs) [2], that is strongly related to PRMs and BLPs yet solves some of their problems with respect to knowledge representation (related to expressiveness and intuitiveness). PRMs, BLPs and LBNs all follow the principle of Knowledge Based Model Construction: they offer a language that can be used to specify general probabilistic logical knowledge and they provide a methodology to construct a propositional model based on this knowledge when given a specific
A scalable method for integration and functional analysis of multiple microarray datasets
 Bioinformatics
, 2006
"... ..."
(Show Context)
The probabilistic program dependence graph and its application to fault diagnosis
 IEEE TSE
, 2010
"... This paper presents an innovative model of a program’s internal behavior over a set of test inputs, called the probabilistic program dependence graph (PPDG), that facilitates probabilistic analysis and reasoning about uncertain program behavior, particularly that associated with faults. The PPDG is ..."
Abstract

Cited by 25 (3 self)
 Add to MetaCart
(Show Context)
This paper presents an innovative model of a program’s internal behavior over a set of test inputs, called the probabilistic program dependence graph (PPDG), that facilitates probabilistic analysis and reasoning about uncertain program behavior, particularly that associated with faults. The PPDG is an augmentation of the structural dependences represented by a program dependence graph with estimates of statistical dependences between node states, which are computed from the test set. The PPDG is based on the established framework of probabilistic graphical models, which are widely used in applications such as medical diagnosis. This paper presents algorithms for constructing PPDGs and applying the PPDG to fault diagnosis. This paper also presents preliminary evidence indicating that PPDGs can facilitate fault localization and fault comprehension.
Youtubecat: Learning to categorize wild web videos
 In Proc. IEEE Conf. Computer Vision and Pattern Recognition
, 2010
"... Automatic categorization of videos in a Webscale unconstrained collection such as YouTube is a challenging task. A key issue is how to build an effective training set in the presence of missing, sparse or noisy labels. We propose to achieve this by first manually creating a small labeled set and th ..."
Abstract

Cited by 25 (2 self)
 Add to MetaCart
(Show Context)
Automatic categorization of videos in a Webscale unconstrained collection such as YouTube is a challenging task. A key issue is how to build an effective training set in the presence of missing, sparse or noisy labels. We propose to achieve this by first manually creating a small labeled set and then extending it using additional sources such as related videos, searched videos, and textbased webpages. The data from such disparate sources has different properties and labeling quality, and thus fusing them in a coherent fashion is another practical challenge. We propose a fusion framework in which each data source is first combined with the manuallylabeled set independently. Then, using the hierarchical taxonomy of the categories, a Conditional Random Field (CRF) based fusion strategy is designed. Based on the final fused classifier, category labels are predicted for the new videos. Extensive experiments on about 80K videos from 29 most frequent categories in YouTube show the effectiveness of the proposed method for categorizing largescale wild Web videos 1. 1.