Results 1  10
of
69
Representing and querying correlated tuples in probabilistic databases
 In ICDE
, 2007
"... Probabilistic databases have received considerable attention recently due to the need for storing uncertain data produced by many real world applications. The widespread use of probabilistic databases is hampered by two limitations: (1) current probabilistic databases make simplistic assumptions abo ..."
Abstract

Cited by 117 (11 self)
 Add to MetaCart
Probabilistic databases have received considerable attention recently due to the need for storing uncertain data produced by many real world applications. The widespread use of probabilistic databases is hampered by two limitations: (1) current probabilistic databases make simplistic assumptions about the data (e.g., complete independence among tuples) that make it difficult to use them in applications that naturally produce correlated data, and (2) most probabilistic databases can only answer a restricted subset of the queries that can be expressed using traditional query languages. We address both these limitations by proposing a framework that can represent not only probabilistic tuples, but also correlations that may be present among them. Our proposed framework naturally lends itself to the possible world semantics thus preserving the precise query semantics extant in current probabilistic databases. We develop an efficient strategy for query evaluation over such probabilistic databases by casting the query processing problem as an inference problem in an appropriately constructed probabilistic graphical model. We present several optimizations specific to probabilistic databases that enable efficient query evaluation. We validate our approach by presenting an experimental evaluation that illustrates the effectiveness of our techniques at answering various queries using real and synthetic datasets. 1
Collective classification in network data
, 2008
"... Numerous realworld applications produce networked data such as web data (hypertext documents connected via hyperlinks) and communication networks (people connected via communication links). A recent focus in machine learning research has been to extend traditional machine learning classification te ..."
Abstract

Cited by 98 (27 self)
 Add to MetaCart
Numerous realworld applications produce networked data such as web data (hypertext documents connected via hyperlinks) and communication networks (people connected via communication links). A recent focus in machine learning research has been to extend traditional machine learning classification techniques to classify nodes in such data. In this report, we attempt to provide a brief introduction to this area of research and how it has progressed during the past decade. We introduce four of the most widely used inference algorithms for classifying networked data and empirically compare them on both synthetic and realworld data. 1
Lifted firstorder probabilistic inference
 In Proceedings of IJCAI05, 19th International Joint Conference on Artificial Intelligence
, 2005
"... Most probabilistic inference algorithms are specified and processed on a propositional level. In the last decade, many proposals for algorithms accepting firstorder specifications have been presented, but in the inference stage they still operate on a mostly propositional representation level. [Poo ..."
Abstract

Cited by 88 (7 self)
 Add to MetaCart
Most probabilistic inference algorithms are specified and processed on a propositional level. In the last decade, many proposals for algorithms accepting firstorder specifications have been presented, but in the inference stage they still operate on a mostly propositional representation level. [Poole, 2003] presented a method to perform inference directly on the firstorder level, but this method is limited to special cases. In this paper we present the first exact inference algorithm that operates directly on a firstorder level, and that can be applied to any firstorder model (specified in a language that generalizes undirected graphical models). Our experiments show superior performance in comparison with propositional exact inference. 1
Hybrid Bayesian Networks for Reasoning about Complex Systems
, 2002
"... Many realworld systems are naturally modeled as hybrid stochastic processes, i.e., stochastic processes that contain both discrete and continuous variables. Examples include speech recognition, target tracking, and monitoring of physical systems. The task is usually to perform probabilistic inferen ..."
Abstract

Cited by 48 (0 self)
 Add to MetaCart
Many realworld systems are naturally modeled as hybrid stochastic processes, i.e., stochastic processes that contain both discrete and continuous variables. Examples include speech recognition, target tracking, and monitoring of physical systems. The task is usually to perform probabilistic inference, i.e., infer the hidden state of the system given some noisy observations. For example, we can ask what is the probability that a certain word was pronounced given the readings of our microphone, what is the probability that a submarine is trying to surface given our sonar data, and what is the probability of a valve being open given our pressure and flow readings. Bayesian networks are
A Survey of Algorithms for RealTime Bayesian Network Inference
 In In the joint AAAI02/KDD02/UAI02 workshop on RealTime Decision Support and Diagnosis Systems
, 2002
"... As Bayesian networks are applied to more complex and realistic realworld applications, the development of more efficient inference algorithms working under realtime constraints is becoming more and more important. This paper presents a survey of various exact and approximate Bayesian network ..."
Abstract

Cited by 32 (2 self)
 Add to MetaCart
As Bayesian networks are applied to more complex and realistic realworld applications, the development of more efficient inference algorithms working under realtime constraints is becoming more and more important. This paper presents a survey of various exact and approximate Bayesian network inference algorithms. In particular, previous research on realtime inference is reviewed. It provides a framework for understanding these algorithms and the relationships between them. Some important issues in realtime Bayesian networks inference are also discussed.
Probabilistic Inference in Influence Diagrams
 Computational Intelligence
, 1998
"... This paper is about reducing influence diagram (ID) evaluation into Bayesian network (BN) inference problems. Such reduction is interesting because it enables one to readily use one's favorite BN inference algorithm to efficiently evaluate IDs. Two such reduction methods have been proposed pre ..."
Abstract

Cited by 32 (0 self)
 Add to MetaCart
This paper is about reducing influence diagram (ID) evaluation into Bayesian network (BN) inference problems. Such reduction is interesting because it enables one to readily use one's favorite BN inference algorithm to efficiently evaluate IDs. Two such reduction methods have been proposed previously (Cooper 1988, Shachter and Peot 1992). This paper proposes a new method. The BN inference problems induced by the mew method are much easier to solve than those induced by the two previous methods.
Exploiting Shared Correlations in Probabilistic Databases
, 2008
"... There has been a recent surge in work in probabilistic databases, propelled in large part by the huge increase in noisy data sources — from sensor data, experimental data, data from uncurated sources, and many others. There is a growing need for database management systems that can efficiently repre ..."
Abstract

Cited by 26 (6 self)
 Add to MetaCart
There has been a recent surge in work in probabilistic databases, propelled in large part by the huge increase in noisy data sources — from sensor data, experimental data, data from uncurated sources, and many others. There is a growing need for database management systems that can efficiently represent and query such data. In this work, we show how data characteristics can be leveraged to make the query evaluation process more efficient. In particular, we exploit what we refer to as shared correlations where the same uncertainties and correlations occur repeatedly in the data. Shared correlations occur mainly due to two reasons: (1) Uncertainty and correlations usually come from general statistics and rarely vary on a tupletotuple basis; (2) The query evaluation procedure itself tends to reintroduce the same correlations. Prior work has shown that the query evaluation problem on probabilistic databases is equivalent to a probabilistic inference problem on an appropriately constructed probabilistic graphical model (PGM). We leverage this by introducing a new data structure, called the random variable elimination graph (rvelim graph) that can be built from the PGM obtained from query evaluation. We develop techniques based on bisimulation that can be used to compress the rvelim graph exploiting the presence of shared correlations in the PGM, the compressed rvelim graph can then be used to run inference. We validate our methods by evaluating them empirically and show that even with a few shared correlations significant speedups are possible.
New advances in inference by recursive conditioning
 In Proceedings of the 19th Conference on Uncertainty in Artificial Intelligence
"... Recursive Conditioning (RC) was introduced recently as an any–space algorithm for inference in Bayesian networks which can trade time for space by varying the size of its cache at the increment needed to store a floating point number. Under full caching, RC has an asymptotic time and space complexit ..."
Abstract

Cited by 21 (1 self)
 Add to MetaCart
Recursive Conditioning (RC) was introduced recently as an any–space algorithm for inference in Bayesian networks which can trade time for space by varying the size of its cache at the increment needed to store a floating point number. Under full caching, RC has an asymptotic time and space complexity which is comparable to mainstream algorithms based on variable elimination and clustering (exponential in the network treewidth and linear in its size). We show two main results about RC in this paper. First, we show that its actual space requirements under full caching are much more modest than those needed by mainstream methods and study the implications of this finding. Second, we show that RC can effectively deal with determinism in Bayesian networks by employing standard logical techniques, such as unit resolution, allowing a significant reduction in its time requirements in certain cases. We illustrate our results using a number of benchmark networks, including the very challenging ones that arise in genetic linkage analysis. 1
Bisimulationbased approximate lifted inference
"... There has been a great deal of recent interest in methods for performing lifted inference; however, most of this work assumes that the firstorder model is given as input to the system. Here, we describe lifted inference algorithms that determine symmetries and automatically lift the probabilistic m ..."
Abstract

Cited by 21 (2 self)
 Add to MetaCart
There has been a great deal of recent interest in methods for performing lifted inference; however, most of this work assumes that the firstorder model is given as input to the system. Here, we describe lifted inference algorithms that determine symmetries and automatically lift the probabilistic model to speedup inference. In particular, we describe approximate lifted inference techniques that allow the user to trade off inference accuracy for computational efficiency by using a handful of tunable parameters, while keeping the error bounded. Our algorithms are closely related to the graphtheoretic concept of bisimulation. We report experiments on both synthetic and real data to show that in the presence of symmetries, runtimes for inference can be improved significantly, with approximate lifted inference providing orders of magnitude speedup over ground inference.
Practical solution techniques for firstorder mdps
 Artificial Intelligence
"... Many traditional solution approaches to relationally specified decisiontheoretic planning problems (e.g., those stated in the probabilistic planning domain description language, or PPDDL) ground the specification with respect to a specific instantiation of domain objects and apply a solution approa ..."
Abstract

Cited by 18 (1 self)
 Add to MetaCart
Many traditional solution approaches to relationally specified decisiontheoretic planning problems (e.g., those stated in the probabilistic planning domain description language, or PPDDL) ground the specification with respect to a specific instantiation of domain objects and apply a solution approach directly to the resulting ground Markov decision process (MDP). Unfortunately, the space and time complexity of these grounded solution approaches are polynomial in the number of domain objects and exponential in the predicate arity and the number of nested quantifiers in the relational problem specification. An alternative to grounding a relational planning problem is to tackle the problem directly at the relational level. In this article, we propose one such approach that translates an expressive subset of the PPDDL representation to a firstorder MDP (FOMDP) specification and then derives a domainindependent policy without grounding at any intermediate step. However, such generality does not come without its own set of challenges—the purpose of this article is to explore practical solution techniques for solving FOMDPs. To demonstrate the applicability of our techniques, we present proofofconcept results of our firstorder approximate linear programming (FOALP) planner on problems from the probabilistic track