Results 1 - 10
of
109
libDAI: A free/open source C++ library for discrete approximate inference methods
, 2008
"... This paper describes the software package libDAI, a free & open source C++ library that provides implementations of various exact and approximate inference methods for graphical models with discrete-valued variables. libDAI supports directed graphical models (Bayesian networks) as well as undirected ..."
Abstract
-
Cited by 19 (1 self)
- Add to MetaCart
This paper describes the software package libDAI, a free & open source C++ library that provides implementations of various exact and approximate inference methods for graphical models with discrete-valued variables. libDAI supports directed graphical models (Bayesian networks) as well as undirected ones (Markov random fields and factor graphs). It offers various approximations of the partition sum, marginal probability distributions and maximum probability states. Parameter learning is also supported. A feature comparison with other open source software packages for approximate inference is given. libDAI is licensed under the GPL v2+ license and is available at
Transportability of causal and statistical relations: A formal approach
- In Proceedings of the Twenty-Fifth National Conference on Artificial Intelligence. AAAI Press, Menlo Park, CA
, 2011
"... We address the problem of transferring information learned from experiments to a different environment, in which only passive observations can be collected. We introduce a formal representation called “selection diagrams ” for expressing knowledge about differences and commonalities between environm ..."
Abstract
-
Cited by 7 (5 self)
- Add to MetaCart
We address the problem of transferring information learned from experiments to a different environment, in which only passive observations can be collected. We introduce a formal representation called “selection diagrams ” for expressing knowledge about differences and commonalities between environments and, using this representation, we derive procedures for deciding whether effects in the target environment can be inferred from experiments conducted elsewhere. When the answer is affirmative, the procedures identify the set of experiments and observations that need be conducted to license the transport. We further discuss how transportability analysis can guide the transfer of knowledge in non-experimental learning to minimize re-measurement cost and improve prediction power.
USHER: Improving Data Quality with Dynamic Forms
"... Abstract — Data quality is a critical problem in modern databases. Data entry forms present the first and arguably best opportunity for detecting and mitigating errors, but there has been little research into automatic methods for improving data quality at entry time. In this paper, we propose USHER ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
Abstract — Data quality is a critical problem in modern databases. Data entry forms present the first and arguably best opportunity for detecting and mitigating errors, but there has been little research into automatic methods for improving data quality at entry time. In this paper, we propose USHER, an endto-end system for form design, entry, and data quality assurance. Using previous form submissions, USHER learns a probabilistic model over the questions of the form. USHER then applies this model at every step of the data entry process to improve data quality. Before entry, it induces a form layout that captures the most important data values of a form instance as quickly as possible. During entry, it dynamically adapts the form to the values being entered, and enables real-time feedback to guide the data enterer toward their intended values. After entry, it re-asks questions that it deems likely to have been entered incorrectly. We evaluate all three components of USHER using two real-world data sets. Our results demonstrate that each component has the potential to improve data quality considerably, at a reduced cost when compared to current practice. I.
Learning Tree Conditional Random Fields
"... We examine maximum spanning tree-based methods for learning the structure of tree Conditional Random Fields (CRFs) P (Y|X). We use edge weights which take advantage of local inputs X and thus scale to large problems. For a general class of edge weights, we give a negative learnability result. Howeve ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
We examine maximum spanning tree-based methods for learning the structure of tree Conditional Random Fields (CRFs) P (Y|X). We use edge weights which take advantage of local inputs X and thus scale to large problems. For a general class of edge weights, we give a negative learnability result. However, we demonstrate that two members of the class–local Conditional Mutual Information and Decomposable Conditional Influence– have reasonable theoretical bases and perform very well in practice. On synthetic data and a large-scale fMRI application, our methods outperform existing techniques. 1.
Structured Determinantal Point Processes
"... We present a novel probabilistic model for distributions over sets of structures— for example, sets of sequences, trees, or graphs. The critical characteristic of our model is a preference for diversity: sets containing dissimilar structures are more likely. Our model is a marriage of structured pro ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
We present a novel probabilistic model for distributions over sets of structures— for example, sets of sequences, trees, or graphs. The critical characteristic of our model is a preference for diversity: sets containing dissimilar structures are more likely. Our model is a marriage of structured probabilistic models, like Markov random fields and context free grammars, with determinantal point processes, which arise in quantum physics as models of particles with repulsive interactions. We extend the determinantal point process model to handle an exponentially-sized set of particles (structures) via a natural factorization of the model into parts. We show how this factorization leads to tractable algorithms for exact inference, including computing marginals, computing conditional probabilities, and sampling. Our algorithms exploit a novel polynomially-sized dual representation of determinantal point processes, and use message passing over a special semiring to compute relevant quantities. We illustrate the advantages of the model on tracking and articulated pose estimation problems. 1
Annotating and Searching Web Tables Using Entities, Types and Relationships
"... Tables are a universal idiom to present relational data. Billions of tables on Web pages express entity references, attributes and relationships. This representation of relational world knowledge is usually considerably better than completely unstructured, free-format text. At the same time, unlike ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Tables are a universal idiom to present relational data. Billions of tables on Web pages express entity references, attributes and relationships. This representation of relational world knowledge is usually considerably better than completely unstructured, free-format text. At the same time, unlike manually-created knowledge bases, relational information mined from “organic ” Web tables need not be constrained by availability of precious editorial time. Unfortunately, in the absence of any formal, uniform schema imposed on Web tables, Web search cannot take advantage of these high-quality sources of relational information. In this paper we propose new machine learning techniques to annotate table cells with entities that they likely mention, table columns with types from which entities are drawn for cells in the column, and relations that pairs of table columns seek to express. We propose a new graphical model for making all these labeling decisions for each table simultaneously, rather than make separate local decisions for entities, types and relations. Experiments using the YAGO catalog, DB-Pedia, tables from Wikipedia, and over 25 million HTML tables from a 500 million page Web crawl uniformly show the superiority of our approach. We also evaluate the impact of better annotations on a prototype relational Web search tool. We demonstrate clear benefits of our annotations beyond indexing tables in a purely textual manner. 1.
Spectral dimensionality reduction via maximum entropy
- In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics
, 2011
"... We introduce a new perspective on spectral dimensionality reduction which views these methods as Gaussian random fields (GRFs). Our unifying perspective is based on the maximum entropy principle which is in turn inspired by maximum variance unfolding. The resulting probabilistic models are based on ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
We introduce a new perspective on spectral dimensionality reduction which views these methods as Gaussian random fields (GRFs). Our unifying perspective is based on the maximum entropy principle which is in turn inspired by maximum variance unfolding. The resulting probabilistic models are based on GRFs. The resulting model is a nonlinear generalization of principal component analysis. We show that parameter fitting in the locally linear embedding is approximate maximum likelihood in these models. We directly maximize the likelihood and show results that are competitive with the leading spectral approaches on a robot navigation visualization and a human motion capture data set. 1
Designing Adaptive Feedback for Improving Data Entry Accuracy
"... Data quality is critical for many information-intensive applications. One of the best opportunities to improve data quality is during entry. USHER provides a theoretical, data-driven foundation for improving data quality during entry. Based on prior data, USHER learns a probabilistic model of the de ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
Data quality is critical for many information-intensive applications. One of the best opportunities to improve data quality is during entry. USHER provides a theoretical, data-driven foundation for improving data quality during entry. Based on prior data, USHER learns a probabilistic model of the dependencies between form questions and values. Using this information, USHER maximizes information gain. By asking the most unpredictable questions first, USHER is better able to predict answers for the remaining questions. In this paper, we use USHER’s predictive ability to design a number of intelligent user interface adaptations that improve data entry accuracy and efficiency. Based on an underlying cognitive model of data entry, we apply these modifications before, during and after committing an answer. We evaluated these mechanisms with professional data entry clerks working with real patient data from six clinics in rural Uganda. The results show that our adaptations has the potential to reduce error (by up to 78%), with limited effect on entry time (varying between-14 % and +6%). We believe this approach has wide applicability for improving the quality and availability of data, which is increasingly important for decision-making and resource allocation. ACM Classification: H5.2 [Information interfaces and presentation]:

