Results 1 - 10
of
24
Factorie: Probabilistic programming via imperatively defined factor graphs
- In Advances in Neural Information Processing Systems 22
, 2009
"... Discriminatively trained undirected graphical models have had wide empirical success, and there has been increasing interest in toolkits that ease their application to complex relational data. The power in relational models is in their repeated structure and tied parameters; at issue is how to defin ..."
Abstract
-
Cited by 15 (2 self)
- Add to MetaCart
Discriminatively trained undirected graphical models have had wide empirical success, and there has been increasing interest in toolkits that ease their application to complex relational data. The power in relational models is in their repeated structure and tied parameters; at issue is how to define these structures in a powerful and flexible way. Rather than using a declarative language, such as SQL or first-order logic, we advocate using an imperative language to express various aspects of model structure, inference, and learning. By combining the traditional, declarative, statistical semantics of factor graphs with imperative definitions of their construction and operation, we allow the user to mix declarative and procedural domain knowledge, and also gain significant efficiencies. We have implemented such imperatively defined factor graphs in a system we call FACTORIE, a software library for an object-oriented, strongly-typed, functional language. In experimental comparisons to Markov Logic Networks on joint segmentation and coreference, we find our approach to be 3-15 times faster while reducing error by 20-25%—achieving a new state of the art. 1
Embedded probabilistic programming
- In Working conf. on domain specific lang
, 2009
"... Abstract. Two general techniques for implementing a domain-specific language (DSL) with less overhead are the finally-tagless embedding of object programs and the direct-style representation of side effects. We use these techniques to build a DSL for probabilistic programming, for expressing countab ..."
Abstract
-
Cited by 12 (3 self)
- Add to MetaCart
Abstract. Two general techniques for implementing a domain-specific language (DSL) with less overhead are the finally-tagless embedding of object programs and the direct-style representation of side effects. We use these techniques to build a DSL for probabilistic programming, for expressing countable probabilistic models and performing exact inference and importance sampling on them. Our language is embedded as an ordinary OCaml library and represents probability distributions as ordinary OCaml programs. We use delimited continuations to reify probabilistic programs as lazy search trees, which inference algorithms may traverse without imposing any interpretive overhead on deterministic parts of a model. We thus take advantage of the existing OCaml implementation to achieve competitive performance and ease of use. Inference algorithms can easily be embedded in probabilistic programs themselves.
Purely Functional Lazy Non-deterministic Programming
"... Functional logic programming and probabilistic programming have demonstrated the broad benefits of combining laziness (non-strict evaluation with sharing of the results) with non-determinism. Yet these benefits are seldom enjoyed in functional programming, because the existing features for non-stric ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
Functional logic programming and probabilistic programming have demonstrated the broad benefits of combining laziness (non-strict evaluation with sharing of the results) with non-determinism. Yet these benefits are seldom enjoyed in functional programming, because the existing features for non-strictness, sharing, and nondeterminism in functional languages are tricky to combine. We present a practical way to write purely functional lazy non-deterministic programs that are efficient and perspicuous. We achieve this goal by embedding the programs into existing languages (such as Haskell, SML, and OCaml) with high-quality implementations, by making choices lazily and representing data with non-deterministic components, by working with custom monadic data types and search strategies, and by providing equational laws for the programmer to reason about their code.
A Stochastic Memoizer for Sequence Data
"... We propose an unbounded-depth, hierarchical, Bayesian nonparametric model for discrete sequence data. This model can be estimated from a single training sequence, yet shares statistical strength between subsequent symbol predictive distributions in such a way that predictive performance generalizes ..."
Abstract
-
Cited by 7 (4 self)
- Add to MetaCart
We propose an unbounded-depth, hierarchical, Bayesian nonparametric model for discrete sequence data. This model can be estimated from a single training sequence, yet shares statistical strength between subsequent symbol predictive distributions in such a way that predictive performance generalizes well. The model builds on a specific parameterization of an unbounded-depth hierarchical Pitman-Yor process. We introduce analytic marginalization steps (using coagulation operators) to reduce this model to one that can be represented in time and space linear in the length of the training sequence. We show how to perform inference in such a model without truncation approximation and introduce fragmentation operators necessary to do predictive inference. We demonstrate the sequence memoizer by using it as a language model, achieving state-of-the-art results. 1.
Samplerank: Learning preference from atomic gradients
- In NIPS WS on Advances in Ranking
, 2009
"... Large templated factor graphs with complex structure that changes during inference have been shown to provide state-of-the-art experimental results on tasks such as identity uncertainty and information integration. However, learning parameters in these models is difficult because computing the gradi ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
Large templated factor graphs with complex structure that changes during inference have been shown to provide state-of-the-art experimental results on tasks such as identity uncertainty and information integration. However, learning parameters in these models is difficult because computing the gradients require expensive inference routines. In this paper we propose an online algorithm that instead learns preferences over hypotheses from the gradients between the atomic steps of inference. Although there are a combinatorial number of ranking constraints over the entire hypothesis space, a connection to the frameworks of sampled convex programs reveals a polynomial bound on the number of rankings that need to be satisfied in practice. We further apply ideas of passive aggressive algorithms to our update rules, enabling us to extend recent work in confidenceweighted classification to structured prediction problems. We compare our algorithm to structured perceptron, contrastive divergence, and persistent contrastive divergence, demonstrating substantial error reductions on two real-world problems (20 % over contrastive divergence).
Productivity and Reuse in Language
, 2011
"... We present a Bayesian model of the mirror image problems of linguistic productivity and reuse. The model, known as Fragment Grammar, is evaluated against several morphological datasets; its performance is compared to competing theoretical accounts including full–parsing, full–listing, and exemplar–b ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
We present a Bayesian model of the mirror image problems of linguistic productivity and reuse. The model, known as Fragment Grammar, is evaluated against several morphological datasets; its performance is compared to competing theoretical accounts including full–parsing, full–listing, and exemplar–based models. The model is able to learn the correct patterns of productivity and reuse for two very different systems: the English past tense which is characterized by a sharp dichotomy in productivity between regular and irregular forms and English derivational morphology which is characterized by a graded cline from very productive (-ness) to very unproductive (-th). Keywords:Productivity;Reuse;Storage;Computation; Bayesian Model;Past Tense;Derivational Morphology
Probabilistic Modelling, Inference and Learning using Logical Theories
"... This paper provides a study of probabilistic modelling, inference and learning in a logic-based setting. We show how probability densities, being functions, can be represented and reasoned with naturally and directly in higher-order logic, an expressive formalism not unlike the (informal) everyday l ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
This paper provides a study of probabilistic modelling, inference and learning in a logic-based setting. We show how probability densities, being functions, can be represented and reasoned with naturally and directly in higher-order logic, an expressive formalism not unlike the (informal) everyday language of mathematics. We give efficient inference algorithms and illustrate the general approach with a diverse collection of applications. Some learning issues are also considered.
Monolingual Probabilistic Programming Using Generalized Coroutines
"... Probabilistic programming languages and modeling toolkits are two modular ways to build and reuse stochastic models and inference procedures. Combining strengths of both, we express models and inference as generalized coroutines in the same general-purpose language. We use existing facilities of the ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Probabilistic programming languages and modeling toolkits are two modular ways to build and reuse stochastic models and inference procedures. Combining strengths of both, we express models and inference as generalized coroutines in the same general-purpose language. We use existing facilities of the language, such as rich libraries, optimizing compilers, and types, to develop concise, declarative, and realistic models with competitive performance on exact and approximate inference. In particular, a wide range of models can be expressed using memoization. Because deterministic parts of models run at full speed, custom inference procedures are trivial to incorporate, and inference procedures can reason about themselves without interpretive overhead. Within this framework, we introduce a new, general algorithm for importance sampling with look-ahead. 1
Gibbs Sampling in Open-Universe Stochastic Languages
"... Languages for open-universe probabilistic models (OUPMs) can represent situations with an unknown number of objects and identity uncertainty. While such cases arise in a wide range of important real-world applications, existing general purpose inference methods for OUPMs are far less efficient than ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Languages for open-universe probabilistic models (OUPMs) can represent situations with an unknown number of objects and identity uncertainty. While such cases arise in a wide range of important real-world applications, existing general purpose inference methods for OUPMs are far less efficient than those available for more restricted languages and model classes. This paper goes some way to remedying this deficit by introducing, and proving correct, a generalization of Gibbs sampling to partial worlds with possibly varying model structure. Our approach draws on and extends previous generic OUPM inference methods, as well as auxiliary variable samplers for nonparametric mixture models. It has been implemented for BLOG, a well-known OUPM language. Combined with compile-time optimizations, the resulting algorithm yields very substantial speedups over existing methods on several test cases, and substantially improves the practicality of OUPM languages generally. 1

