Results 1 - 10
of
31
Estimators for Stochastic "Unification-Based" Grammars*
, 1999
"... Log-linear models provide a statistically sound framework for Stochastic "Unification-Based" Grammars (SUBGs) and stochastic versions of other kinds of grammars. We describe two computationally-tractable ways of estimating the parameters of such grammars from a training corpus of syntactic analy ..."
Abstract
-
Cited by 125 (18 self)
- Add to MetaCart
Log-linear models provide a statistically sound framework for Stochastic "Unification-Based" Grammars (SUBGs) and stochastic versions of other kinds of grammars. We describe two computationally-tractable ways of estimating the parameters of such grammars from a training corpus of syntactic analyses, and apply these to estimate a stochastic version of LexicalFunctional Grammar.
Lifted first-order probabilistic inference
- In Proceedings of IJCAI-05, 19th International Joint Conference on Artificial Intelligence
, 2005
"... Most probabilistic inference algorithms are specified and processed on a propositional level. In the last decade, many proposals for algorithms accepting first-order specifications have been presented, but in the inference stage they still operate on a mostly propositional representation level. [Poo ..."
Abstract
-
Cited by 56 (6 self)
- Add to MetaCart
Most probabilistic inference algorithms are specified and processed on a propositional level. In the last decade, many proposals for algorithms accepting first-order specifications have been presented, but in the inference stage they still operate on a mostly propositional representation level. [Poole, 2003] presented a method to perform inference directly on the first-order level, but this method is limited to special cases. In this paper we present the first exact inference algorithm that operates directly on a first-order level, and that can be applied to any first-order model (specified in a language that generalizes undirected graphical models). Our experiments show superior performance in comparison with propositional exact inference. 1
Discriminative training of markov logic networks
- In Proc. of the Natl. Conf. on Artificial Intelligence
, 2005
"... Many machine learning applications require a combination of probability and first-order logic. Markov logic networks (MLNs) accomplish this by attaching weights to first-order clauses, and viewing these as templates for features of Markov networks. Model parameters (i.e., clause weights) can be lear ..."
Abstract
-
Cited by 54 (13 self)
- Add to MetaCart
Many machine learning applications require a combination of probability and first-order logic. Markov logic networks (MLNs) accomplish this by attaching weights to first-order clauses, and viewing these as templates for features of Markov networks. Model parameters (i.e., clause weights) can be learned by maximizing the likelihood of a relational database, but this can be quite costly and lead to suboptimal results for any given prediction task. In this paper we propose a discriminative approach to training MLNs, one which optimizes the conditional likelihood of the query predicates given the evidence ones, rather than the joint likelihood of all predicates. We extend Collins’s (2002) voted perceptron algorithm for HMMs to MLNs by replacing the Viterbi algorithm with a weighted satisfiability solver. Experiments on entity resolution and link prediction tasks show the advantages of this approach compared to generative MLN training, as well as compared to purely probabilistic and purely logical approaches.
Unifying logical and statistical AI
- Proceedings of the Twenty-First National Conference on Artificial Intelligence
, 2006
"... Intelligent agents must be able to handle the complexity and uncertainty of the real world. Logical AI has focused mainly on the former, and statistical AI on the latter. Markov logic combines the two by attaching weights to first-order formulas and viewing them as templates for features of Markov n ..."
Abstract
-
Cited by 14 (4 self)
- Add to MetaCart
Intelligent agents must be able to handle the complexity and uncertainty of the real world. Logical AI has focused mainly on the former, and statistical AI on the latter. Markov logic combines the two by attaching weights to first-order formulas and viewing them as templates for features of Markov networks. Inference algorithms for Markov logic draw on ideas from satisfiability, Markov chain Monte Carlo and knowledge-based model construction. Learning algorithms are based on the voted perceptron, pseudo-likelihood and inductive logic programming. Markov logic has been successfully applied to problems in entity resolution, link prediction, information extraction and others, and is the basis of the open-source Alchemy system.
Joint and Conditional Estimation of Tagging and Parsing Models
, 2001
"... This paper compares two different ways of estimating statistical language models. ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
This paper compares two different ways of estimating statistical language models.
Probability and Statistics in Computational Linguistics, a brief review
- Mathematical foundations of speech and language processing
, 2003
"... processes involved in language learning, production, and comprehension. Computational linguists believe that the essence of these processes (in humans and machines) is a computational manipulation of information. Computational psycholinguistics studies psychological aspects of human ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
processes involved in language learning, production, and comprehension. Computational linguists believe that the essence of these processes (in humans and machines) is a computational manipulation of information. Computational psycholinguistics studies psychological aspects of human
Simulation-based Inference for Spatial Point Processes
, 2001
"... Introduction Spatial point processes play a fundamental role in spatial statistics. In the simplest case they model \small" objects that may be identied by a map of points showing stores, towns, plants, nests, galaxies or cases of a disease observed in a two or three dimensional region. The points ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
Introduction Spatial point processes play a fundamental role in spatial statistics. In the simplest case they model \small" objects that may be identied by a map of points showing stores, towns, plants, nests, galaxies or cases of a disease observed in a two or three dimensional region. The points may be decorated with marks (such as sizes or types) whereby marked point processes are obtained. The areas of applications are manifold: astronomy, geography, ecology, forestry, spatial epidemiology, image analysis, and many more. Currently spatial point processes is an active area of research, which probably will be of increasing importance for many new applications, as new technology such as geographical information systems makes huge amounts of spatial point process data available. Textbooks and review articles on dierent aspects of spatial point processes include Matheron (1975), Ripley (1977), Ripley (1981), Diggle (1983), Penttinen (1984), Daley &Vere-Jones (1988),
Convex structure learning in log-linear models: Beyond pairwise potentials
- In Proceedings of International Workshop on Artificial Intelligence and Statistics
, 2010
"... Previous work has examined structure learning in log-linear models with `1regularization, largely focusing on the case of pairwise potentials. In this work we consider the case of models with potentials of arbitrary order, but that satisfy a hierarchical constraint. We enforce the hierarchical const ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
Previous work has examined structure learning in log-linear models with `1regularization, largely focusing on the case of pairwise potentials. In this work we consider the case of models with potentials of arbitrary order, but that satisfy a hierarchical constraint. We enforce the hierarchical constraint using group `1-regularization with overlapping groups. An active set method that enforces hierarchical inclusion allows us to tractably consider the exponential number of higher-order potentials. We use a spectral projected gradient method as a subroutine for solving the overlapping group `1regularization problem, and make use of a sparse version of Dykstra's algorithm to compute the projection. Our experiments indicate that this model gives equal or better test set likelihood compared to previous models. 1
Bayesian inference in hidden Markov random fields for binary data defined on large lattices
, 2005
"... this paper is to introduce approximate methods to compute the likelihood for large lattices based on exact likelihood calculations for smaller lattices. We introduce approximate likelihood methods by relaxing some of the dependencies in the latent model, and also by approximating the likelihood by a ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
this paper is to introduce approximate methods to compute the likelihood for large lattices based on exact likelihood calculations for smaller lattices. We introduce approximate likelihood methods by relaxing some of the dependencies in the latent model, and also by approximating the likelihood by a partially ordered Markov model defined on a collection of sublattices. Results are presented based on simulated data as well as inference for the temporal-spatial structure of the interaction between up- and down-regulated states within the mitochondrial chromosome of the Plasmodium falciparum organism
in press. Consistency of pseudolikelihood estimation of fully visible Boltzmann machines. Neural Computation
- Neural Comp
, 2006
"... Boltzmann machine is a classic model of neural computation, and a number of methods have been proposed for its estimation. Most methods are plagued by either very slow convergence, or asymptotic bias in the resulting estimates. Here we consider estimation in the basic case of fully visible Boltzmann ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
Boltzmann machine is a classic model of neural computation, and a number of methods have been proposed for its estimation. Most methods are plagued by either very slow convergence, or asymptotic bias in the resulting estimates. Here we consider estimation in the basic case of fully visible Boltzmann machines. We show that the old principle of pseudolikelihood estimation provides an estimator that is computationally very simple, yet statistically consistent. 1

