Results 1 -
5 of
5
Learning and Inference in WEIGHTED LOGIC WITH APPLICATION TO NATURAL LANGUAGE PROCESSING
, 2008
"... ..."
CarpeDiem: Optimizing the Viterbi Algorithm and Applications to Supervised Sequential Learning
"... The growth of information available to learning systems and the increasing complexity of learning tasks determine the need for devising algorithms that scale well with respect to all learning parameters. In the context of supervised sequential learning, the Viterbi algorithm plays a fundamental role ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
The growth of information available to learning systems and the increasing complexity of learning tasks determine the need for devising algorithms that scale well with respect to all learning parameters. In the context of supervised sequential learning, the Viterbi algorithm plays a fundamental role, by allowing the evaluation of the best (most probable) sequence of labels with a time complexity linear in the number of time events, and quadratic in the number of labels. In this paper we propose CarpeDiem, a novel algorithm allowing the evaluation of the best possible sequence of labels with a sub-quadratic time complexity. 1 We provide theoretical grounding together with solid empirical results supporting two chief facts. CarpeDiem always finds the optimal solution requiring, in most cases, only a small fraction of the time taken by the Viterbi algorithm; meantime, CarpeDiem is never asymptotically worse than the Viterbi algorithm, thus confirming it as a sound replacement.
Unsupervised Models for Spatial, Temporal and Relational Systems
, 2009
"... Social processes can be strongly influenced by their spatial and temporal environment, as well as relational structures specific to the process itself. While it has traditionally been expedient to study one or two of these dimensions at a time, it is increasingly feasible to collect data necessary t ..."
Abstract
- Add to MetaCart
Social processes can be strongly influenced by their spatial and temporal environment, as well as relational structures specific to the process itself. While it has traditionally been expedient to study one or two of these dimensions at a time, it is increasingly feasible to collect data necessary to investigate how, and in what combinations and proportions spatial, temporal and relational (STR) factors govern a process. This proposal is concerned with enabling the early stages of such an analysis, in which the researcher has a hypothesis regarding what relationships exist between STR variables, but not the details and relative strengths of these relationships. Can we express this generalized hypothesis, and algorithmically use available data to recommend a more specific one? I adopt probabilistic graphical models (PGMs) as a flexible framework for representing structural hypotheses, and introduce a templating system for generating regular PGM structures appropriate STR data. In fitting these models to data, I argue against both supervised training and Bayesian unsupervised methods, suggesting a focus on fast, useful inference over (even approximate) optimality. To this end, I introduce Expectation Maximizing belief propagation (EMBP) algorithms, which perform fast unsupervised learning in graphical models with spatial, temporal and relational structure, leading to a variety of
Approved as to style and content by:
, 2008
"... It is customary for doctoral theses to begin with embarrassingly fulsome thanks to one’s advisor, professors, family, and friends. This thesis will be no exception. I am therefore fortunate that I have had a wonderful network of mentors, family, and friends, so that I can express my thanks without r ..."
Abstract
- Add to MetaCart
It is customary for doctoral theses to begin with embarrassingly fulsome thanks to one’s advisor, professors, family, and friends. This thesis will be no exception. I am therefore fortunate that I have had a wonderful network of mentors, family, and friends, so that I can express my thanks without risk of overstatement. I could not have completed this thesis without the help of my advisor Andrew McCallum. Andrew is a dynamo of enthusiasm, encouragement, and interesting suggestions—as anyone who has had him drop by their cubicle can attest. I have been especially influenced by his keen sense of how to find research areas that are theoretically interesting, practically important, and ripe for the attack. I am also grateful to have benefited from the feedback of the other members of my dissertation committee: Sridhar Mahadevan, Erik Learned-Miller, Jonathan Machta, and Tommi Jaakkola. I spent a delightful summer doing an internship at Microsoft Research Cambridge. I thank Tom Minka and Martin Szummer for hosting me, and for our many discussions that helped me to better understand approximate inference and training, and that helped many of the explanations in this thesis to become crisper and more insightful. At UMass, I have also learned much from discussions with Wei Li, Aron Culotta, Greg Druck, Gideon Mann, and Jerod Weinman. I have also had the opportunity to collaborate on several projects outside of my thesis, and for this I thank my
Slice Normalized Dynamic Markov Logic Networks
"... Markov logic is a widely used tool in statistical relational learning, which uses a weighted first-order logic knowledge base to specify a Markov random field (MRF) or a conditional random field (CRF). In many applications, a Markov logic network (MLN) is trained in one domain, but used in a differe ..."
Abstract
- Add to MetaCart
Markov logic is a widely used tool in statistical relational learning, which uses a weighted first-order logic knowledge base to specify a Markov random field (MRF) or a conditional random field (CRF). In many applications, a Markov logic network (MLN) is trained in one domain, but used in a different one. This paper focuses on dynamic Markov logic networks, where the size of the discretized time-domain typically varies between training and testing. It has been previously pointed out that the marginal probabilities of truth assignments to ground atoms can change if one extends or reduces the domains of predicates in an MLN. We show that in addition to this problem, the standard way of unrolling a Markov logic theory into a MRF may result in time-inhomogeneity of the underlying Markov chain. Furthermore, even if these representational problems are not significant for a given domain, we show that the more practical problem of generating samples in a sequential conditional random field for the next slice relying on the samples from the previous slice has high computational cost in the general case, due to the need to estimate a normalization factor for each sample. We propose a new discriminative model, slice normalized dynamic Markov logic networks (SN-DMLN), that suffers from none of these issues. It supports efficient online inference, and can directly model influences between variables within a time slice that do not have a causal direction, in contrast with fully directed models (e.g., DBNs). Experimental results show an improvement in accuracy over previous approaches to online inference in dynamic Markov logic networks. 1

