Results 1  10
of
21
Gradientbased boosting for Statistical Relational Learning: The Relational Dependency Network Case
, 2011
"... Abstract. Dependency networks approximate a joint probability distribution over multiple random variables as a product of conditional distributions. Relational Dependency Networks (RDNs) are graphical models that extend dependency networks to relational domains. This higher expressivity, however, co ..."
Abstract

Cited by 16 (9 self)
 Add to MetaCart
Abstract. Dependency networks approximate a joint probability distribution over multiple random variables as a product of conditional distributions. Relational Dependency Networks (RDNs) are graphical models that extend dependency networks to relational domains. This higher expressivity, however, comes at the expense of a more complex modelselection problem: an unbounded number of relational abstraction levels might need to be explored. Whereas current learning approaches for RDNs learn a single probability tree per random variable, we propose to turn the problem into a series of relational functionapproximation problems using gradientbased boosting. In doing so, one can easily induce highly complex features over several iterations and in turn estimate quickly a very expressive model. Our experimental results in several different data sets show that this boosting method results in efficient learning of RDNs when compared to stateoftheart statistical relational learning approaches. 1
Online MaxMargin Weight Learning for Markov Logic Networks
"... Most of the existing weightlearning algorithms for Markov Logic Networks (MLNs) use batch training which becomes computationally expensive and even infeasible for very large datasets since the training examples may not fit in main memory. To overcome this problem, previous work has used online lear ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
Most of the existing weightlearning algorithms for Markov Logic Networks (MLNs) use batch training which becomes computationally expensive and even infeasible for very large datasets since the training examples may not fit in main memory. To overcome this problem, previous work has used online learning algorithms to learn weights for MLNs. However, this prior work has only applied existing online algorithms, and there is no comprehensive study of online weight learning for MLNs. In this paper, we derive a new online algorithm for structured prediction using the primaldual framework, apply it to learn weights for MLNs, and compare against existing online algorithms on three large, realworld datasets. The experimental results show that our new algorithm generally achieves better accuracy than existing methods, especially on noisy datasets.
In Online Structure Learning for Markov Logic Networks
"... Abstract. Most existing learning methods for Markov Logic Networks (MLNs) use batch training, which becomes computationally expensive and eventually infeasible for large datasets with thousands of training examples which may not even all fit in main memory. To address this issue, previous work has u ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
Abstract. Most existing learning methods for Markov Logic Networks (MLNs) use batch training, which becomes computationally expensive and eventually infeasible for large datasets with thousands of training examples which may not even all fit in main memory. To address this issue, previous work has used online learning to train MLNs. However, they all assume that the model’s structure (set of logical clauses) is given, and only learn the model’s parameters. However, the input structure is usually incomplete, so it should also be updated. In this work, we present OSL—the first algorithm that performs both online structure and parameter learning for MLNs. Experimental results on two realworld datasets for naturallanguage field segmentation show that OSL outperforms systems that cannot revise structure. 1
Learning compact markov logic networks with decision trees
 Machine Learning
, 2012
"... Abstract. Markov Logic Networks (MLNs) are a prominent model class that generalizes both firstorder logic and undirected graphical models (Markov networks). The qualitative component of an MLN is a set of clauses and the quantitative component is a set of clause weights. Generative MLNs model the j ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
Abstract. Markov Logic Networks (MLNs) are a prominent model class that generalizes both firstorder logic and undirected graphical models (Markov networks). The qualitative component of an MLN is a set of clauses and the quantitative component is a set of clause weights. Generative MLNs model the joint distribution of relationships and attributes. A stateoftheart structure learning method is the moralization approach: learn a 1storder Bayes net, then convert it to conjunctive MLN clauses. The moralization approach takes advantage of the highquality inference algorithms for MLNs and their ability to handle cyclic dependencies. A weakness of the moralization approach is that it leads to an unnecessarily large number of clauses. In this paper we show that using decision trees to represent conditional probabilities in the Bayes net is an effective remedy that leads to much more compact MLN structures. The accuracy of predictions is competitive with the unpruned model and in many cases superior. 1
F.: Learning the structure of probabilistic logic programs
 ILP 2011. LNCS
, 2012
"... Abstract. There is a growing interest in the field of Probabilistic Inductive Logic Programming, which uses languages that integrate logic programming and probability. Many of these languages are based on the distribution semantics and recently various authors have proposed systems for learning the ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
Abstract. There is a growing interest in the field of Probabilistic Inductive Logic Programming, which uses languages that integrate logic programming and probability. Many of these languages are based on the distribution semantics and recently various authors have proposed systems for learning the parameters (PRISM, LeProbLog, LFIProbLog and EMBLEM) or both the structure and the parameters (SEMCPlogic) of these languages. EMBLEM for example uses an Expectation Maximization approach in which the expectations are computed on Binary Decision Diagrams. In this paper we present the algorithm SLIPCASE for “Structure LearnIng of ProbabilistiC logic progrAmS with Em over bdds”. It performs a beam search in the space of the language of Logic Programs with Annotated Disjunctions (LPAD) using the log likelihood of the data as the guiding heuristics. To estimate the log likelihood of theory refinements it performs a limited number of Expectation Maximization iterations of EMBLEM. SLIPCASE has been tested on three realworld datasetsandcomparedwithSEMCPlogic andLearningusing Structural Motifs, an algorithm for Markov Logic Networks. The results show that SLIPCASE achieves higher areas under the precisionrecall and ROC curves and is more scalable.
Transforming Graph Data for Statistical Relational Learning
"... Relational data representations have become an increasingly important topic due to the recent proliferation of network datasets (e.g., social, biological, information networks) and a corresponding increase in the application of Statistical Relational Learning (SRL) algorithms to these domains. In th ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
Relational data representations have become an increasingly important topic due to the recent proliferation of network datasets (e.g., social, biological, information networks) and a corresponding increase in the application of Statistical Relational Learning (SRL) algorithms to these domains. In this article, we examine and categorize techniques for transforming graphbased relational data to improve SRL algorithms. In particular, appropriate transformations of the nodes, links, and/or features of the data can dramatically affect the capabilities and results of SRL algorithms. We introduce an intuitive taxonomy for data representation transformations in relational domains that incorporates link transformation and node transformation as symmetric representation tasks. More specifically, the transformation tasks for both nodes and links include (i) predicting their existence, (ii) predicting their label or type, (iii) estimating their weight or importance, and (iv) systematically constructing their relevant features. We motivate our taxonomy through detailed examples and use it to survey competing approaches for each of these tasks. We also discuss general conditions for transforming links, nodes, and features. Finally, we highlight challenges that remain to be addressed. 1.
LogicBased Event Recognition
"... Today’s organisations require techniques for automated transformation of their large data volumes into operational knowledge. This requirement may be addressed by employing event recognition systems that detect events/activities of special significance within an organisation, given streams of ‘lowl ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
Today’s organisations require techniques for automated transformation of their large data volumes into operational knowledge. This requirement may be addressed by employing event recognition systems that detect events/activities of special significance within an organisation, given streams of ‘lowlevel ’ information that is very difficult to be utilised by humans. Consider, for example, the recognition of attacks on nodes of a computer network given the TCP/IP messages, the recognition of suspicious trader behaviour given the transactions in a financial market, and the recognition of whale songs given a symbolic representation of whale sounds. Various event recognition systems have been proposed in the literature. Recognition systems with a logicbased representation of event structures, in particular, have been attracting considerable attention, because, among others, they exhibit a formal, declarative semantics, they have proven to be efficient and scalable, and they are supported by machine learning tools automating the construction and refinement of event structures. In this paper we review representative approaches of logicbased event recognition and discuss open research issues of this field. We illustrate the reviewed approaches with the use of a realworld case study: event recognition for city transport management.
Learning directed relational models with recursive dependencies
 IN: ILP
"... Recently, there has been an increasing interest in generative relational models that represent probabilistic patterns over both links and attributes. A key characteristic of relational data is that the value of a predicate often depends on values of the same predicate for related entities. In this ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
Recently, there has been an increasing interest in generative relational models that represent probabilistic patterns over both links and attributes. A key characteristic of relational data is that the value of a predicate often depends on values of the same predicate for related entities. In this paper we present a new approach to learning directed relational models which utilizes two key concepts: a pseudo likelihood measure that is well defined for recursive dependencies, and the notion of stratification from logic programming. An issue for modelling recursive dependencies with Bayes nets are redundant edges that increase the complexity of learning. We propose a new normal form for 1storder Bayes nets that removes the redundancy, and prove that assuming stratification, the normal form constraints involve no loss of modelling power. We incorporate these constraints in the learnandjoin algorithm of Khosravi et al., which is a stateofthe art structure learning algorithm that upgrades propositional Bayes net learners for relational data. Emprical evaluation compares our approach to learning recursive dependencies with undirected models (Markov Logic Networks). The Bayes net approach is orders of magnitude faster, and learns more recursive dependencies, which lead to more accurate predictions.
Efficient Relational Learning with Hidden Variable Detection
"... Markov networks (MNs) can incorporate arbitrarily complex features in modeling relational data. However, this flexibility comes at a sharp price of training an exponentially complex model. To address this challenge, we propose a novel relational learning approach, which consists of a restricted clas ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Markov networks (MNs) can incorporate arbitrarily complex features in modeling relational data. However, this flexibility comes at a sharp price of training an exponentially complex model. To address this challenge, we propose a novel relational learning approach, which consists of a restricted class of relational MNs (RMNs) called relation treebased RMN (treeRMN), and an efficient Hidden Variable Detection algorithm called Contrastive Variable Induction (CVI). On one hand, the restricted treeRMN only considers simple (e.g., unary and pairwise) features in relational data and thus achieves computational efficiency; and on the other hand, the CVI algorithm efficiently detects hidden variables which can capture long range dependencies. Therefore, the resultant approach is highly efficient yet does not sacrifice its expressive power. Empirical results on four real datasets show that the proposed relational learning method can achieve similar prediction quality as the stateoftheart approaches, but is significantly more efficient in training; and the induced hidden variables are semantically meaningful and crucial to improve the training speed and prediction qualities of treeRMNs. 1
Online InferenceRule Learning from NaturalLanguage Extractions
"... In this paper, we consider the problem of learning commonsense knowledge in the form of firstorder rules from incomplete and noisy naturallanguage extractions produced by an offtheshelf information extraction (IE) system. Much of the information conveyed in text must be inferred from what is exp ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
In this paper, we consider the problem of learning commonsense knowledge in the form of firstorder rules from incomplete and noisy naturallanguage extractions produced by an offtheshelf information extraction (IE) system. Much of the information conveyed in text must be inferred from what is explicitly stated since easily inferable facts are rarely mentioned. The proposed rule learner accounts for this phenomenon by learning rules in which the body of the rule contains relations that are usually explicitly stated, while the head employs a lessfrequently mentioned relation that is easily inferred. The rule learner processes training examples in an online manner to allow it to scale to large text corpora. Furthermore, we propose a novel approach to weighting rules using a curated lexical ontology like WordNet. The learned rules along with their parameters are then used to infer implicit information using a Bayesian Logic Program. Experimental evaluation on a machine reading testbed demonstrates the efficacy of the proposed methods.