Results 1 - 10
of
12
Learning Statistical Models from Relational Data
, 2001
"... This workshop is the second in a series of workshops held in conjunction with AAAI and IJCAI. The first workshop was held in July, 2000 at AAAI. Notes from that workshop are available at ..."
Abstract
-
Cited by 33 (6 self)
- Add to MetaCart
This workshop is the second in a series of workshops held in conjunction with AAAI and IJCAI. The first workshop was held in July, 2000 at AAAI. Notes from that workshop are available at
Distribution-based aggregation for relational learning with identifier attributes
- Machine Learning
, 2004
"... Feature construction through aggregation plays an essential role in modeling relational domains with one-to-many relationships between tables. One-to-many relationships lead to bags (multisets) of related entities, from which predictive information must be captured. This paper focuses on aggregation ..."
Abstract
-
Cited by 22 (10 self)
- Add to MetaCart
Feature construction through aggregation plays an essential role in modeling relational domains with one-to-many relationships between tables. One-to-many relationships lead to bags (multisets) of related entities, from which predictive information must be captured. This paper focuses on aggregation from categorical attributes that can take many values (e.g., object identifiers). We present a novel aggregation method as part of a relational learning system ACORA, that combines the use of vector distance and meta-data about the class-conditional distributions of attribute values. We provide a theoretical foundation for this approach deriving a “relational fixed-effect ” model within a Bayesian framework, and discuss the implications of identifier aggregation on the expressive power of the induced model. One advantage of using identifier attributes is the circumvention of limitations caused either by missing/unobserved object properties or by independence assumptions. Finally, we show empirically that the novel aggregators can generalize in the presence of identifier (and other high-dimensional) attributes, and also explore the limitations of the applicability of the methods. 1
Cluster-based Concept Invention for Statistical Relational Learning
- Proceedings of the 10th SIGKDD
, 2004
"... We use clustering to derive new relations which augment database schema used in automatic generation of predictive features in statistical relational learning. Entities derived from clusters increase the expressivity of feature spaces by creating new first-class concepts which contribute to the crea ..."
Abstract
-
Cited by 19 (4 self)
- Add to MetaCart
We use clustering to derive new relations which augment database schema used in automatic generation of predictive features in statistical relational learning. Entities derived from clusters increase the expressivity of feature spaces by creating new first-class concepts which contribute to the creation of new features. For example, in CiteSeer, papers can be clustered based on words or citations giving "topics", and authors can be clustered based on documents they coauthor giving "communities". Such cluster-derived concepts become part of more complex feature expressions. Out of the large number of generated features, those which improve predictive accuracy are kept in the model, as decided by statistical feature selection criteria. We present results demonstrating improved accuracy on two tasks, venue prediction and link prediction, using CiteSeer data.
Nearest Prototype Classification for Relational Learning
"... Abstract. Instance Based Methods for classification are based on storing the complete training dataset. Once a query is received, it is compared with all the instances in the dataset, providing an answer as a function of the labels of the most similar instances. Opposite to this, Nearest Prototype C ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Abstract. Instance Based Methods for classification are based on storing the complete training dataset. Once a query is received, it is compared with all the instances in the dataset, providing an answer as a function of the labels of the most similar instances. Opposite to this, Nearest Prototype Classification (NPC) obtains in training time a reduced set of prototypes that generalize the complete dataset, reducing time and memory constraints of the lazy approaches. This paper presents an algorithm for NPC with relational data. The method is based on a successful approach for NPC with propositional data, and on existing relational distance measures. Empirical results show the utility of the approach, both in classification accuracy and in resources (time and memory) used. 1
Prototypes based relational learning
- In The 13th International Conference on Artificial Intelligence: Methodology, Systems, Applications
, 2008
"... Abstract. Relational instance-based learning (RIBL) algorithms offer high prediction capabilities. However, they do not scale up well, specially in domains where there is a time bound for classification. Nearest prototype approaches can alleviate this problem, by summarizing the data set in a reduce ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. Relational instance-based learning (RIBL) algorithms offer high prediction capabilities. However, they do not scale up well, specially in domains where there is a time bound for classification. Nearest prototype approaches can alleviate this problem, by summarizing the data set in a reduced set of prototypes. In this paper we present an algorithm to build Relational Nearest Prototype Classifiers (rnpc). When compared with RIBL approaches, the algorithm is able to dramatically reduce the number of instances by selecting the most relevant prototypes, maintaining similar accuracy. The number of prototypes is obtained automatically by the algorithm, although it can be also bounded by the user. Empirical results on benchmark data sets demonstrate the utility of this approach compared to other instance based approaches.
Learning with Kernels and Logical Representations
"... Abstract. In this chapter, we describe a view of statistical learning in the inductive logic programming setting based on kernel methods. The relational representation of data and background knowledge are used to form a kernel function, enabling us to subsequently apply a number of kernel-based stat ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract. In this chapter, we describe a view of statistical learning in the inductive logic programming setting based on kernel methods. The relational representation of data and background knowledge are used to form a kernel function, enabling us to subsequently apply a number of kernel-based statistical learning algorithms. Different representational frameworks and associated algorithms are explored in this chapter. In kernels on Prolog proof trees, the representation of an example is obtained by recording the execution trace of a program expressing background knowledge. In declarative kernels, features are directly associated with mereotopological relations. Finally, in kFOIL, features correspond to the truth values of clauses dynamically generated by a greedy search algorithm guided by the empirical risk. 1
118 Conference on Data Mining | DMIN'06 | Pattern-Based Transformation Approach to Relational Domain Learning Using Dynamic Aggregation for Relational Attributes
"... Abstract—Due to the widespread use of relational databases (mySQL, Oracle, DB2, MsSQL), most data are stored as multiple tables in what can be a very large database. As a result, more efficient algorithms for mining data from multirelational domain need to be implemented. Inductive Logic programming ..."
Abstract
- Add to MetaCart
Abstract—Due to the widespread use of relational databases (mySQL, Oracle, DB2, MsSQL), most data are stored as multiple tables in what can be a very large database. As a result, more efficient algorithms for mining data from multirelational domain need to be implemented. Inductive Logic programming (ILP) techniques are useful for analyzing data in multi-relational databases. Unfortunately, even though not complex in structure, such business data are often large and contain highly non-determinate components, making them difficult for ILP learners geared towards structurally complex tasks. In this paper, we build a novel transformation-based approach to relational domain learning and describe the transformation process implemented through relational aggregation based on pattern distance. In this paper, we present the prototype of “Dynamic Aggregation of Relational Attributes ” (hence called DARA) that is capable of mapping one-to-many relationship into one-to-one relationship, while preventing loss of information, in handling classification task in relational domains. We experimentally show these results in a multi-relational domain that show higher percentage of correctly classified instances and illustrate set of rules extracted using our approach.

