Results 1  10
of
85
A Survey of Kernels for Structured Data
"... Kernel methods in general and support vector machines in particular have been successful in various learning tasks on data represented in a single table. Much 'realworld ' data, however, is structured it has no natural representation in a single table. Usually, to apply kernel methods to 'realworl ..."
Abstract

Cited by 113 (3 self)
 Add to MetaCart
Kernel methods in general and support vector machines in particular have been successful in various learning tasks on data represented in a single table. Much 'realworld ' data, however, is structured it has no natural representation in a single table. Usually, to apply kernel methods to 'realworld' data, extensive preprocessing is performed toembed the data into areal vector space and thus in a single table. This survey describes several approaches ofdefining positive definite kernels on structured instances directly.
Linkage and autocorrelation cause feature selection bias in relational learning
 In Proc. of the 19th Intl Conference on Machine Learning
, 2002
"... Two common characteristics of relational data sets — concentrated linkage and relational autocorrelation — can cause learning algorithms to be strongly biased toward certain features, irrespective of their predictive power. We identify these characteristics, define quantitative measures of their sev ..."
Abstract

Cited by 95 (32 self)
 Add to MetaCart
Two common characteristics of relational data sets — concentrated linkage and relational autocorrelation — can cause learning algorithms to be strongly biased toward certain features, irrespective of their predictive power. We identify these characteristics, define quantitative measures of their severity, and explain how they produce this bias. We show how linkage and autocorrelation affect a representative algorithm for feature selection by applying the algorithm to synthetic data and to data drawn from the Internet Movie Database. 1.1 Relational Data and Statistical Dependence Figure 1 presents two simple relational data sets. In each
A Simple Relational Classifier
 Proceedings of the Second Workshop on MultiRelational Data Mining (MRDM2003) at KDD2003
, 2003
"... We analyze a Relational Neighbor (RN) classifier, a simple relational predictive model that predicts only based on class labels of related neighbors, using no learning and no inherent attributes. We show that it performs surprisingly well by comparing it to more complex models such as Probabilist ..."
Abstract

Cited by 82 (14 self)
 Add to MetaCart
We analyze a Relational Neighbor (RN) classifier, a simple relational predictive model that predicts only based on class labels of related neighbors, using no learning and no inherent attributes. We show that it performs surprisingly well by comparing it to more complex models such as Probabilistic Relational Models and Relational Probability Trees on three data sets from published work.
Graph mining: Laws, generators, and algorithms
 ACM COMPUTING SURVEYS
, 2006
"... How does the Web look? How could we tell an abnormal social network from a normal one? These and similar questions are important in many fields where the data can intuitively be cast as a graph; examples range from computer networks to sociology to biology and many more. Indeed, any M : N relation i ..."
Abstract

Cited by 70 (6 self)
 Add to MetaCart
How does the Web look? How could we tell an abnormal social network from a normal one? These and similar questions are important in many fields where the data can intuitively be cast as a graph; examples range from computer networks to sociology to biology and many more. Indeed, any M : N relation in database terminology can be represented as a graph. A lot of these questions boil down to the following: "How can we generate synthetic but realistic graphs?" To answer this, we must first understand what patterns are common in realworld graphs and can thus be considered a mark of normality/realism. This survey give an overview of the incredible variety of work that has been done on these problems. One of our main contributions is the integration of points of view from physics, mathematics, sociology, and computer science. Further, we briefly describe recent advances on some related and interesting graph problems.
Logical hidden markov models
 Journal of Artificial Intelligence Research
, 2006
"... Logical hidden Markov models (LOHMMs) upgrade traditional hidden Markov models to deal with sequences of structured symbols in the form of logical atoms, rather than flat characters. This note formally introduces LOHMMs and presents solutions to the three central inference problems for LOHMMs: evalu ..."
Abstract

Cited by 42 (10 self)
 Add to MetaCart
Logical hidden Markov models (LOHMMs) upgrade traditional hidden Markov models to deal with sequences of structured symbols in the form of logical atoms, rather than flat characters. This note formally introduces LOHMMs and presents solutions to the three central inference problems for LOHMMs: evaluation, most likely hidden state sequence and parameter estimation. The resulting representation and algorithms are experimentally evaluated on problems from the domain of bioinformatics. 1.
Statistical Relational Learning for Document Mining
, 2003
"... A major obstacle to fully integrated deployment of statistical learners is the assumption that data sits in a single table, even though most realworld databases have complex relational structures. In this paper, we introduce an integrated approach to building regression models from data stored ..."
Abstract

Cited by 36 (5 self)
 Add to MetaCart
A major obstacle to fully integrated deployment of statistical learners is the assumption that data sits in a single table, even though most realworld databases have complex relational structures. In this paper, we introduce an integrated approach to building regression models from data stored in relational databases. Potential features are generated by structured search of the space of queries to the database, and then tested for inclusion in a logistic regression. We present experimental results for the task of predicting where scientific papers will be published based on relational data taken from CiteSeer. This data includes word counts in the document, frequently cited authors or papers, cocitations, publication venues of cited papers, word cooccurrences, and word counts in cited or citing documents. Our approach results in classification accuracies superior to those achieved when using classical "flat" features. Our classification task also serves as a "where to publish?" conference/journal recommendation task.
Probabilistic Logic Learning
 ACMSIGKDD Explorations: Special issue on MultiRelational Data Mining
, 2004
"... The past few years have witnessed an significant interest in probabilistic logic learning, i.e. in research lying at the intersection of probabilistic reasoning, logical representations, and machine learning. A rich variety of di#erent formalisms and learning techniques have been developed. This pap ..."
Abstract

Cited by 34 (8 self)
 Add to MetaCart
The past few years have witnessed an significant interest in probabilistic logic learning, i.e. in research lying at the intersection of probabilistic reasoning, logical representations, and machine learning. A rich variety of di#erent formalisms and learning techniques have been developed. This paper provides an introductory survey and overview of the stateof theart in probabilistic logic learning through the identification of a number of important probabilistic, logical and learning concepts.
ExpertGuided Subgroup Discovery: Methodology and Application
 Journal of Artificial Intelligence Research
, 2002
"... This paper presents an approach to expertguided subgroup discovery. The main step of the subgroup discovery process, the induction of subgroup descriptions, is performed by a heuristic beam search algorithm, using a novel parametrized definition of rule quality which is analyzed in detail. The othe ..."
Abstract

Cited by 32 (7 self)
 Add to MetaCart
This paper presents an approach to expertguided subgroup discovery. The main step of the subgroup discovery process, the induction of subgroup descriptions, is performed by a heuristic beam search algorithm, using a novel parametrized definition of rule quality which is analyzed in detail. The other important steps of the proposed subgroup discovery process are the detection of statistically significant properties of selected subgroups and subgroup visualization: statistically significant properties are used to enrich the descriptions of induced subgroups, while the visualization shows subgroup properties in the form of distributions of the numbers of examples in the subgroups. The approach is illustrated by the results obtained for a medical problem of early detection of patient risk groups.
Data Mining in Social Networks
 In National Academy of Sciences Symposium on Dynamic Social Network Modeling and Analysis
, 2002
"... Abstract. Several techniques for learning statistical models have been developed recently by researchers in machine learning and data mining. All of these techniques must address a similar set of representational and algorithmic choices and must face a set of statistical challenges unique to learnin ..."
Abstract

Cited by 30 (1 self)
 Add to MetaCart
Abstract. Several techniques for learning statistical models have been developed recently by researchers in machine learning and data mining. All of these techniques must address a similar set of representational and algorithmic choices and must face a set of statistical challenges unique to learning from relational data.
MRDTL: A multirelational decision tree learning algorithm
 Proceedings of the 13th International Conference on Inductive Logic Programming (ILP 2003
, 2002
"... this paper, we have described an implementation of multirelational decision tree learning (MRDTL) algorithm based on the techniques proposed by Knobbe et el. (Knobbe et el., 1999a, Knobbe et el., 1999b) ..."
Abstract

Cited by 26 (2 self)
 Add to MetaCart
this paper, we have described an implementation of multirelational decision tree learning (MRDTL) algorithm based on the techniques proposed by Knobbe et el. (Knobbe et el., 1999a, Knobbe et el., 1999b)