Results 1 - 10
of
22
Collective classification in network data
, 2008
"... Numerous real-world applications produce networked data such as web data (hypertext documents connected via hyperlinks) and communication networks (people connected via communication links). A recent focus in machine learning research has been to extend traditional machine learning classification te ..."
Abstract
-
Cited by 45 (17 self)
- Add to MetaCart
Numerous real-world applications produce networked data such as web data (hypertext documents connected via hyperlinks) and communication networks (people connected via communication links). A recent focus in machine learning research has been to extend traditional machine learning classification techniques to classify nodes in such data. In this report, we attempt to provide a brief introduction to this area of research and how it has progressed during the past decade. We introduce four of the most widely used inference algorithms for classifying networked data and empirically compare them on both synthetic and real-world data. 1
Propositionalization-based relational subgroup discovery with RSD
- Machine Learning
, 2006
"... Abstract Relational rule learning algorithms are typically designed to construct classification and prediction rules. However, relational rule learning can be adapted also to subgroup discovery. This paper proposes a propositionalization approach to relational subgroup discovery, achieved through ap ..."
Abstract
-
Cited by 16 (3 self)
- Add to MetaCart
Abstract Relational rule learning algorithms are typically designed to construct classification and prediction rules. However, relational rule learning can be adapted also to subgroup discovery. This paper proposes a propositionalization approach to relational subgroup discovery, achieved through appropriately adapting rule learning and first-order feature construction. The proposed approach was successfully applied to standard ILP problems (East-West trains, King-Rook-King chess endgame and mutagenicity prediction) and two real-life problems (analysis of telephone calls and traffic accident analysis).
Multi-relational learning, text mining, and semi-supervised learning for functional genomics
- Machine Learning
, 2004
"... Abstract. We focus on the problem of predicting functional properties of the proteins corresponding to genes in the yeast genome. Our goal is to study the effectiveness of approaches that utilize all data sources that are available in this problem setting, including relational data, abstracts of res ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
Abstract. We focus on the problem of predicting functional properties of the proteins corresponding to genes in the yeast genome. Our goal is to study the effectiveness of approaches that utilize all data sources that are available in this problem setting, including relational data, abstracts of research papers, and unlabeled data. We investigate a propositionalization approach which uses relational gene interaction data. We study the benefit of text classification and information extraction for utilizing a collection of scientific abstracts. We study transduction and co-training for using unlabeled data. We report on both, positive and negative results on the investigated approaches. The studied tasks are KDD Cup tasks of 2001 and 2002. The solutions which we describe achieved the highest score for task 2 in 2001, the fourth rank for task 3 in 2001, the highest score for one of the two subtasks and the third place for the overall task 2 in 2002.
Spatial Associative Classification: Propositional vs. Structural approach
- JOURNAL OF INTELLIGENT INFORMATION SYSTEMS
, 2006
"... Spatial associative classification takes advantage of employing association rules for spatial classification purposes. In this work, we investigate spatial associative classification in multi-relational data mining setting to deal with spatial objects having different properties, which are modeled ..."
Abstract
-
Cited by 9 (5 self)
- Add to MetaCart
Spatial associative classification takes advantage of employing association rules for spatial classification purposes. In this work, we investigate spatial associative classification in multi-relational data mining setting to deal with spatial objects having different properties, which are modeled by as many data tables (relations) as the number of spatial object types (layers). Spatial classification is based on two alternative approaches: a propositional approach and a structural approach. The propositional approach uses spatial association rules to construct an attribute-value representation (propositionalisation) of spatial data and performs spatial classification according to well-known propositional classification methods. Since the attribute-value representation should capture relational properties of spatial data, multi-relational association rules are used in propositionalisation step. The structural approach resorts to an extension of naïve Bayes classifiers to multi-relational data where the classification is driven by multi-relational association rules modelling regularities in spatial data. In both cases the spatial associative classification is performed at different levels of granularity and takes advantage from domain knowledge expressed in form of hierarchies and rules. Experiments on realworld geo-referenced census data analysis show the advantage of the structural approach over the propositional one.
A relational approach to probabilistic classification in a . . .
- ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE
, 2009
"... ..."
1 Introduction Compact Propositional Encodings of First-Order Theories
"... A propositionalization of a theory in First-Order Logic (FOL) is a set of propositional sentences that is satisfiable iff the original theory is satisfiable. We cannot translate arbitrary ..."
Abstract
- Add to MetaCart
A propositionalization of a theory in First-Order Logic (FOL) is a set of propositional sentences that is satisfiable iff the original theory is satisfiable. We cannot translate arbitrary
ALGORITHMS FOR NON-PARAMETRIC CLASSIFIERS IN MULTI-RELATIONAL DATA MINING
, 2006
"... Over the last decades, due to the advances in information technologies, both the indus-trial and scientific communities have acquired large volumes of data in digital form. Most of these data sets are stored using relational databases consisting of multiple tables and associations. Moreover, the dat ..."
Abstract
- Add to MetaCart
Over the last decades, due to the advances in information technologies, both the indus-trial and scientific communities have acquired large volumes of data in digital form. Most of these data sets are stored using relational databases consisting of multiple tables and associations. Moreover, the data used in the fields of bio-informatics, computational biol-ogy, HTML and XML documents are relational in nature. However, most of the existing approaches to knowledge discovery in databases, assume that the data are stored in a single table. Therefore, new algorithms are needed in order to exploit the relational infor-mation provided in these data sets. This thesis proposes two novel solutions to the task of supervised classification in relational domains, based on traditional non-parametric clas-sifiers and built upon relational algebra. The first approach is based on Kernel Density Estimation, and the second technique is based on Gaussian Mixture Models. Both tech-niques are evaluated using three real world relational data sets, drawn from the fields of organic chemistry, medicine and genetics.
KYBERNET I KA --- VOLUME x x ( x x x x ) , NUMBER x , PAGE S x x x -- x x x
"... this paper. Note that for admissible features we could not prove a `monotonicity' lemma similar to Lemma 1 (see Part One) we have shown for feature candidates, since not all prefices of admissible features are admissible features. Obviously, a feature candidate such as hasCar(T,C) can be refined in ..."
Abstract
- Add to MetaCart
this paper. Note that for admissible features we could not prove a `monotonicity' lemma similar to Lemma 1 (see Part One) we have shown for feature candidates, since not all prefices of admissible features are admissible features. Obviously, a feature candidate such as hasCar(T,C) can be refined into an admissible feature hasCar(T,C), long(C). Regarding a little less trivial example, a decomposable feature candidate hasCar(T,C1), long(C1), hasCar(T,C2), short(C2) can be refined into an admissible feature by adding the literal different(C1,C2)
Compact Propositionalizations of First-Order Theories
"... We present new insights and algorithms for converting reasoning problems in monadic First-Order Logic (includes only 1place predicates) into equivalent problems in propositional logic. Our algorithms improve over earlier approaches in two ways. First, they are applicable even without the unique-name ..."
Abstract
- Add to MetaCart
We present new insights and algorithms for converting reasoning problems in monadic First-Order Logic (includes only 1place predicates) into equivalent problems in propositional logic. Our algorithms improve over earlier approaches in two ways. First, they are applicable even without the unique-names and domain-closure assumptions, and for possibly infinite domains. Therefore, they apply for many problems that are outside the scope of previous techniques. Secondly, our algorithms produce propositional representations that are significantly more compact than earlier approaches, provided that some structure is available in the problem. We examined our approach on an example application and discovered that the number of propositional symbols that we produced is smaller by a factor of f # 50 than traditional techniques, when those techniques can be applied. This translates to a factor of about 2 f increase in the speed of reasoning for such structured problems.

