Results 1  10
of
64
LinearTime Computation of Similarity Measures for Sequential Data
, 2008
"... Efficient and expressive comparison of sequences is an essential procedure for learning with sequential data. In this article we propose a generic framework for computation of similarity measures for sequences, covering various kernel, distance and nonmetric similarity functions. The basis for comp ..."
Abstract

Cited by 38 (24 self)
 Add to MetaCart
Efficient and expressive comparison of sequences is an essential procedure for learning with sequential data. In this article we propose a generic framework for computation of similarity measures for sequences, covering various kernel, distance and nonmetric similarity functions. The basis for comparison is embedding of sequences using a formal language, such as a set of natural words, kgrams or all contiguous subsequences. As realizations of the framework we provide lineartime algorithms of different complexity and capabilities using sorted arrays, tries and suffix trees as underlying data structures. Experiments on data sets from bioinformatics, text processing and computer security illustrate the efficiency of the proposed algorithms—enabling peak performances of up to 10^6 pairwise comparisons per second. The utility of distances and nonmetric similarity measures for sequences as alternatives to string kernels is demonstrated in applications of text categorization, network intrusion detection and transcription site recognition in DNA.
Naive Bayesian Classification of Structured Data
, 2003
"... In this paper we present 1BC and 1BC2, two systems that perform naive Bayesian classification of structured individuals. The approach of 1BC is to project the individuals along firstorder features. These features are built from the individual using structural predicates referring to related objects ..."
Abstract

Cited by 33 (0 self)
 Add to MetaCart
In this paper we present 1BC and 1BC2, two systems that perform naive Bayesian classification of structured individuals. The approach of 1BC is to project the individuals along firstorder features. These features are built from the individual using structural predicates referring to related objects (e.g. atoms within molecules), and properties applying to the individual or one or several of its related objects (e.g. a bond between two atoms). We describe an individual in terms of elementary features consisting of zero or more structural predicates and one property; these features are treated as conditionally independent in the spirit of the naive Bayes assumption. 1BC2 represents an alternative firstorder upgrade to the naive Bayesian classifier by considering probability distributions over structured objects (e.g., a molecule as a set of atoms), and estimating those distributions from the probabilities of its elements (which are assumed to be independent). We present a unifying view on both systems in which 1BC works in language space, and 1BC2 works in individual space. We also present a new, efficient recursive algorithm improving upon the original propositionalisation approach of 1BC. Both systems have been implemented in the context of the firstorder descriptive learner Tertius, and we investigate the differences between the two systems both in computational terms and on artificially generated data. Finally, we describe a range of experiments on ILP benchmark data sets demonstrating the viability of our approach.
Statistical learning for inductive query answering on owl ontologies
 In Proceedings of the 7th International Semantic Web Conference (ISWC
, 2008
"... Abstract. A novel family of parametric languageindependent kernel functions defined for individuals within ontologies is presented. They are easily integrated with efficient statistical learning methods for inducing linear classifiers that offer an alternative way to perform classification w.r.t. d ..."
Abstract

Cited by 13 (4 self)
 Add to MetaCart
(Show Context)
Abstract. A novel family of parametric languageindependent kernel functions defined for individuals within ontologies is presented. They are easily integrated with efficient statistical learning methods for inducing linear classifiers that offer an alternative way to perform classification w.r.t. deductive reasoning. A method for adapting the parameters of the kernel to the knowledge base through stochastic optimization is also proposed. This enables the exploitation of statistical learning in a variety of tasks where an inductive approach may bridge the gaps of the standard methods due the inherent incompleteness of the knowledge bases. In this work, a system integrating the kernels has been tested in experiments on approximate query answering with real ontologies collected from standard repositories. 1 Ontology Mining: Learning from Metadata In the context of the Semantic Web (henceforth SW) many applications require the accomplishment of dataintensive tasks that can effectively exploit machine learning methods [1]. However, while a growing amount of metadata is being produced, most of the research effort addresses the problem of learning for the SW (mostly from structured
The graph neural network model
 IEEE Transactions on Neural Networks
, 2009
"... The graph neural network model Many underlying relationships among data in several areas of science and engineering, e.g., computer vision, molecular chemistry, molecular biology, pattern recognition, and data mining, can be represented in terms of graphs. In this paper, we propose a new neural netw ..."
Abstract

Cited by 11 (5 self)
 Add to MetaCart
(Show Context)
The graph neural network model Many underlying relationships among data in several areas of science and engineering, e.g., computer vision, molecular chemistry, molecular biology, pattern recognition, and data mining, can be represented in terms of graphs. In this paper, we propose a new neural network model, called graph neural network (GNN) model, that extends existing neural network methods for processing the data represented in graph domains. This GNN model, which can directly process most of the practically useful types of graphs, e.g., acyclic, cyclic, directed, and undirected, implements a function tau(G,n) isin IRm that maps a graph G and one of its nodes n into an mdimensional Euclidean space. A supervised learning algorithm is derived to estimate the parameters of the proposed GNN model. The computational cost of the proposed algorithm is also considered. Some experimental results are shown to validate the proposed learning algorithm, and to demonstrate its generalization capabilities.
Bridging the gap between distance and generalisation: Symbolic learning in metric spaces
, 2008
"... Distancebased and generalisationbased methods are two families of artificial intelligence techniques that have been successfully used over a wide range of realworld problems. In the first case, general algorithms can be applied to any data representation by just changing the distance. The metric ..."
Abstract

Cited by 10 (4 self)
 Add to MetaCart
(Show Context)
Distancebased and generalisationbased methods are two families of artificial intelligence techniques that have been successfully used over a wide range of realworld problems. In the first case, general algorithms can be applied to any data representation by just changing the distance. The metric space sets the search and learning space, which is generally instanceoriented. In the second case, models can be obtained for a given pattern language, which can be comprehensible. The generalityordered space sets the search and learning space, which is generally modeloriented. However, the concepts of distance and generalisation clash in many different ways, especially when knowledge representation is complex (e.g. structured data). This work establishes a framework where these two fields can be integrated in a consistent way. We introduce the concept of distancebased generalisation, which connects all the generalised examples in such a way that all of them are reachable inside the generalisation by using straight paths in the metric space. This makes the metric space and the generalityordered space coherent (or even dual). Additionally, we also introduce a definition of minimal distancebased generalisation that can be seen as the first formulation of the Minimum Description Length (MDL)/Minimum Message Length (MML) principle in terms of a distance function. We instantiate and develop the framework for the most common data representations and distances, where we show that consistent instances can be found for numerical data, nominal data, sets, lists, tuples, graphs, firstorder atoms and clauses. As a result, general learning methods that integrate the best from distancebased and generalisationbased methods can be defined and adapted to any specific problem by appropriately choosing the distance, the pattern language and the generalisation operator.
Kernels on Prolog Proof Trees: Statistical Learning in the ILP Setting
, 2006
"... We develop kernels for measuring the similarity between relational instances using background knowledge expressed in firstorder logic. The method allows us to bridge the gap between traditional inductive logic programming (ILP) representations and statistical approaches to supervised learning. L ..."
Abstract

Cited by 8 (3 self)
 Add to MetaCart
We develop kernels for measuring the similarity between relational instances using background knowledge expressed in firstorder logic. The method allows us to bridge the gap between traditional inductive logic programming (ILP) representations and statistical approaches to supervised learning. Logic programs are first used to generate proofs of given visitor programs that use predicates declared in the available background knowledge. A kernel is then defined over pairs of proof trees. The method can be used for supervised learning tasks and is suitable for classification as well as regression. We report positive empirical results on Bongardlike and MofN problems that are difficult or impossible to solve with traditional ILP techniques, as well as on real bioinformatics and chemoinformatics data sets.
M.: Kernels over relational algebra structures
 In: The Ninth PacicAsia Conference on Knowledge Discovery and Data
, 2005
"... Abstract. In this paper we present a novel and general framework based on concepts of relational algebra for kernelbased learning over relational schema. We exploit the notion of foreign keys to dene a new attribute that we call instanceset and we use this type of attributes to dene a tree like s ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
(Show Context)
Abstract. In this paper we present a novel and general framework based on concepts of relational algebra for kernelbased learning over relational schema. We exploit the notion of foreign keys to dene a new attribute that we call instanceset and we use this type of attributes to dene a tree like structured representation of the learning instances. We dene kernel functions over relational schemata which are instances of RConvolution kernels and use them as a basis for a relational instancebased learning algorithm. These kernels can be considered as being dened over typed and unordered trees where elementary kernels are used to compute the graded similarity between nodes. We investigate their formal properties and evaluate the performance of the relational instancebased algorithm on a number of relational benchmark datasets. 1
Fast Random Walk Graph Kernel
"... Random walk graph kernel has been used as an important tool for various data mining tasks including classification and similarity computation. Despite its usefulness, however, it suffers from the expensive computational cost which is at least O(n 3) or O(m 2) for graphs with n nodes and m edges. In ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
(Show Context)
Random walk graph kernel has been used as an important tool for various data mining tasks including classification and similarity computation. Despite its usefulness, however, it suffers from the expensive computational cost which is at least O(n 3) or O(m 2) for graphs with n nodes and m edges. In this paper, we propose Ark, a set of fast algorithms for random walk graph kernel computation. Ark is based on the observation that real graphs have much lower intrinsic ranks, compared with the orders of the graphs. Ark exploits the low rank structure to quickly compute random walk graph kernels in O(n 2) or O(m) time. Experimental results show that our method is up to 97,865 × faster than the existing algorithms, while providing more than 91.3 % of the accuracies.
Learning with Kernels in Description Logics
"... Abstract. We tackle the problem of statistical learning in the standard knowledge base representations for the Semantic Web which are ultimately expressed in description Logics. Specifically, in our method a kernel functions for the ALCN logic integrates with a support vector machine which enables t ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
(Show Context)
Abstract. We tackle the problem of statistical learning in the standard knowledge base representations for the Semantic Web which are ultimately expressed in description Logics. Specifically, in our method a kernel functions for the ALCN logic integrates with a support vector machine which enables the usage of statistical learning with reference representations. Experiments where performed in which kernel classification is applied to the tasks of resource retrieval and query answering on OWL ontologies. 1 Learning from Ontologies The Semantic Web (SW) represents an emerging applicative domain where knowledge intensive automated manipulations on complex relational descriptions are foreseen. Although machine learning techniques may have a great potential in this field, so far research has focused mainly on methods for knowledge acquisition from text (ontology learning) [4]. Yet machine learning methods can be transposed from ILP to be applied to ontologies described with formal concept
A scalable kernel approach to learning in semantic graphs with applications to linked data
 in: Proc. of the 1st Workshop on Mining the Future Internet
, 2010
"... Abstract. In this paper we discuss a kernel approach to learning in semantic graphs. To scale up the performance to large data sets, we employ the Nyström approximation. We derive a kernel derived from semantic relations in a local neighborhood of a node. One can apply our approach to problems in mu ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
(Show Context)
Abstract. In this paper we discuss a kernel approach to learning in semantic graphs. To scale up the performance to large data sets, we employ the Nyström approximation. We derive a kernel derived from semantic relations in a local neighborhood of a node. One can apply our approach to problems in multirelational domains with several thousand graph nodes and more than a million potential links. We apply the approach to DBpedia data extracted from the RDFgraph of the Semantic Web’s Linked Open Data (LOD). 1