Results 1 -
6 of
6
LIMES -- A Time-Efficient Approach for Large-Scale Link Discovery on the Web of Data
, 2011
"... The Linked Data paradigm has evolved into a powerful enabler for the transition from the document-oriented Web into the Semantic Web. While the amount of data published as Linked Data grows steadily and has surpassed 25 billion triples, less than 5 % of these triples are links between knowledge base ..."
Abstract
-
Cited by 48 (8 self)
- Add to MetaCart
The Linked Data paradigm has evolved into a powerful enabler for the transition from the document-oriented Web into the Semantic Web. While the amount of data published as Linked Data grows steadily and has surpassed 25 billion triples, less than 5 % of these triples are links between knowledge bases. Link discovery frameworks provide the functionality necessary to discover missing links between knowledge bases in a semi-automatic fashion. Yet, the task of linking knowledge bases requires a significant amount of time, especially when it is carried out on large data sets. This paper presents and evaluates LIMES- a novel timeefficient approach for link discovery in metric spaces. Our approach utilizes the mathematical characteristics of metric spaces to compute estimates of the similarity between instances. These estimates are then used to filter out a large amount of those instance pairs that do not suffice the mapping conditions. Thus, LIMES can reduce the number of comparisons needed during the mapping process by several orders of magnitude. We present the mathematical foundation and the core algorithms employed in the implementation. We evaluate LIMES with synthetic data to elucidate its behavior on small and large data sets with different configurations and show that our approach can significantly reduce the time complexity of a mapping task. In addition, we compare the runtime of our framework with a state-ofthe-art link discovery tool. We show that LIMES is more than 60 times faster when mapping large knowledge bases.
RAVEN – Active Learning of Link Specifications
"... Abstract. With the growth of the Linked Data Web, time-efficient approaches for computing links between data sources have become indispensable. Yet, in many cases, determining the right specification for a link discovery problem is a tedious task that must still be carried out manually. We present R ..."
Abstract
-
Cited by 16 (5 self)
- Add to MetaCart
(Show Context)
Abstract. With the growth of the Linked Data Web, time-efficient approaches for computing links between data sources have become indispensable. Yet, in many cases, determining the right specification for a link discovery problem is a tedious task that must still be carried out manually. We present RAVEN, an approach for the semi-automatic determination of link specifications. Our approach is based on the combination of stable solutions of matching problems and active learning with the time-efficient link discovery framework LIMES. RAVEN aims at requiring a small number of interactions with the user to generate classifiers of high accuracy. We focus on using RAVEN to compute and configure boolean and weighted classifiers, which we evaluate in three experiments against link specifications created manually. Our evaluation shows that we can compute linking configurations that achieve more than 90 % F-score by asking the user to verify at most twelve potential links.
A Time-Efficient Hybrid Approach to Link Discovery
"... Abstract. With the growth of the Linked Data Web, time-efficient Link Discovery frameworks have become indispensable for implementing the fourth Linked Data principle, i.e., the provision of links between data sources. Due to the sheer size of the Data Web, detecting links even when using trivial sp ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
(Show Context)
Abstract. With the growth of the Linked Data Web, time-efficient Link Discovery frameworks have become indispensable for implementing the fourth Linked Data principle, i.e., the provision of links between data sources. Due to the sheer size of the Data Web, detecting links even when using trivial specifications based on a single property can be very timedemanding. Moreover, non-trivial Link Discovery tasks require complex link specifications and are consequently even more challenging to optimize with respect to runtime. In this paper, we present a novel hybrid approach to link discovery that combines two very fast algorithms. Both algorithms are combined by using original insights on the translation of complex link specifications to combinations of atomic specifications via a series of operations on sets and filters. We show in three experiments that our approach outperforms SILK by more than six orders of magnitude while abiding to the restriction of not losing any link.
K.: EAGLE: Efficient Active Learning of Link Specifications Using Genetic Programming
- In: ESWC
, 2012
"... Abstract. With the growth of the Linked Data Web, time-efficient ap-proaches for computing links between data sources have become indis-pensable. Most Link Discovery frameworks implement approaches that require two main computational steps. First, a link specification has to be explicated by the use ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
(Show Context)
Abstract. With the growth of the Linked Data Web, time-efficient ap-proaches for computing links between data sources have become indis-pensable. Most Link Discovery frameworks implement approaches that require two main computational steps. First, a link specification has to be explicated by the user. Then, this specification must be executed. While several approaches for the time-efficient execution of link specifications have been developed over the last few years, the discovery of accurate link specifications remains a tedious problem. In this paper, we present EAGLE, an active learning approach based on genetic programming. EAGLE generates highly accurate link specifications while reducing the annotation burden for the user. We present EAGLE and the framework within which it is implemented. We evaluate EAGLE against batch learn-ing on three different data sets and show that it can detect specifications with an F-measure superior to 90 % while requiring a small number of questions. 1
Link discovery with guaranteed reduction ratio in affine spaces with minkowski measures.
- In Proceedings of ISWC,
, 2012
"... ..."
Raven: Towards zero-configuration link discovery
- In Proceedings of OM@ISWC
, 2011
"... Abstract. With the growth of the Linked Data Web, time-efficient ap-proaches for computing links between data sources have become indis-pensable. Yet, in many cases, determining the right specification for a link discovery problem is a tedious task that must still be carried out manually. In this ar ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
Abstract. With the growth of the Linked Data Web, time-efficient ap-proaches for computing links between data sources have become indis-pensable. Yet, in many cases, determining the right specification for a link discovery problem is a tedious task that must still be carried out manually. In this article we present RAVEN, an approach for the semi-automatic determination of link specifications. Our approach is based on the combination of stable solutions of matching problems and active learning leveraging the time-efficient link discovery framework LIMES. RAVEN is designed to require a small number of interactions with the user in order to generate classifiers of high accuracy. We focus with RAVEN on the computation and configuration of Boolean and weighted classifiers, which we evaluate in three experiments against link specifi-cations created manually. Our evaluation shows that we can compute linking configurations that achieve more than 90 % F-score by asking the user to verify at most twelve potential links.