• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Learning to match the schemas of data sources: A multistrategy approach (0)

by P Domingos A Doan
Venue:Machine Learning
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 40
Next 10 →

QOM – Quick ontology mapping

by Marc Ehrig, Steffen Staab - In Proc. 3rd International Semantic Web Conference (ISWC04 , 2004
"... Abstract. (Semi-)automatic mapping — also called (semi-)automatic alignment — of ontologies is a core task to achieve interoperability when two agents or services use different ontologies. In the existing literature, the focus has so far been on improving the quality of mapping results. We here cons ..."
Abstract - Cited by 84 (8 self) - Add to MetaCart
Abstract. (Semi-)automatic mapping — also called (semi-)automatic alignment — of ontologies is a core task to achieve interoperability when two agents or services use different ontologies. In the existing literature, the focus has so far been on improving the quality of mapping results. We here consider QOM, Quick Ontology Mapping, as a way to trade off between effectiveness (i.e. quality) and efficiency of the mapping generation algorithms. We show that QOM has lower run-time complexity than existing prominent approaches. Then, we show in experiments that this theoretical investigation translates into practical benefits. While QOM gives up some of the possibilities for producing high-quality results in favor of efficiency, our experiments show that this loss of quality is marginal. 1

Bootstrapping Ontology Alignment Methods with APFEL

by Marc Ehrig, Steffen Staab, York Sure - In Proceedings of ISWC , 2005
"... this paper requires training examples. The assistance in their creation is necessary as in a typical ontology alignment setting there are only a small number of really plausible alignments available compared to the large number of candidates, which might be possible a priori ..."
Abstract - Cited by 52 (0 self) - Add to MetaCart
this paper requires training examples. The assistance in their creation is necessary as in a typical ontology alignment setting there are only a small number of really plausible alignments available compared to the large number of candidates, which might be possible a priori

Automatically Refining the Wikipedia Infobox Ontology

by Fei Wu, Daniel S. Weld , 2008
"... The combined efforts of human volunteers have recently extracted numerous facts from Wikipedia, storing them as machine-harvestable object-attribute-value triples in Wikipedia infoboxes. Machine learning systems, such as Kylin, use these infoboxes as training data, accurately extracting even more se ..."
Abstract - Cited by 43 (7 self) - Add to MetaCart
The combined efforts of human volunteers have recently extracted numerous facts from Wikipedia, storing them as machine-harvestable object-attribute-value triples in Wikipedia infoboxes. Machine learning systems, such as Kylin, use these infoboxes as training data, accurately extracting even more semantic knowledge from natural language text. But in order to realize the full power of this information, it must be situated in a cleanly-structured ontology. This paper introduces KOG, an autonomous system for refining Wikipedia’s infobox-class ontology towards this end. We cast the problem of ontology refinement as a machine learning problem and solve it using both SVMs and a more powerful joint-inference approach expressed in Markov Logic Networks. We present experiments demonstrating the superiority of the joint-inference approach and evaluating other aspects of our system. Using these techniques, we build a rich ontology, integrating Wikipedia’s infobox-class schemata with WordNet. We demonstrate how the resulting ontology may be used to enhance Wikipedia with improved query processing and other features.

Link Mining: A Survey

by Lise Getoor, Christopher P. Diehl - SigKDD Explorations Special Issue on Link Mining , 2005
"... Many datasets of interest today are best described as a linked collection of interrelated objects. These may represent homogeneous networks, in which there is a single-object type and link type, or richer, heterogeneous networks, in which there may be multiple object and link types (and possibly oth ..."
Abstract - Cited by 31 (0 self) - Add to MetaCart
Many datasets of interest today are best described as a linked collection of interrelated objects. These may represent homogeneous networks, in which there is a single-object type and link type, or richer, heterogeneous networks, in which there may be multiple object and link types (and possibly other semantic information). Examples of homogeneous networks include single mode social networks, such as people connected by friendship links, or the WWW, a collection of linked web pages. Examples of heterogeneous networks include those in medical domains describing patients, diseases, treatments and contacts, or in bibliographic domains describing publications, authors, and venues. Link mining refers to data mining techniques that explicitly consider these links when building predictive or descriptive models of the linked data. Commonly addressed link mining tasks include object ranking, group detection, collective classification, link prediction and subgraph discovery. While network analysis has been studied in depth in particular areas such as social network analysis, hypertext mining, and web analysis, only recently has there been a cross-fertilization of ideas among these different communities. This is an exciting, rapidly expanding area. In this article, we review some of the common emerging themes. 1.

ASSAM: A Tool for Semi-Automatically Annotating Semantic Web Services

by Andreas Heß, Eddie Johnston, Nicholas Kushmerick - In Intl. Semantic Web Conf. (ISWC , 2004
"... The semantic Web Services vision requires that each service be annotated with semantic metadata. Manually creating such metadata is tedious and error-prone, and many software engineers, accustomed to tools that automatically generate WSDL, might not want to invest the additional e#ort. We theref ..."
Abstract - Cited by 31 (3 self) - Add to MetaCart
The semantic Web Services vision requires that each service be annotated with semantic metadata. Manually creating such metadata is tedious and error-prone, and many software engineers, accustomed to tools that automatically generate WSDL, might not want to invest the additional e#ort. We therefore propose ASSAM, a tool that assists a user in creating semantic metadata for Web Services. ASSAM is intended for service consumers who want to integrate a number of services and therefore must annotate them according to some shared ontology. ASSAM is also relevant for service producers who have deployed a Web Service and want to make it compatible with an existing ontology. ASSAM's capabilities to automatically create semantic metadata are supported by two machine learning algorithms. First, we have developed an iterative relational classification algorithm for semantically classifying Web Services, their operations, and input and output messages. Second, to aggregate the data returned by multiple semantically related Web Services, we have developed a schema mapping algorithm that is based on an ensemble of string distance metrics.

Consistent Query Answers in Virtual Data Integration Systems

by Leopoldo Bertossi, Loreto Bravo - IN INCONSISTENCY TOLERANCE, SPRINGER LNCS 3300 , 2005
"... When data sources are virtually integrated there is no common and centralized mechanism for maintaining global consistency. In consequHHj9 it is likely that inconsistencies with respect to certain global integrity constraints (ICs)will occu; In this chapter we consider the problem of defining ..."
Abstract - Cited by 30 (18 self) - Add to MetaCart
When data sources are virtually integrated there is no common and centralized mechanism for maintaining global consistency. In consequHHj9 it is likely that inconsistencies with respect to certain global integrity constraints (ICs)will occu; In this chapter we consider the problem of defining andcompu2;) those answers that are consistent wrt the global ICs when global qubal) are posed tovirtuM data integration systems whosesou)33 are specified following the local-as-view approach.

A Large Scale Taxonomy Mapping Evaluation

by Paolo Avesani, Fausto Giunchiglia, Mikalai Yatskevich - In Proceedings of ISWC , 2005
"... Abstract. Matching hierarchical structures, like taxonomies or web directories, is the premise for enabling interoperability among heterogenous data organizations. While the number of new matching solutions is increasing the evaluation issue is still open. This work addresses the problem of comparis ..."
Abstract - Cited by 28 (11 self) - Add to MetaCart
Abstract. Matching hierarchical structures, like taxonomies or web directories, is the premise for enabling interoperability among heterogenous data organizations. While the number of new matching solutions is increasing the evaluation issue is still open. This work addresses the problem of comparison for pairwise matching solutions. A methodology is proposed to overcome the issue of scalability. A large scale dataset is developed based on real world case study namely, the web directories of Google, Looksmart and Yahoo!. Finally, an empirical evaluation is performed which compares the most representative solutions for taxonomy matching. We argue that the proposed dataset can play a key role in supporting the empirical analysis for the research effort in the area of taxonomy matching. 1

Clustering Documents in a Web Directory

by Giordano Adami, Paolo, Avesani, Diego Sona , 2003
"... growing interest due to the widespread proliferation of topic hierarchies for text documents. The worst problem of hierarchical supervised classifiers is their high demand in terms of labeled examples, whose amount is related to the number of topics in the taxonomy. Hence, bootstrapping a huge hiera ..."
Abstract - Cited by 18 (6 self) - Add to MetaCart
growing interest due to the widespread proliferation of topic hierarchies for text documents. The worst problem of hierarchical supervised classifiers is their high demand in terms of labeled examples, whose amount is related to the number of topics in the taxonomy. Hence, bootstrapping a huge hierarchy with a proper set of labeled examples is a critical issue. In this paper, we propose some solutions for the bootstrapping problem, implicitly or explicitly using a taxonomy definition: a baseline approach where documents are classified according to class labels, and two clustering approaches, where training is constrained by the a-priori knowledge of the taxonomy structure, both at terminological and topological level. In particular, we propose the TaxSOM model, that clusters a set of documents in a predefined hierarchy of classes, directly exploiting the knowledge of both their topological organization and their lexical description. Experimental evaluation was performed on a set of taxonomies taken from the Google Web directory.

Bootstrapping for Hierarchical Document Classification

by Giordano Adami, Paolo Avesani, Diego Sona - In Proceedings of the Twelfth ACM International Conference on Informationand Knowledge Management(CIKM03 , 2003
"... Managing the hierarchical organization of data is starting to play a key role in the knowledge management community due to the great amount of human resources needed to create and maintain these organized repositories of information. Machine learning community has in part addressed this problem by d ..."
Abstract - Cited by 12 (5 self) - Add to MetaCart
Managing the hierarchical organization of data is starting to play a key role in the knowledge management community due to the great amount of human resources needed to create and maintain these organized repositories of information. Machine learning community has in part addressed this problem by developing hierarchical supervised classifiers that help maintainers to categorize new resources within given hierarchies. Although such learning models succeed in exploiting relational knowledge, they are highly demanding in terms of labeled examples, because the number of categories is related to the dimension of the corresponding hierarchy. Hence, the creation of new directories or the modification of existing ones require strong investments.

Semantic Alignment Of Business Processes

by Saartje Brockmans , Marc Ehrig, Agnes Koschmider, Andreas Oberweis, Rudi Studer - IN: PROCEEDINGS OF THE EIGHTH INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS (ICEIS 2006 , 2006
"... This paper presents a method for semantically aligning business processes. We provide a representation of Petri nets in the ontology language OWL, to semantically enrich the business process models. On top of this, we propose a technique for semantically aligning business processes to support (sem ..."
Abstract - Cited by 10 (2 self) - Add to MetaCart
This paper presents a method for semantically aligning business processes. We provide a representation of Petri nets in the ontology language OWL, to semantically enrich the business process models. On top of this, we propose a technique for semantically aligning business processes to support (semi)automatic interconnectivity of business processes. This semantic alignment is improved by a background ontology modeled with a specific UML Profile allowing to visually model it. The different parts of our proposal, which reduces communication efforts and solves interconnectivity problems, are discussed.
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University