Results 1 - 10
of
100
Formal Ontology and Information Systems
, 1998
"... Research on ontology is becoming increasingly widespread in the computer science community, and its importance is being recognized in a multiplicity of research fields and application areas, including knowledge engineering, database design and integration, information retrieval and extraction. We sh ..."
Abstract
-
Cited by 497 (9 self)
- Add to MetaCart
Research on ontology is becoming increasingly widespread in the computer science community, and its importance is being recognized in a multiplicity of research fields and application areas, including knowledge engineering, database design and integration, information retrieval and extraction. We shall use the generic term information systems, in its broadest sense, to collectively refer to these application perspectives. We argue in this paper that so-called ontologies present their own methodological and architectural peculiarities: on the methodological side, their main peculiarity is the adoption of a highly interdisciplinary approach, while on the architectural side the most interesting aspect is the centrality of the role they can play in an information system, leading to the perspective of ontology-driven information systems.
Object exchange across heterogeneous information sources
- INTERNATIONAL CONFERENCE ON DATA ENGINEERING
, 1995
"... We address the problem of providing integrated access to diverse and dynamic information sources. We explain how this problem differs from the traditional database integration problem and we focus on one aspect of the information integration problem, namely information exchange. We define an object- ..."
Abstract
-
Cited by 465 (56 self)
- Add to MetaCart
We address the problem of providing integrated access to diverse and dynamic information sources. We explain how this problem differs from the traditional database integration problem and we focus on one aspect of the information integration problem, namely information exchange. We define an object-based information exchange model and a corresponding query language that we believe are well suited for integration of diverse information sources. We describe how, the model and language have been used to integrate heterogeneous bibliographic information sources. We also describe two general-purpose libraries we have implemented for object exchange between clients and servers.
Optimizing Queries across Diverse Data Sources
- In Proc. of VLDB
, 1997
"... Businesses today need to interrelate data stored in diverse systems with differing capabilities, ideally via a single high-level query interface. We present the design of a query optimizer for Gar- lic [C+95], a middleware system designed to integrate data from a broad range of data sources with ver ..."
Abstract
-
Cited by 241 (15 self)
- Add to MetaCart
Businesses today need to interrelate data stored in diverse systems with differing capabilities, ideally via a single high-level query interface. We present the design of a query optimizer for Gar- lic [C+95], a middleware system designed to integrate data from a broad range of data sources with very different query capabilities. Garlic's optimizer extends the rule-based approach of [Loh88 ] to work in a heterogeneous environment, by defining generic rules for the middleware and using wrapper-provided rules to encapsulate the capabilities of each data source. This approach offers great advantages in terms of plan quality, extensibility to new sources, incremental implementation of rules for new sources, and the ability to express the capabilities of a diverse set of sources. We describe the design and implementation of this optimizer, and illustrate its actions through an example.
DBpedia: A Nucleus for a Web of Open Data
- In 6th Int’l Semantic Web Conference, Busan, Korea
, 2007
"... Abstract DBpedia is a community effort to extract structured information from Wikipedia and to make this information available on the Web. DBpedia allows you to ask sophisticated queries against datasets derived from Wikipedia and to link other datasets on the Web to Wikipedia data. We describe the ..."
Abstract
-
Cited by 203 (19 self)
- Add to MetaCart
Abstract DBpedia is a community effort to extract structured information from Wikipedia and to make this information available on the Web. DBpedia allows you to ask sophisticated queries against datasets derived from Wikipedia and to link other datasets on the Web to Wikipedia data. We describe the extraction of the DBpedia datasets, and how the resulting information is published on the Web for human- and machineconsumption. We describe some emerging applications from the DBpedia community and show how website authors can facilitate DBpedia content within their sites. Finally, we present the current status of interlinking DBpedia with other open datasets on the Web and outline how DBpedia could serve as a nucleus for an emerging Web of open data. 1
Using Schema Matching to Simplify Heterogeneous Data Translation
, 1998
"... A broad spectrum of data is available on the Web in distinct heterogeneous sources, and stored under different formats. As the number of systems that utilize this heterogeneous data grows, the importance of data translation and conversion mechanisms increases greatly. In this paper we present a n ..."
Abstract
-
Cited by 187 (5 self)
- Add to MetaCart
A broad spectrum of data is available on the Web in distinct heterogeneous sources, and stored under different formats. As the number of systems that utilize this heterogeneous data grows, the importance of data translation and conversion mechanisms increases greatly. In this paper we present a new translation system, based on schemamatching, aimed to simplify the intricate task of data conversion. We observe that in many cases the schema of the data in the source system is very similar to the that of the target system. In such cases, much of the translation work can be done automatically, based on the schemas similarity. This saves a lot of effort for the user, limiting the amount of programming needed. We define common schema and data models, in which schemas and data (resp.) from many common models can be represented. Using a rulebased method, the source schema is compared with the target one, and each component in the source schema is matched with a corresponding compone...
The state of the art in distributed query processing
- ACM Computing Surveys
, 2000
"... Distributed data processing is fast becoming a reality. Businesses want to have it for many reasons, and they often must have it in order to stay competitive. While much of the infrastructure for distributed data processing is already in place (e.g., modern network technology), there are a number of ..."
Abstract
-
Cited by 182 (2 self)
- Add to MetaCart
Distributed data processing is fast becoming a reality. Businesses want to have it for many reasons, and they often must have it in order to stay competitive. While much of the infrastructure for distributed data processing is already in place (e.g., modern network technology), there are a number of issues which still make distributed data processing a complex undertaking: (1) distributed systems can become very large involving thousands of heterogeneous sites including PCs and mainframe server machines � (2) the state of a distributed system changes rapidly because the load of sites varies over time and new sites are added to the system� (3) legacy systems need to be integrated|such legacy systems usually have not been designed for distributed data processing and now need to interact with other (modern) systems in a distributed environment. This paper presents the state of the art of query processing for distributed database and information systems. The paper presents the \textbook " architecture for distributed query processing and a series of techniques that are particularly useful for distributed database systems. These techniques include special join techniques, techniques to exploit intra-query parallelism, techniques to reduce communication costs, and techniques to exploit caching and replication of data. Furthermore, the paper discusses di erent kinds of distributed systems such as client-server, middleware (multi-tier), and heterogeneous database systems and shows how query processing works in these systems. Categories and subject descriptors: E.5 [Data]:Files � H.2.4 [Database Management Systems]: distributed databases, query processing � H.2.5 [Heterogeneous Databases]: data translation General terms: algorithms � performance Additional key words and phrases: query optimization � query execution � client-server databases � middleware � multi-tier architectures � database application systems � wrappers� replication � caching � economic models for query processing � dissemination-based information systems 1
SEMINT: A tool for identifying attribute correspondences in heterogeneous databases using neural networks
- DATA & KNOWLEDGE ENGINEERING
, 2000
"... ..."
Multimedia Database Systems
- Journal of the ACM
, 1993
"... Though there are now numerous examples of multimedia systems in the commercial market, these systems have been developed primarily on a case-by-case basis. The largescale development of such systems requires a principled characterization of multimedia systems which is independent of any single appli ..."
Abstract
-
Cited by 97 (10 self)
- Add to MetaCart
Though there are now numerous examples of multimedia systems in the commercial market, these systems have been developed primarily on a case-by-case basis. The largescale development of such systems requires a principled characterization of multimedia systems which is independent of any single application. It requires a unified query language framework to access these different structures in a variety of ways. It requires algorithms that are provably correct in processing such queries and whose efficiency can be appropriately evaluated. In this paper, we develop a framework for characterizing multimedia information systems which builds on top of the implementations of individual media, and provides a logical query language that integrates such diverse media. We develop indexing structures and algorithms to process such queries and show that these algorithms are sound and complete and relatively efficient (polynomial-time). We show that the generation of media-events (i.e. generating di...
Learning Object Identification Rules for Information Integration
- Information Systems
, 2001
"... When integrating information from multiple websites, the same data objects can exist in inconsistent text formats across sites, making it di#cult to identify matching objects using exact text match. We have developed an object identification system called Active Atlas, which compares the objects' ..."
Abstract
-
Cited by 77 (8 self)
- Add to MetaCart
When integrating information from multiple websites, the same data objects can exist in inconsistent text formats across sites, making it di#cult to identify matching objects using exact text match. We have developed an object identification system called Active Atlas, which compares the objects' shared attributes in order to identify matching objects. Certain attributes are more important for deciding if a mapping should exist between two objects. Previous methods of object identification have required manual construction of object identification rules or mapping rules for determining the mappings between objects. This manual process is time consuming and error-prone.
The Stanford Data Warehousing Project
- IEEE Data Engineering Bulletin
, 1995
"... The goal of the data warehousing project at Stanford (the WHIPS project) is to develop algorithms and tools for the efficient collection and integration of information from heterogeneous and autonomous sources, including legacy sources. In this paper we give a brief overview of the WHIPS project, an ..."
Abstract
-
Cited by 72 (8 self)
- Add to MetaCart
The goal of the data warehousing project at Stanford (the WHIPS project) is to develop algorithms and tools for the efficient collection and integration of information from heterogeneous and autonomous sources, including legacy sources. In this paper we give a brief overview of the WHIPS project, and we describe some of the research problems being addressed in the initial phase of the project. 1

