Results 1 - 10
of
222
Querying Heterogeneous Information Sources Using Source Descriptions
, 1996
"... We witness a rapid increase in the number of structured information sources that are available online, especially on the WWW. These sources include commercial databases on product information, stock market information, real estate, automobiles, and entertainment. We would like to use the data stored ..."
Abstract
-
Cited by 638 (33 self)
- Add to MetaCart
We witness a rapid increase in the number of structured information sources that are available online, especially on the WWW. These sources include commercial databases on product information, stock market information, real estate, automobiles, and entertainment. We would like to use the data stored in these databases to answer complex queries that go beyond keyword searches. We face the following challenges: (1) Several information sources store interrelated data, and any query-answering system must understand the relationships between their contents. (2) Many sources are not full-featured database systems and can answer only a small set of queries over their data (for example, forms on the WWW restrict the set of queries one can ask). (3) Since the number of sources is very large, effective techniques are needed to prune the set of information sources accessed to answer a query. (4) The details of interacting with each source vary greatly. We describe the Information Manifold, an imp...
A Scalable Comparison-Shopping Agent for the World-Wide Web
- In Proceedings of the First International Conference on Autonomous Agents
, 1997
"... The Web is less agent-friendly than we might hope. Most information on the Web is presented in loosely structured natural language text with no agent-readable semantics. HTML annotations structure the display of Web pages, but provide virtually no insight into their content. Thus, the designers of i ..."
Abstract
-
Cited by 279 (18 self)
- Add to MetaCart
The Web is less agent-friendly than we might hope. Most information on the Web is presented in loosely structured natural language text with no agent-readable semantics. HTML annotations structure the display of Web pages, but provide virtually no insight into their content. Thus, the designers of intelligent Web agents need to address the following questions: (1) To what extent can an agent understand information published at Web sites? (2) Is the agent's understanding sufficient to provide genuinely useful assistance to users? (3) Is site-specific hand-coding necessary, or can the agent automatically extract information from unfamiliar Web sites? (4) What aspects of the Web facilitate this competence? In this paper we investigate these issues with a case study using the ShopBot. ShopBot is a fullyimplemented, domain-independent comparison-shopping agent. Given the home pages of several on-line stores, ShopBot autonomously learns how to shop at those vendors. After its learning is com...
Query Reformulation for Dynamic Information Integration
- JOURNAL OF INTELLIGENT INFORMATION SYSTEMS
, 1996
"... The standard approach to integrating heterogeneous information sources is to build a global schema that relates all of the information in the different sources, and to pose queries directly against it. The problem is that schema integration is usually difficult, and as soon as any of the information ..."
Abstract
-
Cited by 227 (26 self)
- Add to MetaCart
The standard approach to integrating heterogeneous information sources is to build a global schema that relates all of the information in the different sources, and to pose queries directly against it. The problem is that schema integration is usually difficult, and as soon as any of the information sources change or a new source is added, the process mayhave to be repeated. The SIMS system uses an alternative approach. A domain model of the application domain is created, establishing a fixed vocabulary for describing data sets in the domain. Using this language, each available information source is described. Queries to SIMS against the collection of available information sources are posed using terms from the domain model, and reformulation operators are employed to dynamically select an appropriate set of information sources and to determine how to integrate the available information to satisfy a query. This approach results in a system that is more flexible than existing ones, more easily scalable, and able to respond dynamically to newly available or unexpectedly missing information sources.
Data Model and Query Evaluation in Global Information Systems
- Journal of Intelligent Information Systems
, 1991
"... . Global information systems involve a large number of information sources distributed over computer networks. The variety of information sources and disparity of interfaces makes the task of easily locating and efficiently accessing information over the network very cumbersome. We describe an archi ..."
Abstract
-
Cited by 200 (14 self)
- Add to MetaCart
. Global information systems involve a large number of information sources distributed over computer networks. The variety of information sources and disparity of interfaces makes the task of easily locating and efficiently accessing information over the network very cumbersome. We describe an architecture for global information systems that is especially tailored to address the challenges raised in such an environment, and distinguish our architecture from architectures of multidatabase and distributed database systems. Our architecture is based on presenting a conceptually unified view of the information space to a user, specifying rich descriptions of the contents of the information sources, and using these descriptions for optimizing queries posed in the unified view. The contributions of this paper include: (1) we identify aspects of site descriptions that are useful in query optimization; (2) we describe query optimization techniques that minimize the number of information source...
Description Logics in Data Management
, 1995
"... Description logics and reasoners, which are descendants of the kl-one language, have been studied in depth in Artificial Intelligence. After a brief introduction, we survey in this paper their application to the problems of information management, using the framework of an abstract information serve ..."
Abstract
-
Cited by 174 (12 self)
- Add to MetaCart
Description logics and reasoners, which are descendants of the kl-one language, have been studied in depth in Artificial Intelligence. After a brief introduction, we survey in this paper their application to the problems of information management, using the framework of an abstract information server equipped with several operations -- each involving one or more languages. Specifically, we indicate how one can achieve enhanced access to data and knowledge by using descriptions in languages for schema design and integration, queries, answers, updates, rules, and constraints.
Ontology-Based Integration of Information - A Survey of Existing Approaches
, 2001
"... We review the use on ontologies for the integration of heterogeneous information sources. Based on an in-depth evaluation of existing approaches to this problem we discuss how ontologies are used to support the integration task. We evaluate and compare the languages used to represent the ontologies ..."
Abstract
-
Cited by 171 (1 self)
- Add to MetaCart
We review the use on ontologies for the integration of heterogeneous information sources. Based on an in-depth evaluation of existing approaches to this problem we discuss how ontologies are used to support the integration task. We evaluate and compare the languages used to represent the ontologies and the use of mappings between ontologies as well as to connect ontologies with information sources. We also enquire into ontology engineering methods and tools used to develop ontologies for information integration. Based on the results of our analysis we summarize the state-of-the-art in ontology-based information integration and name areas of further research activities.
Object fusion in mediator systems
- INTERNATIONAL CONFERENCE ON VERY LARGE DATA BASES
, 1996
"... One of the main tasks of mediators is to fuse information from heterogeneous information sources. This may involve, for example, removing redundancies, and resolving inconsistencies in favor of the most reliable source. The problem becomes harder when the sources are unstructured/semistructured and ..."
Abstract
-
Cited by 155 (29 self)
- Add to MetaCart
One of the main tasks of mediators is to fuse information from heterogeneous information sources. This may involve, for example, removing redundancies, and resolving inconsistencies in favor of the most reliable source. The problem becomes harder when the sources are unstructured/semistructured and we do not have complete knowledge of their contents and structure. In this paper we show how many common fusion operations can be specified non-procedurally and succinctly. The key to our approach is to assign semantically meaningful object ids to objects as they are "imported " into the mediator.
The Information Manifold
- In Proceedings of the AAAI 1995 Spring Symp. on Information Gathering from Heterogeneous, Distributed Enviroments
"... We describe the Information Manifold (IM), a system for browsing and querying of multiple networked information sources. As a first contribution, the system demonstrates the viability of knowledge representation technology for retrieval and organization of information from disparate (structured and ..."
Abstract
-
Cited by 148 (5 self)
- Add to MetaCart
We describe the Information Manifold (IM), a system for browsing and querying of multiple networked information sources. As a first contribution, the system demonstrates the viability of knowledge representation technology for retrieval and organization of information from disparate (structured and unstructured) information sources. Such an organization allows the user to pose high-level queries that use data from multiple information sources. As a second contribution, we describe novel query processing algorithms used to combine information from multiple sources. In particular, our algorithms are guaranteed to find exactly the set of information sources relevant to a query, and to completely exploit knowledge about local closed world information (Etzioni et al. 1994). Introduction We are currently witnessing an explosion in the amount of information that is available online. For example, the rapid rise in popularity of the World Wide Web (WWW) has increased the amount of information...
Semantic and schematic similarities between database objects: A context-based approach
- VLDB Journal
, 1996
"... Inamultidatabase system, schematic con icts between two objects are usually of interest only when the objects have some semantic similarity. We use the concept of semantic proximity, which is essentially an abstraction/mapping between the domains of the two objects associated with the context of com ..."
Abstract
-
Cited by 141 (12 self)
- Add to MetaCart
Inamultidatabase system, schematic con icts between two objects are usually of interest only when the objects have some semantic similarity. We use the concept of semantic proximity, which is essentially an abstraction/mapping between the domains of the two objects associated with the context of comparison. An explicit though partial context representation is proposed and the speci city relationship between contexts is de ned. The contexts are organized as a meet semi-lattice and associated operations like the greatest lower bound (glb) are de ned. The context of comparison and the type of abstractions used to relate the two objects form the basis of a semantic taxonomy. Atthesemantic level, the intensional description of database objects provided by the context is expressed in a description logic language. Schema correspondences are used to store mappings from the semantic level to the data level and are associated with the respective contexts. Inferences about database content at the federation level are modeled as changes in the context and the associated schema correspondences. We try to reconcile the dual (schematic and semantic) perspecitves by: enumerating possible semantic similarities between objects having schema and data conicts, and modeling schema correspondences as the projection of semantic proximity wrt context. 1
Semantic Integration of Semistructured and Structured Data Sources
- SIGMOD Record
, 1999
"... this paper is to describe the MOMIS [4, 5] (Mediator envirOnment for Multiple Information Sources) approach to the integration and query of multiple, heterogeneous information sources, containing structured and semistructured data. MOMIS has been conceived as a joint collaboration between University ..."
Abstract
-
Cited by 126 (17 self)
- Add to MetaCart
this paper is to describe the MOMIS [4, 5] (Mediator envirOnment for Multiple Information Sources) approach to the integration and query of multiple, heterogeneous information sources, containing structured and semistructured data. MOMIS has been conceived as a joint collaboration between University of Milano and Modena in the framework of the INTERDATA national research project, aiming at providing methods and tools for data management in Internet-based information systems. Like other integration projects [1, 10, 14], MOMIS follows a "semantic approach" to information integration based on the conceptual schema, or metadata, of the information sources, and on the following architectural elements: i) a common object-oriented data model, defined according to the ODL I 3 language, to describe source schemas for integration purposes. The data model and ODL I 3 have been defined in MOMIS as subset of the ODMG-93 ones, following the proposal for a standard mediator language developed by the I

