Results 1 - 10
of
50
Querying Semi-Structured Data
, 1997
"... The amount of data of all kinds available electronically has increased dramatically in recent years. The data resides in different forms, ranging from unstructured data in file systems to highly structured in relational database systems. Data is accessible through a variety of interfaces including W ..."
Abstract
-
Cited by 467 (19 self)
- Add to MetaCart
The amount of data of all kinds available electronically has increased dramatically in recent years. The data resides in different forms, ranging from unstructured data in file systems to highly structured in relational database systems. Data is accessible through a variety of interfaces including Web browsers, database query languages, application-specific interfaces, or data exchange formats. Some of this data is raw data, e.g. images or sound. Some of it has structure even if the structure is often implicit, and not as rigid or regular as that found in standard database systems. Sometimes the structure exists but has to be extracted from the data. Sometimes also it exists but we prefer to ignore it for certain purposes such as browsing. We call here semi-structured data this data that is (from a particular viewpoint) neither raw data nor strictly typed, i.e., not table-oriented as in a relational model or sorted-graph as in object databases...
Object fusion in mediator systems
- INTERNATIONAL CONFERENCE ON VERY LARGE DATA BASES
, 1996
"... One of the main tasks of mediators is to fuse information from heterogeneous information sources. This may involve, for example, removing redundancies, and resolving inconsistencies in favor of the most reliable source. The problem becomes harder when the sources are unstructured/semistructured and ..."
Abstract
-
Cited by 155 (29 self)
- Add to MetaCart
One of the main tasks of mediators is to fuse information from heterogeneous information sources. This may involve, for example, removing redundancies, and resolving inconsistencies in favor of the most reliable source. The problem becomes harder when the sources are unstructured/semistructured and we do not have complete knowledge of their contents and structure. In this paper we show how many common fusion operations can be specified non-procedurally and succinctly. The key to our approach is to assign semantically meaningful object ids to objects as they are "imported " into the mediator.
Querying Documents in Object Databases
, 1997
"... We consider the problem of storing and accessing documents (SGML and HTML, in particular) using database technology. To specify the database image of documents, we use structuring schemas that consist in grammars annotated with database programs. To query documents, we introduce an extension of OQL ..."
Abstract
-
Cited by 82 (13 self)
- Add to MetaCart
We consider the problem of storing and accessing documents (SGML and HTML, in particular) using database technology. To specify the database image of documents, we use structuring schemas that consist in grammars annotated with database programs. To query documents, we introduce an extension of OQL, the ODMG standard query language for object databases. Our extension (named OQL-doc) allows to query documents without a precise knowledge of their structure using in particular generalized path expressions and pattern matching. This allows us to introduce in a declarative language (in the style of SQL or OQL), navigational and information retrieval styles of accessing data. Query processing in the context of documents and path expressions leads to challenging implementation issues. We extend an object algebra with new operators to deal with generalized path expressions. We then consider two essential complementary optimization techniques: 1. we show that almost standard database optim...
Incremental Maintenance for Materialized Views over Semistructured Data
, 1998
"... Semistructured data is not strictly typed like relational or object-oriented data and may be irregular or incomplete. It often arises in practice, e.g., when heterogeneous data sources are integrated or data is taken from the World Wide Web. Views over semistructured data can be used to filter the d ..."
Abstract
-
Cited by 60 (6 self)
- Add to MetaCart
Semistructured data is not strictly typed like relational or object-oriented data and may be irregular or incomplete. It often arises in practice, e.g., when heterogeneous data sources are integrated or data is taken from the World Wide Web. Views over semistructured data can be used to filter the data and to restructure (or provide structure to) it. To achieve fast query response time, these views are often materialized. This paper studies incremental maintenance techniques for materialized views over semistructured data. We use the graph-based data model OEM and the query language Lorel, developed at Stanford, as the framework for our work. We propose a new algorithm that produces a set of queries that compute the changes to the view based upon a change to the source. We develop an analytic cost model and compare the cost of executing our incremental maintenance algorithm to that of recomputing the view. We show that for nearly all types of database updates, it is more efficient to a...
K2/Kleisli and GUS: Experiments in Integrated Access to Genomic Data Sources
, 2000
"... The integration of heterogeneous data sources and software systems is a major issue in the biomedical community and several approaches have been explored: linking databases, "on-the-fly" integration through views, and integration through warehousing. In this paper we report on our experiences with t ..."
Abstract
-
Cited by 52 (4 self)
- Add to MetaCart
The integration of heterogeneous data sources and software systems is a major issue in the biomedical community and several approaches have been explored: linking databases, "on-the-fly" integration through views, and integration through warehousing. In this paper we report on our experiences with two systems that were developed at the University of Pennsylvania: an integration system called K2, which has primarily been used to provide views over multiple external data sources and software systems; and a data warehouse called GUS which downloads, cleans, integrates and annotates data from multiple external data sources. Although the view and warehouse approaches each have their advantages, there is no clear "winner". Therefore, users must consider how the data is to be used, what the performance guarantees must be, and how much programmer time and expertise is available to choose the best strategy for a particular application.
Active Views for Electronic Commerce
, 1999
"... Electronic commerce is emerging as a major Web-supported application. In this paper we argue that database technology can, and should, provide the backbone for a wide range of such applications. More precisely, we present here the ActiveViews system, which, relaying on an extensive use of database ..."
Abstract
-
Cited by 47 (8 self)
- Add to MetaCart
Electronic commerce is emerging as a major Web-supported application. In this paper we argue that database technology can, and should, provide the backbone for a wide range of such applications. More precisely, we present here the ActiveViews system, which, relaying on an extensive use of database features including views, active rules (triggers), and enhanced mechanisms for notification, access control and logging/tracing of users activities, provides the needed basis for electronic commerce. Based on
A Transparent Object-Oriented Schema Change Approach Using View Evolution
- In IEEE International Conference on Data Engineering
, 1995
"... When a database is shared by many users, updates to the database schema are almost always prohibited because there is a risk of making existing application programs obsolete when they run against the modified schema. This paper addresses the problem by integrating schema evolution with view faciliti ..."
Abstract
-
Cited by 46 (16 self)
- Add to MetaCart
When a database is shared by many users, updates to the database schema are almost always prohibited because there is a risk of making existing application programs obsolete when they run against the modified schema. This paper addresses the problem by integrating schema evolution with view facilities. Each user is assigned his or her own database view, and develops application programs against the view. When new requirements necessitate schema updates for a particular user, then the user specifies schema changes to his personal view rather than to the shared base schema. Our view schema evolution approach then computes a new view schema that reflects the semantics of the desired schema change, and replaces the old view with the new one. This approach provides the means for schema change without affecting other views (and thus without affecting existing application programs). The persistent data is shared by different views of the schema, i.e. by both old as well as newly developed app...
An Equational Chase for Path-Conjunctive Queries, Constraints, and Views
- In ICDT
, 1999
"... We consider the class of path-conjunctive queries and constraints (dependencies) defined over complex values with dictionaries. ..."
Abstract
-
Cited by 41 (11 self)
- Add to MetaCart
We consider the class of path-conjunctive queries and constraints (dependencies) defined over complex values with dictionaries.
Viewing the Semantic Web through RVL Lenses
, 2003
"... Personalized access and content syndication involving diverse conceptual representations of information resources are two of the key challenges of real-scale Semantic Web (SW) applications, such as eCommerce, e-Learning or e-Science portals. RDF/S represents nowadays the core SW language for cre ..."
Abstract
-
Cited by 37 (10 self)
- Add to MetaCart
Personalized access and content syndication involving diverse conceptual representations of information resources are two of the key challenges of real-scale Semantic Web (SW) applications, such as eCommerce, e-Learning or e-Science portals. RDF/S represents nowadays the core SW language for creating and exchanging resource descriptions worldwide. Unfortunately, full-fledged view definition languages for the RDF/S data model are still missing. We propose RVL, a view definition language capable of creating not only virtual resource descriptions, but also virtual RDF/S schemas from (meta)classes, properties, as well as, resource descriptions available on the Semantic Web. RVL exploits the functional nature and type system of the RQL query language in order to navigate, filter and restructure complex RDF/S schema and resource description graphs.

