Results 1 - 10
of
46
Algorithmics and Applications of Tree and Graph Searching
- In Symposium on Principles of Database Systems
, 2002
"... Modern search engines answer keyword-based queries extremely efficiently. The impressive speed is due to clever inverted index structures, caching, a domain-independent knowledge of strings, and thousands of machines. Several research efforts have attempted to generalize keyword search to keytree an ..."
Abstract
-
Cited by 89 (8 self)
- Add to MetaCart
Modern search engines answer keyword-based queries extremely efficiently. The impressive speed is due to clever inverted index structures, caching, a domain-independent knowledge of strings, and thousands of machines. Several research efforts have attempted to generalize keyword search to keytree and keygraph searching, because trees and graphs have many applications in next-generation database systems. This paper surveys both algorithms and applications, giving some emphasis to our own work.
Integrity Constraints for XML
, 1999
"... this paper, we extend XML DTDs with several classes of integrity constraints and investigate the complexity of reasoning about these constraints. The constraints range over keys, foreign keys, inverse constraints as well as ID constraints for capturing the semantics of object identities. They imp ..."
Abstract
-
Cited by 79 (12 self)
- Add to MetaCart
this paper, we extend XML DTDs with several classes of integrity constraints and investigate the complexity of reasoning about these constraints. The constraints range over keys, foreign keys, inverse constraints as well as ID constraints for capturing the semantics of object identities. They improve semantic specifications and provide a better reference mechanism for native XML applications. They are also useful in information exchange and data integration for preserving the semantics of data originating in relational and object-oriented databases. We establish complexity and axiomatization results for the (finite) implication problems associated with these constraints. In addition, we study implication of more general constraints, such as functional, inclusion and inverse constraints defined in terms of navigation paths
Rewriting of Regular Expressions and Regular Path Queries
, 2002
"... Recent work on semi-structured data has revitalized the interest in path queries, i.e., queries that ask for all pairs of objects in the database that are connected by a path conforming to a certain specification, in particular to a regular expression. Also, in semi-structured data, as well as in da ..."
Abstract
-
Cited by 66 (23 self)
- Add to MetaCart
Recent work on semi-structured data has revitalized the interest in path queries, i.e., queries that ask for all pairs of objects in the database that are connected by a path conforming to a certain specification, in particular to a regular expression. Also, in semi-structured data, as well as in data integration, data warehousing, and query optimization, the problem of view-based query rewriting is receiving much attention: Given a query and a collection of views, generate a new query which uses the views and provides the answer to the original one. In this paper we address the problem of view-based query rewriting in the context of semi-structured data. We present a method for computing the rewriting of a regular expression E in terms of other regular expressions. The method computes the exact rewriting (the one that defines the same regular language as E) if it exists, or the rewriting that defines the maximal language contained in the one defined by E, otherwise. We present a complexity analysis of both the problem and the method, showing that the latter is essentially optimal. Finally, we illustrate how to exploit the method for view-based rewriting of regular path queries in semi-structured data. The complexity results established for the rewriting of regular expressions apply also to the case of regular path queries.
Schemas for Integration and Translation of Structured and Semi-Structured Data
- In Proceedings of the International Conference on Database Theory
, 1999
"... Introduction The Web is emerging as a universal data repository, offering access to sources whose data organization varies from strictly structured databases to almost completely unstructured pages, and everything in between. Consequently, much research has recently focused on data integration and ..."
Abstract
-
Cited by 62 (5 self)
- Add to MetaCart
Introduction The Web is emerging as a universal data repository, offering access to sources whose data organization varies from strictly structured databases to almost completely unstructured pages, and everything in between. Consequently, much research has recently focused on data integration and data translation systems [10, 6, 9, 8, 17, 13, 2, 19], whose goals are to allow applications to utilize data from many sources, with possibly widely varying formats. These research efforts have established a common data model of semistructured data, for uniformly representing data from any source. Recently, however, it is being realized that having a common schema model is also beneficial, and even necessary, in translation and integration systems to support tasks such as query formulation, decomposition and optimization, or declarative specification of data translation. As an example, which we use for motivation throughout the paper, recently suggested tools for data translation [2, 11, 19
Flexible Queries over Semistructured Data
- IN PODS
, 2001
"... Flexible queries facilitate, in a novel way, easy and concise querying of databases that have varying structures. Two dierent semantics, exible and semiexible, are introduced and investigated. The complexity of evaluating queries under the two semantics is analyzed. Query evaluation is polynomial in ..."
Abstract
-
Cited by 36 (4 self)
- Add to MetaCart
Flexible queries facilitate, in a novel way, easy and concise querying of databases that have varying structures. Two dierent semantics, exible and semiexible, are introduced and investigated. The complexity of evaluating queries under the two semantics is analyzed. Query evaluation is polynomial in the size of the query, the database and the result in the following two cases. First, a semiexible DAG query and a tree database. Second, a exible tree query and a database that is any graph. Query containment and equivalence are also investigated. For the exible semantics, query equivalence is always polynomial. For the semiexible semantics, query equivalence is polynomial for DAG queries and exponential when the queries have cycles. Under the semiexible and exible semantics, two databases could be equivalent even when they are not isomorphic. Database equivalence is formally de ned and characterized. The complexity of deciding equivalences among databases is analyzed. The implications of database equivalence on query evaluation are explained.
Interaction between Path and Type Constraints
- In Proceedings of ACM Symposium on Principles of Database Systems (PODS
, 1999
"... This paper investigates that interaction. In particular it studies constraint implication problems, which are important both in understanding the semantics of type/constraint systems and in query optimization. It shows that path constraints interact with types in a highly intricate way. For that pur ..."
Abstract
-
Cited by 34 (15 self)
- Add to MetaCart
This paper investigates that interaction. In particular it studies constraint implication problems, which are important both in understanding the semantics of type/constraint systems and in query optimization. It shows that path constraints interact with types in a highly intricate way. For that purpose a number of results on path constraint implication are established in the presence and absence of type systems. These results demonstrate that adding a type system may in some cases simplify reasoning about path constraints and in other cases make it harder. For example, it is shown that there is a path constraint implication problem that is decidable in PTIME in the untyped context, but that becomes undecidable when a type system is added. On the other hand, there is an implication problem that is undecidable in the untyped context, but becomes not only decidable in cubic time but also finitely axiomatizable when a type system is imposed

