Results 1 - 10
of
48
Rewriting of Regular Expressions and Regular Path Queries
, 2002
"... Recent work on semi-structured data has revitalized the interest in path queries, i.e., queries that ask for all pairs of objects in the database that are connected by a path conforming to a certain specification, in particular to a regular expression. Also, in semi-structured data, as well as in da ..."
Abstract
-
Cited by 66 (23 self)
- Add to MetaCart
Recent work on semi-structured data has revitalized the interest in path queries, i.e., queries that ask for all pairs of objects in the database that are connected by a path conforming to a certain specification, in particular to a regular expression. Also, in semi-structured data, as well as in data integration, data warehousing, and query optimization, the problem of view-based query rewriting is receiving much attention: Given a query and a collection of views, generate a new query which uses the views and provides the answer to the original one. In this paper we address the problem of view-based query rewriting in the context of semi-structured data. We present a method for computing the rewriting of a regular expression E in terms of other regular expressions. The method computes the exact rewriting (the one that defines the same regular language as E) if it exists, or the rewriting that defines the maximal language contained in the one defined by E, otherwise. We present a complexity analysis of both the problem and the method, showing that the latter is essentially optimal. Finally, we illustrate how to exploit the method for view-based rewriting of regular path queries in semi-structured data. The complexity results established for the rewriting of regular expressions apply also to the case of regular path queries.
Form-Based Proxy Caching for Database-Backed Web Sites: Keywords and Functions
, 2008
"... Web caching proxy servers are essential for improving web performance and scalability, and recent research has focused on making proxy caching work for database-backed web sites. In this paper, we explore a new proxy caching framework that exploits the query semantics of HTML forms. We identify two ..."
Abstract
-
Cited by 48 (3 self)
- Add to MetaCart
Web caching proxy servers are essential for improving web performance and scalability, and recent research has focused on making proxy caching work for database-backed web sites. In this paper, we explore a new proxy caching framework that exploits the query semantics of HTML forms. We identify two common classes of form-based queries from real-world database-backed web sites, namely, keyword-based queries and function-embedded queries. Using typical examples of these queries, we study two representative caching schemes within our framework: (i) traditional passive query caching, and (ii) active query caching, in which the proxy cache can service a request by evaluating a query over the contents of the cache. Results from our experimental implementation show that our form-based proxy is a general and flexible approach that efficiently enables active caching schemes for database-backed web sites. Furthermore, handling query containment at the proxy yields significant performance advantages over passive query caching, but extending the power of the active cache to do full semantic caching appears to be less generally effective.
Representing and querying XML with incomplete information
- ACM TODS
, 2006
"... We study the representation and querying of XML with incomplete information. We consider a simple model for XML data and their DTDs, a very simple query language, and a representation system for incomplete information in the spirit of the representations systems developed by Imielinski and Lipski [1 ..."
Abstract
-
Cited by 46 (5 self)
- Add to MetaCart
We study the representation and querying of XML with incomplete information. We consider a simple model for XML data and their DTDs, a very simple query language, and a representation system for incomplete information in the spirit of the representations systems developed by Imielinski and Lipski [1984] for relational databases. In the scenario we consider, the incomplete information about an XML document is continuously enriched by successive queries to the document. We show that our representation system can represent partial information about the source document acquired by successive queries, and that it can be used to intelligently answer new queries. We also consider the impact on complexity of enriching our representation system or query language with additional features. The results suggest that our approach achieves a practically appealing balance between expressiveness and tractability.
Information Integration Using Contextual Knowledge and Ontology Merging
, 2003
"... With the advances in telecommunications, and the introduction of the Internet, information systems achieved physical connectivity, but have yet to establish logical connectivity. Lack of logical connectivity is often inviting disaster as in the case of Mars Orbiter, which was lost because one team u ..."
Abstract
-
Cited by 39 (5 self)
- Add to MetaCart
With the advances in telecommunications, and the introduction of the Internet, information systems achieved physical connectivity, but have yet to establish logical connectivity. Lack of logical connectivity is often inviting disaster as in the case of Mars Orbiter, which was lost because one team used metric units, the other English while exchanging a critical maneuver data. In this Thesis, we focus on the two intertwined sub problems of logical connectivity, namely data extraction and data interpretation in the domain of heterogeneous information systems. The first challenge, data extraction, is about making it possible to easily exchange data among semi-structured and structured information systems. We describe the design and implementation of a general purpose, regular expression based Caméléon wrapper engine with an integrated capabilities-aware planner/optimizer/executioner. The second challenge, data interpretation, deals with the existence of heterogeneous contexts, whereby each source of information and potential receiver of that information may operate with a different context, leading to large-scale semantic heterogeneity. We extend the existing formalization of the COIN framework with new logical formalisms and features to handle larger
Filter Similarities in Content-Based Publish/Subscribe Systems
- In International Conference on Architecture of Computing Systems (ARCS
, 2002
"... Matching notifications to subscriptions and routing notifications from producers to interested consumers are the main problems in large-scale publish/subscribe systems. ..."
Abstract
-
Cited by 34 (8 self)
- Add to MetaCart
Matching notifications to subscriptions and routing notifications from producers to interested consumers are the main problems in large-scale publish/subscribe systems.
Rewriting XPath Queries Using Materialized Views
- In VLDB
, 2005
"... As a simple XML query language but with enough expressive power, XPath has become very popular. To expedite evaluation of XPath queries, we consider the problem of rewriting XPath queries using materialized XPath views. This problem is very important and arises not only from query optimization in se ..."
Abstract
-
Cited by 31 (0 self)
- Add to MetaCart
As a simple XML query language but with enough expressive power, XPath has become very popular. To expedite evaluation of XPath queries, we consider the problem of rewriting XPath queries using materialized XPath views. This problem is very important and arises not only from query optimization in server side but also from semantic caching in client side. We consider the problem of deciding whether there exists a rewriting of a query using XPath views and the problem of finding minimal rewritings. We first consider those two problems for a very practical XPath fragment containing the descendent, child, wildcard and branch features. We show that the rewriting existence problem is coNP-hard and the problem of finding minimal rewritings is 3 . We also consider those two rewriting problems for three subclasses of this XPath fragment, each of which contains child feature and two of descendent, wildcard and branch features, and show that both rewriting problems can be polynomially solved. Finally, we give an algorithm for finding minimal rewritings, which is sound for the XPath fragment, but is also complete and runs in polynomial time for its three subclasses.
Consistent Query Answers in Virtual Data Integration Systems
- IN INCONSISTENCY TOLERANCE, SPRINGER LNCS 3300
, 2005
"... When data sources are virtually integrated there is no common and centralized mechanism for maintaining global consistency. In consequHHj9 it is likely that inconsistencies with respect to certain global integrity constraints (ICs)will occu; In this chapter we consider the problem of defining ..."
Abstract
-
Cited by 30 (18 self)
- Add to MetaCart
When data sources are virtually integrated there is no common and centralized mechanism for maintaining global consistency. In consequHHj9 it is likely that inconsistencies with respect to certain global integrity constraints (ICs)will occu; In this chapter we consider the problem of defining andcompu2;) those answers that are consistent wrt the global ICs when global qubal) are posed tovirtuM data integration systems whosesou)33 are specified following the local-as-view approach.
Ontology of Integration and Integration of Ontologies
- Procs. of the 2001 Description Logic Workshop (DL 2001
"... One of the basic problems in the development of techniques for the semantic web is the integration of ontologies. In this paper we deal with a situation where we have various local ontologies, developed independently from each other, and we are required to build an integrated, global ontology as a m ..."
Abstract
-
Cited by 29 (0 self)
- Add to MetaCart
One of the basic problems in the development of techniques for the semantic web is the integration of ontologies. In this paper we deal with a situation where we have various local ontologies, developed independently from each other, and we are required to build an integrated, global ontology as a mean for extracting information from the local ones. In this context, the problem of how to specify the mapping between the global ontology and the local ontologies is a fundamental one, and its solution is essential for establishing an ontology of integration. Description Logics (DLs) are an ideal candidate to formalize ontologies, due to their ability to express complex relationships between concepts. We argue, however, that, for capturing the mapping between different ontologies, the direct use of a DL, even a very expressive one, is not sufficient, and it is necessary to resort to more flexible mechanisms based on the notion of query. Also, we elaborate on the observation that, in the semantic web, the case of mutually inconsistent local ontologies will be very common, and we present the basic ideas in order to extend the integration framework with suitable nonmonotonic features for dealing with this case. 1
IF-Map: An Ontology-Mapping Method Based on Information-Flow Theory
, 2003
"... In order to tackle the need of sharing knowledge within and across organisational boundaries, the last decade has seen researchers both in academia and industry advocating for the use of ontologies as a means for providing a shared understanding of common domains. But with the generalised use of ..."
Abstract
-
Cited by 24 (8 self)
- Add to MetaCart
In order to tackle the need of sharing knowledge within and across organisational boundaries, the last decade has seen researchers both in academia and industry advocating for the use of ontologies as a means for providing a shared understanding of common domains. But with the generalised use of large distributed environments such as the World Wide Web came the proliferation of many di#erent ontologies, even for the same or similar domain, hence setting forth a new need of sharing---that of sharing ontologies. In addition, if visions such as the Semantic Web are ever going to become a reality, it will be necessary to provide as much automated support as possible to the task of mapping di#erent ontologies. Although many e#orts in ontology mapping have already been carried out, we have noticed that few of them are based on strong theoretical grounds and on principled methodologies. Furthermore, many of them are based only on syntactical criteria. In this paper we present a theory and method for automated ontology mapping based on channel theory, a mathematical theory of semantic information flow.

