Results 1 -
6 of
6
Indexing and Querying XML Data for Regular Path Expressions
- IN VLDB
, 2001
"... With the advent of XML as a standard for data representation and exchange on the Internet, storing and querying XML data becomes more and more important. Several XML query languages have been proposed, and the common feature of the languages is the use of regular path expressions to query XML ..."
Abstract
-
Cited by 265 (9 self)
- Add to MetaCart
With the advent of XML as a standard for data representation and exchange on the Internet, storing and querying XML data becomes more and more important. Several XML query languages have been proposed, and the common feature of the languages is the use of regular path expressions to query XML data. This poses a new challenge concerning indexing and searching XML data, because conventional approaches based on tree traversals may not meet the processing requirements under heavy access requests. In this paper, we propose a new system for indexing and storing XML data based on a numbering scheme for elements. This numbering scheme quickly determines the ancestor-descendant relationship between elements in the hierarchy of XML data. We also propose several algorithms for processing regular path expressions, namely, (1) ##-Join for searching paths from an element to another, (2) ##-Join for scanning sorted elements and attributes to find element-attribute pairs, and (3) ##-Join for finding Kleene-Closure on repeated paths or elements. The ##-Join algorithm is highly effective particularly for searching paths that are very long or whose lengths are unknown. Experimental results from our prototype system implementation show that the proposed algorithms can process XML queries with regular path expressions by up to an or- # This work was sponsored in part by National Science Foundation CAREER Award (IIS-9876037) and Research Infrastructure program EIA-0080123. The authors assume all responsibility for the contents of the paper. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its...
Effective keyword search in XML documents based on MIU
- In Proc. of DASFAA Conference
, 2006
"... Abstract. Keyword search is an effective approach for most users to search for information because they do not need to learn complex query languages or the underlying structures of the data. This paper focuses on effective keyword search in XML documents which are modeled as labeled trees. We first ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Abstract. Keyword search is an effective approach for most users to search for information because they do not need to learn complex query languages or the underlying structures of the data. This paper focuses on effective keyword search in XML documents which are modeled as labeled trees. We first analyze the problems caused by the refinement of result granularity during XML keyword search and then propose to partition an XML document into XML fragments with the granularity of Minimal Information Unit (MIU). Furthermore, we present efficient index structures and the corresponding search algorithms. Finally, our comprehensive experiments demonstrate the benefits of our method over previously proposed methods in terms of result quality, index size and execution time. 1
Top-k Keyword Search over Probabilistic XML Data
"... far more expensive to answer a keyword query over probabilistic XML data due to the consideration of possible world semantics. In this paper, we firstly define the new problem of studying top-k keyword search over probabilistic XML data, which is to retrieve k SLCA results with the k highest probabi ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
far more expensive to answer a keyword query over probabilistic XML data due to the consideration of possible world semantics. In this paper, we firstly define the new problem of studying top-k keyword search over probabilistic XML data, which is to retrieve k SLCA results with the k highest probabilities of existence. And then we propose two efficient algorithms. The first algorithm PrStack can find k SLCA results with the k highest probabilities by scanning the relevant keyword nodes only once. To further improve the efficiency, we propose a second algorithm EagerTopK based on a set of pruning properties which can quickly prune unsatisfied SLCA candidates. Finally, we implement the two algorithms and compare their performance with analysis of extensive experimental results. I.
X-Binder: Path Combining System of XML Documents Based on RDBMS
"... Abstract. With the increasing use of XML, considerable research is being conducted on the XML document management systems for more efficient storage and searching of XML documents. Depending on the base systems, these researches can be classified into object-oriented DBMS (OODBMS) and relational DBM ..."
Abstract
- Add to MetaCart
Abstract. With the increasing use of XML, considerable research is being conducted on the XML document management systems for more efficient storage and searching of XML documents. Depending on the base systems, these researches can be classified into object-oriented DBMS (OODBMS) and relational DBMS (RDBMS). OODBMS-based systems are better suited to reflect the structure of XML-documents than RDBMS-based ones. However, using an XML parser to map the contents of documents to relational tables is a better way to construct a stable and effective XML document management system. The proposed X-Binder system uses an RDBMS-based inverted index; this guarantees high searching speed but wastes considerable storage space. To avoid this, the proposed system incorporates a path combining module agent that combines paths with sibling relations, and stores them in a single row. Performance evaluation revealed that the proposed system reduces storage wastage and search time. 1
Keyword Search in Databases
- IEEE Data Engineering Bulletin
, 2001
"... Querying using keywords is easily the most widely used form of querying today. While keyword searching is widely used to search documents on the Web, querying of databases currently relies on complex query languages that are inappropriate for casual end-users, since they are complex and hard to lear ..."
Abstract
- Add to MetaCart
Querying using keywords is easily the most widely used form of querying today. While keyword searching is widely used to search documents on the Web, querying of databases currently relies on complex query languages that are inappropriate for casual end-users, since they are complex and hard to learn. Given the popularity of keyword search, and the increasing use of databases as the back end for data published on the Web, the need for querying databases using keywords is being increasingly felt. One key problem in applying document or web keyword search techniques to databases is that information related to a single answer to a keyword query may be split across multiple tuples in different relations.

