Results 1 - 10
of
12
Storing semistructured data with STORED
"... Systems for managing and querying semistructured-data sources often store data in proprietary object repositories or in a tagged-text format. We describe a technique that can use relational database management systems to store and manage semistructured data. Our technique relies on a mapping between ..."
Abstract
-
Cited by 214 (8 self)
- Add to MetaCart
Systems for managing and querying semistructured-data sources often store data in proprietary object repositories or in a tagged-text format. We describe a technique that can use relational database management systems to store and manage semistructured data. Our technique relies on a mapping between the semistructured data model and the relational data model, expressed in a query language called STORED. When a semistrcutured data instance is given, a STORED mapping can be generated automatically using data-mining techniques. We are interested in applying STORED to XML data, which is an instance of semistructured data. We show how a document-type-descriptor (DTD), when present, can be exploited to further improve performance.
XRANK: Ranked Keyword Search over XML Documents
, 2003
"... We consider the problem of efficiently producing ranked results for keyword search queries over hyperlinked XML documents. Evaluating ..."
Abstract
-
Cited by 161 (1 self)
- Add to MetaCart
We consider the problem of efficiently producing ranked results for keyword search queries over hyperlinked XML documents. Evaluating
Anatomy of a Native XML Base Management System
- VLDB JOURNAL
, 2002
"... Several alternatives to manage large XML document collections exist, ranging from file systems over relational or other database systems to specifically tailored XML repositories. In this paper we give a tour of Natix, a database management system designed from scratch for storing and processing XML ..."
Abstract
-
Cited by 74 (28 self)
- Add to MetaCart
Several alternatives to manage large XML document collections exist, ranging from file systems over relational or other database systems to specifically tailored XML repositories. In this paper we give a tour of Natix, a database management system designed from scratch for storing and processing XML data. Contrary
Semistructured Data and XML
, 1998
"... This paper argues that the research on semistructured data is receiving a new set of challenges with the advent of XML (Extensible Mark-up Language [Bos97, Con98]). This is a new standard approved by the World Wide Web Consortium that many believe will become the de facto data exchange format for th ..."
Abstract
-
Cited by 59 (1 self)
- Add to MetaCart
This paper argues that the research on semistructured data is receiving a new set of challenges with the advent of XML (Extensible Mark-up Language [Bos97, Con98]). This is a new standard approved by the World Wide Web Consortium that many believe will become the de facto data exchange format for the Web. XML supports the electronic exchange of machine-readable data (while HTML is designed primarily for human-readable documents). XML data shares many features of semistructured data: its structure can be irregular, is not always known ahead of time, and may change frequently and without notice. On the other hand it is easy to convert data from any source into XML which will make it attractive for organizations to "publish" their information sources in XML, and thus make them available to other XML applications on the Web. For XML applications to reach their full potential however, we need to build the right tools to process data in this new format. Existing Web tools (browsers, search engines) are oriented toward document operations . For XML we need database operations , like data extraction, data integration, data translation, data storage. The research done so far on semistructured data may offer some solutions to the database problems posed by XML. For example the recently proposed query language for XML, called XML-QL [DFF
XML Content Management based on Object-Relational Database Technology
, 2000
"... XML (Extensible Markup Language) is a textual markup language designed for the creation of self-describing documents. Such documents contain textual data combined with structural information describing the structure of the textual data. Currently, products and approaches for document-oriented appl ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
XML (Extensible Markup Language) is a textual markup language designed for the creation of self-describing documents. Such documents contain textual data combined with structural information describing the structure of the textual data. Currently, products and approaches for document-oriented application domains focus mainly on the textual representation when processing and analyzing documents. Usually, they do not take advantage of the availability of structural information and only support some of the relevant aspects of content management. On the other hand, existing research approaches for structure-oriented application domains prefer very fine granularities and give less attention to operations revealing textual document contents.
Query optimization in XML structured-document databases
- THE VLDB JOURNAL
, 2006
"... While the information published in the form of XML-compliant documents keeps fast mounting up, efficient and effective query processing and optimization for XML have now become more important than ever. This article reports our recent advances in XML structureddocument query optimization. In this ar ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
While the information published in the form of XML-compliant documents keeps fast mounting up, efficient and effective query processing and optimization for XML have now become more important than ever. This article reports our recent advances in XML structureddocument query optimization. In this article, we elaborate on a novel approach and the techniques developed for XML query optimization. Our approach performs heuristic-based algebraic transformations on XPath queries, represented as PAT algebraic expressions, to achieve query optimization. This article first presents a comprehensive set of general equivalences with regard to XML documents and XML queries. Based on these equivalences, we developed a large set of deterministic algebraic transformation rules for XML query optimization. Our approach is unique, in that it performs exclusively deterministic transformations on queries for fast optimization. The deterministic nature of the proposed approach straightforwardly renders high optimization efficiency and simplicity in implementation. Our approach is a logical-level one, which is independent of any particular storage model. Therefore, the optimizers developed based on our approach can be easily adapted to a broad range of XML data/information servers to achieve fast query optimization. Experimental study confirms the validity and effectiveness of the proposed approach.
Storing Semistructured Data in Relations
- In Proceedings of the Workshop on Query Processing for Semistructured Data and Non-Standard Data Formats
"... this paper we argue that one can store semistructured data in relational format, by exploiting the regularities inherent in existing semistructured data instances. "Most" of the data will be stored in relational format: the outliers, and possible future insertions, will be still stored in a self-des ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
this paper we argue that one can store semistructured data in relational format, by exploiting the regularities inherent in existing semistructured data instances. "Most" of the data will be stored in relational format: the outliers, and possible future insertions, will be still stored in a self-describing way. We propose to use data mining techniques to extract a "good" relational schema for a given semistructured data instance. Our algorithm accepts a variety of input parameters, such as maximum number of relations allowed, maximum number of attributes per relation, and, optionally, a collection of queries on the semistructured data for which the relational storage has to be optimized. Experimental results on the DBLP data show that around 90% of the data can be stored in relational format. The techniques described here are presented in more details in [DFS98].
An Open Electronic Marketplace through Agent-based Workflows: MOPPET
- MOPPET, International Journal on Digital Libraries
, 2000
"... We propose an electronic marketplace architecture, called MOPPET, where the commerce processes in the marketplace are modeled as adaptable agent-based workflows. The higher level of abstraction provided by the workflow technology makes the customization of electronic commerce processes for different ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
We propose an electronic marketplace architecture, called MOPPET, where the commerce processes in the marketplace are modeled as adaptable agent-based workflows. The higher level of abstraction provided by the workflow technology makes the customization of electronic commerce processes for different users possible. Agent-based implementation, on the other hand, provides for a highly reusable component-based workflow architecture as well as negotiation ability and the capability to adapt to dynamic changes in the environment. Agent communication is handled through Knowledge Query and Manipulation Language (KQML). A workflow-based architecture also makes it possible for complete modeling of electronic commerce processes by allowing involved parties to be able to invoke already existing applications or to define new tasks and to restructure the control and data flow among the tasks to create custom built process definitions. In the proposed architecture all data exchanges are realized thr...
Combining Pat-Trees and Signature Files for Query Evaluation in Document Databases
- PROC. 10TH INT. CONF. ON DB & EXPERT SYSTEMS APPLIC
, 1999
"... In this paper, a new indexing technique to support the query evaluation in document databases is proposed. The key idea of the method is the combination of the technique of pat-trees with signature files. While the signature files are built to expedite the traversal of object hierarchies, the pat ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
In this paper, a new indexing technique to support the query evaluation in document databases is proposed. The key idea of the method is the combination of the technique of pat-trees with signature files. While the signature files are built to expedite the traversal of object hierarchies, the pat-trees are constructed to speed up both the signature file searching and the text scanning. In this way, high performance can be achieved.
Isolation in xml bases
- Lehrstuhl fur Praktische Informatik III, Universitat
, 2001
"... The eXtensible Markup Language (XML) is well accepted in many different application areas. As a consequence, there is an increasing need for persistently storing XML documents. As soon as many users and applications work concurrently on the same collection of XML documents --- i.e. an XML base -- ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
The eXtensible Markup Language (XML) is well accepted in many different application areas. As a consequence, there is an increasing need for persistently storing XML documents. As soon as many users and applications work concurrently on the same collection of XML documents --- i.e. an XML base --- isolating accesses and modifications of different transactions becomes an important issue.

