Results 1 - 10
of
15
Taxonomy of xml schema languages using formal language theory
- EXTREME MARKUP LANGUAGES
, 2001
"... On the basis of regular tree grammars, we present a formal framework for XML schema languages. This framework helps to describe, compare, and implement such schema languages in a rigorous manner. Our main results are as follows: (1) a simple framework to study three classes of tree languages (local, ..."
Abstract
-
Cited by 169 (5 self)
- Add to MetaCart
On the basis of regular tree grammars, we present a formal framework for XML schema languages. This framework helps to describe, compare, and implement such schema languages in a rigorous manner. Our main results are as follows: (1) a simple framework to study three classes of tree languages (local, single-type, and regular); (2) classification and comparison of schema languages (DTD, W3C XML Schema, and RELAX NG) based on these classes; (3) efficient document validation algorithms for these classes; and (4) other grammatical concepts and advanced validation algorithms relevant to an XML model (e.g., binarization, derivative-based validation).
Regular Expression Types for XML
, 2003
"... We propose regular expression types as a foundation for statically typed XML processing languages. Regular expression types, like most schema languages for XML, introduce regular expression notations such as repetition (*), alternation (|), etc., to describe XML documents. The novelty of our type sy ..."
Abstract
-
Cited by 157 (18 self)
- Add to MetaCart
We propose regular expression types as a foundation for statically typed XML processing languages. Regular expression types, like most schema languages for XML, introduce regular expression notations such as repetition (*), alternation (|), etc., to describe XML documents. The novelty of our type system is a semantic presentation of subtyping, as inclusion between the sets of documents denoted by two types. We give several examples illustrating the usefulness of this form of subtyping in XML processing. The decision problem for the subtype relation reduces to the inclusion problem between tree automata, which is known to be exptime-complete. To avoid this high complexity in typical cases, we develop a practical algorithm that, unlike classical algorithms based on determinization of tree automata, checks the inclusion relation by a top-down traversal of the original type expressions. The main advantage of this algorithm is that it can exploit the property that type expressions being compared often share portions of their representations. Our algorithm is a variant of Aiken and Murphy’s set-inclusion constraint solver, to which are added several new implementation techniques, correctness proofs, and preliminary performance measurements on some small programs in the domain of typed XML processing.
Reasoning about XML Schema Languages using Formal Language Theory
- Technical Report, IBM Almaden Research Center, RJ# 10197, Log# 95071
, 2000
"... A mathematical framework using formal language theory to describe and compare XML schema languages is presented. Our framework uses the work in two related areas -- regular tree languages [CDG + 97] and ambiguity in regular expressions [BEGO71, BKW98]. Using these work as well as the content in ..."
Abstract
-
Cited by 35 (4 self)
- Add to MetaCart
A mathematical framework using formal language theory to describe and compare XML schema languages is presented. Our framework uses the work in two related areas -- regular tree languages [CDG + 97] and ambiguity in regular expressions [BEGO71, BKW98]. Using these work as well as the content in two classical references [HU79, AU79], we present the following results: (1) a normal form representation for regular tree grammars, (2) a framework of marked regular expressions and model groups, and their ambiguities, (3) five subclasses of regular tree grammars and their corresponding languages to describe XML content models: regular tree languages, TD(1) (top-down input scan with 1-vertical lookahead), single-type constraint languages, TDLL(1) (top-down and left-right input scan with 1-vertical and 1-horizontal lookaheads) , and local tree languages, (4) the closure properties of the five language classes under boolean set operations, (5) a classification and comparison of a few ...
CPI: Constraints-Preserving Inlining Algorithm for Mapping XML DTD to Relational Schema
- J. Data & Knowledge Engineering (DKE
, 2001
"... mapping XML DTD to relational schema ..."
Boolean operations and inclusion test for attribute-element constraints
- In Eighth International Conference on Implementation and Application of Automata
, 2003
"... Abstract The history of schema languages for XML is an increase of expressiveness. While early schema languages mainly focused on the element structure, Clark first paid an equal attention to attributes by allowing both element and attribute constraints in a single regular expression. In this paper, ..."
Abstract
-
Cited by 12 (3 self)
- Add to MetaCart
Abstract The history of schema languages for XML is an increase of expressiveness. While early schema languages mainly focused on the element structure, Clark first paid an equal attention to attributes by allowing both element and attribute constraints in a single regular expression. In this paper, we investigate an algorithmic aspect of Clark's mechanism (called "attributeelement constraints"), namely, intersection and difference operations and inclusion test, which have been proved to be important in static typechecking for XML processing programs. The contributions here are (1) proofs of closure under intersection and difference and decidability of inclusion test and (2) algorithm formulations incorporating a "divide-and-conquer " strategy for avoiding an exponential blow-up for typical inputs.
SchemaPath, a Minimal Extension to XML Schema for Conditional Constraints
, 2004
"... In the past few years, a number of constraint languages for XML documents has been proposed. They are cumulatively called schema languages or validation languages and they comprise, among others, DTD, XML Schema, RELAX NG, Schematron, DSD, xlinkit. One major point of... ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
In the past few years, a number of constraint languages for XML documents has been proposed. They are cumulatively called schema languages or validation languages and they comprise, among others, DTD, XML Schema, RELAX NG, Schematron, DSD, xlinkit. One major point of...
Preservation of Digital Data with SelfValidating, Self-Instantiating Knowledge-Based Archives
- ACM SIGMOD Record
, 2001
"... Digital archives are dedicated to the long-term preservation of electronic information and have the mandate to enable sustained access despite rapid technology changes. Persistent archives are confronted with heterogeneous data formats, helper applications, and platforms being used over the lifetime ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
Digital archives are dedicated to the long-term preservation of electronic information and have the mandate to enable sustained access despite rapid technology changes. Persistent archives are confronted with heterogeneous data formats, helper applications, and platforms being used over the lifetime of the archive. This is not unlike the interoperability challenges, for which mediators are devised. To prevent technological obsolescence over time and across platforms, a migration approach for persistent archives is proposed based on an XML infrastructure. We extend current archival approaches that build upon standardized data formats and simple metadata mechanisms for collection management, by involving high-level conceptual models and knowledge representations as an integral part of the archive and the ingestion/migration processes. Infrastructure independence is maximized by archiving generic, executable specifications of (i) archival constraints (i.e., “model validators”), and (ii) archival transformations that are part of the ingestion process. The proposed architecture facilitates construction of self-validating and selfinstantiating knowledge-based archives. We illustrate our overall approach and report on first experiences using a sample collection from a collaboration
Query Relaxation for XML Model
, 2002
"... This dissertation addresses mainly three issues needed to support query relaxation for XML model: framework formalization, extension of existing database techniques, and data conversion between XML and relational models. ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
This dissertation addresses mainly three issues needed to support query relaxation for XML model: framework formalization, extension of existing database techniques, and data conversion between XML and relational models.
Towards Self-Validating Knowledge-Based Archives
- 11th Workshop on Research Issues in Data Engineering (RIDE), Heidelberg, IEEE Computer Society, April 2001, SDSC
, 2001
"... Digital archives are dedicated to the long-term preservation of electronic information and have the mandate to enable sustained access despite a rapidly changing information infrastructure. Current archival approaches build upon standardized data formats and simple metadata mechanisms for collection ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Digital archives are dedicated to the long-term preservation of electronic information and have the mandate to enable sustained access despite a rapidly changing information infrastructure. Current archival approaches build upon standardized data formats and simple metadata mechanisms for collection management, but do not involve high-level conceptual models and knowledge representations. This results in serious limitations, not only for expressing various kinds of information and knowledge about the archived data, but also for creating infrastructure independent, selfvalidating and self-instantiating archives. To overcome these limitations, we first propose a scalable XML-based archival infrastructure, based on standard tools, and subsequently show how this architecture can be extended to a model-based framework, where higher-level knowledge representations become an integral part of the archive and the ingestion /migration processes. This allows us to maximize infrastructure indepen...
A Few Tips for Good XML Design
, 2000
"... The design on XML (eXtensible Markup Language) specification is one of the essential parts in XML application development. XML users however are writing their application in their own way. Doubts on the quality of an XML application have been raised. In this paper, we listed some criteria for good X ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
The design on XML (eXtensible Markup Language) specification is one of the essential parts in XML application development. XML users however are writing their application in their own way. Doubts on the quality of an XML application have been raised. In this paper, we listed some criteria for good XML design. The quest is to find out the design technique leading us to fit this criteria. The attempts on recycling the existing technique fail because of the new challenges on XML design. To illustrate the uniqueness of XML design, we presented two case studies. These case studies leads to the sugggestion on the categorization on XML application. With the experience from the prior research [5], we wrote down some guidelines for XML design.

