Results 1 - 10
of
22
Regular Expression Types for XML
, 2003
"... We propose regular expression types as a foundation for statically typed XML processing languages. Regular expression types, like most schema languages for XML, introduce regular expression notations such as repetition (*), alternation (|), etc., to describe XML documents. The novelty of our type sy ..."
Abstract
-
Cited by 157 (18 self)
- Add to MetaCart
We propose regular expression types as a foundation for statically typed XML processing languages. Regular expression types, like most schema languages for XML, introduce regular expression notations such as repetition (*), alternation (|), etc., to describe XML documents. The novelty of our type system is a semantic presentation of subtyping, as inclusion between the sets of documents denoted by two types. We give several examples illustrating the usefulness of this form of subtyping in XML processing. The decision problem for the subtype relation reduces to the inclusion problem between tree automata, which is known to be exptime-complete. To avoid this high complexity in typical cases, we develop a practical algorithm that, unlike classical algorithms based on determinization of tree automata, checks the inclusion relation by a top-down traversal of the original type expressions. The main advantage of this algorithm is that it can exploit the property that type expressions being compared often share portions of their representations. Our algorithm is a variant of Aiken and Murphy’s set-inclusion constraint solver, to which are added several new implementation techniques, correctness proofs, and preliminary performance measurements on some small programs in the domain of typed XML processing.
XDuce: A Statically Typed XML Processing Language
, 2002
"... this paper we describe a statically typed XML processing language called XDuce (o#cially pronounced "transduce"). XDuce is a functional language whose primitive data structures represent XML documents and whose types---called regular expression types---correspond to document schemas. The motivating ..."
Abstract
-
Cited by 127 (5 self)
- Add to MetaCart
this paper we describe a statically typed XML processing language called XDuce (o#cially pronounced "transduce"). XDuce is a functional language whose primitive data structures represent XML documents and whose types---called regular expression types---correspond to document schemas. The motivating principle behind its design is that a simple, clean, and powerful type system for XML processing can be based directly on the theory of regular tree automata
XDuce: A Typed XML Processing Language
- In Proc. of Workshop on the Web and Data Bases (WebDB
, 2000
"... this paper, we present a preliminary design for a statically typed programming language, XDuce (pronounced "transduce "). XDuce is a tree transformation language, similar in spirit to mainstream functional languages but specialized to the domain of XML processing. Its novel features are regular expr ..."
Abstract
-
Cited by 122 (7 self)
- Add to MetaCart
this paper, we present a preliminary design for a statically typed programming language, XDuce (pronounced "transduce "). XDuce is a tree transformation language, similar in spirit to mainstream functional languages but specialized to the domain of XML processing. Its novel features are regular expression types and a corresponding mechanism for regular expression pattern matching. Regular expression types are a natural generalization of DTDs, describing, as DTDs do, structures in XML documents using regular expression operators (i.e., *, ?, ---, etc.). Moreover, regular expression types support a simple but powerful notion of subtyping, yielding a substantial degree of flexibility in programming. Regular expression pattern matching is similar to ML pattern matching except that regular expression types can be embedded in patterns, which allows even more flexible matching.
Type-Indexed Rows
, 2001
"... Record calculi use labels to distinguish between the elements of products and sums. This paper presents a novel variation, type-indexed rows, in which labels are discarded and the types of the elements themselves serve as indices. The calculus, TIR , can express tuples, recursive datatypes, monom ..."
Abstract
-
Cited by 32 (7 self)
- Add to MetaCart
Record calculi use labels to distinguish between the elements of products and sums. This paper presents a novel variation, type-indexed rows, in which labels are discarded and the types of the elements themselves serve as indices. The calculus, TIR , can express tuples, recursive datatypes, monomophic records, polymorphic extensible records, and closed-world style type-based overloading. Our key application of TIR , however, is to encode the \choice" types of XML, and the \unordered sequence" types of SGML. Indeed, TIR is the kernel of the language XM, a lazy functional language extending XML with polymorphism and higher-order functions. The system is built from rows, equality constraints, membership constraints and constrained parametric polymorphism. The constraint domain enjoys decidable entailment and satisfaction (in EXP). We present a type checking algorithm, and show how TIR may be implemented by a typedirected translation which replaces type-indexing by conven...
XIRQL: An XML Query Language Based on Information Retrieval Concepts
, 2001
"... Most proposals for XML query languages are based on the data-centric view on XML and do not support uncertainty and vagueness, thus being insuitable for information retrieval (IR) of XML documents. Based on the document-centric view, we present the query language XIRQL which implements IR-related fe ..."
Abstract
-
Cited by 31 (2 self)
- Add to MetaCart
Most proposals for XML query languages are based on the data-centric view on XML and do not support uncertainty and vagueness, thus being insuitable for information retrieval (IR) of XML documents. Based on the document-centric view, we present the query language XIRQL which implements IR-related features such as weighting and ranking, relevance-oriented search, datatypes with vague predicates, and structural relativism. XIRQL integrates these features by using ideas from logic-based probabilistic IR models, in combination with concepts from the database area. For processing XIRQL queries, a path algebra is presented which also serves as a starting point for query optimization.
Discovering Frequent Substructures from Hierarchical Semi-structured Data
- In Proc. of the 2nd SIAM International Conference on Data Mining (SDM-2002
, 2002
"... Abstract: Frequent substructure discovery from a collection of semi-structured objects can serve for storage, browsing, querying, indexing and classification of semi-structured documents. This paper examines the problem of discovering frequent substructures from a collection of hierarchical semi-str ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
Abstract: Frequent substructure discovery from a collection of semi-structured objects can serve for storage, browsing, querying, indexing and classification of semi-structured documents. This paper examines the problem of discovering frequent substructures from a collection of hierarchical semi-structured objects of the same type. The use of wildcard is an important aspect of substructure discovery from semi-structured data due to the irregularity and lack of fixed structure of such data. This paper proposes a more general and powerful wildcard mechanism, which allows us to find more complex and interesting substructures than existing techniques. Furthermore, the complexity of structural information of semi-structured data and the usage of wildcard make the existing frequent set mining algorithms inapplicable for substructure discovery. In this work, we adopt a vertical format for the storage of semi-structured objects, and adapt a frequent set mining algorithm for our purpose. The application of our approach to real-life data shows that it is very effective.
Describing semistructured data
- SIGMOD Record, Database Principles Column
"... We introduce a rich language of descriptions for semistructured tree-like data, and we explain how such descriptions relate to the data they describe. Various query languages and data schemas can be based on such descriptions. 1 ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
We introduce a rich language of descriptions for semistructured tree-like data, and we explain how such descriptions relate to the data they describe. Various query languages and data schemas can be based on such descriptions. 1
Semantic subtyping with an SMT solver
, 2010
"... We study a first-order functional language with the novel combination of the ideas of refinement type (the subset of a type to satisfy a Boolean expression) and type-test (a Boolean expression testing whether a value belongs to a type). Our core calculus can express a rich variety of typing idioms; ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
We study a first-order functional language with the novel combination of the ideas of refinement type (the subset of a type to satisfy a Boolean expression) and type-test (a Boolean expression testing whether a value belongs to a type). Our core calculus can express a rich variety of typing idioms; for example, intersection, union, negation, singleton, nullable, variant, and algebraic types are all derivable. We formulate a semantics in which expressions denote terms, and types are interpreted as first-order logic formulas. Subtyping is defined as valid implication between the semantics of types. The formulas are interpreted in a specific model that we axiomatize using standard first-order theories. On this basis, we present a novel type-checking algorithm able to eliminate many dynamic tests and to detect many errors statically. The key idea is to rely on an SMT solver to compute subtyping efficiently. Moreover, interpreting types as formulas allows us to call the SMT solver at run-time to compute instances of types.
Type System of an Object-Oriented Database Programming Language (Extended Abstract)
- ACM Computing Surveys (CSUR
, 1999
"... In this paper we present the type system of the TIGUKAT database programming language. It is a highly parametric object-oriented type system that combines multiple dispatch with reflexivity, separation of interface and implementation, precise behavior typing, and union and intersection types. We dem ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
In this paper we present the type system of the TIGUKAT database programming language. It is a highly parametric object-oriented type system that combines multiple dispatch with reflexivity, separation of interface and implementation, precise behavior typing, and union and intersection types. We demonstrate the inner workings of the type system by considering a concrete example of type specification in TIGUKAT. We also review type systems of several existing programming languages and conclude that the proposed type system has a unique combination of features particularly suited for object-oriented database programming. This is an expanded version of the extended abstract submitted to DBPL 97. 1 INTRODUCTION 1 1 Introduction In the past two decades, several new database application areas have emerged. These new areas include office automation systems, geographical information systems, CASE tools, medical systems, and CAD/CAM systems. These complicated applications demand new, more ...
Pattern Guards and Transformational Patterns
, 2000
"... We propose three extensions to patterns and pattern matching in Haskell. The first, pattern guards, allows the guards of a guarded equation to match patterns and bind variables, as well as to test boolean condition. For this we introduce a natural generalisation of guard expressions to guard quali ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
We propose three extensions to patterns and pattern matching in Haskell. The first, pattern guards, allows the guards of a guarded equation to match patterns and bind variables, as well as to test boolean condition. For this we introduce a natural generalisation of guard expressions to guard qualifiers. A frequently-occurring special case is that a function should be applied to a matched value, and the result of this is to be matched against another pattern. For this we introduce a syntactic abbreviation, transformational patterns, that is particularly useful when dealing with views. These proposals can be implemented with very modest syntactic and implementation cost. They are upward compatible with Haskell; all existing programs will continue to work. We also offer a third, much more speculative proposal, which provides the transformational-pattern construct with additional power to explicitly catch pattern match failure. We demonstrate the usefulness of the proposed extension by several examples, in particular, we compare our proposal with views, and we also discuss the use of the new patterns in combination with equational reasoning.

