Results 1 - 10
of
292
The Lorel Query Language for Semistructured Data
- International Journal on Digital Libraries
, 1997
"... We present the Lorel language, designed for querying semistructured data. Semistructured data is becoming more and more prevalent, e.g., in structured documents such as HTML and when performing simple integration of data from multiple sources. Traditional data models and query languages are inapprop ..."
Abstract
-
Cited by 631 (25 self)
- Add to MetaCart
We present the Lorel language, designed for querying semistructured data. Semistructured data is becoming more and more prevalent, e.g., in structured documents such as HTML and when performing simple integration of data from multiple sources. Traditional data models and query languages are inappropriate, since semistructured data often is irregular, some data is missing, similar concepts are represented using different types, heterogeneous sets are present, or object structure is not fully known. Lorel is a user-friendly language in the SQL/OQL style for querying such data effectively. For wide applicability, the simple object model underlying Lorel can be viewed as an extension of ODMG and the language as an extension of OQL. The main novelties of the Lorel language are: (i) extensive use of coercion to relieve the user from the strict typing of OQL, which is inappropriate for semistructured data
From Structured Documents to Novel Query Facilities
, 1994
"... Structured documents (e.g., SGML) can benefit a lot from database support and more specifically from object-oriented database (OODB) management systems. This paper describes a natural mapping from SGML documents into OODB's and a formal extension of two OODB query languages (one SQL-like and the oth ..."
Abstract
-
Cited by 222 (34 self)
- Add to MetaCart
Structured documents (e.g., SGML) can benefit a lot from database support and more specifically from object-oriented database (OODB) management systems. This paper describes a natural mapping from SGML documents into OODB's and a formal extension of two OODB query languages (one SQL-like and the other calculus) in order to deal with SGML document retrieval. Although motivated by structured documents, the extensions of query languages that we present are general and useful for a variety of other OODB applications. A key element is the introduction of paths as first class citizens. The new features allow to query data (and to some extent schema) without exact knowledge of the schema in a simple and homogeneous fashion. 1 Introduction Structured documents are central to a wide class of applications such as software engineering, libraries, technical documentation, etc. They are often stored in file systems and document access tools are somewhat limited. We believe that (object-oriented) d...
Using Schema Matching to Simplify Heterogeneous Data Translation
, 1998
"... A broad spectrum of data is available on the Web in distinct heterogeneous sources, and stored under different formats. As the number of systems that utilize this heterogeneous data grows, the importance of data translation and conversion mechanisms increases greatly. In this paper we present a n ..."
Abstract
-
Cited by 187 (5 self)
- Add to MetaCart
A broad spectrum of data is available on the Web in distinct heterogeneous sources, and stored under different formats. As the number of systems that utilize this heterogeneous data grows, the importance of data translation and conversion mechanisms increases greatly. In this paper we present a new translation system, based on schemamatching, aimed to simplify the intricate task of data conversion. We observe that in many cases the schema of the data in the source system is very similar to the that of the target system. In such cases, much of the translation work can be done automatically, based on the schemas similarity. This saves a lot of effort for the user, limiting the amount of programming needed. We define common schema and data models, in which schemas and data (resp.) from many common models can be represented. Using a rulebased method, the source schema is compared with the target one, and each component in the source schema is matched with a corresponding compone...
The Amsterdam Hypermedia Model: Adding Time, Structure and Context to Hypertext
- Communications of the ACM
, 1994
"... this paper to describe hypermedia: a document is a complete collection of related components. Each component can be built recursively from other components or from primitive data elements of various types, also called entities. A presentation is the active form of a document. In normal use, the term ..."
Abstract
-
Cited by 143 (34 self)
- Add to MetaCart
this paper to describe hypermedia: a document is a complete collection of related components. Each component can be built recursively from other components or from primitive data elements of various types, also called entities. A presentation is the active form of a document. In normal use, the terms document and presentation are nearly interchangeable, as are (to a lessor extent) entity and component. Generally, context should clarify usage.
One-Unambiguous Regular Languages
- Information and computation
, 1997
"... The ISO standard for the Standard Generalized Markup Language (SGML) provides a syntactic meta-language for the definition of textual markup systems. In the standard, the right-hand sides of productions are based on regular expressions, although only regular expressions that denote words unambigu ..."
Abstract
-
Cited by 91 (9 self)
- Add to MetaCart
The ISO standard for the Standard Generalized Markup Language (SGML) provides a syntactic meta-language for the definition of textual markup systems. In the standard, the right-hand sides of productions are based on regular expressions, although only regular expressions that denote words unambiguously, in the sense of the ISO standard, are allowed. In general, a word that is denoted by a regular expression is witnessed by a sequence of occurrences of symbols in the regular expression that match the word. In an unambiguous regular expression as defined by Book, Even, Greibach, and Ott, each word has at most one witness. But the SGML standard also requires that a witness be computed incrementally from the word with a one-symbol lookahead; we call such regular expressions 1-unambiguous. A regular language is a 1-unambiguous language if it is denoted by some 1-unambiguous regular expression. We give a Kleene theorem for 1-unambiguous languages and characterize 1-unambiguous regu...
JavaML: A Markup Language for Java Source Code
, 2000
"... The classical plain-text representation of source code is convenient for programmers but requires parsing to uncover the deep structure of the program. While sophisticated software tools parse source code to gain access to the program's structure, many lightweight programming aids such as grep rely ..."
Abstract
-
Cited by 83 (4 self)
- Add to MetaCart
The classical plain-text representation of source code is convenient for programmers but requires parsing to uncover the deep structure of the program. While sophisticated software tools parse source code to gain access to the program's structure, many lightweight programming aids such as grep rely instead on only the lexical structure of source code. I describe a new XML application that provides an alternative representation of Java source code. This XML-based representation, called JavaML, is more natural for tools and permits easy specification of numerous software-engineering analyses by leveraging the abundance of XML tools and techniques. A robust converter built with the Jikes Java compiler framework translates from the classical Java source code representation to JavaML, and an XSLT stylesheet converts from JavaML back into the classical textual form. Keywords: Java, XML, abstract syntax tree representation, software-engineering analysis, Jikes compiler. 1 Introduction Since...
Correspondence and translation for heterogeneous data
, 2002
"... Data integration often requires a clean abstraction of the different formats in which data are stored, and means for specifying the correspondences/relationships between data in different worlds and for translating data from one world to another. For that, we introduce in this paper a middleware dat ..."
Abstract
-
Cited by 73 (10 self)
- Add to MetaCart
Data integration often requires a clean abstraction of the different formats in which data are stored, and means for specifying the correspondences/relationships between data in different worlds and for translating data from one world to another. For that, we introduce in this paper a middleware data model that serves as a basis for the integration task, and a declarative rules language for specifying the integration. We show that using the language, correspondences between data elements can be computed in polynomial time in many cases, andmay require exponential time only when insensitivity to order or duplicates are considered. Furthermore, we show that in most practical cases the correspondence rules can be automatically turnedinto translation rules to map data from one representation to another. Thus, a complete integration task (derivation of correspondences, transformation of data from one world to the other, incremental integration of a new bulk of data, etc.) can be specified using a single set of declarative rules.
The practitioner's guide to coloured Petri nets
- International Journal on Software Tools for Technology Transfer
, 1998
"... Coloured Petri nets (CP-nets or CPNs) provide a framework for the design, specification, validation, and verification of systems. CP-nets have a wide range of application areas and many CPN projects have been carried out in industry, e.g., in the areas of communication protocols, operating systems, ..."
Abstract
-
Cited by 68 (16 self)
- Add to MetaCart
Coloured Petri nets (CP-nets or CPNs) provide a framework for the design, specification, validation, and verification of systems. CP-nets have a wide range of application areas and many CPN projects have been carried out in industry, e.g., in the areas of communication protocols, operating systems, hardware designs, embedded systems, software system designs, and business process re-engineering. Design/CPN is a graphical computer tool supporting the practical use of CP-nets. The tool supports the construction, simulation, and functional and performance analysis of CPN models. The tool is used by more than four hundred organisations in forty different countries -- including one hundred commercial companies. It is available free of charge, also for commercial use. This paper provides a comprehensive road map to the practical use of CP-nets and the Design/CPN tool. We give an informal introduction to the basic concepts and ideas underlying CP-nets. The key components and facilities of the Design/CPN tool are presented and their use illustrated. The paper is self-contained and does not assume any prior knowledge of Petri nets and CP-nets nor any experience with the Design/CPN tool.
Schemas for Integration and Translation of Structured and Semi-Structured Data
- In Proceedings of the International Conference on Database Theory
, 1999
"... Introduction The Web is emerging as a universal data repository, offering access to sources whose data organization varies from strictly structured databases to almost completely unstructured pages, and everything in between. Consequently, much research has recently focused on data integration and ..."
Abstract
-
Cited by 62 (5 self)
- Add to MetaCart
Introduction The Web is emerging as a universal data repository, offering access to sources whose data organization varies from strictly structured databases to almost completely unstructured pages, and everything in between. Consequently, much research has recently focused on data integration and data translation systems [10, 6, 9, 8, 17, 13, 2, 19], whose goals are to allow applications to utilize data from many sources, with possibly widely varying formats. These research efforts have established a common data model of semistructured data, for uniformly representing data from any source. Recently, however, it is being realized that having a common schema model is also beneficial, and even necessary, in translation and integration systems to support tasks such as query formulation, decomposition and optimization, or declarative specification of data translation. As an example, which we use for motivation throughout the paper, recently suggested tools for data translation [2, 11, 19
Locating Matches of Tree Patterns in Forests
- Foundations of Software Technology and Theoretical Computer Science, Lecture Notes in Computer Science
, 1998
"... . We deal with matching and locating of patterns in forests of variable arity. A pattern consists of a structural and a contextual condition for subtrees of a forest, both of which are given as tree or forest regular languages. We use the notation of constraint systems to uniformly specify both kind ..."
Abstract
-
Cited by 52 (5 self)
- Add to MetaCart
. We deal with matching and locating of patterns in forests of variable arity. A pattern consists of a structural and a contextual condition for subtrees of a forest, both of which are given as tree or forest regular languages. We use the notation of constraint systems to uniformly specify both kinds of conditions. In order to implement pattern matching we introduce the class of pushdown forest automata. We identify a special class of contexts such that not only pattern matching but also locating all of a forest's subtrees matching in context can be performed in a single traversal. We also give a method for computing the reachable states of an automaton in order to minimize the size of transition tables. 1 Introduction In Standard Generalized Markup Language (SGML) [Gol90] documents are represented as trees. A node in a document tree may have arbitrarily many children, independent of the symbol at that node. A sequence of documents or subdocuments is called a forest. A main task in do...

