Results 11  20
of
96
Standard Generalized Markup Language: Mathematical and Philosophical Issues
 Computer Science Today. Recent Trends and Developments
, 1995
"... . The Standard Generalized Markup Language (SGML), an ISO standard, has become the accepted method of defining markup conventions for text files. SGML is a metalanguage for defining grammars for textual markup in much the same way that BackusNaur Form is a metalanguage for defining programming ..."
Abstract

Cited by 22 (2 self)
 Add to MetaCart
. The Standard Generalized Markup Language (SGML), an ISO standard, has become the accepted method of defining markup conventions for text files. SGML is a metalanguage for defining grammars for textual markup in much the same way that BackusNaur Form is a metalanguage for defining programminglanguage grammars. Indeed, HTML, the method of marking up a hypertext documents for the World Wide Web, is an SGML grammar. The underlying assumptions of the SGML initiative are that a logical structure of a document can be identified and that it can be indicated by the insertion of labeled matching brackets (start and end tags). Moreover, it is assumed that the nesting relationships of these tags can be described with an extended contextfree grammar (the righthand sides of productions are regular expressions). In this survey of some of the issues raised by the SGML initiative, I reexamine the underlying assumptions and address some of the theoretical questions that SGML raises....
Canonical derivatives, partial derivatives and finite automaton constructions
 Theor. Comput. Sci
"... Let E be a regular expression. Our aim is to establish a theoretical relation between two wellknown automata recognizing the language of E, namely the position automaton PE constructed by Glushkov or McNaughton and Yamada, and the equation automaton EE constructed by Mirkin or Antimirov. We define ..."
Abstract

Cited by 22 (3 self)
 Add to MetaCart
Let E be a regular expression. Our aim is to establish a theoretical relation between two wellknown automata recognizing the language of E, namely the position automaton PE constructed by Glushkov or McNaughton and Yamada, and the equation automaton EE constructed by Mirkin or Antimirov. We define the notion of cderivative (for canonical derivative) of a regular expression E and show that if E is linear then two Brzozowski’s derivatives of E are acisimilar if and only if the corresponding cderivatives are identical. It allows us to represent the BerrySethi’s set of continuations of a position by a unique cderivative, called the ccontinuation of the position. Hence the definition of CE, the ccontinuation automaton of E, whose states are pairs made of a position of E and of the associated ccontinuation. If states are viewed as positions, CE is isomorphic to PE. On the other hand, a partial derivative, as defined by Antimirov, is a class of cderivatives for some equivalence relation, thus CE reduces to EE. Finally CE makes it possible to go from PE to EE, while this cannot be achieved directly (from the state graphs). These theoretical results lead to an O(E  2) space and time algorithm to compute the equation automaton, where E  is the size of the expression. This is the complexity of the most efficient constructions yielding the position automaton, while the size of the equation automaton is not greater and generally much smaller than the size of the position automaton.
The AQUA Approach to Querying Lists and Trees in ObjectOriented Databases
 IN IEEE INTERNATIONAL CONFERENCE ON DATA ENGINEERING
, 1995
"... Relational database systems and most objectoriented database systems provide support for queries. Usually these queries represent retrievals over sets or multisets. Many new applications for databases, such as multimedia systems and digital libraries, need support for queries on complex bulk types s ..."
Abstract

Cited by 21 (0 self)
 Add to MetaCart
Relational database systems and most objectoriented database systems provide support for queries. Usually these queries represent retrievals over sets or multisets. Many new applications for databases, such as multimedia systems and digital libraries, need support for queries on complex bulk types such as lists and trees. In this paper we describe an objectoriented query algebra for lists and trees. The operators in the algebra preserve the ordering between the elements of a list or tree, even when the result list or tree contains an arbitrary set of nodes from the original tree. We also present predicate languages for lists and trees which allow ordersensitive queries because they use pattern matching to examine groups of list or tree nodes rather than individual nodes. The ability to decompose predicate patterns enables optimizations that make use of indices.
Building survivable systems: An integrated approach based on intrusion detection and damage containment
 Proc. of the DARPA Information Survivability Conference and Exposition[C]. IEEE Computer
, 2000
"... Reliance on networked information systems to support critical infrastructures prompts interest in making network information systems survivable, so that they continue functioning even when under attack. To build survivable systems, attacks must be detected and reacted to before they impact performan ..."
Abstract

Cited by 21 (5 self)
 Add to MetaCart
Reliance on networked information systems to support critical infrastructures prompts interest in making network information systems survivable, so that they continue functioning even when under attack. To build survivable systems, attacks must be detected and reacted to before they impact performance or functionality. Previous survivable systems research focussed primarily on detecting intrusions, rather than on preventing or containing damage due to intrusions. We have therefore developed a new approach that combines early attack detection with automated reaction for damage prevention and containment, as well as tracing and isolation of attack origination point(s). Our approach is based on specifying securityrelevant behaviors using patterns over sequences of observable events, such as a process’s system calls and their arguments, and the contents of network packets. By intercepting actual events at runtime and comparing them to specifications, attacks can be detected and operations associated with the deviant events can be modified to thwart the attack. Being based on securityrelevant behaviors rather than known attack signatures, our approach can protect against unknown attacks. At the same time, our approach produces few false positives – a property that is critical for automating reactions. Our hostbased mechanisms for attack detection and isolation coordinate with network routers enhanced with active networking technology in order to trace the origin of the attack and isolate the attacker. 1
MinimumCost Spanning Tree as a PathFinding Problem
 Information Processing Letters
, 1994
"... In this paper we show that minimumcost spanning tree is a special case of the closed semiring pathfinding problem. This observation gives us a nonrecursive algorithm for finding minimumcost spanning trees on meshconnected computers that has the same asymptotic running time but is much simpler th ..."
Abstract

Cited by 18 (0 self)
 Add to MetaCart
In this paper we show that minimumcost spanning tree is a special case of the closed semiring pathfinding problem. This observation gives us a nonrecursive algorithm for finding minimumcost spanning trees on meshconnected computers that has the same asymptotic running time but is much simpler than the previous recursive algorithms.
Recognizing Regular Expressions by means of Dataflow Networks
 In proc. of the 23rd International Colloquium on Automata, Languages, and Programming, (ICALP'96
, 1996
"... . This paper addresses the problem of building a Boolean dataflow network (sequential circuit) recognizing the language described by a regular expression. The main result is that both the construction time and the size of the resulting network are linear with respect to the size of the regular expre ..."
Abstract

Cited by 17 (2 self)
 Add to MetaCart
. This paper addresses the problem of building a Boolean dataflow network (sequential circuit) recognizing the language described by a regular expression. The main result is that both the construction time and the size of the resulting network are linear with respect to the size of the regular expression. Introduction "Grep" machine: Let \Sigma be a vocabulary, L be a regular language on \Sigma . A "grep" machine is a machine receiving a sequence s 0 ; s 1 ; : : : ; s n ; : : : of symbols (s i 2 \Sigma ) and computing a sequence b 0 ; b 1 ; : : : ; b n ; : : : of Booleans, such that b n is true if and only if the word s 0 s 1 : : : s n belongs to L 2 . This paper addresses the problem of building a "grep" machine for languages described by regular expressions. This problem is rather classical [4, 11, 10, 3, 1, 2]. We propose a solution which, to our knowledge, is new: Informally, it consists of building, from a regular expression E, a "circuit" (or Boolean dataflow network) explori...
SUCCINCTNESS OF THE COMPLEMENT AND INTERSECTION OF REGULAR EXPRESSIONS
, 2008
"... We study the succinctness of the complement and intersection of regular expressions. In particular, we show that when constructing a regular expression defining the complement of a given regular expression, a double exponential size increase cannot be avoided. Similarly, when constructing a regular ..."
Abstract

Cited by 17 (5 self)
 Add to MetaCart
We study the succinctness of the complement and intersection of regular expressions. In particular, we show that when constructing a regular expression defining the complement of a given regular expression, a double exponential size increase cannot be avoided. Similarly, when constructing a regular expression defining the intersection of a fixed and an arbitrary number of regular expressions, an exponential and double exponential size increase, respectively, can in worstcase not be avoided. All mentioned lower bounds improve the existing ones by one exponential and are tight in the sense that the target expression can be constructed in the corresponding time class, i.e., exponential or double exponential time. As a byproduct, we generalize a theorem by Ehrenfeucht and Zeiger stating that there is a class of DFAs which are exponentially more succinct than regular expressions, to a fixed fourletter alphabet. When the given regular expressions are oneunambiguous, as for instance required by the XML Schema specification, the complement can be computed in polynomial time whereas the bounds concerning intersection continue to hold. For the subclass of singleoccurrence regular expressions, we prove a tight exponential lower bound for intersection.
Efficient Regular Expression Evaluation: Theory to Practice
"... Several algorithms and techniques have been proposed recently to accelerate regular expression matching and enable deep packet inspection at line rate. This work aims to provide a comprehensive practical evaluation of existing techniques, extending them and analyzing their compatibility. The study f ..."
Abstract

Cited by 16 (0 self)
 Add to MetaCart
Several algorithms and techniques have been proposed recently to accelerate regular expression matching and enable deep packet inspection at line rate. This work aims to provide a comprehensive practical evaluation of existing techniques, extending them and analyzing their compatibility. The study focuses on two hardware architectures: memorybased ASICs and FPGAs. 1.
A Taxonomy of Finite Automata Construction Algorithms
 Computing Science
, 1993
"... This paper presents a taxonomy of finite automata construction algorithms. Each algorithm is classified into one of two families: those based upon the structure of regular expressions, and those based upon the automatatheoretic work of Myhill and Nerode. Many of the algorithms appearing in the lite ..."
Abstract

Cited by 15 (4 self)
 Add to MetaCart
This paper presents a taxonomy of finite automata construction algorithms. Each algorithm is classified into one of two families: those based upon the structure of regular expressions, and those based upon the automatatheoretic work of Myhill and Nerode. Many of the algorithms appearing in the literature are based upon the structure of regular expressions. In this paper, we make this term precise by defining regular expressions as a \Sigmaterm algebra, and automata constructions as various \Sigmaalgebras of automata. Each construction algorithm is then presented as the unique natural homomorphism from the \Sigmaterm algebra of regular expressions to the appropriate \Sigmaalgebra of automata. The concept of duality is introduced and used to derive more practical construction algorithms. In this way, we successfully present (and relate) algorithms given by Thompson, Berry and Sethi, McNaughton and Yamada, Glushkov, and Aho, Sethi, and Ullman. Efficient implementations (including thos...
The Validation of SGML Content Models
 MATHEMATICAL AND COMPUTER MODELLING
, 1997
"... The Standard Generalized Markup Language (SGML) is an ISO standard that provides a syntactic metalanguage for the definition of textual markup systems, which are used to indicate the structure of documents so that they can be electronically typeset, searched, and communicated. We address only on ..."
Abstract

Cited by 14 (8 self)
 Add to MetaCart
The Standard Generalized Markup Language (SGML) is an ISO standard that provides a syntactic metalanguage for the definition of textual markup systems, which are used to indicate the structure of documents so that they can be electronically typeset, searched, and communicated. We address only one problem raised by the standard, namely: In SGML, the righthand sides of contextfree productions are regular expressions, called content models, that are restricted to be what the standard calls "unambiguous," but what is more appropriately called deterministic. We solve the problem of how to define determinism precisely, how to recognize deterministic regular expressions efficiently, and how to recognize deterministic regular languages. Any SGML parser must check that a given document grammar conforms to the standard; that is, it must validate it. Hence, our results are an important step in the clarification of the standard and in the efficient implementation of an SGML parser fo...