## Adding nesting structure to words (2006)

### Cached

### Download Links

Venue: | In Developments in Language Theory, LNCS 4036 |

Citations: | 79 - 12 self |

### BibTeX

@INPROCEEDINGS{Alur06addingnesting,

author = {Rajeev Alur and P. Madhusudan},

title = {Adding nesting structure to words},

booktitle = {In Developments in Language Theory, LNCS 4036},

year = {2006},

pages = {1--13},

publisher = {Springer}

}

### Years of Citing Articles

### OpenURL

### Abstract

We propose the model of nested words for representation of data with both a linear ordering and a hierarchically nested matching of items. Examples of data with such dual linear-hierarchical structure include executions of structured programs, annotated linguistic data, and HTML/XML documents. Nested words generalize both words and ordered trees, and allow both word and tree operations. We define nested word automata—finite-state acceptors for nested words, and show that the resulting class of regular languages of nested words has all the appealing theoretical properties that the classical regular word languages enjoys: deterministic nested word automata are as expressive as their nondeterministic counterparts; the class is closed under union, intersection, complementation, concatenation, Kleene-*, prefixes, and language homomorphisms; membership, emptiness, language inclusion, and language equivalence are all decidable; and definability in monadic second order logic corresponds exactly to finite-state recognizability. We also consider regular languages of infinite nested words and show that the closure properties, MSO-characterization, and decidability of decision problems carry over. The linear encodings of nested words give the class of visibly pushdown languages of words, and this class lies between balanced languages and deterministic context-free languages. We argue that for algorithmic verification of structured programs, instead of viewing the program as a context-free language over words, one should view it as a regular language of nested words (or equivalently, a visibly pushdown language), and this would allow model checking of many properties (such as stack inspection, pre-post conditions) that are not expressible in existing specification logics. We also study the relationship between ordered trees and nested words, and the corresponding automata: while the analysis complexity of nested word automata is the same as that of classical tree automata, they combine both bottom-up and top-down traversals, and enjoy expressiveness and succinctness benefits over tree automata. 1

### Citations

4091 |
Introduction to Automata Theory, Languages, and Computation
- HOPCROFT, ULLMAN
- 1979
(Show Context)
Citation Context ...bles. The final clause allows processing of unmatched returns. ✷ Recall that a bracketed language consists of well-bracketed words of different types of parentheses (c.f. [Ginsburg and Harrison 1967; =-=Hopcroft and Ullman 1979-=-]). A parenthesis language is a bracketed language with only one kind of parentheses. Bracketed languages are special case of balanced grammars [Berstel and Boasson 2002; Brüggermann-Klein and Wood 20... |

1468 | An axiomatic basis for computer programming
- Hoare
- 1969
(Show Context)
Citation Context ...program specifications as languages of nested words generalizes the linear-time semantics that allows integration of Pnueli-style temporal reasoning [Pnueli 1977] and Hoarestyle structured reasoning [=-=Hoare 1969-=-]. We believe that the nested-word view will provide a unifying basis for the next generation of specification logics for program analysis, software verification, and runtime monitoring. Given a langu... |

1287 |
The temporal logic of programs
- Pnueli
- 1977
(Show Context)
Citation Context ...s. More broadly, modeling structured programs and program specifications as languages of nested words generalizes the linear-time semantics that allows integration of Pnueli-style temporal reasoning [=-=Pnueli 1977-=-] and Hoarestyle structured reasoning [Hoare 1969]. We believe that the nested-word view will provide a unifying basis for the next generation of specification logics for program analysis, software ve... |

876 | Dynamic logic
- Harel
- 1984
(Show Context)
Citation Context ...e augmented with some restricted class of context-free languages, and simple-minded pushdown automata, which may be viewed as a restricted class of VPAs, have been proposed to explain the phenomenon [=-=Harel et al. 2000-=-]. There is a logical characterization of context free languages using quantifications over matchings [Lautemann et al. 1994], and Theorem 5.2 follows from that result. Tree Automata. There is a rich ... |

502 |
Automata on infinite objects
- Thomas
- 1990
(Show Context)
Citation Context ...y a read operation” is formulated using automata over infinite words, and the theory of ω-regular languages is well developed with many of the counterparts of the results for regular languages (c.f. [=-=Thomas 1990-=-; Vardi and Wolper 1994]). Consequently, we also define nested ω-words and consider nested word automata augmented with acceptance conditions such as Büchi and Muller, that accept languages of nested ... |

420 | Automatic predicate abstraction of C programs
- Ball, Majumdar, et al.
- 2001
(Show Context)
Citation Context ...of software verification, a popular paradigm relies on data abstraction, where the data in a program is abstracted using a finite set of boolean variables that stand for predicates on the data-space [=-=Ball et al. 2001-=-; Henzinger et al. 2002]. The resulting models hence have finite-state but stack-based control flow (see Boolean programs [Ball and Rajamani 2000] and recursive state machines [Alur et al. 2005] as co... |

386 | Precise Interprocedural Dataflow Analysis via Graph Reachability
- Reps, Horwitz, et al.
(Show Context)
Citation Context ...the algorithms for inter-procedural program analysis and context-free reachability compute summary edges between control locations to capture the computation of the called procedure (see, for example =-=[18]-=-).s2 Nested Words Definition A nested relation ν of width k, for k ≥ 0, is a binary relation over {1, 2 . . .k} such that (1) if ν(i, j) then i < j; (2) if ν(i, j) and ν(i, j ′ ) then j = j ′ , and if... |

319 | Reachability analysis of pushdown automata: Application to model-checking
- Bouajjani, Esparza, et al.
- 1997
(Show Context)
Citation Context ...e problem of checking regular requirements of pushdown models has been extensively studied in recent years leading to efficient implementations and applications to program analysis [Reps et al. 1995; =-=Boujjani et al. 1997-=-; Ball and Rajamani 2000; Alur et al. 2005; Henzinger et al. 2002; Esparza et al. 2003; Chen and Wagner 2002]. While many analysis problems such as identifying dead code and accesses to uninitialized ... |

314 |
Two approaches to inter-procedural data-flow analysis
- Sharir, Pnueli
- 1981
(Show Context)
Citation Context ...he algorithms for inter-procedural program analysis and context-free reachability compute summary edges between control locations to capture the computation of the called procedure (see, for example [=-=Sharir and Pnueli 1981-=-; Reps et al. 1995]). The problem of checking regular requirements of pushdown models has been extensively studied in recent years leading to efficient implementations and applications to program Jour... |

262 | Reasoning about infinite computations
- Vardi, Wolper
- 1994
(Show Context)
Citation Context ...inspection properties, pre-post conditions of programs, local flows in programs, etc. Analogous to the theorem that a linear temporal formula can be compiled into an automaton that accepts its models =-=[21]-=-, any Caret formula can be compiled into a nested word automaton that accepts its models. Decidability of inclusion then yields a decidable model-checking problem for program models against Caret [6, ... |

234 | Bebop: A symbolic model checker for Boolean programs
- Ball, Rajamani
- 2000
(Show Context)
Citation Context ... abstracted using a finite set of boolean variables that stand for predicates on the data-space [7,10]. The resulting models hence have finite-state but stack-based control flow (see Boolean programs =-=[8]-=- and recursive state machines [1] as concrete instances of pushdown models of programs). Given a program P modeled as a pushdown automaton, we can view P as a generator of nested words in the followin... |

212 | Mops: an infrastructure for examining security properties of software - Chen, Wagner - 2002 |

160 | Stream Processing of XPath Queries with Predicates - Gupta, Suciu - 2003 |

146 | Visibly pushdown languages
- Alur, Madhusudan
- 2004
(Show Context)
Citation Context ...′ 2 nw1.nw ′ 1 ∼L nw2.nw ′ 2 ). We can now show that the finiteness of this congruence characterizes regularity for nested-word languages using the corresponding result for visibly pushdown languages =-=[5]-=-.sTheorem 4. For a set L of nested words, L is regular iff ∼L has finitely many congruence classes. Proof. Let L be a regular language of nested words. Let A be a NWA that accepts L, and let its set o... |

145 | Processing XML streams with Deterministic Automata
- Green, Micklau, et al.
- 2003
(Show Context)
Citation Context ... the constructions for nested word automata can be traced to the corresponding constructions for tree automata. Deterministic word automata have been also used for stream processing of XML documents [=-=Green et al. 2003-=-], where the authors argue, with experimental supporting data, that finite-state word automata may be good enough given that hierarchical depth of documents is small. Pushdown automata have been used ... |

117 | Analysis of recursive state machines
- Alur, Etessami, et al.
- 2000
(Show Context)
Citation Context ... boolean variables that stand for predicates on the data-space [7,10]. The resulting models hence have finite-state but stack-based control flow (see Boolean programs [8] and recursive state machines =-=[1]-=- as concrete instances of pushdown models of programs). Given a program P modeled as a pushdown automaton, we can view P as a generator of nested words in the following manner. We choose an observatio... |

109 | Regular tree and regular hedge languages over unranked alphabets - Bruggemann-Klein, Murata, et al. - 2001 |

82 | Validating streaming XML documents
- Segoufin, Vianu
- 2002
(Show Context)
Citation Context ...is one where the input must be read left-to-right, and can be read only once. Note that this result comes useful in type-checking streaming XML documents, as the depth of documents is often not large =-=[19,13]-=-. When A is fixed, the result in [22] exploits the visibly pushdown structure to solve the membership problem in logarithmic space, and [9] shows that membership can be checked using boolean circuits ... |

81 | Model checking for context-free processes
- Steffen, Burkart
- 1992
(Show Context)
Citation Context ...NWAs is decidable in polynomial time since they can be Journal of the ACM, Vol. ?, No. ?, ? 2009.Adding nesting structure to words · 41 interpreted as pushdown automata over infinite words over ˆ Σ [=-=Burkart and Steffen 1992-=-]. From our results it also follows that the universality and inclusion problems are Exptime-complete for nondeterministic Büchi NWAs: the upper bounds follow from the complexity of complementation, a... |

76 | Temporal-safety proofs for systems code
- Henzinger, Jhala, et al.
- 2002
(Show Context)
Citation Context ...fication, a popular paradigm to verification is through data abstraction, where the data in a program is abstracted using a finite set of boolean variables that stand for predicates on the data-space =-=[7,10]-=-. The resulting models hence have finite-state but stack-based control flow (see Boolean programs [8] and recursive state machines [1] as concrete instances of pushdown models of programs). Given a pr... |

74 | Verification of control flow based security properties
- Metayer, Thorn
- 1999
(Show Context)
Citation Context ...interrupt-handlers in the call-stack currently is less than 5, then a property p holds” require inspection of the stack, and decision procedures for certain classes of stack properties already exist [=-=Jensen et al. 1999-=-; Chen and Wagner 2002; Esparza et al. 2003; Chatterjee et al. 2004]. A separate class of non-regular, but decidable, properties includes the temporal logic Caret that allows matching of calls and ret... |

69 | Model-checking LTL with regular valuations for pushdown systems
- Esparza, Kučera, et al.
(Show Context)
Citation Context ...ied in recent years leading to efficient implementations and applications to program analysis [Reps et al. 1995; Boujjani et al. 1997; Ball and Rajamani 2000; Alur et al. 2005; Henzinger et al. 2002; =-=Esparza et al. 2003-=-; Chen and Wagner 2002]. While many analysis problems such as identifying dead code and accesses to uninitialized variables can be captured as regular requirements, many others require inspection of t... |

67 | Context-free languages and pushdown automata
- Autebert, Berstel, et al.
- 1997
(Show Context)
Citation Context ...2002; Bouquet et al. 2003]. Context-free Languages. There is an extensive literature on pushdown automata, context-free languages, deterministic pushdown automata, and context-free ω-languages (c.f. [=-=Autebert et al. 1997-=-]). The most related work is McNaughton’s parenthesis languages with a decidable equivalence problem [McNaughton 1967]. Knuth showed that parentheses languages are closed under union, intersection, an... |

60 | A temporal logic of nested calls and returns
- Alur, Etessami, et al.
- 2004
(Show Context)
Citation Context ...ill go to state q3, and all transitions from q3 will go to q3.) Further, we can build specification logics for programs that exploit the nested structure. An example of such a temporal logic is Caret =-=[4]-=-, which extends linear temporal logic by local modalities such as 〈a〉ϕ, which holds at a call if the return-successor of the call satisfies ϕ. Caret can state many interesting properties of programs, ... |

39 |
Parenthesis grammars
- McNaughton
- 1967
(Show Context)
Citation Context ...ed to Dyck languages, which is the class of languages with well-bracketed structure. The class of parenthesis languages studied by McNaughton comes closest to our notion of visibly pushdown languages =-=[16]-=-. A parenthesis language is one generated by a context free grammar where every production introduces a pair of parentheses that delimit the scope of the production. Viewing the nesting relation as th... |

32 | Logics for unranked trees: an overview
- Libkin
(Show Context)
Citation Context ...-terminals always stand for tags. Consequently, type definitions can be encoded using nested word automata. Though trees and automata on unranked trees are traditionally used in the study of XML (see =-=[17,14]-=- for recent surveys), nested word automata lend more naturally to describing the document especially when the document needs to be processed as a word being read from leftsto right (as in the case of ... |

31 | Automata for XML – a survey - Schwentick - 2004 |

26 |
On logics, tilings, and automata
- Thomas
- 1991
(Show Context)
Citation Context ... definition that allows combining of states is quite common. For example, bottom-up tree automata allow such a join. Various notions of automata on partial-orders and graphs are also defined this way =-=[20]-=-. In fact, one can define a more general notion of automata on nested words by giving tiling systems that tile the positions using a finite number of tiles with local constraints that restrict the til... |

25 | Logics for context-free languages
- Lautemann, Schwentick, et al.
- 1994
(Show Context)
Citation Context ...ed as a restricted class of VPAs, have been proposed to explain the phenomenon [Harel et al. 2000]. There is a logical characterization of context free languages using quantifications over matchings [=-=Lautemann et al. 1994-=-], and Theorem 5.2 follows from that result. Tree Automata. There is a rich literature on tree automata, and we used [Schwentick 2007; Comon et al. 2002] for our research. Besides classical top-down a... |

23 |
A characterization of parenthesis languages
- Knuth
- 1967
(Show Context)
Citation Context ...a pair of parentheses that delimit the scope of the production. Viewing the nesting relation as that defined by the parentheses, parenthesis languages are a subclass of visibly pushdown languages. In =-=[16,11]-=-, it was shown that parenthesis languages are closed under union, intersection and difference, and that the equivalence problem for them is decidable. However, parenthesis languages are a strict subcl... |

23 | Stack size analysis for interrupt-driven programs
- Chatterjee, Ma, et al.
(Show Context)
Citation Context ...and can express the classical correctness requirements of program modules with pre and post conditions, such as “if p holds when a module is invoked, the module must return, and q holds upon return” [=-=Alur et al. 2004-=-]. This suggests that the answer to the question “which class of properties are algorithmically checkable against pushdown models?” should be more general than “regular word languages.” Our results su... |

22 | Pushdown games with the unboundedness and regular conditions: full version with complete proofs. http://www.labri.fr/∼igw/publications.html
- Bouquet, Serre, et al.
(Show Context)
Citation Context ...alities for matching calls and returns [Alur et al. 2004]. Also, properties expressing boundedness of stack, and repeatedly boundedness, have received a lot of attention recently [Cachat et al. 2002; =-=Bouquet et al. 2003-=-]. Context-free Languages. There is an extensive literature on pushdown automata, context-free languages, deterministic pushdown automata, and context-free ω-languages (c.f. [Autebert et al. 1997]). T... |

21 | Designing and evaluating an XPath dialect for linguistic queries
- Bird, Chen, et al.
- 2006
(Show Context)
Citation Context ...e the annotation adds a hierarchical structure. Traditionally, the result is represented as an ordered tree, but can equally be represented as a nested word. For illustration, we use an example from [=-=Bird et al. 2006-=-]. The sentence is I saw the old man with a dog today The linguistic categorization parses the sentence into following categories: S (sentence), VP (verb phrase), NP (noun phrase), PP (prepositional p... |

21 |
Bracketed context-free languages
- Ginsburg, Harrison
- 1967
(Show Context)
Citation Context ...FL). We give a grammar-based characterization of VPLs, which helps in understanding of VPLs as a generalization of parenthesis languages, bracketed languages, and balanced languages [McNaughton 1967; =-=Ginsburg and Harrison 1967-=-; Berstel and Boasson 2002]. Note that VPLs have better closure properties than CFLs, DCFLs, or parenthesis languages: CFLs are not closed under intersection and complement, DCFLs are not closed under... |

20 |
Tree automata techniques and applications. Draft, Available at http://www.grappa.univ-lille3.fr/tata
- Comon, Dauchet, et al.
- 2002
(Show Context)
Citation Context ...uages using quantifications over matchings [Lautemann et al. 1994], and Theorem 5.2 follows from that result. Tree Automata. There is a rich literature on tree automata, and we used [Schwentick 2007; =-=Comon et al. 2002-=-] for our research. Besides classical top-down and bottomup automata over binary trees, stepwise bottom-up tree automata for processing unranked ordered trees [Martens and Niehren 2005; Brüggemann-Kle... |

20 | Minimizing Tree Automata for Unranked Trees - Martens, Niehren - 2005 |

19 | Visibly pushdown games
- Löding, Madhusudan, et al.
- 2004
(Show Context)
Citation Context ...lready exist for visibly pushdown automata: visibly pushdown languages over infinite words have been studied in [6]; games on pushdown graphs against visibly pushdown winning conditions are decidable =-=[15]-=-; congruence based characterizations and minimization theorems for visibly pushdown automata exist [5]; and active learning, conformance testing, and black-box checking for visibly pushdown automata a... |

17 | Marrying words and trees - Alur |

15 |
A fixpoint calculus for local and global program flows
- Alur, Chaudhuri, et al.
- 2006
(Show Context)
Citation Context ... possible executions of a program, and branching corresponds to the choice within the program. It is possible to define nested trees in which each path encodes a structured execution as a nested word =-=[3]-=-. rd wr rd rd en wr rd ex wr en ex sk en exsThe Kleene-∗ operation is defined as usual: if L is a language of nested words over Σ, then L ∗ is the set of nested words nw1.nw2 . . . nwi, where i ∈ N, a... |

15 | Balanced grammars and their languages - Berstel, Boasson - 2002 |

15 | Solving pushdown games with a Σ3-winning condition - Cachat, Duparc, et al. - 2013 |

9 |
Input-driven languages are in log n depth
- Dymond
- 1988
(Show Context)
Citation Context ...ges are a strict subclass of visibly pushdown languages, and are not closed under Kleene-∗. The class of visibly pushdown languages, was considered in papers related to parsing input-driven languages =-=[22,9]-=-. Input-driven languages are precisely visibly pushdown languages (the stack operations are driven by the input). However, the papers considered only the membership problem for these languages (namely... |

7 | Minimization, learning, and conformance testing of Boolean programs
- Kumar, Madhusudan, et al.
- 2006
(Show Context)
Citation Context ...ased characterizations and minimization theorems for visibly pushdown automata exist [5]; and active learning, conformance testing, and black-box checking for visibly pushdown automata are studied in =-=[12]-=-. The nested structure on words can be extended to trees, and automata on nested trees are studied in [3,2]. Finally, a version of the µ-calculus on nested structures has been defined in [3], and is s... |

6 |
Balanced Context-Free Grammars, Hedge Grammar and Pushdown Caterpillar Automata
- Brüggemann-Klein, Wood
(Show Context)
Citation Context ...967; Hopcroft and Ullman 1979]). A parenthesis language is a bracketed language with only one kind of parentheses. Bracketed languages are special case of balanced grammars [Berstel and Boasson 2002; =-=Brüggermann-Klein and Wood 2004-=-]. The original definition of balanced grammars considers productions of the form X →〈aLa〉, where L is a regular language over the nonterminals V . We present a simpler formulation that turns out to b... |

5 |
Input-driven languages are recognized in log n space
- Braunmühl, Verbeek
- 1983
(Show Context)
Citation Context ...ges are a strict subclass of visibly pushdown languages, and are not closed under Kleene-∗. The class of visibly pushdown languages, was considered in papers related to parsing input-driven languages =-=[22,9]-=-. Input-driven languages are precisely visibly pushdown languages (the stack operations are driven by the input). However, the papers considered only the membership problem for these languages (namely... |

4 |
Visibly pushdown languages for XML
- Kumar, Madhusudan, et al.
- 2006
(Show Context)
Citation Context ...g type-inclusion and in checking streaming XML documents against SDTDs. Further, minimization theorems for nested word automata can be exploited to construct minimal machines to process XML documents =-=[13]-=-. 4 Alternative Characterizations We now show alternate characterizations of the class of regular nested word languages. Monadic Second Order Logic of Nested Words Let us fix a countable set of first-... |

3 | Solving pushdown games with a 3 winning condition - Cachat, Duparc, et al. - 2002 |

2 | Input-driven languages are in log n depth, Information Process - DYMOND - 1988 |

1 |
Automata on nested trees. Under submission
- Alur, Chaudhuri, et al.
- 2006
(Show Context)
Citation Context ...ing, conformance testing, and black-box checking for visibly pushdown automata are studied in [12]. The nested structure on words can be extended to trees, and automata on nested trees are studied in =-=[3,2]-=-. Finally, a version of the µ-calculus on nested structures has been defined in [3], and is shown to be more powerful than the standard µ-calculus, while at the same time remaining robust and tractabl... |

1 | A characterization of parenthesis languages. Information and Control - Knuth - 1967 |