Monadic second–order evaluations on treedecomposable graphs
 Theoret. Comput. Sci
, 1993
Cited by 80 (23 self)
Courcelle, B. and M. Mosbah, Monadic secondorder evaluations on treedecomposable graphs,
Monadic Datalog and the Expressive Power of Languages for Web Information Extraction
 J. ACM
, 2002
Cited by 75 (11 self)
Research on information extraction from Web pages (wrapping) has seen much activity in recent times (particularly systems implementations), but little work has been done on formally studying the expressiveness of the formalisms proposed or on the theoretical foundations of wrapping. In this paper, we first study monadic datalog as a wrapping language (over ranked or unranked tree structures). Using previous work by Neven and Schwentick, we show that this simple language is equivalent to full monadic second order logic (MSO) in its ability to specify wrappers. We believe that MSO has the right expressiveness required for Web information extraction and thus propose MSO as a yardstick for evaluating and comparing wrappers. Using the above result, we study the kernel fragment Elog of the Elog wrapping language used in the Lixto system (a visual wrapper generator). The striking fact here is that Elog exactly captures MSO, yet is easier to use. Indeed, programs in this language can be entirely visually specified. We also formally compare Elog to other wrapping languages proposed in the literature.
Specifying Timed State Sequences in Powerful Decidable Logics and Timed Automata (Extended Abstract)
 LNCS 863
, 1994
Cited by 52 (0 self)
) Thomas Wilke ChristianAlbrechtsUniversitat zu Kiel, Institut fur Informatik und Praktische Mathematik, D24098 Kiel, Germany ? Abstract. A monadic secondorder language, denoted by Ld, is introduced for the specification of sets of timed state sequences. A fragment of Ld, denoted by L $ d, is proved to be expressively complete for timed automata (Alur and Dill), i. e., every timed regular language is definable by a L $ dformula and every L $ dformula defines a timed regular language. As a consequence the satisfiability problem for L $ d is decidable. Timed temporal logics are shown to be effectively embeddable into L $ d and hence turn out to have a decidable theory. This applies to TL \Gamma (Manna and Pnueli) and EMITLp , which is obtained by extending the logic MITLp (Alur and Henzinger) by automata operators (Sistla, Vardi, and Wolper). For every positive natural number k the full monadic secondorder logic Ld and L $ d are equally expressive modulo the set of timed...
A Descriptive Approach to LanguageTheoretic Complexity
, 1996
Cited by 52 (3 self)
Contents 1 Language Complexity in Generative Grammar 3 Part I The Descriptive Complexity of Strongly ContextFree Languages 11 2 Introduction to Part I 13 3 Trees as Elementary Structures 15 4 L 2 K;P and SnS 25 5 Definability and NonDefinability in L 2 K;P 35 6 Conclusion of Part I 57 DRAFT 2 / Contents Part II The Generative Capacity of GB Theories 59 7 Introduction to Part II 61 8 The Fundamental Structures of GB Theories 69 9 GB and Nondefinability in L 2 K;P 79 10 Formalizing XBar Theory 93 11 The Lexicon, Subcategorization, Thetatheory, and Case Theory 111 12 Binding and Control 119 13 Chains 131 14 Reconstruction 157 15 Limitations of the Interpretation 173 16 Conclusion of Part II 179 A Index of Definitions 183 Bibliography DRAFT 1<
Concurrent Reachability Games
, 2008
Cited by 49 (20 self)
We consider concurrent twoplayer games with reachability objectives. In such games, at each round, player 1 and player 2 independently and simultaneously choose moves, and the two choices determine the next state of the game. The objective of player 1 is to reach a set of target states; the objective of player 2 is to prevent this. These are zerosum games, and the reachability objective is one of the most basic objectives: determining the set of states from which player 1 can win the game is a fundamental problem in control theory and system verification. There are three types of winning states, according to the degree of certainty with which player 1 can reach the target. From type1 states, player 1 has a deterministic strategy to always reach the target. From type2 states, player 1 has a randomized strategy to reach the target with probability 1. From type3 states, player 1 has for every real ε> 0 a randomized strategy to reach the target with probability greater than 1 − ε. We show that for finite state spaces, all three sets of winning states can be computed in polynomial time: type1 states in linear time, and type2 and type3 states in quadratic time. The algorithms to compute the three sets of winning states also enable the construction of the winning and spoiling strategies.
Stochastic ContextFree Grammars for Modeling RNA
, 1993
Cited by 39 (4 self)
Stochastic contextfree grammars (SCFGs) are used to fold, align and model a family of homologous RNA sequences. SCFGs capture the sequences' common primary and secondary structure and generalize the hidden Markov models (HMMs) used in related work on protein and DNA. The novel aspect of this work is that SCFG parameters are learned automatically from unaligned, unfolded training sequences. A generalization of the HMM forwardbackward algorithm is introduced. The new algorithm, based on tree grammars and faster than the previously proposed SCFG insideoutside algorithm, is tested on the transfer RNA (tRNA) family. Results show the model can discern tRNA from similarlength RNA sequences, can find secondary structure of new tRNA sequences, and can give multiple alignments of large sets of tRNA sequences. The model is extended to handle introns in tRNA. Keywords: Stochastic ContextFree Grammar, RNA, Transfer RNA, Multiple Sequence Alignments, Database Searching. 1 Introduction Attempt...
An Efficient Graph Algorithm for Dominance Constraints
 JOURNAL OF ALGORITHMS
, 2003
Cited by 37 (15 self)
Dominance constraints are logical descriptions of trees that are widely used in computational linguistics. Their general satisfiability problem is known to be NPcomplete. Here we identify normal dominance constraints and present an efficient graph algorithm for testing their satisfiablity in deterministic polynomial time. Previously, no polynomial time algorithm was known.
The Lixto Data Extraction Project  Back and Forth between Theory and Practice
 PODS 2004
, 2004
Cited by 37 (2 self)
We present the Lixto project, which is both a research project in database theory and a commercial enterprise that develops Web data extraction (wrapping) and Web service definition software. We discuss the project's main motivations and ideas, in particular the use of a logicbased framework for wrapping. Then we present theoretical results on monadic datalog over trees and on Elog, its close relative which is used as the internal wrapper language in the Lixto system. These results include both a characterization of the expressive power and the complexity of these languages. We describe the visual wrapper specification process in Lixto and various practical aspects of wrapping. We discuss work on the complexity of query languages for trees that was inseminated by our theoretical study of logicbased languages for wrapping. Then we return to the practice of wrapping and the Lixto Transformation Server, which allows for streaming integration of data extracted from Web pages. This is a natural requirement in complex services based on Web wrapping. Finally, we discuss industrial applications of Lixto and point to open problems for future study.
Xpath leashed
 IN ACM COMPUTING SURVEYS
, 2007
Cited by 36 (3 self)
This survey gives an overview of formal results on the XML query language XPath. We identify several important fragments of XPath, focusing on subsets of XPath 1.0. We then give results on the expressiveness of XPath and its fragments compared to other formalisms for querying trees, algorithms and complexity bounds for evaluation of XPath queries, and static analysis of XPath queries.
Modular Data Structure Verification
 EECS DEPARTMENT, MASSACHUSETTS INSTITUTE OF TECHNOLOGY
, 2007
Cited by 36 (21 self)
This dissertation describes an approach for automatically verifying data structures, focusing on techniques for automatically proving formulas that arise in such verification. I have implemented this approach with my colleagues in a verification system called Jahob. Jahob verifies properties of Java programs with dynamically allocated data structures. Developers write Jahob specifications in classical higherorder logic (HOL); Jahob reduces the verification problem to deciding the validity of HOL formulas. I present a new method for proving HOL formulas by combining automated reasoning techniques. My method consists of 1) splitting formulas into individual HOL conjuncts, 2) soundly approximating each HOL conjunct with a formula in a more tractable fragment and 3) proving the resulting approximation using a decision procedure or a theorem prover. I present three concrete logics; for each logic I show how to use it to approximate HOL formulas, and how to decide the validity of formulas in this logic. First, I present an approximation of HOL based on a translation to firstorder logic, which enables the use of existing resolutionbased theorem provers. Second, I present an approximation of HOL based on field constraint analysis, a new technique that enables