Provenance semirings
 In PODS’07, Bejing
, 2007
"... Provenance Semirings We show that relational algebra calculations for incomplete databases, probabilistic databases, bag semantics and why provenance are particular cases of the same general algorithms involving semirings. This further suggests a comprehensive provenance representation that uses sem ..."
Provenance Semirings We show that relational algebra calculations for incomplete databases, probabilistic databases, bag semantics and why provenance are particular cases of the same general algorithms involving semirings. This further suggests a comprehensive provenance representation that uses semirings of polynomials. We extend these considerations to datalog and semirings of formal power series. We give algorithms for datalog provenance calculation as well as datalog evaluation for incomplete and probabilistic databases. Finally, we show that for some semirings containment of conjunctive queries is the same as for standard set semantics.
Parsing InsideOut
, 1998
"... Probabilistic ContextFree Grammars (PCFGs) and variations on them have recently become some of the most common formalisms for parsing. It is common with PCFGs to compute the inside and outside probabilities. When these probabilities are multiplied together and normalized, they produce the probabili ..."
Probabilistic ContextFree Grammars (PCFGs) and variations on them have recently become some of the most common formalisms for parsing. It is common with PCFGs to compute the inside and outside probabilities. When these probabilities are multiplied together and normalized, they produce the probability that any given nonterminal covers any piece of the input sentence. The traditional use of these probabilities is to improve the probabilities of grammar rules. In this thesis we show that these values are useful for solving many other problems in Statistical Natural Language Processing. We give a framework for describing parsers. The framework generalizes the inside and outside values to semirings. It makes it easy to describe parsers that compute a wide variety of interesting quantities, including the inside and outside probabilities, as well as related quantities such as Viterbi probabilities and nbest lists. We also present three novel uses for the inside and outside probabilities. T...
Efficient contextsensitive intrusion detection
, 2004
"... Modelbased intrusion detection compares a process’s execution against a program model to detect intrusion attempts. Models constructed from static program analysis have historically traded precision for efficiency. We address this problem with our Dyck model, the first efficient staticallyconstruc ..."
Modelbased intrusion detection compares a process’s execution against a program model to detect intrusion attempts. Models constructed from static program analysis have historically traded precision for efficiency. We address this problem with our Dyck model, the first efficient staticallyconstructed contextsensitive model. This model specifies both the correct sequences of system calls that a program can generate and the stack changes occurring at function call sites. Experiments demonstrate that the Dyck model is an order of magnitude more precise than a contextinsensitive finite state machine model. With null call squelching, a dynamic technique to bound cost, the Dyck model operates in time similar to the contextinsensitive model. We also present two static analysis techniques designed to counter mimicry and evasion attacks. Our branch analysis identifies between 32 % and 64 % of our test programs’ system call sites as affecting control flow via their return values. Interprocedural argument capture of general values recovers 32 % to 69 % more arguments than previously reported techniques. 1.
TwoDimensional Languages
, 1997
"... this paper, much work have been done in studying properties of picture languages recognized by finitestate machines and several other models have been designed. A survey on this subject can be found in [21]. An intersting model of twodimensional tape acceptor is the twodimensional online tessell ..."
this paper, much work have been done in studying properties of picture languages recognized by finitestate machines and several other models have been designed. A survey on this subject can be found in [21]. An intersting model of twodimensional tape acceptor is the twodimensional online tessellation automaton introduced by K. Inoue and A. Nakamura in [18]. This is defined as an infinite twodimensional array of identical conventional finitestate automata and it is a special type of cellular automaton. Despite it is not evident that it is a generalization of a onedimensional model, it can be easily 2 identified to a conventional automaton when restricted to onerow (or onecolumn) pictures. Moreover, the family of picture languages recognized by this model of automaton satisfy many important properties. Different systems to generate pictures using grammars have been also explored (cf. [31, 32, 33, 35, 34, 36, 29, 30, 39]). However, in the finite state case, this approach is shown to be less powerful than others. Another possible generalization is to describe picture languages by logic formulas. Recently, W. Thomas gave a general formalism to describe graphs (and, in particular, pictures) as model theoretical structures and showed as "recognizability" corresponds to the notions of definability on existential monadic second order logic (cf. [38]). This is coherent with the string language recognizability theory where Buchi's Theorem holds. In a recent proposal (cf. [13, 14]) a notion of recognizability of a set of pictures in terms of tiling systems is introduced. The underlying idea is to define recognizability by "projection of local properties". Informally, recognition in a tiling system is defined in terms of a finite set of square pictures of side two which c...
Semiring Parsing
 Computational Linguistics
, 1999
"... this paper is that all five of these commonly computed quantities can be described as elements of complete semirings (Kuich 1997). The relationship between grammars and semirings was discovered by Chomsky and Schtitzenberger (1963), and for parsing with the CKY algorithm, dates back to Teitelbaum ( ..."
this paper is that all five of these commonly computed quantities can be described as elements of complete semirings (Kuich 1997). The relationship between grammars and semirings was discovered by Chomsky and Schtitzenberger (1963), and for parsing with the CKY algorithm, dates back to Teitelbaum (1973). A complete semiring is a set of values over which a multiplicative operator and a commutative additive operator have been defined, and for which infinite summations are defined. For parsing algorithms satisfying certain conditions, the multiplicative and additive operations of any complete semiring can be used in place of/x and , and correct values will be returned. We will give a simple normal form for describing parsers, then precisely define complete semirings, and the conditions for correctness
ContextFree Languages and PushDown Automata
 Handbook of Formal Languages
, 1997
"... Contents 1. Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 2 1.1 Grammars : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 2 1.2 Examples : : : : : : : : : : : : : : : : : : : : : : : : : : : ..."
Contents 1. Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 2 1.1 Grammars : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 2 1.2 Examples : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 4 2. Systems of equations : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 5 2.1 Systems : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 6 2.2 Resolution : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 11 2.3 Linear systems : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 12 2.4 Parikh's theorem : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :
Motif Statistics
, 1999
"... We present a complete analysis of the statistics of number of occurrences of a regular expression pattern in a random text. This covers "motifs" widely used in computational biology. Our approach is based on: (i) a constructive approach to classical results in theoretical computer science ..."
We present a complete analysis of the statistics of number of occurrences of a regular expression pattern in a random text. This covers "motifs" widely used in computational biology. Our approach is based on: (i) a constructive approach to classical results in theoretical computer science (automata and formal language theory), in particular, the rationality of generating functions of regular languages; (ii) analytic combinatorics that is used for deriving asymptotic properties from generating functions; (iii) computer algebra for determining generating functions explicitly, analysing generating functions and extracting coefficients efficiently. We provide constructions for overlapping or nonoverlapping matches of a regular expression. A companion implementation produces multivariate generating functions for the statistics under study. A fast computation of Taylor coefficients of the generating functions then yields exact values of the moments with typical application to random t...
Random walks on infinite graphs and groups  a survey on selected topics
 Bull. London Math. Soc
, 1994
"... 2. Basic definitions and preliminaries 3 A. Adaptedness to the graph structure 4 B. Reversible Markov chains 4 ..."
2. Basic definitions and preliminaries 3 A. Adaptedness to the graph structure 4 B. Reversible Markov chains 4
The insertion encoding of permutations
, 2005
"... We introduce the insertion encoding, an encoding of finite permutations. Classes of permutations whose insertion encodings form a regular language are characterized. Some necessary conditions are provided for a class of permutations to have insertion encodings that form a context free language. Appl ..."
We introduce the insertion encoding, an encoding of finite permutations. Classes of permutations whose insertion encodings form a regular language are characterized. Some necessary conditions are provided for a class of permutations to have insertion encodings that form a context free language. Applications of the insertion encoding to the evaluation of generating functions for classes of permutations, construction of polynomial time algorithms for enumerating such classes, and the illustration of bijective equivalence between classes are demonstrated.