Results 1  10
of
39
Parsing InsideOut
, 1998
"... Probabilistic ContextFree Grammars (PCFGs) and variations on them have recently become some of the most common formalisms for parsing. It is common with PCFGs to compute the inside and outside probabilities. When these probabilities are multiplied together and normalized, they produce the probabili ..."
Abstract

Cited by 97 (2 self)
 Add to MetaCart
(Show Context)
Probabilistic ContextFree Grammars (PCFGs) and variations on them have recently become some of the most common formalisms for parsing. It is common with PCFGs to compute the inside and outside probabilities. When these probabilities are multiplied together and normalized, they produce the probability that any given nonterminal covers any piece of the input sentence. The traditional use of these probabilities is to improve the probabilities of grammar rules. In this thesis we show that these values are useful for solving many other problems in Statistical Natural Language Processing. We give a framework for describing parsers. The framework generalizes the inside and outside values to semirings. It makes it easy to describe parsers that compute a wide variety of interesting quantities, including the inside and outside probabilities, as well as related quantities such as Viterbi probabilities and nbest lists. We also present three novel uses for the inside and outside probabilities. T...
Semiring Parsing
 Computational Linguistics
, 1999
"... this paper is that all five of these commonly computed quantities can be described as elements of complete semirings (Kuich 1997). The relationship between grammars and semirings was discovered by Chomsky and Schtitzenberger (1963), and for parsing with the CKY algorithm, dates back to Teitelbaum ( ..."
Abstract

Cited by 79 (1 self)
 Add to MetaCart
(Show Context)
this paper is that all five of these commonly computed quantities can be described as elements of complete semirings (Kuich 1997). The relationship between grammars and semirings was discovered by Chomsky and Schtitzenberger (1963), and for parsing with the CKY algorithm, dates back to Teitelbaum (1973). A complete semiring is a set of values over which a multiplicative operator and a commutative additive operator have been defined, and for which infinite summations are defined. For parsing algorithms satisfying certain conditions, the multiplicative and additive operations of any complete semiring can be used in place of/x and , and correct values will be returned. We will give a simple normal form for describing parsers, then precisely define complete semirings, and the conditions for correctness
Kleene Algebra with Domain
, 2003
"... We propose Kleene algebra with domain (KAD), an extension of Kleene algebra with two equational axioms for a domain and a codomain operation, respectively. KAD considerably augments the expressibility of Kleene algebra, in particular for the specification and analysis of state transition systems. We ..."
Abstract

Cited by 44 (30 self)
 Add to MetaCart
(Show Context)
We propose Kleene algebra with domain (KAD), an extension of Kleene algebra with two equational axioms for a domain and a codomain operation, respectively. KAD considerably augments the expressibility of Kleene algebra, in particular for the specification and analysis of state transition systems. We develop the basic calculus, discuss some related theories and present the most important models of KAD. We demonstrate applicability by two examples: First, an algebraic reconstruction of Noethericity and wellfoundedness. Second, an algebraic reconstruction of propositional Hoare logic.
A Kleene theorem for weighted tree automata
 Theory of Computing Systems
, 2002
"... In this paper we prove Kleene's result for tree series over a commutative and idempotent semiring A (which is not necessarily complete or continuous), i.e., the class of recognizable tree series over A and the class of rational tree series over A are equal. We show the result by direct automata ..."
Abstract

Cited by 21 (8 self)
 Add to MetaCart
(Show Context)
In this paper we prove Kleene's result for tree series over a commutative and idempotent semiring A (which is not necessarily complete or continuous), i.e., the class of recognizable tree series over A and the class of rational tree series over A are equal. We show the result by direct automatatheoretic constructions and prove their correctness.
Growth and ergodicity of contextfree languages
 Trans. Amer. Math. Soc
"... Abstract. A language L over a finite alphabet Σ is called growthsensitive if forbidding any set of subwords F yields a sublanguage LF whose exponential growth rate is smaller than that of L. It is shown that every ergodic unambiguous, nonlinear contextfree language is growthsensitive. “Ergodic ” ..."
Abstract

Cited by 13 (7 self)
 Add to MetaCart
(Show Context)
Abstract. A language L over a finite alphabet Σ is called growthsensitive if forbidding any set of subwords F yields a sublanguage LF whose exponential growth rate is smaller than that of L. It is shown that every ergodic unambiguous, nonlinear contextfree language is growthsensitive. “Ergodic ” means for a contextfree grammar and language that its dependency digraph is strongly connected. The same result as above holds for the larger class of essentially ergodic contextfree languages, and if growth is considered with respect to the ambiguity degrees, then the assumption of unambiguity may be dropped. The methods combine a construction of grammars for 2block languages with a generating function technique regarding systems of algebraic equations. 1. Introduction and
Bounded Underapproximations
"... We show a new and constructive proof of the following languagetheoretic result: for every contextfree language L, there is a bounded contextfree language L ′ ⊆ L which has the same Parikh (commutative) image as L. Bounded languages, introduced by Ginsburg and Spanier, are subsets of regular lang ..."
Abstract

Cited by 12 (1 self)
 Add to MetaCart
(Show Context)
We show a new and constructive proof of the following languagetheoretic result: for every contextfree language L, there is a bounded contextfree language L ′ ⊆ L which has the same Parikh (commutative) image as L. Bounded languages, introduced by Ginsburg and Spanier, are subsets of regular languages of the form w ∗ 1w ∗ 2 · · · w ∗ m for some w1,..., wm ∈ Σ ∗. In particular bounded contextfree languages have nice structural and decidability properties. Our proof proceeds in two parts. First, we give a new construction that shows that each context free language L has a subset LN that has the same Parikh image as L and that can be represented as a sequence of substitutions on a linear language. Second, we inductively construct a Parikhequivalent bounded contextfree subset of LN. We show two applications of this result in model checking: to underapproximate the reachable state space of multithreaded procedural programs and to underapproximate the reachable state space of recursive counter programs. The bounded language constructed above provides a decidable underapproximation for the original
Growthsensitivity of contextfree languages
, 2003
"... A language L over a finite alphabet is called growthsensitive if forbidding any set of subwords F yields a sublanguage L F whose exponential growth rate is smaller than that of L. It is shown that every (essentially) ergodic nonlinear contextfree language of convergent type is growthsensitive. ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
A language L over a finite alphabet is called growthsensitive if forbidding any set of subwords F yields a sublanguage L F whose exponential growth rate is smaller than that of L. It is shown that every (essentially) ergodic nonlinear contextfree language of convergent type is growthsensitive. “Ergodic” means that the dependency digraph of the generating contextfree grammar is strongly connected, and “essentially ergodic” means that there is only one nonregular strong component in that graph. The methods combine (1) an algorithm for constructing from a given grammar one that generates the associated 2block language and (2) a generating function technique regarding systems of algebraic equations. Furthermore, the algorithm of (1) preserves unambiguity as well as the number of nonregular strong components of the dependency digraph.
Commutation Problems on Sets of Words and Formal Power Series
, 2002
"... We study in this thesis several problems related to commutation on sets of words and on formal power series. We investigate the notion of semilinearity for formal power series in commuting variables, introducing two families of series  the semilinear and the bounded series  both natural generaliza ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
We study in this thesis several problems related to commutation on sets of words and on formal power series. We investigate the notion of semilinearity for formal power series in commuting variables, introducing two families of series  the semilinear and the bounded series  both natural generalizations of the semilinear languages, and we study their behaviour under rational operations, morphisms, Hadamard product, and difference. Turning to commutation on sets of words, we then study the notions of centralizer of a language  the largest set commuting with a language , of root and of primitive root of a set of words. We answer a question raised by Conway more than thirty years ago  asking whether or not the centralizer of any rational language is rational  in the case of periodic, binary, and ternary sets of words, as well as for rational ccodes, the most general results on this problem. We also prove that any code has a unique primitive root and that two codes commute if and only if they have the same primitive root, thus solving two conjectures of Ratoandromanana, 1989. Moreover, we prove that the commutation with an ccode X can be characterized similarly as in free monoids: a language commutes with X if and only if it is a union of powers of the primitive root of X.