Deterministic PartofSpeech Tagging with FiniteState Transducers
 Computational Linguistics
, 1995
Cited by 82 (0 self)
Stochastic approaches to natural language processing have often been preferred to rulebased approaches because of their robustness and their automatic training capabilities. This was the case for partofspeech tagging until Brill showed how stateoftheart partofspeech tagging can be achieved with a rulebased tagger by inferring rules from a training corpus. However, current implementations of the rulebased tagger run more slowly than previous approaches. In this paper, we present a finitestate tagger, inspired by the rulebased tagger, that operates in optimal time in the sense that the time to assign tags to a sentence corresponds to the time required to follow a single path in a deterministic finitestate machine. This result is achieved by encoding the application of the rules found in the tagger as a nondeterministic finitestate transducer and then turning it into a deterministic transducer. The resulting deterministic transducer yields a partofspeech tagger whose speed is dominated by the access time of mass storage devices. We then generalize the techniques to the class of transformationbased systems. 1.
A conditional random field for discriminativelytrained finitestate string edit distance
 In Conference on Uncertainty in AI (UAI
, 2005
Cited by 51 (7 self)
The need to measure sequence similarity arises in information extraction, object identity, data mining, biological sequence analysis, and other domains. This paper presents discriminative stringedit CRFs, a finitestate conditional random field model for edit sequences between strings. Conditional random fields have advantages over generative approaches to this problem, such as pair HMMs or the work of Ristad and Yianilos, because as conditionallytrained methods, they enable the use of complex, arbitrary actions and features of the input strings. As in generative models, the training data does not have to specify the edit sequences between the given string pairs. Unlike generative models, however, our model is trained on both positive and negative instances of string pairs. We present positive experimental results on several data sets. 1
The Equality Problem for Rational Series With Multiplicities in the Tropical Semiring is Undecidable
, 1994
Cited by 51 (2 self)
this paper that the equality problem for Mrational series over an alphabet with at least two letters is undecidable.
Detecting Deadlocks In Concurrent Systems
 IN CONCUR’98: CONCURRENCY THEORY (NICE
, 1998
Cited by 46 (11 self)
We study deadlocks using geometric methods based on generalized process graphs [11], i.e., cubical complexes or HigherDimensional Automata (HDA) [23, 24, 30, 35], describing the semantics of the concurrent system of interest. A new algorithm is described and fully assessed, both theoretically and practically and compared with more wellknown traversing techniques. An implementation is
On the Expressive Power of Temporal Logic
 J. COMPUT. SYSTEM SCI
, 1993
Cited by 40 (4 self)
We study the expressive power of linear propositional temporal logic interpreted on finite sequences or words. We first give a transparent proof of the fact that a formal language is expressible in this logic if and only if its syntactic semigroup is finite and aperiodic. This gives an effective algorithm to decide whether a given rational language is expressible. Our main result states a similar condition for the "restricted" temporal logic (RTL), obtained by discarding the "until" operator. A formal language is RTLexpressible if and only if its syntactic semigroup is finite and satisfies a certain simple algebraic condition. This leads
Polynomial closure and unambiguous product
 Theory Comput. Systems
, 1997
Cited by 32 (5 self)
This paper is a contribution to the algebraic theory of recognizable languages. The main topic of this paper is the polynomial closure, an operation that mixes together the operations of union and concatenation. Formally, the polynomial closure of a class of languages L of A ∗ is the set of languages
Finite semigroups and recognizable languages: an introduction
, 2002
Cited by 29 (9 self)
This paper is an attempt to share with a larger audience some modern developments in the theory of finite automata. It is written for the mathematician who has a background in semigroup theory but knows next to nothing on automata and languages. No proofs are given, but the main results are illustrated by several examples and counterexamples. What is the topic of this theory? It deals with languages, automata and semigroups, although recent developments have shown interesting connections with model theory in logic, symbolic dynamics and topology. Historically, in their attempt to formalize natural languages, linguists such as Chomsky gave a mathematical definition of natural concepts such as words, languages or grammars: given a finite set A, a word on A is simply an element of the free monoid on A, and a language is a set of words. But since scientists are fond of classifications of all sorts, language theory didn’t escape to this mania. Chomsky established a first hierarchy, based on his formal grammars. In this paper, we are interested in the recognizable languages, which form the lower level of the
A conjecture on the Hall topology for the free group
, 1991
Cited by 25 (5 self)
The Hall topology for the free group is the coarsest topology such that every group morphism from the free group onto a finite discrete group is continuous. It was shown by M. Hall Jr that every finitely generated subgroup of the free group is closed for this topology. We conjecture that if H1 ; H2 ; : : : ; Hn are finitely generated subgroups of the free group, then the product H1H2 Hn is closed. We discuss some consequences of this conjecture. First, it would give a nice and simple algorithm to compute the closure of a given rational subset of the free group. Next, it implies a similar conjecture for the free monoid, which, in turn, is equivalent to a deep conjecture on nite semigroup, for the solution of which J. Rhodes has offered $100. We hope that our new conjecture will shed some light on the Rhodes's conjecture.
Like Me?  Measures of Correspondence AND IMITATION
, 2001
Cited by 24 (7 self)
Imitation is a powerful mechanism for ef¢cient learning of novel behaviors that both supports and takes advantage of sociality. A fundamental problem for imitation is to create an appropriate (partial) mapping between the body of the system being imitated and the imitator. By considering for each of these two systems an associated automaton (respectively, transformation semigroup) structure, attempts at such mapping can be considered (partial) relational homomorphisms. This article shows how mathematical techniques can be applied to characterize how far a behavior is from a successful imitation and how to evaluate attempts at imitation arising from a particular correspondence between the imitator and model. For the imitator and the imitated, affordances in the agentenvironment structural coupling are likely to be different, all the more so in the case of dissimilar embodiment. We argue that the use of what is afforded to the imitator to attain corresponding effects or, as in dance, sequences of effects, is necessary and suf¢cient for successful imitation. However, the judged degree of success or failure of an attempted behavioral match depends on some externally imposed oröin the case of autonomous agentsöinternally determined criteria on effects of the attempted imitative behavior (including effects attained successively as well as ¢nal effects). These criteria correspond to metricsömeasures of differenceöwhich can guide the evaluation of a correspondence, the learning of a correspondence, or learning how to apply one. Metrics on states and sequences of action events in the systemenvironment coupling allow judgment of similarity for `observerdependent’ purposes. This allows one to formally de¢ne successful
Tropical Semirings
Cited by 23 (0 self)
this paper is to present other semirings that occur in theoretical computer science. These semirings were baptized tropical semirings by Dominique Perrin in honour of the pioneering work of our brazilian colleague and friend Imre Simon, but are also commonly known as (min; +)semirings