Results 11  20
of
155
Regular Grammatical Inference from Positive and Negative Samples by Genetic Search: the GIG method
, 1994
"... We recall briefly in this paper the formal theory of regular grammatical inference from positive and negative samples of the language to be learned. We state this problem as a search toward an optimal element in a boolean lattice built from the positive information. We explain how a genetic search t ..."
Abstract

Cited by 38 (0 self)
 Add to MetaCart
We recall briefly in this paper the formal theory of regular grammatical inference from positive and negative samples of the language to be learned. We state this problem as a search toward an optimal element in a boolean lattice built from the positive information. We explain how a genetic search technique may be applied to this problem and we introduce a new set of genetic operators. In view of limiting the increasing complexity as the sample size grows, we propose a semiincremental procedure. Finally, an experimental protocol to assess the performance of a regular inference technique is detailed and comparative results are given. 1 Introduction Grammatical Inference is an instance of the Inductive Learning problem which can be formulated as the task of discovering common structures in examples which are supposed to be generated by the same process. In this particular case, the examples are sentences defined on a specific alphabet and the common structures are represented by a gram...
Synchronous binarization for machine translation
 In Proc. HLTNAACL
, 2006
"... Systems based on synchronous grammars and tree transducers promise to improve the quality of statistical machine translation output, but are often very computationally intensive. The complexity is exponential in the size of individual grammar rules due to arbitrary reorderings between the two langu ..."
Abstract

Cited by 35 (10 self)
 Add to MetaCart
Systems based on synchronous grammars and tree transducers promise to improve the quality of statistical machine translation output, but are often very computationally intensive. The complexity is exponential in the size of individual grammar rules due to arbitrary reorderings between the two languages, and rules extracted from parallel corpora can be quite large. We devise a lineartime algorithm for factoring syntactic reorderings by binarizing synchronous rules when possible and show that the resulting rule set significantly improves the speed and accuracy of a stateoftheart syntaxbased machine translation system. 1
Statistical visual language models for ink parsing
 In AAAI Spring Symposium on Sketch Understanding
, 2002
"... Abstract 1 In this paper we motivate a new technique for automatic recognition of handsketched digital ink. By viewing sketched drawings as utterances in a visual language, sketch recognition can be posed as an ambiguous parsing problem. On this premise we have developed an algorithm for ink parsin ..."
Abstract

Cited by 30 (1 self)
 Add to MetaCart
Abstract 1 In this paper we motivate a new technique for automatic recognition of handsketched digital ink. By viewing sketched drawings as utterances in a visual language, sketch recognition can be posed as an ambiguous parsing problem. On this premise we have developed an algorithm for ink parsing that uses a statistical model to disambiguate. Under this formulation, writing a new recognizer for a visual language is as simple as writing a declarative grammar for the language, generating a model from the grammar, and training the model on drawing examples. We evaluate the speed and accuracy of this approach for the sample domain of the SILK visual language and report positive initial results.
The Paradigms of Programming
 Communications of the ACM
, 1979
"... tee) cited Professor Floyd for "helping to found the following important subfields of computer science: the theory of parsing, the semantics of programming languages, automatic program verification, automatic program synthesis, and analysis of algorithms." Professor Floyd, who received bo ..."
Abstract

Cited by 27 (0 self)
 Add to MetaCart
tee) cited Professor Floyd for "helping to found the following important subfields of computer science: the theory of parsing, the semantics of programming languages, automatic program verification, automatic program synthesis, and analysis of algorithms." Professor Floyd, who received both his A.B. and B.S. from the University of Chicago in 1953 and 1958, respectively, is a selftaught computer scientist. His study of computing began in 1956, when as a nightoperator for an IBM 650, he found the time to learn about programming between loads of card hoppers. Floyd implemented one of the first Algol 60 compilers, finishing his work on this project in 1962. In the process, he did some early work on compiler optimization. Subsequently, in the
LinearTime PointerMachine Algorithms for Least Common Ancestors, MST Verification, and Dominators
 IN PROCEEDINGS OF THE THIRTIETH ANNUAL ACM SYMPOSIUM ON THEORY OF COMPUTING
, 1998
"... We present two new data structure toolsdisjoint set union with bottomup linking, and pointerbased radix sortand combine them with bottomlevel microtrees to devise the first lineartime pointermachine algorithms for offline least common ancestors, minimum spanning tree (MST) verification, ..."
Abstract

Cited by 27 (4 self)
 Add to MetaCart
We present two new data structure toolsdisjoint set union with bottomup linking, and pointerbased radix sortand combine them with bottomlevel microtrees to devise the first lineartime pointermachine algorithms for offline least common ancestors, minimum spanning tree (MST) verification, randomized MST construction, and computing dominators in a flowgraph.
Dominators in Linear Time
, 1997
"... A linear time algorithm is presented for finding dominators in control flow graphs. ..."
Abstract

Cited by 25 (0 self)
 Add to MetaCart
A linear time algorithm is presented for finding dominators in control flow graphs.
FSA: An Efficient and Flexible C++ Toolkit for Finite State Automata Using OnDemand Computation
 IN: ACL PROCEEDINGS. (2004
, 2004
"... In this paper we present the RWTH FSA toolkit  an efficient implementation of algorithms for creating and manipulating weighted finitestate automata. The toolkit has been designed using the principle of ondemand computation and offers a large range of widely used algorithms. To prove the superio ..."
Abstract

Cited by 24 (12 self)
 Add to MetaCart
In this paper we present the RWTH FSA toolkit  an efficient implementation of algorithms for creating and manipulating weighted finitestate automata. The toolkit has been designed using the principle of ondemand computation and offers a large range of widely used algorithms. To prove the superior efficiency of the toolkit, we compare the implementation to that of other publically available toolkits. We also show that ondemand computations help to reduce memory requirements significantly without any loss in speed. To increase its flexibility, the RWTH FSA toolkit supports highlevel interfaces to the programming language Python as well as a commandline tool for interactive manipulation of FSAs. Furthermore, we show how to utilize the toolkit to rapidly build a fast and accurate statistical machine translation system. Future extensibility of the toolkit is ensured as it will be publically available as open source software.
Binarization of Synchronous ContextFree Grammars
"... Systems based on synchronous grammars and tree transducers promise to improve the quality of statistical machine translation output, but are often very computationally intensive. The complexity is exponential in the size of individual grammar rules due to arbitrary reorderings between the two langu ..."
Abstract

Cited by 24 (5 self)
 Add to MetaCart
Systems based on synchronous grammars and tree transducers promise to improve the quality of statistical machine translation output, but are often very computationally intensive. The complexity is exponential in the size of individual grammar rules due to arbitrary reorderings between the two languages. We develop a theory of binarization for synchronous contextfree grammars and present a lineartime algorithm for binarizing synchronous rules when possible. In our largescale experiments, we found that almost all rules are binarizable and the resulting binarized rule set significantly improves the speed and accuracy of a stateoftheart syntaxbased machine translation system. We also discuss the more general, and computationally more difficult, problem of finding good parsing strategies for nonbinarizable rules, and present an approximate polynomialtime algorithm for this problem. 1.
Recent Methods for RNA Modeling Using Stochastic ContextFree Grammars
, 1994
"... Stochastic contextfree grammars (SCFGs) can be applied to the problems of folding, aligning and modeling families of homologous RNA sequences. SCFGs capture the sequences' common primary and secondary structure and generalize the hidden Markov models (HMMs) used in related work on protein and DNA. ..."
Abstract

Cited by 24 (1 self)
 Add to MetaCart
Stochastic contextfree grammars (SCFGs) can be applied to the problems of folding, aligning and modeling families of homologous RNA sequences. SCFGs capture the sequences' common primary and secondary structure and generalize the hidden Markov models (HMMs) used in related work on protein and DNA. This paper discusses our new algorithm, TreeGrammar EM, for deducing SCFG parameters automatically from unaligned, unfolded training sequences. TreeGrammar EM, a generalization of the HMM forwardbackward algorithm, is based on tree grammars and is faster than the previously proposed insideoutside SCFG training algorithm. Independently, Sean Eddy and Richard Durbin have introduced a trainable "covariance model" (CM) to perform similar tasks. We compare and contrast our methods with theirs.
Some computational complexity results for synchronous contextfree grammars
 In Proceedings of HLT/EMNLP05
, 2005
"... This paper investigates some computational problems associated with probabilistic translation models that have recently been adopted in the literature on machine translation. These models can be viewed as pairs of probabilistic contextfree grammars working in a ‘synchronous’ way. Two hardness result ..."
Abstract

Cited by 23 (3 self)
 Add to MetaCart
This paper investigates some computational problems associated with probabilistic translation models that have recently been adopted in the literature on machine translation. These models can be viewed as pairs of probabilistic contextfree grammars working in a ‘synchronous’ way. Two hardness results for the class NP are reported, along with an exponential time lowerbound for certain classes of algorithms that are currently used in the literature. 1