Results 1 
6 of
6
A General Weighted Grammar Library
 IN PROCEEDINGS OF THE NINTH INTERNATIONAL CONFERENCE ON AUTOMATA (CIAA 2004
, 2004
"... We present a general weighted grammar software library, the GRM Library, that can be used in a variety of applications in text, speech, and biosequence processing. The underlying algorithms were designed to support a wide variety of semirings and the representation and use of very large grammars ..."
Abstract

Cited by 8 (4 self)
 Add to MetaCart
We present a general weighted grammar software library, the GRM Library, that can be used in a variety of applications in text, speech, and biosequence processing. The underlying algorithms were designed to support a wide variety of semirings and the representation and use of very large grammars and automata of several hundred million rules or transitions. We describe several algorithms and utilities of this library and point out in each case their application to several text and speech processing tasks.
New techniques for regular expression searching
 Algorithmica
, 2005
"... We present two new techniques for regular expression searching and use them to derive faster practical algorithms. Based on the specific properties of Glushkov’s nondeterministic finite automaton construction algorithm, we show how to encode a deterministic finite automaton (DFA) using O(m2 m) bits, ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
We present two new techniques for regular expression searching and use them to derive faster practical algorithms. Based on the specific properties of Glushkov’s nondeterministic finite automaton construction algorithm, we show how to encode a deterministic finite automaton (DFA) using O(m2 m) bits, where m is the number of characters, excluding operator symbols, in the regular expression. This compares favorably against the worst case of O(m2 m Σ) bits needed by a classical DFA representation (where Σ is the alphabet) and O(m2 2m) bits needed by the Wu and Manber approach implemented in Agrep. We also present a new way to search for regular expressions, which is able to skip text characters. The idea is to determine the minimum length ℓ of a string matching the regular expression, manipulate the original automaton so that it recognizes all the reverse prefixes of length up to ℓ of the strings originally accepted, and use it to skip text characters as done for exact string matching in previous work. We combine these techniques into two algorithms, one able and one unable to skip text characters. The algorithms are simple to implement, and our experiments show that they permit fast searching for regular expressions, normally faster than any existing algorithm. 1
Lattice Kernels for Spoken Dialog Classification
 In Proceedings ICASSP'03, Hong Kong
, 2003
"... Classification is a key task in spokendialog systems. The response of a spokendialog system is often guided by the category assigned to the speaker’s utterance. Unfortunately, classifiers based on the onebest transcription of the speech utterances are not satisfactory because of the high word err ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
Classification is a key task in spokendialog systems. The response of a spokendialog system is often guided by the category assigned to the speaker’s utterance. Unfortunately, classifiers based on the onebest transcription of the speech utterances are not satisfactory because of the high word error rate of conversational speech recognition systems. Since the correct transcription may not be the highest ranking one but often will be represented in the word lattices output by the recognizer, the classification accuracy can be much higher if the full lattice is exploited both during training and classification. In this paper we present the first principled approach for classification based on full lattices. For this purpose, we use the Support Vector Machine (SVM) framework with kernels for lattices. The lattice kernel we define belongs to the general class of rational kernels. We give efficient algorithms for computing kernels for arbitrary lattices and report experiments using the algorithm in a difficult callclassification task with ¢¤ £ categories. Our experiments with a trigram lattice kernel show a ¥§¦© ¨ reduction in error rate at a ¢©�© ¨ rejection level. 1.
The Design Principles and Algorithms of a General Weighted Grammar Library
"... We present the software design principles, algorithms, and utilities of a general weighted grammar library, the GRM Library, that can be used in a variety of applications in text, speech, and biosequence processing. Several of the algorithms and utilities of this library are described, including in ..."
Abstract
 Add to MetaCart
We present the software design principles, algorithms, and utilities of a general weighted grammar library, the GRM Library, that can be used in a variety of applications in text, speech, and biosequence processing. Several of the algorithms and utilities of this library are described, including in some cases their pseudocodes and pointers to their use in applications. The algorithms and the utilities were designed to support a wide variety of semirings and the representation and use of large grammars and automata of several hundred million rules or transitions.
Rhetorical Systems 4 Crichton’s Close, Edinburgh
, 2004
"... This paper describes a novel method of compiling ranked tagging rules into a deterministic finitestate device called a bimachine. The rules are formulated in the framework of regular rewrite operations and allow unrestricted regular expressions in both left and right rule contexts. The compiler is ..."
Abstract
 Add to MetaCart
This paper describes a novel method of compiling ranked tagging rules into a deterministic finitestate device called a bimachine. The rules are formulated in the framework of regular rewrite operations and allow unrestricted regular expressions in both left and right rule contexts. The compiler is illustrated by an application within a speech synthesis system. 1
Random Generation of Deterministic Acyclic Automata Using Markov Chains
, 2013
"... Abstract. In this article we propose an algorithm, based on Markov chain techniques, to generate random automata that are deterministic, accessible and acyclic. The distribution of the output approaches the uniform distribution on nstate such automata. We then show how to adapt this algorithm in or ..."
Abstract
 Add to MetaCart
Abstract. In this article we propose an algorithm, based on Markov chain techniques, to generate random automata that are deterministic, accessible and acyclic. The distribution of the output approaches the uniform distribution on nstate such automata. We then show how to adapt this algorithm in order to generate minimal acyclic automata with n states almost uniformly. 1