Results 1  10
of
21
Discovering Models of Software Processes from EventBased Data
 ACM Transactions on Software Engineering and Methodology
, 1998
"... this article we describe a Markov method that we developed specifically for process discovery, as well as describe two additional methods that we adopted from other domains and augmented for our purposes. The three methods range from the purely algorithmic to the purely statistical. We compare the m ..."
Abstract

Cited by 235 (7 self)
 Add to MetaCart
this article we describe a Markov method that we developed specifically for process discovery, as well as describe two additional methods that we adopted from other domains and augmented for our purposes. The three methods range from the purely algorithmic to the purely statistical. We compare the methods and discuss their application in an industrial case study.
FiniteState SpeechToSpeech Translation
, 1997
"... A fully integrated approach to SpeechInput Language Translation in limiteddomain applications is presented. The mapping from the input to the output language is modeled in terms of a finite state translation model which is learned from examples of inputoutput sentences of the task considered. Thi ..."
Abstract

Cited by 64 (14 self)
 Add to MetaCart
A fully integrated approach to SpeechInput Language Translation in limiteddomain applications is presented. The mapping from the input to the output language is modeled in terms of a finite state translation model which is learned from examples of inputoutput sentences of the task considered. This model is tightly integrated with standard acousticphonetic models of the input language and the resulting global model directly supplies, through Viterbi search, an optimal outputlanguage sentence for each input language utterance. Several extensions to this framework, recently developed to cope with the increasing difficulty of translation tasks, are reviewed. Finally, results for a task in the framework of hotel frontdesk communication, with a vocabulary of about 700 words, are reported.
Incremental Regular Inference
 Proceedings of the Third ICGI96
, 1996
"... In this paper, we extend the characterization of the search space of regular inference [DMV94] to sequential presentations of learning data. We propose the RPNI2 algorithm, an incremental extension of the RPNI algorithm. We study the convergence and complexities of both algorithms from a theoretical ..."
Abstract

Cited by 31 (2 self)
 Add to MetaCart
In this paper, we extend the characterization of the search space of regular inference [DMV94] to sequential presentations of learning data. We propose the RPNI2 algorithm, an incremental extension of the RPNI algorithm. We study the convergence and complexities of both algorithms from a theoretical and practical point of view. These results are assessed on the Feldman task. 1 Introduction Regular inference is the problem of learning a regular language from a positive sample, that is, a finite set of strings supposed to be drawn from a target language. Whenever a negative sample, that is, a finite set of strings not belonging to the target language, is also available, the problem may be solved by the RPNI algorithm 1 proposed by Oncina and Garc'ia [OG92] and, independently, by Lang [Lan92]. The RPNI algorithm has been shown to identify in the limit any regular language with polynomial complexity as a function of the positive and negative sample sizes. However, this algorithm requir...
Using domain information during the learning of a subsequential transducer
, 1996
"... The recently proposed OSTI algorithm allows for the identification of subsequential functions from input/output pairs. However, if the target is a partial function the convergence is not guaranteed. In this work, we extend the algorithm in order to allow for the identification of any partial subsequ ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
The recently proposed OSTI algorithm allows for the identification of subsequential functions from input/output pairs. However, if the target is a partial function the convergence is not guaranteed. In this work, we extend the algorithm in order to allow for the identification of any partial subsequential function provided that either a negative sample or a description of the domain by means of a deterministic finite automaton is available.
Using knowledge to improve NGram Language Modelling through the MGGI methodology
 In Grammatical Inference: Learning Syntax from Sentences, L.Miclet, C.De La Higuera, Eds. LNAI (1147
, 1996
"... The structural limitations of NGram models used for Language Modelling are illustrated through several examples. In most cases of interest, these limitations can be easily overcome using (general) regular or finitestate models, without having to resort to more complex, recursive devices. The p ..."
Abstract

Cited by 9 (3 self)
 Add to MetaCart
The structural limitations of NGram models used for Language Modelling are illustrated through several examples. In most cases of interest, these limitations can be easily overcome using (general) regular or finitestate models, without having to resort to more complex, recursive devices. The problem is how to obtain the required finitestate structures from reasonably small amounts of training (positive) sentences of the considered task. Here this problem is approached through a Grammatical Inference technique known as MGGI. This allows us to easily apply a priory knowledge about the type of syntactic constraints that are relevant to the considered task to significantly improve the performance of NGrams, using similar or smaller amounts of training data. Speech Recognition experiments are presented with results supporting the interest of the proposed approach.
Language Understanding and Subsequential Transducer Learning
, 1998
"... Language Understanding can be considered as the realization of a mapping from sentences of a natural language into a description of their meaning in an appropriate formal language. Under this viewpoint, the application of the Onward Subsequential Transducer Inference Algorithm (OSTIA) to Language Un ..."
Abstract

Cited by 8 (3 self)
 Add to MetaCart
Language Understanding can be considered as the realization of a mapping from sentences of a natural language into a description of their meaning in an appropriate formal language. Under this viewpoint, the application of the Onward Subsequential Transducer Inference Algorithm (OSTIA) to Language Understanding is considered. The basic version of OSTIA is reviewed and a new version is presented in which syntactic restrictions of the domain and/or range of the target transduction can effectively be taken into account. For experimentation purposes, a task proposed by Feldman et al. for assessing the capabilities of Language Learning and Understanding systems has been adopted and three increasingly difficulttolearn semantic coding schemes have been defined for this task. In all cases the basic version of OSTIA has consistently proved able to learn very compact and accurate transducers from relatively small training sets of inputoutput examples of the task. Moreover, if the input sentences are corrupted with syntactic incorrectness or errors, the new version of OSTIA still provides understanding results that only degrade in a gradual and natural way.
Stone Soup Translation: The Linked Automata Model
, 2002
"... The automated translation of one natural language to another, known as machine translation (MT), typically requires successful modeling of the grammars of the languages and the relationship between them. Rather than handcoding these grammars and relationships, some machine translation e#orts employ ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
The automated translation of one natural language to another, known as machine translation (MT), typically requires successful modeling of the grammars of the languages and the relationship between them. Rather than handcoding these grammars and relationships, some machine translation e#orts employ datadriven methods, where the goal is to learn from a large amount of training examples of accurate translations. One such datadriven approach is statistical MT, where language and alignment models are automatically induced from parallel corpora. This work has also been extended to probabilistic finitestate approaches, most often via transducers.
The data driven approach applied to the OSTIA algorithm
 Proceedings of the Fourth International Colloquium on Grammatical Inference; Lecture
, 1998
"... ..."
SpokenLanguage Machine Translation in LimitedDomain Tasks
 In Proceedings in Arti Intelligence: CRIM/FORWISS Workshop on Progress and Prospects of Speech Research and Technology
, 1994
"... Subsequential transducers constitute a formal model for translation that may be considered perhaps too simple to model translation between natural languages. However, their capability can suffice in limiteddomain translation tasks. The finitestate nature of subsequential transducers makes their int ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
Subsequential transducers constitute a formal model for translation that may be considered perhaps too simple to model translation between natural languages. However, their capability can suffice in limiteddomain translation tasks. The finitestate nature of subsequential transducers makes their integration with wellknown Continuous Speech Recognition technology both easy and efficient. A recent algorithm allows the automatic learning of these transducers, given a sufficiently large set of examples of sentences and their corresponding translations, and it also allows the incorporation of syntactic restrictions of the input and/or output languages. In this paper, we describe an implementation of a Speech Translation System for limited domains which is fully trainable and capable of real time translation from speech input.
Optimum Algorithm to Minimize Human Interactions in Sequential Computer Assisted Pattern Recognition
"... Given a Pattern Recognition task, Computer Assisted Pattern Recognition can be viewed as a series of solution proposals made by a computer system, followed by corrections made by a user, until an acceptable solution is found. For this kind of systems, the appropriate measure of performance is the ex ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Given a Pattern Recognition task, Computer Assisted Pattern Recognition can be viewed as a series of solution proposals made by a computer system, followed by corrections made by a user, until an acceptable solution is found. For this kind of systems, the appropriate measure of performance is the expected number of corrections the user has to make. In the present work we study the special case when the solution proposals have a sequential nature. Some examples of this type of tasks are: language translation, speech transcription and handwriting text transcription. In all these cases the output (the solution proposal) is a sequence of symbols. In this framework it is assumed that the user corrects always the first error found in the proposed solution. As a consequence, the prefix of the proposed solution before the last error correction can be assumed error free in the next iteration. Nowadays, all the techniques in the literature relies in proposing, at each step, the most probable suffix given that a prefix of the “correct ” output is already known. Usually the computation of the conditional most probable output is an NPHard or an undecidable problem (and then we have to apply some approximations) or, in some simple cases, complex dynamic programming techniques should be used (usualy some variant of the Viterbi algorithm). In the present work we show that this strategy is not optimum when we are interested in minimizing the number of human interactions. Moreover we describe the optimum strategy that is simpler (and usually faster) to compute.