Results 11  20
of
42
Learnable Similarity Functions and Their Applications to Clustering and Record Linkage
, 2004
"... rship (Xing et al. 2003), and relative comparisons (Schultz & Joachims 2004). These approaches have shown improvements over traditional similarity functions for different data types such as vectors in Euclidean space, strings, and database records composed of multiple text fields. While these initia ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
rship (Xing et al. 2003), and relative comparisons (Schultz & Joachims 2004). These approaches have shown improvements over traditional similarity functions for different data types such as vectors in Euclidean space, strings, and database records composed of multiple text fields. While these initial results are encouraging, there still remains a large number of similarity functions that are currently unable to adapt to a particular domain. In our research, we attempt to bridge this gap by developing both new learnable similarity functions and methods for their application to particular problems in machine learning and data mining. In preliminary work, we proposed two learnable similarity functions for strings that adapt distance computations given training pairs of equivalent and nonequivalent strings (Bilenko & Mooney 2003a). The first function is based on a probabilistic model of edit distance with affine gaps (Gus Copyright c # 2004, American Association for Artificial Intelli
Dyna: Extending Datalog For Modern AI ⋆
"... Abstract. Modern statistical AI systems are quite large and complex; this interferes with research, development, and education. We point out that most of the computation involves databaselike queries and updates on complex views of the data. Specifically, recursive queries look up and aggregate rel ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
Abstract. Modern statistical AI systems are quite large and complex; this interferes with research, development, and education. We point out that most of the computation involves databaselike queries and updates on complex views of the data. Specifically, recursive queries look up and aggregate relevant or potentially relevant values. If the results of these queries are memoized for reuse, the memos may need to be updated through change propagation. We propose a declarative language, which generalizes Datalog, to support this work in a generic way. Through examples, we show that a broad spectrum of AIalgorithms can be concisely captured by writing down systems of equations in our notation. Many strategies could be used to actually solve those systems. Our examples motivatecertainextensionstoDatalog, whichareconnectedtofunctional and objectoriented programming paradigms. 1 Why a New DataOriented Language for AI? Modern AI systems are frustratingly big, making them timeconsuming to engineer
A phraselevel machine translation approach for disfluency detection using weighted finite state transducers
 In Proceedings of Interspeech
, 2006
"... We propose a novel algorithm to detect disfluency in speech by reformulating the problem as phraselevel statistical machine translation using weighted finite state transducers. We approach the task as translation of noisy speech to clean speech. We simplify our translation framework such that it do ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
We propose a novel algorithm to detect disfluency in speech by reformulating the problem as phraselevel statistical machine translation using weighted finite state transducers. We approach the task as translation of noisy speech to clean speech. We simplify our translation framework such that it does not require fertility and alignment models. We tested our model on the Switchboard disfluencyannotated corpus. Using an optimized decoder that is developed for phrasebased translation at IBM, we are able to detect repeats, repairs and filled pauses for more than a thousand sentences in less than a second with encouraging results. Index Terms: disfluency detection, machine translation, speechtospeech translation.
Propagating Multitrust within Trust Networks
"... We suggest the concept of multitrust, which is aimed at computing trust by collectively involving a group of trustees at the same time: the trustor needs the concurrent support of multiple individuals to accomplish its task. We propose Soft Constraint Logic Programming based on semirings as a mean t ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
We suggest the concept of multitrust, which is aimed at computing trust by collectively involving a group of trustees at the same time: the trustor needs the concurrent support of multiple individuals to accomplish its task. We propose Soft Constraint Logic Programming based on semirings as a mean to quickly represent and evaluate trust propagation for this scenario. To attain this, we model the trust network adapting it to a weighted andor graph, where the weight on a connector corresponds to the trust feedback value among the connected nodes. Semirings are the parametric and flexible structures used to appropriately represent trust metrics.
M.: A discriminative model of stochastic edit distance in the form of a conditional transducer. Grammatical Inference: Algorithms and Applications 4201
, 2006
"... Abstract. Many realworld applications such as spellchecking or DNA analysis use the Levenshtein editdistance to compute similarities between strings. In practice, the costs of the primitive edit operations (insertion, deletion and substitution of symbols) are generally handtuned. In this paper, ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Abstract. Many realworld applications such as spellchecking or DNA analysis use the Levenshtein editdistance to compute similarities between strings. In practice, the costs of the primitive edit operations (insertion, deletion and substitution of symbols) are generally handtuned. In this paper, we propose an algorithm to learn these costs. The underlying model is a probabilitic transducer, computed by using grammatical inference techniques, that allows us to learn both the structure and the probabilities of the model. Beyond the fact that the learned transducers are neither deterministic nor stochastic in the standard terminology, they are conditional, thus independant from the distributions of the input strings. Finally, we show through experiments that our method allows us to design cost functions that depend on the string context where the edit operations are used. In other words, we get kinds of contextsensitive edit distances.
A class of rational nWFSM autointersections
 in Proc. Conf. Impl. and Appl. of Automata, Sophia Antipolis
, 2005
"... Abstract. Weighted finitestate machines with n tapes describe nary rational string relations. The join nary relation is very important regarding to applications. It is shown how to compute it via a more simple operation, the autointersection. Join and autointersection generally do not preserve ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Abstract. Weighted finitestate machines with n tapes describe nary rational string relations. The join nary relation is very important regarding to applications. It is shown how to compute it via a more simple operation, the autointersection. Join and autointersection generally do not preserve rationality. We define a class of triples 〈A,i, j 〉 such that the autointersection of the machine A w.r.t. tapes i and j can be computed by a delaybased algorithm. We point out how to extend this class and hope that it is sufficient for many practical applications. 1
Learning unbiased stochastic edit distance in the form of a memoryless finitestate transducer
 International Joint Conference on Machine Learning (2005). Workshop: Grammatical Inference Applications: Successes and Future Challenges
"... We aim at learning an unbiased stochastic edit distance in the form of a finitestate transducer from a corpus of (input,output) pairs of strings. Contrary to the other standard methods, which generally use the algorithm Expectation Maximization, our algorithm learns a transducer independently on th ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
We aim at learning an unbiased stochastic edit distance in the form of a finitestate transducer from a corpus of (input,output) pairs of strings. Contrary to the other standard methods, which generally use the algorithm Expectation Maximization, our algorithm learns a transducer independently on the marginal probability distribution of the input strings. Such an unbiased way to proceed requires to optimize the parameters of a conditional transducer instead of a joint one. This transducer can be very useful in many domains of pattern recognition and machine learning, such as noise management, or DNA alignment. Several experiments are carried out with our algorithm showing that it is able to correctly assess theoretical target distributions. 1
Finitestate transducer inference for a speechinput PortuguesetoEnglish machine
"... translation system ..."
FiniteState Dirichlet Allocation: Learned Priors on FiniteState Models
"... To model a collection of documents, suppose that each document was generated by a different hidden Markov model or probabilistic finitestate automaton (PFSA). Further suppose that all these PFSAs are similar because they are drawn from a single (but unknown) prior distribution over PFSAs. We wish t ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
To model a collection of documents, suppose that each document was generated by a different hidden Markov model or probabilistic finitestate automaton (PFSA). Further suppose that all these PFSAs are similar because they are drawn from a single (but unknown) prior distribution over PFSAs. We wish to infer the prior, obtain smoothed estimates of the individual PFSAs, and reconstruct the hidden paths by which the unknown PFSAs generated their documents. As an initial application, particularly hard for our model because of its sparse data, we derive an FSA topology from WordNet. For each verb, we construct the “document ” of all nouns that have appeared as its object. Our method then learns a better estimate of p(object  verb), as well as which paths in WordNet, and hence which senses of ambiguous objects, tend to be favored. Our method improves 14.6 % over WittenBell smoothing on the conditional perplexity of objects given the verb, and 27.5 % over random on detecting the most common senses of nouns in the SemCor corpus.
Automatic Broadcast News Speech Summarization
"... As the numbers of speech and video documents available on the web and on handheld devices soar to new levels, it becomes increasingly important to enable users to find relevant, significant and interesting parts of the documents automatically. In this dissertation, we present a system for summarizin ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
As the numbers of speech and video documents available on the web and on handheld devices soar to new levels, it becomes increasingly important to enable users to find relevant, significant and interesting parts of the documents automatically. In this dissertation, we present a system for summarizing Broadcast News (BN), ConciseSpeech, that identifies important segments of speech using lexical, acoustic/prosodic, and structural information, and combines them, optimizing significance, length and redundancy of the summary. There are many obstacles particular to speech such as word errors, disfluencies and the lack of segmentation that make speech summarization challenging. We present methods to address these problems. We show the use of Automatic Speech Recognition (ASR) confidence scores to compensate for word errors; present a phraselevel machine translation approach using weighted finite state transducers for detecting disfluency; and present the possibility of using intonational phrase segments for summarization. We also describe structural properties of BN used in determining which segments should be selected for a summary, including speaker roles, soundbites and commercials. We present Information Extraction (IE)