Results 11  20
of
38
Parse Disambiguation for a Rich HPSG Grammar
 IN FIRST WORKSHOP ON TREEBANKS AND LINGUISTIC THEORIES. SOZOPOL
, 2002
"... In this paper, we describe experiments on HPSG parse disambiguation using the Redwoods HPSG treebank (Oepen et al. 2002a,b,c). HPSG is a constraintbased lexicalist ("unification") grammar formalism The ..."
Abstract

Cited by 20 (4 self)
 Add to MetaCart
In this paper, we describe experiments on HPSG parse disambiguation using the Redwoods HPSG treebank (Oepen et al. 2002a,b,c). HPSG is a constraintbased lexicalist ("unification") grammar formalism The
Gaussian Mixture Models for Online Signature Verification
, 2003
"... This paper introduces and motivates the use of Gaussian Mixture Models (GMMs) for online signature verification. The individual Gaussian components are shown to represent some local, signerdependent features that characterise spatial and temporal aspects of a signature, and are e#ective for modell ..."
Abstract

Cited by 16 (7 self)
 Add to MetaCart
This paper introduces and motivates the use of Gaussian Mixture Models (GMMs) for online signature verification. The individual Gaussian components are shown to represent some local, signerdependent features that characterise spatial and temporal aspects of a signature, and are e#ective for modelling its specificity. The focus of this work is on automated order selection for signature models, based on the Minimum Description Length (MDL) principle. A complete experimental evaluation of the Gaussian Mixture signature models is conducted on a 50user subset of the MCYT multimodal database. Algorithmic issues are explored and comparisons to other commonly used online signature modelling techniques based on Hidden Markov Models (HMMs) are made.
An Efficient MDLBased Construction of RBF Networks
, 1998
"... We propose a method for optimizing the complexity of Radial Basis Function (RBF) networks. The method involves two procedures: adaptation (training) and selection. The first procedure adaptively changes the locations and the width of the basis functions and trains the linear weights. The selectio ..."
Abstract

Cited by 11 (2 self)
 Add to MetaCart
We propose a method for optimizing the complexity of Radial Basis Function (RBF) networks. The method involves two procedures: adaptation (training) and selection. The first procedure adaptively changes the locations and the width of the basis functions and trains the linear weights. The selection procedure performs the elimination of the redundant basis functions using an objective function based on the Minimum Description Length (MDL) principle. By iteratively combining these two procedures we achieve a controlled way of training and modifying RBF networks, which balances accuracy, training time, and complexity of the resulting network. We test the proposed method on function approximation and classification tasks, and compare it to some other recently proposed methods. Keywords: Radial basis functions, Optimizing radial basis function network, Minimum Description Length principle, function approximation, heart disease classification 4 1 Introduction Radial basis function...
Tensor Decompositions, Alternating Least Squares and Other Tales
 JOURNAL OF CHEMOMETRICS
, 2009
"... This work was originally motivated by a classification of tensors proposed by Richard Harshman. In particular, we focus on simple and multiple “bottlenecks”, and on “swamps”. Existing theoretical results are surveyed, some numerical algorithms are described in details, and their numerical complexity ..."
Abstract

Cited by 10 (3 self)
 Add to MetaCart
This work was originally motivated by a classification of tensors proposed by Richard Harshman. In particular, we focus on simple and multiple “bottlenecks”, and on “swamps”. Existing theoretical results are surveyed, some numerical algorithms are described in details, and their numerical complexity is calculated. In particular, the interest in using the ELS enhancement in these algorithms is discussed. Computer simulations feed this discussion.
Learning Probabilistic Subcategorization Preference by Identifying Case Dependencies and Optimal Noun Class Generalization Level
 In Proceedings of the 5th ANLP
, 1997
"... This paper proposes a novel method of learning probabilistic subcategorization preference. In the method, for the purpose of coping with the ambiguities of case dependencies and noun class gen eralization of argument/adjunct nouns, we intro duce a data structure which represents a tuple of i ..."
Abstract

Cited by 9 (4 self)
 Add to MetaCart
This paper proposes a novel method of learning probabilistic subcategorization preference. In the method, for the purpose of coping with the ambiguities of case dependencies and noun class gen eralization of argument/adjunct nouns, we intro duce a data structure which represents a tuple of independent partial subcategorization frames.
Morfessor in the morpho challenge
 Proceedings of the PASCAL Challenge Workshop on Unsupervised Segmentation of Words into Morphemes
, 2006
"... In this work, Morfessor, a morpheme segmentation model and algorithm developed by the organizers of the Morpho Challenge, is outlined and references are made to earlier work. Although Morfessor does not take part in the official Challenge competition, we report experimental results for the morpheme ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
In this work, Morfessor, a morpheme segmentation model and algorithm developed by the organizers of the Morpho Challenge, is outlined and references are made to earlier work. Although Morfessor does not take part in the official Challenge competition, we report experimental results for the morpheme segmentation of English, Finnish and Turkish words. The obtained results are very good. Morfessor outperforms the other algorithms in the Finnish and Turkish tasks and comes second in the English task. In the Finnish speech recognition task, Morfessor achieves the lowest letter error rate. 1
Rooij. Asymptotic logloss of prequential maximum likelihood codes
 In Conference on Learning Theory (COLT 2005
, 2005
"... We analyze the DawidRissanen prequential maximum likelihood codes relative to oneparameter exponential family models M. If data are i.i.d. according to an (essentially) arbitrary P, then the redundancy grows at rate 1 2c lnn. We show that c = σ2 1/σ2 2, where σ2 1 is the variance of P, and σ2 2 is ..."
Abstract

Cited by 6 (4 self)
 Add to MetaCart
We analyze the DawidRissanen prequential maximum likelihood codes relative to oneparameter exponential family models M. If data are i.i.d. according to an (essentially) arbitrary P, then the redundancy grows at rate 1 2c lnn. We show that c = σ2 1/σ2 2, where σ2 1 is the variance of P, and σ2 2 is the variance of the distribution M ∗ ∈ M that is closest to P in KL divergence. This shows that prequential codes behave quite differently from other important universal codes such as the 2part MDL, Shtarkov and Bayes codes, for which c = 1. This behavior is undesirable in an MDL model selection setting. 1
Maximum entropy model learning of subcategorization preference
 In Proceedings of the 5th Workshop on Very Large Corpora
, 1997
"... This paper proposes a novel method for learning probabilistic models of subcategorization preference of verbs. Especially, we propose to consider the issues of case dependencie ~ and noun class generalization in a uniform way. We adopt the maximum entropy model learn~,g method and apply it to the ta ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
This paper proposes a novel method for learning probabilistic models of subcategorization preference of verbs. Especially, we propose to consider the issues of case dependencie ~ and noun class generalization in a uniform way. We adopt the maximum entropy model learn~,g method and apply it to the task of model learning of subcategorization preference. Case dependencies and noun class generalization are represented as featura ~ in the maximum entropy approach. The feature selection facility of the maximum entropy model learning makes it possible to find optimal case dependencies and optimal noun c! ~ generalization levels. We describe the results of the experiment on learning probabilistic models of subcategorization preference f~om the EDR Japanese bracketed corpus. We also evaluated the performance of the selected features and their estimated parameters in the subcategorization preference task. 1
MDL Principle for Robust Vector Quantization
 Pattern Analysis and Applications
, 1999
"... We address the problem of finding the optimal number of reference vectors for vector quantization from the point of view of the Minimum Description Length (MDL) principle. We formulate vector quantization in terms of the MDL principle, and then derive different instantiations of the algorithm, depen ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
We address the problem of finding the optimal number of reference vectors for vector quantization from the point of view of the Minimum Description Length (MDL) principle. We formulate vector quantization in terms of the MDL principle, and then derive different instantiations of the algorithm, depending on the coding procedure. Moreover, we develop an efficient algorithm (similar to EMtype algorithms) for optimizing the MDL criterion. In addition, we use the MDL principle to increase the robustness of the training algorithm, namely, the MDL principle provides a criterion to decide which data points are outliers. We illustrate our approach on 2D clustering problems (in order to visualize the behavior of the algorithm) and present applications on image coding. Finally we outline various ways to extend the algorithm. 1 Introduction Unsupervised learning (clustering techniques) are widely used methods in pattern recognition and neural networks for exploratory data analysis. These method...
A Scaling Law for the ValidationSet TrainingSet Size Ratio
 AT & T Bell Laboratories
, 1997
"... We address the problem of determining what fraction of the training set should be reserved as development test set or validation set. We determine that the ratio of the validation set size over the training set size scales like the square root of two complexity parameters: the complexity of the seco ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
We address the problem of determining what fraction of the training set should be reserved as development test set or validation set. We determine that the ratio of the validation set size over the training set size scales like the square root of two complexity parameters: the complexity of the second level of inference (minimizing the validation error) over the complexity of the first level of inference (minimizing the error rate on the training set). Keywords: Crossvalidation; Learning Theory; Statistics; Machine Learning; Pattern Recognition; Training Set; Validation Set; Test Set; Experiment Design. Introduction The problem often arises when organizing benchmarks in pattern recognition to determine what size test set will give statistically significant results. In a companion paper [1], we tackled the problem from the point of view of the benchmark organizer: From a corpus of available data, how much data should be reserved for the benchmark test set? In this paper, we tackle th...