Results 1  10
of
10
On the Estimation of αDivergences
"... We propose new nonparametric, consistent Rényiα and Tsallisα divergence estimators for continuous distributions. Given two independent and identically distributed samples, a “naïve ” approach would be to simply estimate the underlying densities and plug the estimated densities into the correspondi ..."
Abstract

Cited by 8 (4 self)
 Add to MetaCart
(Show Context)
We propose new nonparametric, consistent Rényiα and Tsallisα divergence estimators for continuous distributions. Given two independent and identically distributed samples, a “naïve ” approach would be to simply estimate the underlying densities and plug the estimated densities into the corresponding formulas. Our proposed estimators, in contrast, avoid density estimation completely, estimating the divergences directly using only simple knearestneighbor statistics. We are nonetheless able to prove that the estimators are consistent under certain conditions. We also describe how to apply these estimators to mutual information and demonstrate their efficiency via numerical experiments. 1
LETTER Communicated by Simon Giszter On Nonnegative Matrix Factorization Algorithms for SignalDependent Noise with Application to Electromyography Data
"... Nonnegative matrix factorization (NMF) by the multiplicative updates algorithm is a powerful machine learning method for decomposing a highdimensional nonnegative matrix V into two nonnegative matrices, W andH, whereV ∼WH. It has been successfully applied in the analysis and interpretation of large ..."
Abstract
 Add to MetaCart
(Show Context)
Nonnegative matrix factorization (NMF) by the multiplicative updates algorithm is a powerful machine learning method for decomposing a highdimensional nonnegative matrix V into two nonnegative matrices, W andH, whereV ∼WH. It has been successfully applied in the analysis and interpretation of largescale data arising in neuroscience, computational biology, and natural language processing, among other areas. A distinctive feature ofNMF is its nonnegativity constraints that allowonly additive linear combinations of the data, thus enabling it to learn parts that have distinct physical representations in reality. In this letter, we describe an informationtheoretic approach to NMF for signaldependent noise based on the generalized inverse gaussian model. Specifically, we propose three novel algorithms in this setting, each based on multiplicative updates, and provemonotonicity of updates using the EMalgorithm. In addition, we develop algorithmspecific measures to evaluate their goodness of fit on data. Our methods are demonstrated using experimental data from electromyography studies, as well as simulated data in the extraction of muscle synergies, and compared with existing algorithms for signaldependent noise. 1
Nonparametric Divergence Estimation and its Applications to Machine Learning
"... Lowdimensional embedding, manifold learning, clustering, classification, and anomaly detection are among the most important problems in machine learning. Here we consider the setting where each instance of the inputs corresponds to a continuous probability distribution. These distributions are unk ..."
Abstract
 Add to MetaCart
(Show Context)
Lowdimensional embedding, manifold learning, clustering, classification, and anomaly detection are among the most important problems in machine learning. Here we consider the setting where each instance of the inputs corresponds to a continuous probability distribution. These distributions are unknown to us, but we are given some i.i.d. samples from each of them. While most of the existing machine learning methods operate on points, i.e. finitedimensional feature vectors, in our setting we study algorithms that operate on groups, i.e. sets of feature vectors. For this purpose, we propose new nonparametric, consistent estimators for a large family of divergences and describe how to apply them for machine learning problems. As important special cases, the estimators can be used to estimate Rényi, Tsallis, KullbackLeibler, Hellinger, Bhattacharyya distance, L2 divergences, and mutual information. We present empirical results on synthetic data, real word images, and astronomical data sets. 1.
Adaptive multiplicative updates for quadratic nonnegative matrix factorization
"... a b s t r a c t In Nonnegative Matrix Factorization (NMF), a nonnegative matrix is approximated by a product of lowerrank factorizing matrices. Quadratic Nonnegative Matrix Factorization (QNMF) is a new class of NMF methods where some factorizing matrices occur twice in the approximation. QNMF fin ..."
Abstract
 Add to MetaCart
a b s t r a c t In Nonnegative Matrix Factorization (NMF), a nonnegative matrix is approximated by a product of lowerrank factorizing matrices. Quadratic Nonnegative Matrix Factorization (QNMF) is a new class of NMF methods where some factorizing matrices occur twice in the approximation. QNMF finds its applications in graph partition, biclustering, graph matching, etc. However, the original QNMF algorithms employ constant multiplicative update rules and thus have mediocre convergence speed. Here we propose an adaptive multiplicative algorithm for QNMF which is not only theoretically convergent but also significantly faster than the original implementation. An adaptive exponent scheme has been adopted for our method instead of the old constant ones, which enables larger learning steps for improved efficiency. The proposed method is general and thus can be applied to QNMF with a variety of factorization forms and with the most commonly used approximation error measures. We have performed extensive experiments, where the results demonstrate that the new method is effective in various QNMF applications on both synthetic and realworld datasets.
LEARNING αINTEGRATION WITH PARTIALLYLABELED DATA
"... Sensory data integration is an important task in human brain for multimodal processing as well as in machine learning for multisensor processing. αintegration was proposed by Amari as a principled way of blending multiple positive measures (e.g., stochastic models in the form of probability distrib ..."
Abstract
 Add to MetaCart
(Show Context)
Sensory data integration is an important task in human brain for multimodal processing as well as in machine learning for multisensor processing. αintegration was proposed by Amari as a principled way of blending multiple positive measures (e.g., stochastic models in the form of probability distributions), providing an optimal integration in the sense of minimizing the αdivergence. It also encompasses existing integration methods as its special case, e.g., weighted average and exponential mixture. In αintegration, the value of α determines the characteristics of the integration and the weight vector w assigns the degree of importance to each measure. In most of the existing work, however, α and w are given in advance rather than learned. In this paper we present two algorithms, for learning α and w from data when only a few integrated target values are available. Numerical experiments on synthetic as well as realworld data confirm the proposed method’s effectiveness. Index Terms — αintegration, parameter estimation 1.
Submitted: 10.11.2009 Published: 11.11.2009
"... Supervised and unsupervised vector quantization methods for classification and clustering traditionally use dissimilarities, frequently taken as Euclidean distances. In this article we investigate the applicability of divergences instead. We deduce the mathematical fundamentals for its utilization i ..."
Abstract
 Add to MetaCart
(Show Context)
Supervised and unsupervised vector quantization methods for classification and clustering traditionally use dissimilarities, frequently taken as Euclidean distances. In this article we investigate the applicability of divergences instead. We deduce the mathematical fundamentals for its utilization in derivative based vector quantization algorithms. It bears on the generalized derivatives known as Fréchetderivatives. We exemplary show the application of this methodology for widely applied supervised and unsupervised vector quantization schemes including selforganizing maps, neural gas, and learning vector quantization. Further we show principles for hyperparameter optimization for parametrized divergences in the case of supervised vector quantization to achieve an improved classification accuracy. Machine Learning Reports,Research group on Computational Intelligence,
LETTER Communicated by Shunichi Amari Parameter Learning for Alpha Integration
"... In pattern recognition, data integration is an important issue, and when properly done, it can lead to improved performance. Also, data integration can be used to help model and understand multimodal processing in the brain. Amari proposed αintegration as a principled way of blending multiple pos ..."
Abstract
 Add to MetaCart
In pattern recognition, data integration is an important issue, and when properly done, it can lead to improved performance. Also, data integration can be used to help model and understand multimodal processing in the brain. Amari proposed αintegration as a principled way of blending multiple positive measures (e.g., stochastic models in the form of probability distributions), enabling an optimal integration in the sense of minimizing the αdivergence. It also encompasses existing integration methods as its special case, for example, a weighted average and an exponential mixture. The parameter α determines integration characteristics, and the weight vector w assigns the degree of importance to each measure. In most work, however, α and w are given in advance rather than learned. In this letter, we present a parameter learning algorithm for learning α and w from data when multiple integrated target values are available. Numerical experiments on synthetic as well as realworld data demonstrate the effectiveness of the proposed method. 1
Incremental MultiSource Recognition with NonNegative Matrix Factorization
, 2009
"... This master’s thesis is dedicated to incremental multisource recognition using nonnegative matrix factorization. A particular attention is paid to providing a mathematical framework for sparse coding schemes in this context. The applications of nonnegative matrix factorization problems to sound ..."
Abstract
 Add to MetaCart
(Show Context)
This master’s thesis is dedicated to incremental multisource recognition using nonnegative matrix factorization. A particular attention is paid to providing a mathematical framework for sparse coding schemes in this context. The applications of nonnegative matrix factorization problems to sound recognition are discussed to give the outlines, positions and contributions of the present work with respect to the literature. The problem of incremental recognition is addressed within the framework of nonnegative decomposition, a modified nonnegative matrix factorization scheme where the incoming signal is projected onto a basis of templates learned offline prior to the decomposition. As it appears that sparsity is one of the main issue in this context, a theoretical approach is followed to overcome the problem. The main contribution of the present work is in the formulation of a sparse nonnegative matrix factorization framework. This formulation is motivated and illustrated with a synthetic experiment, and then addressed with convex optimization techniques such as gradient optimization, convex quadratic programming and secondorder cone programming. Several algorithms are proposed to address the question of sparsity. To provide results and validations, some of these algorithms are applied to preliminary evaluations, notably that of incremental multiplepitch and multipleinstrument recognition, and that of incremental analysis of complex auditory scenes.