• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Batch and on-line parameter estimation of Gaussian mixtures based on the joint entropy (1998)

by Y Singer, M K Warmuth
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 10

Matrix exponentiated gradient updates for on-line learning and Bregman projections

by Koji Tsuda, Gunnar Rätsch, Manfred K. Warmuth - Journal of Machine Learning Research , 2005
"... We address the problem of learning a symmetric positive definite matrix. The central issue is to design parameter updates that preserve positive definiteness. Our updates are motivated with the von Neumann divergence. Rather than treating the most general case, we focus on two key applications that ..."
Abstract - Cited by 32 (8 self) - Add to MetaCart
We address the problem of learning a symmetric positive definite matrix. The central issue is to design parameter updates that preserve positive definiteness. Our updates are motivated with the von Neumann divergence. Rather than treating the most general case, we focus on two key applications that exemplify our methods: On-line learning with a simple square loss and finding a symmetric positive definite matrix subject to symmetric linear constraints. The updates generalize the Exponentiated Gradient (EG) update and AdaBoost, respectively: the parameter is now a symmetric positive definite matrix of trace one instead of a probability vector (which in this context is a diagonal positive definite matrix with trace one). The generalized updates use matrix logarithms and exponentials to preserve positive definiteness. Most importantly, we show how the analysis of each algorithm generalizes to the non-diagonal case. We apply both new algorithms, called the Matrix Exponentiated Gradient (MEG) update and DefiniteBoost, to learn a kernel matrix from distance measurements. 1

Rotation Invariant Texture Characterization and Retrieval using Steerable Wavelet-domain Hidden Markov Models

by Minh N. Do, Martin Vetterli
"... A new statistical model for characterizing texture images based on wavelet-domain hidden Markov models and steerable pyramids is presented. The new model is shown to capture well both the subband marginal distributions and the dependencies across scales and orientations of the wavelet descriptors. O ..."
Abstract - Cited by 28 (4 self) - Add to MetaCart
A new statistical model for characterizing texture images based on wavelet-domain hidden Markov models and steerable pyramids is presented. The new model is shown to capture well both the subband marginal distributions and the dependencies across scales and orientations of the wavelet descriptors. Once it is trained for an input texture image, the model can be easily steered to characterize that texture at any other orientation. After a diagonalization operation, one obtains a rotation-invariant model of the texture image. The effectiveness of the new texture models are demonstrated in retrieval experiments with large image databases, where significant performance gains are shown. Keywords texture characterization, image retrieval, rotation invariance, wavelets, hidden Markov models, steerable pyramids. Corresponding author. Address: see above; Phone: +41 21 693 7663; Fax: +41 21 693 4312. y Also with Department of EECS, UC Berkeley, Berkeley CA 94720, USA. April 23, 2001 DRAFT I.

Differential Entropic Clustering of Multivariate Gaussians

by Jason V. Davis, Inderjit Dhillon - Adv. in Neural Inf. Proc. Sys. (NIPS , 2006
"... Gaussian data is pervasive and many learning algorithms (e.g., k-means) model their inputs as a single sample drawn from a multivariate Gaussian. However, in many real-life settings, each input object is best described by multiple samples drawn from a multivariate Gaussian. Such data can arise, for ..."
Abstract - Cited by 16 (1 self) - Add to MetaCart
Gaussian data is pervasive and many learning algorithms (e.g., k-means) model their inputs as a single sample drawn from a multivariate Gaussian. However, in many real-life settings, each input object is best described by multiple samples drawn from a multivariate Gaussian. Such data can arise, for example, in a movie review database where each movie is rated by several users, or in time-series domains such as sensor networks. Here, each input can be naturally described by both a mean vector and covariance matrix which parameterize the Gaussian distribution. In this paper, we consider the problem of clustering such input objects, each represented as a multivariate Gaussian. We formulate the problem using an information theoretic approach and draw several interesting theoretical connections to Bregman divergences and also Bregman matrix divergences. We evaluate our method across several domains, including synthetic data, sensor network data, and a statistical debugging application. 1

A Distance Measure Between GMMs Based on the Unscented Transform and its Application to Speaker Recognition

by Jacob Goldberger - in Proc. of Interspeech, 2005 , 2005
"... This paper proposes a dissimilarity measure between two Gaussian mixture models (GMM). Computing a distance measure between two GMMs that were learned from speech segments is a key element in speaker verification, speaker segmentation and many other related applications. A natural measure between tw ..."
Abstract - Cited by 4 (0 self) - Add to MetaCart
This paper proposes a dissimilarity measure between two Gaussian mixture models (GMM). Computing a distance measure between two GMMs that were learned from speech segments is a key element in speaker verification, speaker segmentation and many other related applications. A natural measure between two distributions is the Kullback-Leibler divergence. However, it cannot be analytically computed in the case of GMM. We propose an accurate and efficiently computed approximation of the KL-divergence. The method is based on the unscented transform which is usually used to obtain a better alternative to the extended Kalman filter. The suggested distance is evaluated in an experimental setup of speakers data-set. The experimental results indicate that our proposed approximations outperform previously suggested methods. 1.

Optimal Power Allocation for Distributed Detection in Wireless Sensor Networks

by Xin Zhang, H. Vincent Poor, Mung Chiang
"... Abstract — In distributed detection systems with wireless sensor networks, communication between sensors and a fusion center is not perfect due to interference and limited communication power of the sensors to combat noise. The problem of optimizing detection performance with imperfect communication ..."
Abstract - Add to MetaCart
Abstract — In distributed detection systems with wireless sensor networks, communication between sensors and a fusion center is not perfect due to interference and limited communication power of the sensors to combat noise. The problem of optimizing detection performance with imperfect communication between the sensors and the fusion center over wireless channels brings a new challenge to distributed detection. In this paper, a distributed detection system infrastructure is provided, and a multiaccess channel model is included to account for imperfect communication between the sensors and the fusion center. The J-divergence between the distributions of the detection statistic under different hypotheses is used as a performance criterion in order to provide a tractable analysis. Optimizing the performance (in terms of the J-divergence) under a total communication power constraint on the sensors is studied, and the corresponding optimal power allocation scheme is provided. It is interesting to see that, for the case with orthogonal channels, the power allocation can be solved by a weighted water-filling algorithm. Numerical results are used to illustrate the solution. Index Terms — Distributed detection, wireless sensor networks, multiaccess channel, power allocation I.

Information Theoretic Novelty Detection

by Maurizio Filippone, Guido Sanguinetti , 2009
"... We present a novel approach to online change detection problems when the training sample size is small. The proposed approach is based on estimating the expected information content of a new data point and allows an accurate control of the false positive rate even for small data sets. In the case of ..."
Abstract - Add to MetaCart
We present a novel approach to online change detection problems when the training sample size is small. The proposed approach is based on estimating the expected information content of a new data point and allows an accurate control of the false positive rate even for small data sets. In the case of the Gaussian distribution, our approach is analytically tractable and closely related to classical statistical tests. We then propose an approximation scheme to extend our approach to the case of the mixture of Gaussians. We evaluate extensively our approach on synthetic data and on three real benchmark data sets. The experimental validation shows that our method maintains a good overall accuracy, but significantly improves the control over the false positive rate.

Project Summary

by unknown authors
"... Complete and accurate collection of clinical data in the course of health care is a long-standing goal that has not been achieved either by manual record-keeping or through electronic record systems. This proposed project addresses the problem from the beginning of the clinical process, by aiming to ..."
Abstract - Add to MetaCart
Complete and accurate collection of clinical data in the course of health care is a long-standing goal that has not been achieved either by manual record-keeping or through electronic record systems. This proposed project addresses the problem from the beginning of the clinical process, by aiming to improve the capture of relevant medical facts during the face-to-face interaction between a patient and provider. Instead of relying on the provider’s fallible memory to record facts after the visit, the proposed system will “listen ” to the conversation, use automatic speech recognition to produce an (imperfect) record of what was said, and apply a variety of text analysis and extraction methods to create a draft record of the encounter. Further, it will provide an interface that should permit patients and providers to examine the facts that were recorded and to correct and complete them, also using speech as the primary interface. The projects aims are to develop and integrate the components needed to accomplish this goal, to create a testbed in collaboration with researchers at the environmental health clinic of a children’s hospital in which experiments can guide system development and assess progress, and to conduct a series of evaluations that assess a series of objectives. First, the research will characterize the ability of the speech recognition, information extraction and information organization components to process the target conversations. Second, it will evaluate the hypothesis that this system can collect a more complete and

IEEE SIGNAL PROCESSING LETTERS, SUBMITTED 1 Fast Approximation of Kullback-Leibler Distance for Dependence Trees and Hidden Markov Models

by Minh N. Do
"... Abstract — We present a fast algorithm to approximate the Kullback-Leibler distance (KLD) between two dependence tree models. The algorithm uses the “upward ” (or “forward”) procedure to compute an upper bound for the KLD. For hidden Markov models, this algorithm is reduced to a simple expression. N ..."
Abstract - Add to MetaCart
Abstract — We present a fast algorithm to approximate the Kullback-Leibler distance (KLD) between two dependence tree models. The algorithm uses the “upward ” (or “forward”) procedure to compute an upper bound for the KLD. For hidden Markov models, this algorithm is reduced to a simple expression. Numerical experiments show that for a similar accuracy, the proposed algorithm offers a saving of hundreds of times in computational complexity compared to the commonly used Monte-Carlo method. This makes the proposed algorithm important for real-time applications, like image retrieval. Keywords — Kullback-Leibler distance, models, hidden Markov models

Image Classification With Kernelized Spatial-Context

by Guo-jun Qi, Xian-sheng Hua, Yong Rui, Jinhui Tang, Hong-jiang Zhang
"... Abstract—The goal of image classification is to classify a collection of unlabeled images into a set of semantic classes. Many methods have been proposed to approach this goal by leveraging visual appearances of local patches in images. However, the spatial context between these local patches also p ..."
Abstract - Add to MetaCart
Abstract—The goal of image classification is to classify a collection of unlabeled images into a set of semantic classes. Many methods have been proposed to approach this goal by leveraging visual appearances of local patches in images. However, the spatial context between these local patches also provides significant information to improve the classification accuracy. Traditional spatial contextual models, such as two-dimensional hidden Markov model, attempt to construct one common model for each image category to depict the spatial structures of the images in this class. However due to large intra-class variances in an image category, one single model has difficulties in representing various spatial contexts in different images. In contrast, we propose to construct a prototype set of spatial contextual models by leveraging the kernel methods rather than only one model. Such an algorithm combines the advantages of rich representation ability of spatial contextual models as well as the powerful classification ability of kernel method. In particular, we propose a new distance measure between different spatial contextual models by integrating joint appearance-spatial image features. Such a distance measure can be efficiently computed in a recursive formulation that scales well to image size. Extensive experiments demonstrate that the proposed approach significantly outperforms the state-of-the-art approaches. Index Terms—2-D hidden Markov model, image classification, kernel method, spatial context.

Auto-Regressive HMM Inference with Incomplete Data for Short-Horizon Wind Forecasting

by Chris Barber, Joseph Bockhorst, Paul Roebber
"... Accurate short-term wind forecasts (STWFs), with time horizons from 0.5 to 6 hours, are essential for efficient integration of wind power to the electrical power grid. Physical models based on numerical weather predictions are currently not competitive, and research on machine learning approaches is ..."
Abstract - Add to MetaCart
Accurate short-term wind forecasts (STWFs), with time horizons from 0.5 to 6 hours, are essential for efficient integration of wind power to the electrical power grid. Physical models based on numerical weather predictions are currently not competitive, and research on machine learning approaches is ongoing. Two major challenges confronting these efforts are missing observations and weather-regime induced dependency shifts among wind variables. In this paper we introduce approaches that address both of these challenges. We describe a new regime-aware approach to STWF that use auto-regressive hidden Markov models (AR-HMM), a subclass of conditional linear Gaussian (CLG) models. Although AR-HMMs are a natural representation for weather regimes, as with CLG models in general, exact inference is NP-hard when observations are missing (Lerner and Parr, 2001). We introduce a simple approximate inference method for AR-HMMs, which we believe has applications in other problem domains. In an empirical evaluation on publicly available wind data from two geographically distinct regions, our approach makes significantly more accurate predictions than baseline models, and uncovers meteorologically relevant regimes. 1
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University