Results 1  10
of
45
Clustering Based on Conditional Distributions in an Auxiliary Space
 Neural Computation
, 2001
"... We study the problem of learning groups or categories that are local ..."
Abstract

Cited by 85 (23 self)
 Add to MetaCart
We study the problem of learning groups or categories that are local
Learning Joint Statistical Models for AudioVisual Fusion and Segregation
, 2001
"... People can understand complex auditory and visual information, often using one to disambiguate the other. Automated analysis, even at a lowlevel, faces severe challenges, including the lack of accurate statistical models for the signals, and their highdimensionality and varied sampling rates. Previ ..."
Abstract

Cited by 70 (1 self)
 Add to MetaCart
(Show Context)
People can understand complex auditory and visual information, often using one to disambiguate the other. Automated analysis, even at a lowlevel, faces severe challenges, including the lack of accurate statistical models for the signals, and their highdimensionality and varied sampling rates. Previous approaches [6] assumed simple parametric models for the joint distribution which, while tractable, cannot capture the complex signal relationships. We learn the joint distribution of the visual and auditory signals using a nonparametric approach. First, we project the data into a maximally informative, lowdimensional subspace, suitable for density estimation. We then model the complicated stochastic relationships between the signals using a nonparametric density estimator. These learned densities allow processing across signal modalities. We demonstrate, on synthetic and real signals, localization in video of the face that is speaking in audio, and, conversely, audio enhan...
Speaker association with signallevel audiovisual fusion
 IEEE Transactions on Multimedia
, 2004
"... Abstract—Audio and visual signals arriving from a common source are detected using a signallevel fusion technique. A probabilistic multimodal generation model is introduced and used to derive an information theoretic measure of crossmodal correspondence. Nonparametric statistical density modeling ..."
Abstract

Cited by 66 (0 self)
 Add to MetaCart
(Show Context)
Abstract—Audio and visual signals arriving from a common source are detected using a signallevel fusion technique. A probabilistic multimodal generation model is introduced and used to derive an information theoretic measure of crossmodal correspondence. Nonparametric statistical density modeling techniques can characterize the mutual information between signals from different domains. By comparing the mutual information between different pairs of signals, it is possible to identify which person is speaking a given utterance and discount errant motion or audio from other utterances or nonspeech events. Index Terms—Audiovisual correspondence, multimodal data association, mutual information. I.
Bankruptcy Analysis with SelfOrganizing Maps in Learning Metrics
 IEEE Transactions on Neural Networks
, 2001
"... We introduce a method for deriving a metric, locally based on the Fisher information matrix, into the data space. A SelfOrganizing Map is computed in the new metric to explore financial statements of enterprises. The metric measures local distances in terms of changes in the distribution of an auxi ..."
Abstract

Cited by 52 (19 self)
 Add to MetaCart
(Show Context)
We introduce a method for deriving a metric, locally based on the Fisher information matrix, into the data space. A SelfOrganizing Map is computed in the new metric to explore financial statements of enterprises. The metric measures local distances in terms of changes in the distribution of an auxiliary random variable that reflects what is important in the data. In this paper the variable indicates bankruptcy within the next few years. The conditional density of the auxiliary variable is first estimated, and the change in the estimate resulting from local displacements in the primary data space is measured using the Fisher information matrix. When a SelfOrganizing Map is computed in the new metric it still visualizes the data space in a topologypreserving fashion, but represents the (local) directions in which the probability of bankruptcy changes the most.
On Entropy Approximation for Gaussian Mixture Random Vectors
"... Abstract — For many practical probability density representations such as for the widely used Gaussian mixture densities, an analytic evaluation of the differential entropy is not possible and thus, approximate calculations are inevitable. For this purpose, the first contribution of this paper deals ..."
Abstract

Cited by 33 (2 self)
 Add to MetaCart
(Show Context)
Abstract — For many practical probability density representations such as for the widely used Gaussian mixture densities, an analytic evaluation of the differential entropy is not possible and thus, approximate calculations are inevitable. For this purpose, the first contribution of this paper deals with a novel entropy approximation method for Gaussian mixture random vectors, which is based on a componentwise Taylorseries expansion of the logarithm of a Gaussian mixture and on a splitting method of Gaussian mixture components. The employed order of the Taylorseries expansion and the number of components used for splitting allows balancing between accuracy and computational demand. The second contribution is the determination of meaningful and efficiently to calculate lower and upper bounds of the entropy, which can be also used for approximation purposes. In addition, a refinement method for the more important upper bound is proposed in order to approach the true entropy value. I.
Discriminative components of data
 IEEE Transactions on Neural Networks
, 2005
"... for publication. This material is posted here with permission of the IEEE. Such permission of the IEEE does not in any way imply IEEE endorsement of any of Helsinki University’s products or services. Internal or personal use of this material is permitted. However, permission to reprint/republish thi ..."
Abstract

Cited by 18 (5 self)
 Add to MetaCart
(Show Context)
for publication. This material is posted here with permission of the IEEE. Such permission of the IEEE does not in any way imply IEEE endorsement of any of Helsinki University’s products or services. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by writing to pubspermissions@ieee.org. By choosing to view this document, you agree to all provisions of the copyright laws protecting it. Thank you.
Informative Discriminant Analysis
 In: Proceedings of the Twentieth International Conference on Machine Learning (ICML2003). AAAI Press, Menlo Park, CA
, 2003
"... We introduce a probabilistic model that generalizes classical linear discriminant analysis and gives an interpretation for the components as informative or relevant components of data. The components maximize the predictability of class distribution which is asymptotically equivalent to (i) ma ..."
Abstract

Cited by 15 (7 self)
 Add to MetaCart
We introduce a probabilistic model that generalizes classical linear discriminant analysis and gives an interpretation for the components as informative or relevant components of data. The components maximize the predictability of class distribution which is asymptotically equivalent to (i) maximizing mutual information with the classes, and (ii) nding principal components in the socalled learning or Fisher metrics. The Fisher metric measures only distances that are relevant to the classes, that is, distances that cause changes in the class distribution. The components have applications in data exploration, visualization, and dimensionality reduction.
Nonlinear Feature Transforms Using Maximum Mutual Information
 In Proc. IJCNN
, 2001
"... Finding the right features is an essential part of a pattern recognition system. This can be accomplished either by selection or by a transform from a larger number of "raw" features. In this work we learn nonlinear dimension reducing discriminative transforms that are implemented as neur ..."
Abstract

Cited by 14 (4 self)
 Add to MetaCart
(Show Context)
Finding the right features is an essential part of a pattern recognition system. This can be accomplished either by selection or by a transform from a larger number of "raw" features. In this work we learn nonlinear dimension reducing discriminative transforms that are implemented as neural networks, either as radial basis function networks or as multilayer perceptrons. As the criterion, we use the joint mutual information (MI) between the class labels of training data and transformed features. Our measure of MI makes use of Renyi entropy as formulated by Principe et al. Resulting lowdimensional features enable a classifier to operate with less computational resources and memory without compromising the accuracy.
Extraction of audio features specific to speech production for multimodal speaker detection
 IEEE Trans. Multimedia
, 2008
"... Abstract—A method that exploits an information theoretic framework to extract optimized audio features using video information is presented. A simple measure of mutual information (MI) between the resulting audio and video features allows the detection of the active speaker among different candidate ..."
Abstract

Cited by 14 (1 self)
 Add to MetaCart
(Show Context)
Abstract—A method that exploits an information theoretic framework to extract optimized audio features using video information is presented. A simple measure of mutual information (MI) between the resulting audio and video features allows the detection of the active speaker among different candidates. This method involves the optimization of an MIbased objective function. No approximation is needed to solve this optimization problem, neither for the estimation of the probability density functions (pdfs) of the features, nor for the cost function itself. The pdfs are estimated from the samples using a nonparametric approach. The challenging optimization problem is solved using a global method: the differential evolution algorithm. Two information theoretic optimization criteria are compared and their ability to extract audio features specific to speech production is discussed. Using these specific audio features, candidate video features are then classified as member of the “speaker ” or “nonspeaker” class, resulting in a speaker detection scheme. As a result, our method achieves a speaker detection rate of 100 % on inhouse test sequences, and of 85 % on most commonly used sequences. Index Terms—Audio features, differential evolution, multimodal, mutual information, speaker detection, speech. I.
Learning Discriminative Feature Transforms to Low Dimensions in Low Dimensions
 In Advances in neural information processing systems 14
, 2001
"... The marriage of Renyi entropy with Parzen density estimation has been shown to be a viable tool in learning discriminative feature transforms. ..."
Abstract

Cited by 13 (3 self)
 Add to MetaCart
(Show Context)
The marriage of Renyi entropy with Parzen density estimation has been shown to be a viable tool in learning discriminative feature transforms.