Results 1  10
of
23
Multiview regression via canonical correlation analysis
 In Proc. of Conference on Learning Theory
, 2007
"... Abstract. In the multiview regression problem, we have a regression problem where the input variable (which is a real vector) can be partitioned into two different views, where it is assumed that either view of the input is sufficient to make accurate predictions — this is essentially (a significan ..."
Abstract

Cited by 50 (7 self)
 Add to MetaCart
Abstract. In the multiview regression problem, we have a regression problem where the input variable (which is a real vector) can be partitioned into two different views, where it is assumed that either view of the input is sufficient to make accurate predictions — this is essentially (a significantly weaker version of) the cotraining assumption for the regression problem. We provide a semisupervised algorithm which first uses unlabeled data to learn a norm (or, equivalently, a kernel) and then uses labeled data in a ridge regression algorithm (with this induced norm) to provide the predictor. The unlabeled data is used via canonical correlation analysis (CCA, which is a closely related to PCA for two random variables) to derive an appropriate norm over functions. We are able to characterize the intrinsic dimensionality of the subsequent ridge regression problem (which uses this norm) by the correlation coefficients provided by CCA in a rather simple expression. Interestingly, the norm used by the ridge regression algorithm is derived from CCA, unlike in standard kernel methods where a special apriori norm is assumed (i.e. a Banach space is assumed). We discuss how this result shows that unlabeled data can decrease the sample complexity. 1
The Information Bottleneck Revisited or How to Choose a Good Distortion Measure
"... Abstract — It is wellknown that the information bottleneck method and rate distortion theory are related. Here it is described how the information bottleneck can be considered as rate distortion theory for a family of probability measures where information divergence is used as distortion measure. ..."
Abstract

Cited by 25 (0 self)
 Add to MetaCart
(Show Context)
Abstract — It is wellknown that the information bottleneck method and rate distortion theory are related. Here it is described how the information bottleneck can be considered as rate distortion theory for a family of probability measures where information divergence is used as distortion measure. It is shown that the information bottleneck method has some properties that are not shared with rate distortion theory based on any other divergence measure. In this sense the information bottleneck method is unique. I.
Unsupervised imageset clustering using an information theoretic framework
 IEEE transactions on image processing
, 2006
"... Abstract—In this paper, we combine discrete and continuous image models with information–theoreticbased criteria for unsupervised hierarchical imageset clustering. The continuous image modeling is based on mixture of Gaussian densities. The unsupervised imageset clustering is based on a generaliz ..."
Abstract

Cited by 19 (0 self)
 Add to MetaCart
(Show Context)
Abstract—In this paper, we combine discrete and continuous image models with information–theoreticbased criteria for unsupervised hierarchical imageset clustering. The continuous image modeling is based on mixture of Gaussian densities. The unsupervised imageset clustering is based on a generalized version of a recently introduced information–theoretic principle, the information bottleneck principle. Images are clustered such that the mutual information between the clusters and the image content is maximally preserved. Experimental results demonstrate the performance of the proposed framework for image clustering on a large image set. Information theoretic tools are used to evaluate cluster quality. Particular emphasis is placed on the application of the clustering for efficient image search and retrieval. Index Terms—Hierarchical database analysis, image clustering, image database management, image modeling, information bottleneck (IB), Kullback–Leibler divergence, mixture of Gaussians, mutual information, retrieval.
Learning and generalization with the information bottleneck method
, 2008
"... The Information Bottleneck (IB) method, introduced in [22], is an informationtheoretic framework for extracting relevant components of an ‘input ’ random variable X, with respect to an ‘output ’ random variable Y. This is performed by finding a compressed, nonparametric and modelindependent repres ..."
Abstract

Cited by 9 (2 self)
 Add to MetaCart
(Show Context)
The Information Bottleneck (IB) method, introduced in [22], is an informationtheoretic framework for extracting relevant components of an ‘input ’ random variable X, with respect to an ‘output ’ random variable Y. This is performed by finding a compressed, nonparametric and modelindependent representation
Predictive Coding and the Slowness Principle: An InformationTheoretic Approach
, 2008
"... Understanding the guiding principles of sensory coding strategies is a main goal in computational neuroscience. Among others, the principles of predictive coding and slowness appear to capture aspects of sensory processing. Predictive coding postulates that sensory systems are adapted to the structu ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
Understanding the guiding principles of sensory coding strategies is a main goal in computational neuroscience. Among others, the principles of predictive coding and slowness appear to capture aspects of sensory processing. Predictive coding postulates that sensory systems are adapted to the structure of their input signals such that information about future inputs is encoded. Slow feature analysis (SFA) is a method for extracting slowly varying components from quickly varying input signals, thereby learning temporally invariant features. Here, we use the information bottleneck method to state an informationtheoretic objective function for temporally local predictive coding. We then show that the linear case of SFA can be interpreted as a variant of predictive coding that maximizes the mutual information between the current output of the system and the input signal in the next time step. This demonstrates that the slowness principle and predictive coding are intimately related.
Nonparametric dependent components
 In Proceedings of ICASSP’05, IEEE International Conference on Acoustics, Speech, and Signal Processing
, 2005
"... Reprinted with permission. ..."
(Show Context)
Receptive fields without spiketriggering
 in Advances in Neural Information Processing Systems 20, eds
, 2008
"... Stimulus selectivity of sensory neurons is often characterized by estimating their receptive field properties such as orientation selectivity. Receptive fields are usually derived from the mean (or covariance) of the spiketriggered stimulus ensemble. This approach treats each spike as an independ ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
Stimulus selectivity of sensory neurons is often characterized by estimating their receptive field properties such as orientation selectivity. Receptive fields are usually derived from the mean (or covariance) of the spiketriggered stimulus ensemble. This approach treats each spike as an independent message but does not take into account that information might be conveyed through patterns of neural activity that are dis tributed across space or time. Can we find a concise description for the process ing of a whole population of neurons analogous to the receptive field for s ingle neurons? Here, we present a generalization of the linear receptive field which is not bound to be triggered on individual spikes but can be meaningfully linked to dis tributed response patterns. More precisely, we seek to identify those s timulus features and the corresponding patterns of neural activity that are most reliably coupled. We use an extens ion of reversecorrelation methods based on canonical correlation analys is. The resulting population receptive fields span the subspace of s timuli that is most informative about the population response. We evaluate our approach us ing both neuronal models and multielectrode recordings from rabbit retinal ganglion cells. We show how the model can be extended to capture nonlinear s timulusresponse relationships us ing kernel canonical correlation analys is, which makes it poss ible to tes t different coding mechanisms. Our technique can also be used to calculate receptive fields from multidimensional neural measurements such as those obtained from dynamic imaging methods. 1
Speaker Recognition by Gaussian Information Bottleneck
"... This paper explores a novel approach for the extraction of relevant information in speaker recognition tasks. This approach uses a principled information theoretic frameworkthe Information Bottleneck method (IB). In our application, the method compresses the acoustic data while preserving mostly th ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
This paper explores a novel approach for the extraction of relevant information in speaker recognition tasks. This approach uses a principled information theoretic frameworkthe Information Bottleneck method (IB). In our application, the method compresses the acoustic data while preserving mostly the relevant information for speaker identification. This paper focuses on a continuous version of the IB method known as the Gaussian Information Bottleneck (GIB). This version assumes that both the source and target variables are high dimensional multivariate Gaussian variables. The GIB was applied in our work to the Super Vector (SV) dimension reduction conundrum. Experiments were conducted on the male part of the NIST SRE 2005 corpora. The GIB representation was compared to other dimension reduction techniques and to a baseline system. In our experiments, the GIB outperformed the baseline system; achieving a 6.1% Equal Error Rate (EER) compared to the 15.1 % EER of a baseline system.
From the information bottleneck to the privacy funnel,” ArXiv eprints,
, 2014
"... AbstractWe focus on the privacyutility tradeoff encountered by users who wish to disclose some information to an analyst, that is correlated with their private data, in the hope of receiving some utility. We rely on a general privacy statistical inference framework, under which data is transform ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
AbstractWe focus on the privacyutility tradeoff encountered by users who wish to disclose some information to an analyst, that is correlated with their private data, in the hope of receiving some utility. We rely on a general privacy statistical inference framework, under which data is transformed before it is disclosed, according to a probabilistic privacy mapping. We show that when the logloss is introduced in this framework in both the privacy metric and the distortion metric, the privacy leakage and the utility constraint can be reduced to the mutual information between private data and disclosed data, and between nonprivate data and disclosed data respectively. We justify the relevance and generality of the privacy metric under the logloss by proving that the inference threat under any bounded cost function can be upperbounded by an explicit function of the mutual information between private data and disclosed data. We then show that the privacyutility tradeoff under the logloss can be cast as the nonconvex Privacy Funnel optimization, and we leverage its connection to the Information Bottleneck, to provide a greedy algorithm that is locally optimal. We evaluate its performance on the US census dataset. Finally, we characterize the optimal privacy mapping for the Gaussian Privacy Funnel.