Results 1  10
of
55
Gaussian process latent variable models for visualisation of high dimensional data
 Adv. in Neural Inf. Proc. Sys
, 2004
"... We introduce a variational inference framework for training the Gaussian process latent variable model and thus performing Bayesian nonlinear dimensionality reduction. This method allows us to variationally integrate out the input variables of the Gaussian process and compute a lower bound on the ex ..."
Abstract

Cited by 133 (5 self)
 Add to MetaCart
We introduce a variational inference framework for training the Gaussian process latent variable model and thus performing Bayesian nonlinear dimensionality reduction. This method allows us to variationally integrate out the input variables of the Gaussian process and compute a lower bound on the exact marginal likelihood of the nonlinear latent variable model. The maximization of the variational lower bound provides a Bayesian training procedure that is robust to overfitting and can automatically select the dimensionality of the nonlinear latent space. We demonstrate our method on real world datasets. The focus in this paper is on dimensionality reduction problems, but the methodology is more general. For example, our algorithm is immediately applicable for training Gaussian process models in the presence of missing or uncertain inputs. 1
Constructing Internet Coordinate System Based on Delay Measurement
, 2003
"... In this paper, we consider the problem of how to represent the locations of Internet hosts in a Cartesian coordinate system to facilitate estimate of the network distance between two arbitrary Internet hosts. We envision an infrastructure that consists of beacon nodes and provides the service of est ..."
Abstract

Cited by 113 (3 self)
 Add to MetaCart
In this paper, we consider the problem of how to represent the locations of Internet hosts in a Cartesian coordinate system to facilitate estimate of the network distance between two arbitrary Internet hosts. We envision an infrastructure that consists of beacon nodes and provides the service of estimating network distance between two hosts without direct delay measurement. We show that the principal component analysis (PCA) technique can e#ectively extract topological information from delay measurements between beacon hosts. Based on PCA, we devise a transformation method that projects the distance data space into a new coordinate system of (much) smaller dimensions. The transformation retains as much topological information as possible and yet enables end hosts to easily determine their locations in the coordinate system. The resulting new coordinate system is termed as the Internet Coordinate System (ICS). As compared to existing work (e.g., IDMaps [1] and GNP [2]), ICS incurs smaller computation overhead in calculating the coordinates of hosts and smaller measurement overhead (required for end hosts to measure their distances to beacon hosts). Finally, we show via experimentation with reallife data sets that ICS is robust and accurate, regardless of the number of beacon nodes (as long as it exceeds certain threshold) and the complexity of network topology.
Segmenting Motion Capture Data into Distinct Behaviors
 In Graphics Interface
, 2004
"... Much of the motion capture data used in animations, commercials, and video games is carefully segmented into distinct motions either at the time of capture or by hand after the capture session. As we move toward collecting more and longer motion sequences, however, automatic segmentation techniques ..."
Abstract

Cited by 86 (5 self)
 Add to MetaCart
Much of the motion capture data used in animations, commercials, and video games is carefully segmented into distinct motions either at the time of capture or by hand after the capture session. As we move toward collecting more and longer motion sequences, however, automatic segmentation techniques will become important for processing the results in a reasonable time frame.
Probabilistic Independent Component Analysis
, 2003
"... Independent Component Analysis is becoming a popular exploratory method for analysing complex data such as that from FMRI experiments. The application of such 'modelfree' methods, however, has been somewhat restricted both by the view that results can be uninterpretable and by the lack of ability t ..."
Abstract

Cited by 75 (12 self)
 Add to MetaCart
Independent Component Analysis is becoming a popular exploratory method for analysing complex data such as that from FMRI experiments. The application of such 'modelfree' methods, however, has been somewhat restricted both by the view that results can be uninterpretable and by the lack of ability to quantify statistical significance. We present an integrated approach to Probabilistic ICA for FMRI data that allows for nonsquare mixing in the presence of Gaussian noise. We employ an objective estimation of the amount of Gaussian noise through Bayesian analysis of the true dimensionality of the data, i.e. the number of activation and nonGaussian noise sources. Reduction of the data to this 'true' subspace before the ICA decomposition automatically results in an estimate of the noise, leading to the ability to assign significance to voxels in ICA spatial maps. Estimation of the number of intrinsic sources not only enables us to carry out probabilistic modelling, but also achieves an asymptotically unique decomposition of the data. This reduces problems of interpretation, as each final independent component is now much more likely to be due to only one physical or physiological process. We also describe other improvements to standard ICA, such as temporal prewhitening and variance normafisation of timeseries, the latter being particularly useful in the context of dimensionality reduction when weak activation is present. We discuss the use of prior information about the spatiotemporal nature of the source processes, and an alternativehypothesis testing approach for inference, using Gaussian mixture models. The performance of our approach is illustrated and evaluated on real and complex artificial FMRI data, and compared to the spatiotemporal accuracy of restfits obtaine...
Is there something out there? Infering space from sensorimotor dependencies
 Neural Computation
, 2002
"... This paper suggests that in biological organisms, the perceived structure of reality, in particular the notions of body, environment, space, object, and attribute, could be a consequence of an effort on the part of brains to account for the dependency between their inputs and their outputs in terms ..."
Abstract

Cited by 53 (3 self)
 Add to MetaCart
This paper suggests that in biological organisms, the perceived structure of reality, in particular the notions of body, environment, space, object, and attribute, could be a consequence of an effort on the part of brains to account for the dependency between their inputs and their outputs in terms of a small number of parameters. To validate this idea, a procedure is demonstrated whereby the brain of an organism with arbitrary input and output connectivity can deduce the dimensionality of the rigid group of the space underlying its input output relationship, that is the dimension of what the organism will call physical space.
Minimum description length shape and appearance models
 In Image Processing Medical Imaging, IPMI
, 2003
"... Abstract. The Minimum Description Length (MDL) approach to shape modelling is reviewed. It solves the point correspondence problem of selecting points on shapes defined as curves so that the points correspond across a data set. An efficient numerical implementation is presented and made available as ..."
Abstract

Cited by 41 (1 self)
 Add to MetaCart
Abstract. The Minimum Description Length (MDL) approach to shape modelling is reviewed. It solves the point correspondence problem of selecting points on shapes defined as curves so that the points correspond across a data set. An efficient numerical implementation is presented and made available as open source Matlab code. The problems with the early MDL approaches are discussed. Finally the MDL approach is extended to an MDL Appearance Model, which is proposed as a means to perform unsupervised image segmentation. 1.
Nonlinear Matrix Factorization with Gaussian Processes
"... A popular approach to collaborative filtering is matrix factorization. In this paper we develop a nonlinear probabilistic matrix factorization using Gaussian process latent variable models. We use stochastic gradient descent (SGD) to optimize the model. SGD allows us to apply Gaussian processes to ..."
Abstract

Cited by 40 (1 self)
 Add to MetaCart
A popular approach to collaborative filtering is matrix factorization. In this paper we develop a nonlinear probabilistic matrix factorization using Gaussian process latent variable models. We use stochastic gradient descent (SGD) to optimize the model. SGD allows us to apply Gaussian processes to data sets with millions of observations without approximate methods. We apply our approach to benchmark movie recommender data sets. The results show better than previous stateoftheart performance. 1.
On ranking the effectiveness of searches
 In: Proc. of the 29th Annual Int’l ACM SIGIR Conf. on Research and Development in Information Retrieval
, 2006
"... There is a growing interest in estimating the effectiveness of search. Two approaches are typically considered: examining the search queries and examining the retrieved document sets. In this paper, we take the latter approach. We use four measures to characterize the retrieved document sets and est ..."
Abstract

Cited by 20 (0 self)
 Add to MetaCart
There is a growing interest in estimating the effectiveness of search. Two approaches are typically considered: examining the search queries and examining the retrieved document sets. In this paper, we take the latter approach. We use four measures to characterize the retrieved document sets and estimate the quality of search. These measures are (i) the clustering tendency as measured by the CoxLewis statistic, (ii) the sensitivity to document perturbation, (iii) the sensitivity to query perturbation and (iv) the local intrinsic dimensionality. We present experimental results for the task of ranking 200 queries according to the search effectiveness over the TREC (discs 4 and 5) dataset. Our ranking of queries is compared with the ranking based on the average precision using the Kendall τ statistic. The best individual estimator is the sensitivity to document perturbation and yields Kendall τ of 0.521. When combined with the clustering tendency based on the CoxLewis statistic and the query perturbation measure, it results in Kendall τ of 0.562 which to our knowledge is the highest correlation with the average precision reported to date.
Signal Detection Using ICA: Application to Chat Room Topic Spotting
, 2001
"... Signal detection and pattern recognition for online grouping huge amounts of data and retrospective analysis is becoming increasingly important as knowledge based standards, such as XML and advanced MPEG, gain popularity. Independent component analysis (ICA) can be used to both cluster and detect si ..."
Abstract

Cited by 19 (3 self)
 Add to MetaCart
Signal detection and pattern recognition for online grouping huge amounts of data and retrospective analysis is becoming increasingly important as knowledge based standards, such as XML and advanced MPEG, gain popularity. Independent component analysis (ICA) can be used to both cluster and detect signals with weak a priori assumptions in multimedia contexts. ICA of real world data is typically performed without knowledge of the number of nontrivial independent components, hence, it is of interest to test hypotheses concerning the number of components or simply to test whether a given set of components is significant relative to a "white noise" null hypothesis. It was recently proposed to use the socalled Bayesian information criterion (BIC) approximation, for estimation of such probabilities of competing hypotheses. Here, we apply this approach to the understanding of chat. We show that ICA can detect meaningful context structures in a chat room log file.
PRACTICAL APPROACHES TO PRINCIPAL COMPONENT ANALYSIS IN THE PRESENCE OF MISSING VALUES
"... Informaatio ja luonnontieteiden tiedekunta ..."