Results 1 - 10
of
23
GTM: The generative topographic mapping
- Neural Computation
, 1998
"... Latent variable models represent the probability density of data in a space of several dimensions in terms of a smaller number of latent, or hidden, variables. A familiar example is factor analysis which is based on a linear transformations between the latent space and the data space. In this paper ..."
Abstract
-
Cited by 234 (5 self)
- Add to MetaCart
Latent variable models represent the probability density of data in a space of several dimensions in terms of a smaller number of latent, or hidden, variables. A familiar example is factor analysis which is based on a linear transformations between the latent space and the data space. In this paper we introduce a form of non-linear latent variable model called the Generative Topographic Mapping for which the parameters of the model can be determined using the EM algorithm. GTM provides a principled alternative to the widely used Self-Organizing Map (SOM) of Kohonen (1982), and overcomes most of the significant limitations of the SOM. We demonstrate the performance of the GTM algorithm on a toy problem and on simulated data from flow diagnostics for a multi-phase oil pipeline. Copyright c○MIT Press (1998). 1
A Review and Empirical Evaluation of Feature Weighting Methods for a Class of Lazy Learning Algorithms
- ARTIFICIAL INTELLIGENCE REVIEW
, 1997
"... Many lazy learning algorithms are derivatives of the k-nearest neighbor (k-NN) classifier, which uses a distance function to generate predictions from stored instances. Several studies have shown that k-NN's performance is highly sensitive to the definition of its distance function. Many k-NN v ..."
Abstract
-
Cited by 94 (0 self)
- Add to MetaCart
Many lazy learning algorithms are derivatives of the k-nearest neighbor (k-NN) classifier, which uses a distance function to generate predictions from stored instances. Several studies have shown that k-NN's performance is highly sensitive to the definition of its distance function. Many k-NN variants have been proposed to reduce this sensitivity by parameterizing the distance function with feature weights. However, these variants have not been categorized nor empirically compared. This paper reviews a class of weight-setting methods for lazy learning algorithms. We introduce a framework for distinguishing these methods and empirically compare them. We observed four trends from our experiments and conducted further studies to highlight them. Our results suggest that methods which use performance feedback to assign weight settings demonstrated three advantages over other methods: they require less pre-processing, perform better in the presence of interacting features, and generally require less training data to learn good settings. We also found that continuous weighting methods tend to outperform feature selection algorithms for tasks where some features are useful but less important than others.
Local polynomial kernel regression for generalized linear models and quasi-likelihood functions
- Journal of the American Statistical Association,90
, 1995
"... were introduced as a means of extending the techniques of ordinary parametric regression to several commonly-used regression models arising from non-normal likelihoods. Typically these models have a variance that depends on the mean function. However, in many cases the likelihood is unknown, but the ..."
Abstract
-
Cited by 38 (4 self)
- Add to MetaCart
were introduced as a means of extending the techniques of ordinary parametric regression to several commonly-used regression models arising from non-normal likelihoods. Typically these models have a variance that depends on the mean function. However, in many cases the likelihood is unknown, but the relationship between mean and variance can be specified. This has led to the consideration of quasi-likelihood methods, where the conditionallog-likelihood is replaced by a quasi-likelihood function. In this article we investigate the extension of the nonparametric regression technique of local polynomial fitting with a kernel weight to these more general contexts. In the ordinary regression case local polynomial fitting has been seen to possess several appealing features in terms of intuitive and mathematical simplicity. One noteworthy feature is the better performance near the boundaries compared to the traditional kernel regression estimators. These properties are shown to carryover to the generalized linear model and quasi-likelihood model. The end result is a class of kernel type estimators for smoothing in quasi-likelihood models. These estimators can be viewed as a straightforward generalization of the usual parametric estimators. In addition, their simple asymptotic distributions allow for simple interpretation
Nonparametric function induction in semi-supervised learning
- In Proc. Artificial Intelligence and Statistics
, 2005
"... There has been an increase of interest for semi-supervised learning recently, because of the many datasets with large amounts of unlabeled examples and only a few labeled ones. This paper follows up on proposed nonparametric algorithms which provide an estimated continuous label for the given unlabe ..."
Abstract
-
Cited by 27 (4 self)
- Add to MetaCart
There has been an increase of interest for semi-supervised learning recently, because of the many datasets with large amounts of unlabeled examples and only a few labeled ones. This paper follows up on proposed nonparametric algorithms which provide an estimated continuous label for the given unlabeled examples. First, it extends them to function induction algorithms that minimize a regularization criterion applied to an outof-sample example, and happen to have the form of Parzen windows regressors. This allows to predict test labels without solving again a linear system of dimension n (the number of unlabeled and labeled training examples), which can cost O(n 3). Second, this function induction procedure gives rise to an efficient approximation of the training process, reducing the linear system to be solved to m ≪ n unknowns, using only a subset of m examples. An improvement of O(n 2 /m 2) in time can thus be obtained. Comparative experiments are presented, showing the good performance of the induction formula and approximation algorithm. 1
Modeling and Integrating Background Knowledge in Data Anonymization
"... Abstract — Recent work has shown the importance of considering the adversary’s background knowledge when reasoning about privacy in data publishing. However, it is very difficult for the data publisher to know exactly the adversary’s background knowledge. Existing work cannot satisfactorily model ba ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Abstract — Recent work has shown the importance of considering the adversary’s background knowledge when reasoning about privacy in data publishing. However, it is very difficult for the data publisher to know exactly the adversary’s background knowledge. Existing work cannot satisfactorily model background knowledge and reason about privacy in the presence of such knowledge. This paper presents a general framework for modeling the adversary’s background knowledge using kernel estimation methods. This framework subsumes different types of knowledge (e.g., negative association rules) that can be mined from the data. Under this framework, we reason about privacy using Bayesian inference techniques and propose the skyline (B, t)privacy model, which allows the data publisher to enforce privacy requirements to protect the data against adversaries with different levels of background knowledge. Through an extensive set of experiments, we show the effects of probabilistic background knowledge in data anonymization and the effectiveness of our approach in both privacy protection and utility preservation. I.
Powering up with space-time wind forecasting
- Journal of the American Statistical Association
, 2009
"... The technology to harvest electricity from wind energy is now advanced enough to make entire cities powered by it a reality. High-quality short-term forecasts of wind speed are vital to making this a more reliable energy source. Gneiting et al. (2006) have introduced a model for the average wind spe ..."
Abstract
-
Cited by 5 (5 self)
- Add to MetaCart
The technology to harvest electricity from wind energy is now advanced enough to make entire cities powered by it a reality. High-quality short-term forecasts of wind speed are vital to making this a more reliable energy source. Gneiting et al. (2006) have introduced a model for the average wind speed two hours ahead based on both spatial and temporal information. The forecasts produced by this model are accurate, and subject to accuracy, the predictive distribution is sharp, i.e., highly concentrated around its center. However, this model is split into nonunique regimes based on the wind direction at an off-site location. This paper both generalizes and improves upon this model by treating wind direction as a circular variable and including it in the model. It is robust in many experiments, such as predicting at new locations. We compare this with the more common approach of modeling wind speeds and directions in the Cartesian space and use a skew-t distribution for the errors. The quality of the predictions from all of these models can be more realistically assessed with a loss measure that depends upon the power curve relating wind speed to power output. This proposed loss measure yields more insight into the true value of each model’s predictions. Some key words: Circular variable, power curve, skew-t distribution, wind direction, wind speed.
Cumulative distribution networks: Inference, estimation and applications of graphical models for cumulative distribution functions
, 2009
"... ..."
Visual exploration of high dimensional scalar functions
- IEEE TRANS. VISUALIZATION AND COMPUTER GRAPHICS
, 2010
"... An important goal of scientific data analysis is to understand the behavior of a system or process based on a sample of the system. In many instances it is possible to observe both input parameters and system outputs, and characterize the system as a high-dimensional function. Such data sets arise, ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
An important goal of scientific data analysis is to understand the behavior of a system or process based on a sample of the system. In many instances it is possible to observe both input parameters and system outputs, and characterize the system as a high-dimensional function. Such data sets arise, for instance, in large numerical simulations, as energy landscapes in optimization problems, or in the analysis of image data relating to biological or medical parameters. This paper proposes an approach to analyze and visualizing such data sets. The proposed method combines topological and geometric techniques to provide interactive visualizations of discretely sampled high-dimensional scalar fields. The method relies on a segmentation of the parameter space using an approximate Morse-Smale complex on the cloud of point samples. For each crystal of the Morse-Smale complex, a regression of the system parameters with respect to the output yields a curve in the parameter space. The result is a simplified geometric representation of the Morse-Smale complex in the high dimensional input domain. Finally, the geometric representation is embedded in 2D, using dimension reduction, to provide a visualization platform. The geometric properties of the regression curves enable the visualization of additional information about each crystal such as local and global shape, width, length, and sampling densities. The method is illustrated on several synthetic examples of two dimensional functions. Two use cases, using data sets from the UCI machine learning repository, demonstrate the utility of the proposed approach on real data. Finally, in collaboration with domain experts the proposed
Dimensionality Reduction and Principal Surfaces via Kernel Map Manifolds
"... We present a manifold learning approach to dimensionality reduction that explicitly models the manifold as a mapping from low to high dimensional space. The manifold is represented as a parametrized surface represented by a set of parameters that are defined on the input samples. The representation ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
We present a manifold learning approach to dimensionality reduction that explicitly models the manifold as a mapping from low to high dimensional space. The manifold is represented as a parametrized surface represented by a set of parameters that are defined on the input samples. The representation also provides a natural mapping from high to low dimensional space, and a concatenation of these two mappings induces a projection operator onto the manifold. The explicit projection operator allows for a clearly defined objective function in terms of projection distance and reconstruction error. A formulation of the mappings in terms of kernel regression permits a direct optimization of the objective function and the extremal points converge to principal surfaces as the number of data to learn from increases. Principal surfaces have the desirable property that they, informally speaking, pass through the middle of a distribution. We provide a proof on the convergence to principal surfaces and illustrate the effectiveness of the proposed approach on synthetic and real data sets. 1.

