Results 11  20
of
1,612
Survey of clustering data mining techniques
, 2002
"... Accrue Software, Inc. Clustering is a division of data into groups of similar objects. Representing the data by fewer clusters necessarily loses certain fine details, but achieves simplification. It models data by its clusters. Data modeling puts clustering in a historical perspective rooted in math ..."
Abstract

Cited by 247 (0 self)
 Add to MetaCart
Accrue Software, Inc. Clustering is a division of data into groups of similar objects. Representing the data by fewer clusters necessarily loses certain fine details, but achieves simplification. It models data by its clusters. Data modeling puts clustering in a historical perspective rooted in mathematics, statistics, and numerical analysis. From a machine learning perspective clusters correspond to hidden patterns, the search for clusters is unsupervised learning, and the resulting system represents a data concept. From a practical perspective clustering plays an outstanding role in data mining applications such as scientific data exploration, information retrieval and text mining, spatial database applications, Web analysis, CRM, marketing, medical diagnostics, computational biology, and many others. Clustering is the subject of active research in several fields such as statistics, pattern recognition, and machine learning. This survey focuses on clustering in data mining. Data mining adds to clustering the complications of very large datasets with very many attributes of different types. This imposes unique
Probabilistic Visual Learning for Object Detection
, 1995
"... We present an unsupervised technique for visual learning which is based on density estimation in highdimensional spaces using an eigenspace decomposition. Two types of density estimates are derived for modeling the training data: a multivariate Gaussian (for a unimodal distribution) and a multivari ..."
Abstract

Cited by 210 (15 self)
 Add to MetaCart
We present an unsupervised technique for visual learning which is based on density estimation in highdimensional spaces using an eigenspace decomposition. Two types of density estimates are derived for modeling the training data: a multivariate Gaussian (for a unimodal distribution) and a multivariate MixtureofGaussians model (for multimodal distributions). These probability densities are then used to formulate a maximumlikelihood estimation framework for visual search and target detection for automatic object recognition. This learning technique is tested in experiments with modeling and subsequent detection of human faces and nonrigid objects such as hands.
Locality Preserving Projections
, 2002
"... Many problems in information processing involve some form of dimensionality reduction. In this paper, we introduce Locality Preserving Projections (LPP). These are linear projective maps that arise by solving a variational problem that optimally preserves the neighborhood structure of the data s ..."
Abstract

Cited by 209 (15 self)
 Add to MetaCart
Many problems in information processing involve some form of dimensionality reduction. In this paper, we introduce Locality Preserving Projections (LPP). These are linear projective maps that arise by solving a variational problem that optimally preserves the neighborhood structure of the data set. LPP should be seen as an alternative to Principal Component Analysis (PCA)  a classical linear technique that projects the data along the directions of maximal variance. When the high dimensional data lies on a low dimensional manifold embedded in the ambient space, the Locality Preserving Projections are obtained by finding the optimal linear approximations to the eigenfunctions of the Laplace Beltrami operator on the manifold. As a result, LPP shares many of the data representation properties of nonlinear techniques such as Laplacian Eigenmaps or Locally Linear Embedding. Yet LPP is linear and more crucially is defined everywhere in ambient space rather than just on the training data points. This is borne out by illustrative examples on some high dimensional data sets.
Minimax Entropy Principle and Its Application to Texture Modeling
, 1997
"... This article proposes a general theory and methodology, called the minimax entropy principle, for building statistical models for images (or signals) in a variety of applications. This principle consists of two parts. The first is the maximum entropy principle for feature binding (or fusion): for a ..."
Abstract

Cited by 193 (39 self)
 Add to MetaCart
This article proposes a general theory and methodology, called the minimax entropy principle, for building statistical models for images (or signals) in a variety of applications. This principle consists of two parts. The first is the maximum entropy principle for feature binding (or fusion): for a certain set of feature statistics, a distribution can be built to bind these feature statistics together by maximizing the entropy over all distributions that reproduce these feature statistics. The second part is the minimum entropy principle for feature selection: among all plausible sets of feature statistics, we choose the set whose maximum entropy distribution has the minimum entropy. Computational and inferential issues in both parts are addressed, in particular, a feature pursuit procedure is proposed for approximately selecting the optimal set of features. The model complexity is restricted because of the sample variation in the observed feature statistics. The minimax entropy principle is applied to texture modeling, where a novel Markov random field (MRF) model, called FRAME (Filter, Random field, And Minimax Entropy), is derived, and encouraging results are obtained in experiments on a variety of texture images. Relationship between our theory and the mechanisms of neural computation is also discussed.
Infinite Latent Feature Models and the Indian Buffet Process
, 2005
"... We define a probability distribution over equivalence classes of binary matrices with a finite number of rows and an unbounded number of columns. This distribution ..."
Abstract

Cited by 181 (38 self)
 Add to MetaCart
We define a probability distribution over equivalence classes of binary matrices with a finite number of rows and an unbounded number of columns. This distribution
Simplifying Surfaces with Color and Texture using Quadric Error Metrics
, 1998
"... There are a variety of application areas in which there is a need for simplifying complex polygonal surface models. These models often have material properties such as colors, textures, and surface normals. Our surface simplification algorithm, based on iterative edge contraction and quadric error m ..."
Abstract

Cited by 168 (2 self)
 Add to MetaCart
There are a variety of application areas in which there is a need for simplifying complex polygonal surface models. These models often have material properties such as colors, textures, and surface normals. Our surface simplification algorithm, based on iterative edge contraction and quadric error metrics, can rapidly produce high quality approximations of such models. We present a natural extension of our original error metric that can account for a wide range of vertex attributes. CR Categories: I.3.5 [Computer Graphics]: Computational Geometry and Object Modelingsurface and object representations Keywords: surface simplification, multiresolution modeling, level of detail, quadric error metric, edge contraction, surface properties, discontinuity preservation 1 INTRODUCTION Many applications in computer graphics and visualization can benefit from automatic simplification of complex polygonal models. Such models are usually not only geometrically complex, but they may also have ...
Unsupervised Learning of Image Manifolds by Semidefinite Programming
, 2004
"... Can we detect low dimensional structure in high dimensional data sets of images and video? The problem of dimensionality reduction arises often in computer vision and pattern recognition. In this paper, we propose a new solution to this problem based on semidefinite programming. Our algorithm can be ..."
Abstract

Cited by 162 (9 self)
 Add to MetaCart
Can we detect low dimensional structure in high dimensional data sets of images and video? The problem of dimensionality reduction arises often in computer vision and pattern recognition. In this paper, we propose a new solution to this problem based on semidefinite programming. Our algorithm can be used to analyze high dimensional data that lies on or near a low dimensional manifold. It overcomes certain limitations of previous work in manifold learning, such as Isomap and locally linear embedding. We illustrate the algorithm on easily visualized examples of curves and surfaces, as well as on actual images of faces, handwritten digits, and solid objects.
Efficient Simplification of PointSampled Surfaces
, 2002
"... In this paper we introduce, analyze and quantitatively compare a number of surface simplification methods for pointsampled geometry. We have implemented incremental and hierarchical clustering, iterative simplification, and particle simulation algorithms to create approximations of pointbased mode ..."
Abstract

Cited by 149 (15 self)
 Add to MetaCart
In this paper we introduce, analyze and quantitatively compare a number of surface simplification methods for pointsampled geometry. We have implemented incremental and hierarchical clustering, iterative simplification, and particle simulation algorithms to create approximations of pointbased models with lower sampling density. All these methods work directly on the point cloud, requiring no intermediate tesselation. We show how local variation estimation and quadric error metrics can be employed to diminish the approximation error and concentrate more samples in regions of high curvature. To compare the quality of the simplified surfaces, we have designed a new method for computing numerical and visual error estimates for pointsampled surfaces. Our algorithms are fast, easy to implement, and create highquality surface approximations, clearly demonstrating the effectiveness of pointbased surface simplification.
Dimensions of Meaning
, 1992
"... The representation of documents and queries as vectors in a highdimensional space is wellestablished in information retrieval [1]. This paper proposes to represent the semantics of words and contexts in a text as vectors. The dimensions of the space are words and the initial vectors are determined ..."
Abstract

Cited by 143 (5 self)
 Add to MetaCart
The representation of documents and queries as vectors in a highdimensional space is wellestablished in information retrieval [1]. This paper proposes to represent the semantics of words and contexts in a text as vectors. The dimensions of the space are words and the initial vectors are determined by the words occurring close to the entity to be represented which implies that the space has several thousand dimensions (words). This makes the vector representations (which are dense) too cumbersome to use directly. Therefore, dimensionality reduction by means of a singular value decomposition is employed. The paper analyzes the structure of the vector representations and applies them to word sense disambiguation and thesaurus induction.
A DataDriven Reflectance Model
 ACM TRANSACTIONS ON GRAPHICS
, 2003
"... We present a generative model for isotropic bidirectional reflectance distribution functions (BRDFs) based on acquired reflectance data. Instead of using analytical reflectance models, we represent each BRDF as a dense set of measurements. This allows us to interpolate and extrapolate in the space o ..."
Abstract

Cited by 143 (6 self)
 Add to MetaCart
We present a generative model for isotropic bidirectional reflectance distribution functions (BRDFs) based on acquired reflectance data. Instead of using analytical reflectance models, we represent each BRDF as a dense set of measurements. This allows us to interpolate and extrapolate in the space of acquired BRDFs to create new BRDFs. We treat each acquired BRDF as a single highdimensional vector taken from a space of all possible BRDFs. We apply both linear (subspace) and nonlinear (manifold) dimensionality reduction tools in an effort to discover a lowerdimensional representation that characterizes our measurements. We let users define perceptually meaningful parametrization directions to navigate in the reduceddimension BRDF space. On the lowdimensional manifold, movement along these directions produces novel but valid BRDFs.