Results 1  10
of
21
A Survey of Dimension Reduction Techniques
, 2002
"... this paper, we assume that we have n observations, each being a realization of the p dimensional random variable x = (x 1 , . . . , x p ) with mean E(x) = = ( 1 , . . . , p ) and covariance matrix E{(x )(x = # pp . We denote such an observation matrix by X = i,j : 1 p, 1 ..."
Abstract

Cited by 87 (0 self)
 Add to MetaCart
this paper, we assume that we have n observations, each being a realization of the p dimensional random variable x = (x 1 , . . . , x p ) with mean E(x) = = ( 1 , . . . , p ) and covariance matrix E{(x )(x = # pp . We denote such an observation matrix by X = i,j : 1 p, 1 n}. If i and # i = # (i,i) denote the mean and the standard deviation of the ith random variable, respectively, then we will often standardize the observations x i,j by (x i,j i )/ # i , where i = x i = 1/n j=1 x i,j , and # i = 1/n j=1 (x i,j x i )
ConceptOriented Indexing of Video Databases: Toward Semantic Sensitive Retrieval and Browsing
 IEEE TRANS. ON IMAGE PROCESSING
, 2004
"... Digital video now plays an important role in medical education, health care, telemedicine and other medical applications. Several contentbased video retrieval (CBVR) systems have been proposed in the past, but they still suffer from the following challenging problems: semantic gap, semantic video ..."
Abstract

Cited by 24 (5 self)
 Add to MetaCart
Digital video now plays an important role in medical education, health care, telemedicine and other medical applications. Several contentbased video retrieval (CBVR) systems have been proposed in the past, but they still suffer from the following challenging problems: semantic gap, semantic video concept modeling, semantic video classification, and conceptoriented video database indexing and access. In this paper, we propose a novel framework to make some advances toward the final goal to solve these problems. Specifically, the framework includes: 1) a semanticsensitive video content representation framework by using principal video shots to enhance the quality of features; 2) semantic video concept interpretation by using flexible mixture model to bridge the semantic gap; 3) a novel semantic videoclassifier training framework by integrating feature selection, parameter estimation, and model selection seamlessly in a single algorithm; and 4) a conceptoriented video database organization technique through a certain domaindependent concept hierarchy to enable semanticsensitive video retrieval and browsing.
Learning nonlinear image manifolds by global alignment of local linear models
 IEEE Trans. Pattern Analysis and Machine Intell
"... Abstract—Appearancebased methods, based on statistical models of the pixel values in an image (region) rather than geometrical object models, are increasingly popular in computer vision. In many applications, the number of degrees of freedom (DOF) in the image generating process is much lower than ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
Abstract—Appearancebased methods, based on statistical models of the pixel values in an image (region) rather than geometrical object models, are increasingly popular in computer vision. In many applications, the number of degrees of freedom (DOF) in the image generating process is much lower than the number of pixels in the image. If there is a smooth function that maps the DOF to the pixel values, then the images are confined to a lowdimensional manifold embedded in the image space. We propose a method based on probabilistic mixtures of factor analyzers to 1) model the density of images sampled from such manifolds and 2) recover global parameterizations of the manifold. A globally nonlinear probabilistic twoway mapping between coordinates on the manifold and images is obtained by combining several, locally valid, linear mappings. We propose a parameter estimation scheme that improves upon an existing scheme and experimentally compare the presented approach to selforganizing maps, generative topographic mapping, and mixtures of factor analyzers. In addition, we show that the approach also applies to finding mappings between different embeddings of the same manifold. Index Terms—Feature extraction or construction, machine learning, statistical image representation. 1
Exploration of dimensionality reduction for text visualization
 In Proc. IEEE Third Intl. Conf. on Coordinated and Multiple Views in Exploratory Visualization
, 2005
"... In the text document visualization community, statistical analysis tools (e.g., principal component analysis and multidimensional scaling) and neurocomputation models (e.g., selforganizing feature maps) have been widely used for dimensionality reduction. Often the resulting dimensionality is set to ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
In the text document visualization community, statistical analysis tools (e.g., principal component analysis and multidimensional scaling) and neurocomputation models (e.g., selforganizing feature maps) have been widely used for dimensionality reduction. Often the resulting dimensionality is set to two, as this facilitates plotting the results. The validity and effectiveness of these approaches largely depend on the specific data sets used and semantics of the targeted applications. To date, there has been little evaluation to assess and compare dimensionality reduction methods and dimensionality reduction processes, either numerically or empirically. The focus of this paper is to propose a mechanism for comparing and evaluating the effectiveness of dimensionality reduction techniques in the visual exploration of text document archives. We use multivariate visualization techniques and interactive visual exploration to study three problems: (a) Which dimensionality reduction technique best preserves the interrelationships within a set of text documents; (b) What is the sensitivity of the results to the number of output dimensions; (c) Can we automatically remove redundant or unimportant words from the vector extracted from the documents while still preserving the majority of information, and thus make dimensionality reduction more efficient. To study each problem, we generate supplemental dimensions based on several dimensionality reduction algorithms and parameters controlling these algorithms. We then visually analyze and explore the characteristics of the reduced dimensional spaces as implemented within a linked, multiview multidimensional visual exploration tool, XmdvTool. We compare the derived dimensions to features known to be present in the original data. Quantitative measures are also used in identifying the quality of results using different numbers of output dimensions.
On the Efficiency of Nearest Neighbor Searching with Data Clustered in Lower Dimensions
, 2001
"... In nearest neighbor searching we are given a set of n data points in real ddimensional space, R d , and the problem is to preprocess these points into a data structure, so that given a query point, the nearest data point to the query point can be reported eciently. Because data sets can be quit ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
In nearest neighbor searching we are given a set of n data points in real ddimensional space, R d , and the problem is to preprocess these points into a data structure, so that given a query point, the nearest data point to the query point can be reported eciently. Because data sets can be quite large, we are interested in data structures that use optimal O(dn) storage. Given the limitation of linear storage, the best known data structures suer from expectedcase query times that grow exponentially in d. However, it is widely regarded in practice that data sets in high dimensional spaces tend to consist of clusters residing in much lower dimensional subspaces. This raises the question of whether data structures for nearest neighbor searching adapt to the presence of lower dimensional clustering, and further how performance varies when the clusters are aligned with the coordinate axes. We analyze the popular kdtree data structure in the form of two variants based on a modication of the splitting method, which produces cells satisfy the basic packing properties needed for eciency without producing empty cells. We show that when data points are uniformly distributed on a k dimensional hyperplane for k d, then expected number of leaves visited in such a kdtree grows exponentially in k, but not in d. We show that the growth rate is even smaller still if the hyperplane is aligned with the coordinate axes. We present empirical studies to support our theoretical results. Keywords: Nearest neighbor searching, kdtrees, splitting methods, expectedcase analysis, clustering. 1
Unsupervised Classification of High Dimensional Data by means of SelfOrganizing Neural Networks
, 1998
"... Contents Introduction 1 1 Unsupervised classification of highdimensional data 4 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2 High dimensional data . . . . . . . . . . . . . . . . . . . . . . 4 1.2.1 Properties of high dimensional spaces . . . . . . . . . 4 1.2.2 I ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Contents Introduction 1 1 Unsupervised classification of highdimensional data 4 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2 High dimensional data . . . . . . . . . . . . . . . . . . . . . . 4 1.2.1 Properties of high dimensional spaces . . . . . . . . . 4 1.2.2 Intrinsic dimension . . . . . . . . . . . . . . . . . . . . 7 1.3 Unsupervised classification . . . . . . . . . . . . . . . . . . . . 11 1.3.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . 12 1.3.2 Dimension reduction . . . . . . . . . . . . . . . . . . . 12 1.3.3 Available techniques . . . . . . . . . . . . . . . . . . . 15 1.4 Application: the Philips project . . . . . . . . . . . . . . . . . 15 1.4.1 Presentation . . . . . . . . . . . . . . . . . . . . . . . 16 1.4.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . 18 1.4.3 Data preprocessing . . . . . . . . . . . . . . . . . . . . 19 1.4.4 Intrinsic dimension . . . . . . . . . . . . . . . . . . .
A Intuitive Visualization of Pareto Frontier for Multi Objective Optimization in nDimensional Performance Space
"... A visualization methodology is presented in which a Pareto Frontier can be visualized in an intuitive and straightforward manner for an ndimensional performance space. Based on this visualization, it is possible to quickly identify ‘good ’ regions of the performance and optimal design spaces for a ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
A visualization methodology is presented in which a Pareto Frontier can be visualized in an intuitive and straightforward manner for an ndimensional performance space. Based on this visualization, it is possible to quickly identify ‘good ’ regions of the performance and optimal design spaces for a multiobjective optimization application, regardless of space complexity. Visualizing Pareto solutions for more than three objectives has long been a significant challenge to the multiobjective optimization community. The Hyperspace Diagonal Counting (HSDC) method described here enables the lossless visualization to be implemented. The proposed method requires no dimension fixing. In this paper, we demonstrate the usefulness of visualizing nf space (i.e. for more than three objective functions in a multiobjective optimization problem). The visualization is shown to aid in the final decision of what potential optimal design point should be chosen amongst all possible Pareto solutions. I.
Principal Manifold Learning by Sparse Grids
, 2008
"... In this paper we deal with the construction of lowerdimensional manifolds from highdimensional data which is an important task in data mining, machine learning and statistics. Here, we consider principal manifolds as the minimum of a regularized, nonlinear empirical quantization error functional. ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
In this paper we deal with the construction of lowerdimensional manifolds from highdimensional data which is an important task in data mining, machine learning and statistics. Here, we consider principal manifolds as the minimum of a regularized, nonlinear empirical quantization error functional. For the discretization we use a sparse grid method in latent parameter space. This approach avoids, to some extent, the curse of dimension of conventional grids like in the GTM approach. The arising nonlinear problem is solved by a descent method which resembles the expectation maximization algorithm. We present our sparse grid principal manifold approach, discuss its properties and report on the results of numerical experiments for one, two and threedimensional model problems.
PCA in Autocorrelation Space
 in International Conference on Pattern Recognition
, 2002
"... The use of higher order autocorrelations as features for pattern classification has been usually restricted to second or third orders due to high computational costs. Since the autocorrelation space is a high dimensional space we are interested in reducing the dimensionality of feature vectors for t ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
The use of higher order autocorrelations as features for pattern classification has been usually restricted to second or third orders due to high computational costs. Since the autocorrelation space is a high dimensional space we are interested in reducing the dimensionality of feature vectors for the benefit of the pattern classification task.
Genetic Algorithms for Artificial Neural Netbased Condition Monitoring System Design for Rotating Mechanical Systems
 Journal of Applied Soft Computing, Elsevier, Submitted
"... Abstract. We present the results of our investigation into the use of Genetic Algorithms (GA) for identifying near optimal design parameters of Diagnostic Systems that are based on Artificial Neural Networks (ANNs) for condition monitoring of mechanical systems. ANNs have been widely used for health ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Abstract. We present the results of our investigation into the use of Genetic Algorithms (GA) for identifying near optimal design parameters of Diagnostic Systems that are based on Artificial Neural Networks (ANNs) for condition monitoring of mechanical systems. ANNs have been widely used for health diagnosis of mechanical bearing using features extracted from vibration and acoustic emission signals. However, different sensors and the corresponding features exhibit varied response to different faults. Moreover, a number of different features can be used as inputs to a classifier ANN. Identification of the most useful features is important for an efficient classification as opposed to using all features from all channels, leading to very high computational cost and is, consequently, not desirable. Furthermore, determining the ANN structure is a fundamental design issue and can be critical for the classification performance. We show that GA can be used to select a smaller subset of features that together form a genetically fit family for successful fault identification and classification tasks. At the same time, an appropriate structure of the ANN, in terms of the number of nodes in the hidden layer, can be determined, resulting in improved performance.