### Citations

4166 | Regression shrinkage and selection via the lasso
- Tibshirani
- 1996
(Show Context)
Citation Context ...isher score, information gain, and relief [16]. Supervised wrapper methods select the most relevant features by optimizing the performance of the learning algorithm. The typical methods include Lasso =-=[17]-=-, LARs [18], SVM-RFE [19], and RFS [20]. Due to the absence of class label, unsupervised feature selection is a much harder problem. Unsupervised filter methods usually select features to best preserv... |

2443 | A global geometric framework for nonlinear dimensionality reduction
- TENENBAUM, SILVA, et al.
- 2000
(Show Context)
Citation Context ... of the clusters [23]. Recently, in order to characterize the underlying manifold structure, many manifold learning algorithms have been proposed, such as Local Linear Embedding (LLE) [24] and ISOMAP =-=[25]-=-. Many unsupervised feature selection algorithms [3], [6], [10] use various graphs to capture the manifold structure. However, most existing works construct graphs to approximate the manifold structur... |

2400 | Nonlinear dimensionality reduction by locally linear embedding
- Roweis, Saul
(Show Context)
Citation Context ...ing the goodness of the clusters [23]. Recently, in order to characterize the underlying manifold structure, many manifold learning algorithms have been proposed, such as Local Linear Embedding (LLE) =-=[24]-=- and ISOMAP [25]. Many unsupervised feature selection algorithms [3], [6], [10] use various graphs to capture the manifold structure. However, most existing works construct graphs to approximate the m... |

1699 | On spectral clustering: Analysis and an algorithm
- Ng, Jordan, et al.
(Show Context)
Citation Context ...d from nonlinear manifold of the ambient Euclidean space into correct clusters [14]. That is to say, the intrinsic manifold structure should be considered while measuring the goodness of the clusters =-=[23]-=-. Recently, in order to characterize the underlying manifold structure, many manifold learning algorithms have been proposed, such as Local Linear Embedding (LLE) [24] and ISOMAP [25]. Many unsupervis... |

1106 | Gene selection for cancer classification using support vector machines
- Guyon, Weston, et al.
- 2000
(Show Context)
Citation Context ...gain, and relief [16]. Supervised wrapper methods select the most relevant features by optimizing the performance of the learning algorithm. The typical methods include Lasso [17], LARs [18], SVM-RFE =-=[19]-=-, and RFS [20]. Due to the absence of class label, unsupervised feature selection is a much harder problem. Unsupervised filter methods usually select features to best preserve the structure of the da... |

713 |
Matching Theory
- Lovasz, Plummer
- 1986
(Show Context)
Citation Context ...1 if x = y and equals 0 otherwise, and map(·) is the permutation mapping function that maps each cluster index to a true class label. The best mapping can be found by using the Kuhn-Munkres algorithm =-=[32]-=-. The greater clustering accuracy means the better clustering performance. Normalized mutual information (NMI). Another evaluation metric that we adopt here is the normalized mutual information, which... |

237 | Multi-task feature learning
- Argyriou, Evgeniou, et al.
- 2006
(Show Context)
Citation Context ...er structure in a non-uniform weighted feature space, which alleviates the side effects of the irrelevant or noisy features. Finally, LGDFS draws the connections to multi-task feature learning, e.g., =-=[29]-=-, [30]. In fact, The estimation of feature weight in Eq. (9) can be regarded as n + 1 (n local and 1 global) regression tasks with weighted 2-norm regularization associated with the simplex constrain... |

159 |
Least angle regression’, Annals of Statistics
- Efron, Hastie, et al.
- 2004
(Show Context)
Citation Context ..., information gain, and relief [16]. Supervised wrapper methods select the most relevant features by optimizing the performance of the learning algorithm. The typical methods include Lasso [17], LARs =-=[18]-=-, SVM-RFE [19], and RFS [20]. Due to the absence of class label, unsupervised feature selection is a much harder problem. Unsupervised filter methods usually select features to best preserve the struc... |

133 | Theoretical and empirical analysis of relieff and rrelieff.
- Robnik-Sikonja, Kononenko
- 2003
(Show Context)
Citation Context ...r methods. Supervised filter methods usually evaluate the feature importance by the correlation between feature and class label. The typical methods include Fisher score, information gain, and relief =-=[16]-=-. Supervised wrapper methods select the most relevant features by optimizing the performance of the learning algorithm. The typical methods include Lasso [17], LARs [18], SVM-RFE [19], and RFS [20]. D... |

103 | Laplacian score for feature selection,”
- He, Cai, et al.
- 2005
(Show Context)
Citation Context ...sionality, and even provide significant insights into the nature of the problem [1], [2]. In recent years, a lot of methods have been proposed to address the problem of unsupervised feature selection =-=[3]-=-, [4], [5], [6], [7], [8], [9], [10], [11], [12]. These methods usually use the graph Laplacian to characterize the structure of high dimensional data. The selection of features is performed according... |

80 | Spectral Feature Selection for Supervised and Unsupervised Learning. In:
- Zhao, Liu
- 2007
(Show Context)
Citation Context ...lity, and even provide significant insights into the nature of the problem [1], [2]. In recent years, a lot of methods have been proposed to address the problem of unsupervised feature selection [3], =-=[4]-=-, [5], [6], [7], [8], [9], [10], [11], [12]. These methods usually use the graph Laplacian to characterize the structure of high dimensional data. The selection of features is performed according to e... |

70 | Efficient and robust feature selection via joint `2, 1-norms minimization
- Nie, Huang, et al.
- 2010
(Show Context)
Citation Context ...ure selection has become increasingly important since it can speed up the learning process, alleviate the curse of dimensionality, and even provide significant insights into the nature of the problem =-=[1]-=-, [2]. In recent years, a lot of methods have been proposed to address the problem of unsupervised feature selection [3], [4], [5], [6], [7], [8], [9], [10], [11], [12]. These methods usually use the ... |

57 | Feature selection for unsupervised and supervised inference: the emergence of sparsity in a weight-based approach,
- Wolf, Shashua
- 2005
(Show Context)
Citation Context ... correlation among features is neglected [6], [10]. For unsupervised wrapper methods, clustering is a commonly used learning algorithm to measure the quality of features. The typical ones include Q-a =-=[21]-=-, MCFS [6], MRSF [7], FSSL [8], UDFS [9], and JELSR [10]. These algorithms apply the following two steps separately [6], [7], [8] or jointly [9], [10]: 1)estimating the cluster structure via spectral ... |

54 | DIFFRAC: a discriminative and flexible framework for clustering
- Bach, Harchaoui
- 2008
(Show Context)
Citation Context ...ce the Discriminative Feature Selection (DFS) cost function, a weighted 2 norm regularized linear regression model by attaching a weight to each feature under the discriminative clustering framework =-=[13]-=-, [14], [15], to evaluate the relevance for features. Our goal is to select these features that well respect the most linearly separable clusters. Moreover, due to the non-linearly separable nature of... |

39 |
Multi-task feature learning via efficient l 2, 1-norm minimization.
- Liu, Ji, et al.
- 2009
(Show Context)
Citation Context ...ucture in a non-uniform weighted feature space, which alleviates the side effects of the irrelevant or noisy features. Finally, LGDFS draws the connections to multi-task feature learning, e.g., [29], =-=[30]-=-. In fact, The estimation of feature weight in Eq. (9) can be regarded as n + 1 (n local and 1 global) regression tasks with weighted 2-norm regularization associated with the simplex constraint. In ... |

37 | A local learning approach for clustering.
- Wu, Scholkopf
- 2006
(Show Context)
Citation Context ... matrix with its diagonal elements being z. Due to the combination nature of the cluster indicator matrix P and the feature indicator variable z, the optimization problem in Eq. (1) is NP-hard. As in =-=[22]-=-, instead of computing the partition matrix P directly, we first substitute it with a scaled partition matrix Y = P (PTP ) 1 2 . It is easy to verify that Y TY = I . We also relax the integer constrai... |

36 | Unsupervised feature selection for multi-cluster data
- Cai, Zhang, et al.
(Show Context)
Citation Context ...cted features on the UMIST and ORL data sets performance, in terms of accuracy and normalized mutual information, versus the number of selected features. In order to obtain stable results, similar to =-=[6]-=-, we also report the clustering performance on the subsets of UMist and ORL with 5 clusters. For each data set, 20 tests are conducted on different randomly selected clusters, and the average performa... |

29 | Efficient spectral feature selection with minimum redundancy
- Zhao, Wang, et al.
- 2010
(Show Context)
Citation Context ...provide significant insights into the nature of the problem [1], [2]. In recent years, a lot of methods have been proposed to address the problem of unsupervised feature selection [3], [4], [5], [6], =-=[7]-=-, [8], [9], [10], [11], [12]. These methods usually use the graph Laplacian to characterize the structure of high dimensional data. The selection of features is performed according to either some spec... |

22 |
Image Clustering using Local Discriminant Models and Global Integration
- Yang, Xu, et al.
- 2010
(Show Context)
Citation Context ... Discriminative Feature Selection (DFS) cost function, a weighted 2 norm regularized linear regression model by attaching a weight to each feature under the discriminative clustering framework [13], =-=[14]-=-, [15], to evaluate the relevance for features. Our goal is to select these features that well respect the most linearly separable clusters. Moreover, due to the non-linearly separable nature of many ... |

13 | A variance minimization criterion to feature selection using laplacian regularization
- He, Ji, et al.
(Show Context)
Citation Context ... a local optimum, we fix the neighborhood structure determined by the estimated z in previous iteration. Due to the multiple matrix inversion operations, the sequential or the convex SDP solver [15], =-=[26]-=- designed for single matrix inversion cannot be easily adapted. In this paper, we resort to the equivalent multi-task regression formulation in Eq. (9). When Y is given, the optimal value of {Wi}ni=1 ... |

12 |
Local-learning-based feature selection for high-dimensional data analysis.
- Sun, Todorovic, et al.
- 2010
(Show Context)
Citation Context ...election has become increasingly important since it can speed up the learning process, alleviate the curse of dimensionality, and even provide significant insights into the nature of the problem [1], =-=[2]-=-. In recent years, a lot of methods have been proposed to address the problem of unsupervised feature selection [3], [4], [5], [6], [7], [8], [9], [10], [11], [12]. These methods usually use the graph... |

12 | Clustering with local and global regularization. In:
- Wang, Zhang, et al.
- 2007
(Show Context)
Citation Context ...ights for features. Another set of related approaches are spectral clustering, such as the local learning based clustering, e.g. [14], [22], and clustering with local and global mixed Laplacian, e.g. =-=[27]-=-, [28]. Obviously, these methods also construct various Laplacian matrices on uniform feature space. Our approach estimates the cluster structure in a non-uniform weighted feature space, which allevia... |

12 | Spectral embedded clustering
- Nie, Xu, et al.
- 2009
(Show Context)
Citation Context ...for features. Another set of related approaches are spectral clustering, such as the local learning based clustering, e.g. [14], [22], and clustering with local and global mixed Laplacian, e.g. [27], =-=[28]-=-. Obviously, these methods also construct various Laplacian matrices on uniform feature space. Our approach estimates the cluster structure in a non-uniform weighted feature space, which alleviates th... |

11 | Joint feature selection and subspace learning
- Gu, Li, et al.
- 2011
(Show Context)
Citation Context ...de significant insights into the nature of the problem [1], [2]. In recent years, a lot of methods have been proposed to address the problem of unsupervised feature selection [3], [4], [5], [6], [7], =-=[8]-=-, [9], [10], [11], [12]. These methods usually use the graph Laplacian to characterize the structure of high dimensional data. The selection of features is performed according to either some specified... |

9 | Feature selection via joint embedding learning and sparse regression
- Hou, Nie, et al.
- 2011
(Show Context)
Citation Context ...cant insights into the nature of the problem [1], [2]. In recent years, a lot of methods have been proposed to address the problem of unsupervised feature selection [3], [4], [5], [6], [7], [8], [9], =-=[10]-=-, [11], [12]. These methods usually use the graph Laplacian to characterize the structure of high dimensional data. The selection of features is performed according to either some specified criterion ... |

6 | Eigenvalue sensitive feature selection
- Jiang, Ren
(Show Context)
Citation Context ... and even provide significant insights into the nature of the problem [1], [2]. In recent years, a lot of methods have been proposed to address the problem of unsupervised feature selection [3], [4], =-=[5]-=-, [6], [7], [8], [9], [10], [11], [12]. These methods usually use the graph Laplacian to characterize the structure of high dimensional data. The selection of features is performed according to either... |

6 |
l 2, 1-norm regularized discriminative feature selection for unsupervised learning,” in
- Yang, Shen, et al.
- 2011
(Show Context)
Citation Context ...gnificant insights into the nature of the problem [1], [2]. In recent years, a lot of methods have been proposed to address the problem of unsupervised feature selection [3], [4], [5], [6], [7], [8], =-=[9]-=-, [10], [11], [12]. These methods usually use the graph Laplacian to characterize the structure of high dimensional data. The selection of features is performed according to either some specified crit... |

6 |
Robust unsupervised feature selection.
- Qian, Zhai
- 2013
(Show Context)
Citation Context ...s into the nature of the problem [1], [2]. In recent years, a lot of methods have been proposed to address the problem of unsupervised feature selection [3], [4], [5], [6], [7], [8], [9], [10], [11], =-=[12]-=-. These methods usually use the graph Laplacian to characterize the structure of high dimensional data. The selection of features is performed according to either some specified criterion or sparse sp... |

6 | Discriminative codeword selection for image representation
- Zhang, Chen, et al.
(Show Context)
Citation Context ...iminative Feature Selection (DFS) cost function, a weighted 2 norm regularized linear regression model by attaching a weight to each feature under the discriminative clustering framework [13], [14], =-=[15]-=-, to evaluate the relevance for features. Our goal is to select these features that well respect the most linearly separable clusters. Moreover, due to the non-linearly separable nature of many high-d... |

5 | Feature selection and kernel learning for local learning-based clustering
- Zeng, Cheung
(Show Context)
Citation Context ...g(z)−1W ) = d∑ j=1 ||W̄j ||2 zj ≥ ⎛ ⎝ d∑ j=1 ||W̄j || ⎞ ⎠ 2 (15) where ||W̄j || = √∑n i=1 λi ∑c c′=1(Wi) 2 jc′ + αλ ∑c c′=1W 2 jc′ and ∑ j zj = 1, zj ≥ 0. Proof: This is the corollary of Theorem 1 in =-=[31]-=-. V. EXPERIMENTAL RESULTS In this section, we evaluate the performance of our proposed algorithm LGDFS. Following [6], [10], we perform Kmeans clustering by using the selected features and compare the... |

2 | Joint clustering and feature selection,” in Web-Age Information Management - Du, Shen - 2013 |