Results 1  10
of
139
Comparison of discrimination methods for the classification of tumors using gene expression data
 JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 2002
"... A reliable and precise classification of tumors is essential for successful diagnosis and treatment of cancer. cDNA microarrays and highdensity oligonucleotide chips are novel biotechnologies increasingly used in cancer research. By allowing the monitoring of expression levels in cells for thousand ..."
Abstract

Cited by 756 (6 self)
 Add to MetaCart
A reliable and precise classification of tumors is essential for successful diagnosis and treatment of cancer. cDNA microarrays and highdensity oligonucleotide chips are novel biotechnologies increasingly used in cancer research. By allowing the monitoring of expression levels in cells for thousands of genes simultaneously, microarray experiments may lead to a more complete understanding of the molecular variations among tumors and hence to a finer and more informative classification. The ability to successfully distinguish between tumor classes (already known or yet to be discovered) using gene expression data is an important aspect of this novel approach to cancer classification. This article compares the performance of different discrimination methods for the classification of tumors based on gene expression data. The methods include nearestneighbor classifiers, linear discriminant analysis, and classification trees. Recent machine learning approaches, such as bagging and boosting, are also considered. The discrimination methods are applied to datasets from three recently published cancer gene expression studies.
Variable Selection for ModelBased Clustering
 Journal of the American Statistical Association
, 2006
"... We consider the problem of variable or feature selection for modelbased clustering. We recast the problem of comparing two nested subsets of variables as a model comparison problem, and address it using approximate Bayes factors. We develop a greedy search algorithm for finding a local optimum in m ..."
Abstract

Cited by 96 (7 self)
 Add to MetaCart
(Show Context)
We consider the problem of variable or feature selection for modelbased clustering. We recast the problem of comparing two nested subsets of variables as a model comparison problem, and address it using approximate Bayes factors. We develop a greedy search algorithm for finding a local optimum in model space. The resulting method selects variables (or features), the number of clusters, and the clustering model simultaneously. We applied the method to several simulated and real examples, and found that removing irrelevant variables often improved performance. Compared to methods based on all the variables, our variable selection method consistently yielded more accurate estimates of the number of clusters, and lower classification error rates, as well as more parsimonious clustering models and easier visualization of results.
Analysis of PCAbased and fisher discriminantbased image recognition algorithms
, 2000
"... ..."
(Show Context)
Feature extraction for nonparametric discriminant analysis
 Journal of Computational and Graphical Statistics
, 2003
"... In highdimensional classi � cation problems, one is often interested in � nding a few important discriminant directions in order to reduce the dimensionality.Fisher’s linear discriminant analysis(LDA) is a commonly used method. Although LDA is guaranteedto � nd the best directions when each class h ..."
Abstract

Cited by 21 (1 self)
 Add to MetaCart
(Show Context)
In highdimensional classi � cation problems, one is often interested in � nding a few important discriminant directions in order to reduce the dimensionality.Fisher’s linear discriminant analysis(LDA) is a commonly used method. Although LDA is guaranteedto � nd the best directions when each class has a Gaussian density with a common covariance matrix, it can fail if the class densitiesare more general.Using a likelihoodbasedinterpretation of Fisher’s LDA criterion, we develop a general method for � nding important discriminant directions without assuming the class densities belong to any particular parametric family. We also show that our method can be easily integrated with projection pursuit density estimation to produce a powerful procedure for (reducedrank) nonparametric discriminant analysis.
The Haar wavelet transform of a dendrogram
 Journal of Classification
, 2007
"... We consider the wavelet transform of a finite, rooted, noderanked, pway tree, focusing on the case of binary (p = 2) trees. We study a Haar wavelet transform on this tree. Wavelet transforms allow for multiresolution analysis through translation and dilation of a wavelet function. We explore how t ..."
Abstract

Cited by 20 (8 self)
 Add to MetaCart
We consider the wavelet transform of a finite, rooted, noderanked, pway tree, focusing on the case of binary (p = 2) trees. We study a Haar wavelet transform on this tree. Wavelet transforms allow for multiresolution analysis through translation and dilation of a wavelet function. We explore how this works in our tree context.
A TwoWay Visualization Method for Clustered Data
, 2003
"... We describe a novel approach to the visualization of hierarchical clustering that superimposes the classical dendrogram over a fully synchronized lowdimensional embedding, thereby gaining the benefits of both approaches. In a single image one can view all the clusters, examine the relations between ..."
Abstract

Cited by 13 (3 self)
 Add to MetaCart
We describe a novel approach to the visualization of hierarchical clustering that superimposes the classical dendrogram over a fully synchronized lowdimensional embedding, thereby gaining the benefits of both approaches. In a single image one can view all the clusters, examine the relations between them and study many of their properties. The method is based on an algorithm for lowdimensional embedding of clustered data, with the property that separation between all clusters is guaranteed, regardless of their nature. In particular, the algorithm was designed to produce embeddings that strictly adhere to a given hierarchical clustering of the data, so that every two disjoint clusters in the hierarchy are drawn separately.
Supervised classification with conditional gaussian networks: Increasing the structure complexity from naive bayes
 International Journal of Approximate Reasoning
"... Most of the Bayesian networkbased classifiers are usually only able to handle discrete variables. However, most realworld domains involve continuous variables. A common practice to deal with continuous variables is to discretize them, with a subsequent loss of information. This work shows how disc ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
(Show Context)
Most of the Bayesian networkbased classifiers are usually only able to handle discrete variables. However, most realworld domains involve continuous variables. A common practice to deal with continuous variables is to discretize them, with a subsequent loss of information. This work shows how discrete classifier induction algorithms can be adapted to the conditional Gaussian network paradigm to deal with continuous variables without discretizing them. In addition, three novel classifier induction algorithms and two new propositions about mutual information are introduced. The classifier induction algorithms presented are ordered and grouped according to their structural complexity: naive Bayes, tree augmented naive Bayes, kdependence Bayesian classifiers and semi naive Bayes. All the classifier induction algorithms are empirically evaluated using predictive accuracy, and they are compared to linear discriminant analysis, as a continuous classic statistical benchmark classifier. Besides, the accuracies for a set of stateoftheart classifiers are included in order to justify the use of linear discriminant analysis as the benchmark algorithm. In order to understand the behavior of the conditional Gaussian networkbased classifiers better, the results include biasvariance decomposition of the expected misclassification rate. The study suggests that semi naive Bayes structure based classifiers and, especially, the novel wrapper condensed semi naive Bayes backward, outperform the behavior of the rest of the presented classifiers. They also obtain quite competitive results compared to the stateoftheart algorithms included. Key words: conditional Gaussian network, Bayesian network, naive Bayes, tree augmented naive Bayes, kdependence Bayesian classifiers, semi naive Bayes, filter, wrapper.
The remarkable simplicity of very high dimensional data: application to modelbased clustering
 Journal of Classication
, 2009
"... An ultrametric topology formalizes the notion of hierarchical structure. An ultrametric embedding, referred to here as ultrametricity, is implied by a hierarchical embedding. Such hierarchical structure can be global in the data set, or local. By quantifying extent or degree of ultrametricity in a d ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
An ultrametric topology formalizes the notion of hierarchical structure. An ultrametric embedding, referred to here as ultrametricity, is implied by a hierarchical embedding. Such hierarchical structure can be global in the data set, or local. By quantifying extent or degree of ultrametricity in a data set, we show that ultrametricity becomes pervasive as dimensionality and/or spatial sparsity increases. This leads us to assert that very high dimensional data are of simple structure. We exemplify this finding through a range of simulated data cases. We discuss also application to very high frequency time series segmentation and modeling.
Comparison of Discrimination Methods for High Dimensional Data
"... In microarray experiments, the dimension p of the data is very large but there are only few observations N on the subjects/patients. In this article, the problem of classifying a subject into one of the two groups, when p is large, is considered. Three procedures based on MoorePenrose inverse of th ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
In microarray experiments, the dimension p of the data is very large but there are only few observations N on the subjects/patients. In this article, the problem of classifying a subject into one of the two groups, when p is large, is considered. Three procedures based on MoorePenrose inverse of the sample covariance matrix and an empirical Bayes estimate of the precision matrix are proposed and compared with the DLDA procedure. Key Words and Phrases: Classification, discrimination analysis,minimum distance, MoorePenrose inverse
Y.: ‘BatchLearning SelfOrganizing Map with falseneighbor degree between neurons
 Neural Networks, 2008. IJCNN 2008. (IEEE World Congress on Computational Intelligence). IEEE International Joint Conference on
, 2008
"... Abstract This study proposes a BatchLearning SelfOrganizing Map with FalseNeighbor degree between neurons (called BLFNSOM). Falseneighbor degrees are allocated between adjacent rows and adjacent columns of BLFNSOM. The initial values of all of the falseneighbor degrees are set to zero, howev ..."
Abstract

Cited by 5 (4 self)
 Add to MetaCart
(Show Context)
Abstract This study proposes a BatchLearning SelfOrganizing Map with FalseNeighbor degree between neurons (called BLFNSOM). Falseneighbor degrees are allocated between adjacent rows and adjacent columns of BLFNSOM. The initial values of all of the falseneighbor degrees are set to zero, however, they are increased with learning, and the falseneighbor degrees act as a burden of the distance between map nodes when the weight vectors of neurons are updated. BLFNSOM changes the neighborhood relationship more flexibly according to the situation and the shape of data although using batch learning. We apply BLFNSOM to some input data and confirm that FNSOM can obtain a more effective map reflecting the distribution state of input data than the conventional BatchLearning SOM. Key words selforganizing maps (SOM), clustering, unsupervised learning