Results 1 - 10
of
52
Comparison of discrimination methods for the classification of tumors using gene expression data
- JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 2002
"... A reliable and precise classification of tumors is essential for successful diagnosis and treatment of cancer. cDNA microarrays and high-density oligonucleotide chips are novel biotechnologies increasingly used in cancer research. By allowing the monitoring of expression levels in cells for thousand ..."
Abstract
-
Cited by 348 (2 self)
- Add to MetaCart
A reliable and precise classification of tumors is essential for successful diagnosis and treatment of cancer. cDNA microarrays and high-density oligonucleotide chips are novel biotechnologies increasingly used in cancer research. By allowing the monitoring of expression levels in cells for thousands of genes simultaneously, microarray experiments may lead to a more complete understanding of the molecular variations among tumors and hence to a finer and more informative classification. The ability to successfully distinguish between tumor classes (already known or yet to be discovered) using gene expression data is an important aspect of this novel approach to cancer classification. This article compares the performance of different discrimination methods for the classification of tumors based on gene expression data. The methods include nearest-neighbor classifiers, linear discriminant analysis, and classification trees. Recent machine learning approaches, such as bagging and boosting, are also considered. The discrimination methods are applied to datasets from three recently published cancer gene expression studies.
Variable Selection for Model-Based Clustering
- Journal of the American Statistical Association
, 2006
"... We consider the problem of variable or feature selection for model-based clustering. We recast the problem of comparing two nested subsets of variables as a model comparison problem, and address it using approximate Bayes factors. We develop a greedy search algorithm for finding a local optimum in m ..."
Abstract
-
Cited by 27 (4 self)
- Add to MetaCart
We consider the problem of variable or feature selection for model-based clustering. We recast the problem of comparing two nested subsets of variables as a model comparison problem, and address it using approximate Bayes factors. We develop a greedy search algorithm for finding a local optimum in model space. The resulting method selects variables (or features), the number of clusters, and the clustering model simultaneously. We applied the method to several simulated and real examples, and found that removing irrelevant variables often improved performance. Compared to methods based on all the variables, our variable selection method consistently yielded more accurate estimates of the number of clusters, and lower classification error rates, as well as more parsimonious clustering models and easier visualization of results.
Feature extraction for non-parametric discriminant analysis
- Journal of Computational and Graphical Statistics
, 2003
"... In high-dimensional classi � cation problems, one is often interested in � nding a few important discriminant directions in order to reduce the dimensionality.Fisher’s linear discriminant analysis(LDA) is a commonly used method. Although LDA is guaranteedto � nd the best directions when each class h ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
In high-dimensional classi � cation problems, one is often interested in � nding a few important discriminant directions in order to reduce the dimensionality.Fisher’s linear discriminant analysis(LDA) is a commonly used method. Although LDA is guaranteedto � nd the best directions when each class has a Gaussian density with a common covariance matrix, it can fail if the class densitiesare more general.Using a likelihood-basedinterpretation of Fisher’s LDA criterion, we develop a general method for � nding important discriminant directions without assuming the class densities belong to any particular parametric family. We also show that our method can be easily integrated with projection pursuit density estimation to produce a powerful procedure for (reduced-rank) nonparametric discriminant analysis.
The Haar wavelet transform of a dendrogram
- Journal of Classification
, 2007
"... We consider the wavelet transform of a finite, rooted, node-ranked, p-way tree, focusing on the case of binary (p = 2) trees. We study a Haar wavelet transform on this tree. Wavelet transforms allow for multiresolution analysis through translation and dilation of a wavelet function. We explore how t ..."
Abstract
-
Cited by 12 (5 self)
- Add to MetaCart
We consider the wavelet transform of a finite, rooted, node-ranked, p-way tree, focusing on the case of binary (p = 2) trees. We study a Haar wavelet transform on this tree. Wavelet transforms allow for multiresolution analysis through translation and dilation of a wavelet function. We explore how this works in our tree context.
A Two-Way Visualization Method for Clustered Data
, 2003
"... We describe a novel approach to the visualization of hierarchical clustering that superimposes the classical dendrogram over a fully synchronized low-dimensional embedding, thereby gaining the benefits of both approaches. In a single image one can view all the clusters, examine the relations between ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
We describe a novel approach to the visualization of hierarchical clustering that superimposes the classical dendrogram over a fully synchronized low-dimensional embedding, thereby gaining the benefits of both approaches. In a single image one can view all the clusters, examine the relations between them and study many of their properties. The method is based on an algorithm for lowdimensional embedding of clustered data, with the property that separation between all clusters is guaranteed, regardless of their nature. In particular, the algorithm was designed to produce embeddings that strictly adhere to a given hierarchical clustering of the data, so that every two disjoint clusters in the hierarchy are drawn separately.
Supervised classification with conditional gaussian networks: Increasing the structure complexity from naive bayes
- International Journal of Approximate Reasoning
"... Most of the Bayesian network-based classifiers are usually only able to handle discrete variables. However, most real-world domains involve continuous variables. A common practice to deal with continuous variables is to discretize them, with a subsequent loss of information. This work shows how disc ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Most of the Bayesian network-based classifiers are usually only able to handle discrete variables. However, most real-world domains involve continuous variables. A common practice to deal with continuous variables is to discretize them, with a subsequent loss of information. This work shows how discrete classifier induction algorithms can be adapted to the conditional Gaussian network paradigm to deal with continuous variables without discretizing them. In addition, three novel classifier induction algorithms and two new propositions about mutual information are introduced. The classifier induction algorithms presented are ordered and grouped according to their structural complexity: naive Bayes, tree augmented naive Bayes, k-dependence Bayesian classifiers and semi naive Bayes. All the classifier induction algorithms are empirically evaluated using predictive accuracy, and they are compared to linear discriminant analysis, as a continuous classic statistical benchmark classifier. Besides, the accuracies for a set of state-of-the-art classifiers are included in order to justify the use of linear discriminant analysis as the benchmark algorithm. In order to understand the behavior of the conditional Gaussian network-based classifiers better, the results include bias-variance decomposition of the expected misclassification rate. The study suggests that semi naive Bayes structure based classifiers and, especially, the novel wrapper condensed semi naive Bayes backward, outperform the behavior of the rest of the presented classifiers. They also obtain quite competitive results compared to the state-of-the-art algorithms included. Key words: conditional Gaussian network, Bayesian network, naive Bayes, tree augmented naive Bayes, k-dependence Bayesian classifiers, semi naive Bayes, filter, wrapper.
Case Base Adaptation Using Solution-Space Metrics, to appear in
- Proceedings of the 18 th International Joint Conference on Artificial Intelligence, IJCAI-03
, 2003
"... In this paper we propose a generalisation of the k-nearest neighbour (k-NN) retrieval method based on an error function using distance metrics in the solution and problem space. It is an interpolate method which is proposed to be effective for sparse case bases. The method applies equally to nominal ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
In this paper we propose a generalisation of the k-nearest neighbour (k-NN) retrieval method based on an error function using distance metrics in the solution and problem space. It is an interpolate method which is proposed to be effective for sparse case bases. The method applies equally to nominal, continuous and mixed domains, and does not depend upon an embedding n-dimensional space. In continuous Euclidean problem domains, the method is shown to be a generalisation of the Shepard's Interpolation method. We term the retrieval algorithm the
Connectionist Learning Architecture Based on an Optical Thin-Film Multilayer Model
, 1997
"... Connectionist models consist of large numbers of simple but highly interconnected "units". They represent an approach that is quite different from that of classical models based on the structure of Von Neumann machines. Although the term "connectionist models" often refers to artificial neural netwo ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Connectionist models consist of large numbers of simple but highly interconnected "units". They represent an approach that is quite different from that of classical models based on the structure of Von Neumann machines. Although the term "connectionist models" often refers to artificial neural network models, which are inspired directly by the biological neurons, there are also other connectionist architectures that differ significantly from this biological exemplar. This thesis describes such a "novel" connectionist learning architecture based on the technology of optical thin-film multilayer. The proposed connectionist model consists of multiple thin-film layers (similar to simple processing units in a neural network model), each with different refractive index and thickness. A light beam incident perpendicular to the surface of the multilayer stack is used to carry out the required computation. The reflectance of the light incident can be used as the general measurement of the outputs. Inputs can be fed into the system by encoding them into some system parameters such as refractive indices, and individual layer thicknesses can be used as adjustable parameters that are equivalent to the connection weights of a neural network model. Since this approach involves optical signal processing, the proposed connectionist learning architecture has unique properties and could offer significant advantages. Much of the work has focused on developing this new connectionist learning architecture and investigating its capability to accomplish complex computational tasks which have been extensively studied for conventional connectionist models such as the widely used feed-forward neural network using the back-propagation learning by gradient descent. A prototype simulation model ha...
Rule Set Quality Measures For Inductive Learning Algorithms
, 1996
"... Symbolic inductive learning systems that induce concept descriptions from examples are valuable tools in the task of knowledge acquisition for expert systems. Since inductive learning methods produce distinct concept descriptions when given identical training data, questions arise as to the quality ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Symbolic inductive learning systems that induce concept descriptions from examples are valuable tools in the task of knowledge acquisition for expert systems. Since inductive learning methods produce distinct concept descriptions when given identical training data, questions arise as to the quality of the different rule sets produced. This work provides several techniques for comparing and analyzing rule sets. These techniques measure the accuracy, generalization, time and space complexity, and domain coverage of rule sets. Based on these metrics, the performance of four different inductive learning systems is compared. These systems are Michalski et al.'s AQ15 (1986a; 1986b; Hong, Mozetic, and Michalski, 1986; Wnek et al., 1995), Quinlan's C4.5 (1993), Clark and Niblett's CN2 (Clark and Niblett, 1989; Clark and Boswell, 1991), and Janikow's Genetic-based Inductive Learning system (GIL) (1991; 1993). The comparison is based on rule sets generated by these algorithms for six real world data sets. ...

