Results 1 -
6 of
6
Comparison of discrimination methods for the classification of tumors using gene expression data
- JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 2002
"... A reliable and precise classification of tumors is essential for successful diagnosis and treatment of cancer. cDNA microarrays and high-density oligonucleotide chips are novel biotechnologies increasingly used in cancer research. By allowing the monitoring of expression levels in cells for thousand ..."
Abstract
-
Cited by 348 (2 self)
- Add to MetaCart
A reliable and precise classification of tumors is essential for successful diagnosis and treatment of cancer. cDNA microarrays and high-density oligonucleotide chips are novel biotechnologies increasingly used in cancer research. By allowing the monitoring of expression levels in cells for thousands of genes simultaneously, microarray experiments may lead to a more complete understanding of the molecular variations among tumors and hence to a finer and more informative classification. The ability to successfully distinguish between tumor classes (already known or yet to be discovered) using gene expression data is an important aspect of this novel approach to cancer classification. This article compares the performance of different discrimination methods for the classification of tumors based on gene expression data. The methods include nearest-neighbor classifiers, linear discriminant analysis, and classification trees. Recent machine learning approaches, such as bagging and boosting, are also considered. The discrimination methods are applied to datasets from three recently published cancer gene expression studies.
Associative clustering for exploring dependencies between functional genomics data sets
- IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS
, 2005
"... ..."
Cryptanalysis of the Cellular Message Encryption Algorithm By David Wagner Bruce Schneier John Kelsey i
- IEEE/ACM Trans. Comput. Biol. Bioinform
, 2005
"... Abstract—We construct a gene-to-gene regulatory network from time-series data of expression levels for the whole genome of the yeast Saccharomyces cerevisae, in a case where the number of measurements is much smaller than the number of genes in the network. This network is analyzed with respect to p ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Abstract—We construct a gene-to-gene regulatory network from time-series data of expression levels for the whole genome of the yeast Saccharomyces cerevisae, in a case where the number of measurements is much smaller than the number of genes in the network. This network is analyzed with respect to present biological knowledge of all genes (according to the Gene Ontology database), and we find some of its large-scale properties to be in accordance with known facts about the organism. The linear modeling employed here has been explored several times, but due to lack of any validation beyond investigating individual genes, it has been seriously questioned with respect to its applicability to biological systems. Our results show the adequacy of the approach and make further investigations of the model meaningful. Index Terms—Biology and genetics, time series analysis, network problems, gene network, network inference, Lasso, yeast, validation, outdegree. æ 1
Low-Rank Matrix Fitting Based on Subspace Perturbation Analysis with Applications to Structure from Motion
- IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
"... The task of finding a low-rank (r) matrix that best fits an original data matrix of higher rank is a recurring problem in science and engineering. The problem becomes especially difficult when the original data matrix has some missing entries and contains an unknown additive noise term in the remain ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
The task of finding a low-rank (r) matrix that best fits an original data matrix of higher rank is a recurring problem in science and engineering. The problem becomes especially difficult when the original data matrix has some missing entries and contains an unknown additive noise term in the remaining elements. The former problem can be solved by concatenating a set of r-column matrices which share a common, single r-dimensional solution space. Unfortunately, the number of possible submatrices is generally very large and, hence, the results obtained with one set of r-column matrices will generally be different from that captured by a different set. Ideally, we would like to find that solution which is least affected by noise. This requires that we determine which of the r-column matrices (i.e., which of the original feature points) are less influenced by the unknown noise term. This paper presents a criterion to successfully carry out such a selection. Our key result is to formally prove that the more distinct the r vectors of the r-column matrices are, the less they are swayed by noise. This key result is then combined with the use of a noise model to derive an upper-bound for the effect that noise and occlusions have on each of the r-column matrices. It is shown how this criterion can be effectively used to recover the noise-free matrix of rank r. Finally, we derive affine and projective structure from motion (SFM) algorithms using the proposed criterion. Extensive validation on synthetic and real data sets shows the superiority of the proposed approach over the state of the art.
A Survey of Evolutionary Algorithms for Clustering
"... Abstract — This paper presents a survey of evolutionary algorithms designed for clustering tasks. It tries to reflect the profile of this area by focusing more on those subjects that have been given more importance in the literature. In this context, most of the paper is devoted to partitional algor ..."
Abstract
- Add to MetaCart
Abstract — This paper presents a survey of evolutionary algorithms designed for clustering tasks. It tries to reflect the profile of this area by focusing more on those subjects that have been given more importance in the literature. In this context, most of the paper is devoted to partitional algorithms that look for hard clusterings of data, though overlapping (i.e., soft and fuzzy) approaches are also covered in the manuscript. The paper is original in what concerns two main aspects. First, it provides an up-to-date overview that is fully devoted to evolutionary algorithms for clustering, is not limited to any particular kind of evolutionary approach, and comprises advanced topics, like multi-objective and ensemble-based evolutionary clustering. Second, it provides a taxonomy that highlights some very important aspects in the context of evolutionary data clustering, namely, fixed or variable number of clusters, cluster-oriented or non-oriented operators, context-sensitive or context-insensitive operators, guided or unguided operators, binary, integer or real encodings, centroid-based, medoid-based, label-based, tree-based or graph-based representations, among others. A number of references is provided that describe applications of evolutionary algorithms for clustering in different domains, such as image processing, computer security, and bioinformatics. The paper ends by addressing some important issues and open questions that can be subject of future research. Index Terms — evolutionary algorithms, clustering, applications. I.

