• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

C.: A decomposition model to track gene expression signatures: preview on observerindependent classification of ovarian cancer (2002)

by A-M Martoglio, J W Miskin, S K Smith, D J MacKay
Venue:Bioinformatics
Add To MetaCart

Tools

Sorted by:
Results 1 - 8 of 8

Infinite Sparse Factor Analysis and Infinite Independent Components Analysis

by David Knowles, Zoubin Ghahramani
"... Abstract. A nonparametric Bayesian extension of Independent Components Analysis (ICA) is proposed where observed data Y is modelled as a linear superposition, G, of a potentially infinite number of hidden sources, X. Whether a given source is active for a specific data point is specified by an infin ..."
Abstract - Cited by 17 (4 self) - Add to MetaCart
Abstract. A nonparametric Bayesian extension of Independent Components Analysis (ICA) is proposed where observed data Y is modelled as a linear superposition, G, of a potentially infinite number of hidden sources, X. Whether a given source is active for a specific data point is specified by an infinite binary matrix, Z. The resulting sparse representation allows increased data reduction compared to standard ICA. We define a prior on Z using the Indian Buffet Process (IBP). We describe four variants of the model, with Gaussian or Laplacian priors on X and the one or two-parameter IBPs. We demonstrate Bayesian inference under these models using a Markov Chain Monte Carlo (MCMC) algorithm on synthetic and gene expression data and compare to standard ICA algorithms. 1

Biologically valid linear factor models of gene expression

by Mark Girolami, Rainer Breitling - Bioinformatics , 2004
"... Motivation The identification of physiological processes underlying and generating the expression pattern observed in microarray experiments is a major challenge. Principal Component Analysis (PCA) is a linear multivariate statistical method that is regularly employed for that purpose as it provides ..."
Abstract - Cited by 16 (1 self) - Add to MetaCart
Motivation The identification of physiological processes underlying and generating the expression pattern observed in microarray experiments is a major challenge. Principal Component Analysis (PCA) is a linear multivariate statistical method that is regularly employed for that purpose as it provides a reduced-dimensional representation for subsequent study of possible biological processes responding to the particular experimental conditions. Making explicit the data assumptions underlying PCA highlights their lack of biological validity thus making biological interpretation of the principal components problematic. A microarray data representation which enables clear biological interpretation is a desirable analysis tool. Results We address this issue by employing the probabilistic interpretation of Principal Component Analysis and proposing alternative Linear Factor Models which are based on refined biological assumptions. A practical study on two well-understood microarray data sets highlights the weakness of Principal Component Analysis and the greater biological interpretability of the linear models we have developed. Availability The model estimation routines are currently implemented as Matlab routines and these, as well as data and results reported, are available from the following URL

Modeling Cellular Processes with Variational Bayesian Cooperative Vector Quantizer

by X. Lu, M. Hauskrecht, R. S. Day - In Proceedings of Pacific Symposium on Biocomputing , 2004
"... Gene expression of a cell is controlled by sophisticated cellular processes. ..."
Abstract - Cited by 3 (3 self) - Add to MetaCart
Gene expression of a cell is controlled by sophisticated cellular processes.

Independent component analysis of starch deficient pgm mutants

by Matthias Scholz, Yves Gibon, Mark Stitt, Joachim Selbig - In Giegerich,R. and Stoye,J. (eds), Proc. of the German Conference on Bioinformatics 2004. Gesellschaft f ür Informatik , 2004
"... Abstract: Changes in enzymatic activities in response to carbon starvation were investigated in Arabidopsis thaliana in two distinct experiments. One compares the Columbia ecotype (Col-0) and its starch deficient pgm mutant (plastidial phosphoglucomutase), the other investigates the enzymatic activi ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
Abstract: Changes in enzymatic activities in response to carbon starvation were investigated in Arabidopsis thaliana in two distinct experiments. One compares the Columbia ecotype (Col-0) and its starch deficient pgm mutant (plastidial phosphoglucomutase), the other investigates the enzymatic activities of Col-0 under extended night conditions. A classical technique for detecting and visualizing relevant information from the measured data is principal component analysis (PCA). We show that independent component analysis (ICA) is more suitable for our questions and the results are more precise than those obtained with PCA. This higher informative power is only achieved when ICA is combined with suitable pre-processing and evaluation criteria. It is essential to first reduce the dimensionality of the data set, using PCA. The number of principal components determines the quality of ICA significantly, therefore we propose a criterion for estimating the optimal dimension automatically. The measure of kurtosis is used to sort the extracted components. We found that ICA could detect on the one hand the time component of the extended night experiment, and on the other hand a discriminating component in the pgm mutant experiment. In both components the most important enzymes were the same, confirming the carbon starvation phenotype in the mutant.

GEOMETRIC OPTIMIZATION METHODS FOR INDEPENDENT COMPONENT ANALYSIS APPLIED ON GENE EXPRESSION DATA

by M. Journée, A. E. Teschendorff, P. -a. Absil, R. Sepulchre
"... DNA microarrays provide a huge amount of data and require therefore dimensionality reduction methods to extract meaningful biological information. Independent Component Analysis (ICA) was proposed by several authors as an interesting means. Unfortunately, experimental data are usually of poor qualit ..."
Abstract - Cited by 2 (2 self) - Add to MetaCart
DNA microarrays provide a huge amount of data and require therefore dimensionality reduction methods to extract meaningful biological information. Independent Component Analysis (ICA) was proposed by several authors as an interesting means. Unfortunately, experimental data are usually of poor quality because of noise, outliers and lack of samples. Robustness to these hurdles will thus be a key feature for an ICA algorithm. This paper identi�es a robust contrast function and proposes a new ICA algorithm. Index Terms — Independent Component Analysis (ICA), 1.

Full address of the corresponding author:

by Francesca Ruffino, Marco Muselli, Giorgio Valentini, Giorgio Valentini
"... In the framework of gene expression data analysis, the selection of biologically relevant sets of genes and the discovery of new subclasses of diseases at bio-molecular level represent two significant problems. Unfortunately, in both cases the correct solution is usually unknown and the evaluation o ..."
Abstract - Add to MetaCart
In the framework of gene expression data analysis, the selection of biologically relevant sets of genes and the discovery of new subclasses of diseases at bio-molecular level represent two significant problems. Unfortunately, in both cases the correct solution is usually unknown and the evaluation of the performance of gene selection and clustering methods is difficult and in many cases unfeasible. A natural approach to this complex issue consists in developing an artificial model for the generation of biologically plausible gene expression data, thus allowing to know in advance the set of relevant genes and the functional classes involved in the problem. In this work we propose a mathematical model, based on positive Boolean functions, for the generation of synthetic gene expression data. Despite its simplicity, this model is sufficiently rich to take account of the specific peculiarities of gene expression, including the biological variability, viewed as a sort of random source. As an applicative example, we also provide some data simulations and numerical experiments for the analysis of the performances of gene selection methods. Key words: Gene expression modeling, gene selection, gene expression data clustering, positive Boolean functions, DNA microarrays. 1

analysis of

by M. Journée, A. E. Teschendorff, P. -a. Absil, S. Tavaré, Liège Belgium
"... optimization methods for the ..."
Abstract - Add to MetaCart
optimization methods for the

12 Geometric Optimization Methods for the Analysis of Gene Expression Data

by Michel Journée, Andrew E. Teschendorff, Pierre-antoine Absil, Rodolphe Sepulchre, Liège Belgium
"... MJ and AET contributed equally to this work. Summary. DNA microarrays provide such a huge amount of data that unsupervised methods are required to reduce the dimension of the data set and to extract meaningful biological information. This work shows that Independent Component Analysis (ICA) is a pro ..."
Abstract - Add to MetaCart
MJ and AET contributed equally to this work. Summary. DNA microarrays provide such a huge amount of data that unsupervised methods are required to reduce the dimension of the data set and to extract meaningful biological information. This work shows that Independent Component Analysis (ICA) is a promising approach for the analysis of genome-wide transcriptomic data. The paper first presents an overview of the most popular algorithms to perform ICA. These algorithms are then applied on a microarray breast-cancer data set. Some issues about the application of ICA and the evaluation of biological relevance of the results are discussed. This study indicates that ICA significantly outperforms Principal Component Analysis (PCA). The transcriptome is the set of all mRNA molecules in a given cell. Unlike the genome, which is roughly similar for all the cells of an organism, the transcriptome may vary from one cell to another according to the biological functions of that cell as well as to the external stimuli. The transcriptome reflects the
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University