Results 1 - 10
of
151
The landscape of human proteins interacting with viruses and other pathogens
- PLoS Pathog
, 2008
"... Infectious diseases result in millions of deaths each year. Mechanisms of infection have been studied in detail for many pathogens. However, many questions are relatively unexplored. What are the properties of human proteins that interact with pathogens? Do pathogens interact with certain functional ..."
Abstract
-
Cited by 38 (1 self)
- Add to MetaCart
(Show Context)
Infectious diseases result in millions of deaths each year. Mechanisms of infection have been studied in detail for many pathogens. However, many questions are relatively unexplored. What are the properties of human proteins that interact with pathogens? Do pathogens interact with certain functional classes of human proteins? Which infection mechanisms and pathways are commonly triggered by multiple pathogens? In this paper, to our knowledge, we provide the first study of the landscape of human proteins interacting with pathogens. We integrate human–pathogen protein–protein interactions (PPIs) for 190 pathogen strains from seven public databases. Nearly all of the 10,477 human-pathogen PPIs are for viral systems (98.3%), with the majority belonging to the human–HIV system (77.9%). We find that both viral and bacterial pathogens tend to interact with hubs (proteins with many interacting partners) and bottlenecks (proteins that are central to many paths in the network) in the human PPI network. We construct separate sets of human proteins interacting with bacterial pathogens, viral pathogens, and those interacting with multiple bacteria and with multiple viruses. Gene Ontology functions enriched in these sets reveal a number of processes, such as cell cycle regulation, nuclear transport, and immune response that participate in interactions with different pathogens. Our results provide the first global view of strategies used by pathogens to subvert human cellular processes and infect human cells. Supplementary data accompanying this paper is available at
FABIA: factor analysis for bicluster acquisition
, 2010
"... Motivation: Biclustering of transcriptomic data groups genes and samples simultaneously. It is emerging as a standard tool for extracting knowledge from gene expression measurements. We propose a novel generative approach for biclustering called ‘FABIA: Factor Analysis for Bicluster Acquisition’. FA ..."
Abstract
-
Cited by 33 (0 self)
- Add to MetaCart
Motivation: Biclustering of transcriptomic data groups genes and samples simultaneously. It is emerging as a standard tool for extracting knowledge from gene expression measurements. We propose a novel generative approach for biclustering called ‘FABIA: Factor Analysis for Bicluster Acquisition’. FABIA is based on a multiplicative model, which accounts for linear dependencies between gene expression and conditions, and also captures heavy-tailed distributions as observed in real-world transcriptomic data. The generative framework allows to utilize well-founded model selection methods and to apply Bayesian techniques. Results: On 100 simulated datasets with known true, artificially implanted biclusters, FABIA clearly outperformed all 11 competitors. On these datasets, FABIA was able to separate spurious biclusters from true biclusters by ranking biclusters according to their information content. FABIA was tested on three microarray datasets with known subclusters, where it was two times the best and once the second best method among the compared biclustering approaches. Availability: FABIA is available as an R package on Bioconductor
Duplessis: Mining gene expression data with pattern structures in formal concept analysis
- Information Sciences
, 2011
"... concept analysis ..."
(Show Context)
A toolbox for bicluster analysis
- in r,” Tech. Rep. 028, Ludwing-Maximilians-Universitat Mnchen
, 2008
"... Abstract. Over the last decade, bicluster methods have become more and more popular in different fields of two way data analysis and a wide variety of algorithms and analysis methods have been published. In this paper we introduce the R package biclust, which contains a collection of bicluster algor ..."
Abstract
-
Cited by 19 (3 self)
- Add to MetaCart
Abstract. Over the last decade, bicluster methods have become more and more popular in different fields of two way data analysis and a wide variety of algorithms and analysis methods have been published. In this paper we introduce the R package biclust, which contains a collection of bicluster algorithms, preprocessing methods for two way data, and validation and visualization techniques for bicluster results. For the first time, such a package is provided on a platform like R, where data analysts can easily add new bicluster algorithms and adapt them to their special needs.
A scalable framework for discovering coherent co-clusters in noisy data
- In ICML ’08
"... A scalable framework for discovering coherent co-clusters in noisy data ..."
Abstract
-
Cited by 19 (4 self)
- Add to MetaCart
(Show Context)
A scalable framework for discovering coherent co-clusters in noisy data
QUBIC: a qualitative biclustering algorithm for analyses of gene expression data
, 2009
"... ..."
(Show Context)
Identification of regulatory modules in time-series gene expression data using a linear time biclustering algorithm
- IEEE/ACM Transactions on Computational Biology and Bioinformatics
"... Several non-supervised machine learning methods have been used in the analysis of gene expression data obtained from microarray experiments. Recently, biclustering, a non-supervised approach that performs simultaneous clustering on the row and column dimensions of the data matrix, has been shown to ..."
Abstract
-
Cited by 18 (2 self)
- Add to MetaCart
(Show Context)
Several non-supervised machine learning methods have been used in the analysis of gene expression data obtained from microarray experiments. Recently, biclustering, a non-supervised approach that performs simultaneous clustering on the row and column dimensions of the data matrix, has been shown to be remarkably effective in a variety of applications. The goal of biclustering is to find subgroups of genes and subgroups of experimental conditions, where the genes exhibit highly correlated behaviors. These correlated behaviors correspond to coherent expression patterns and can be used to identify potential regulatory modules possibly involved in regulatory mechanisms. Many specific versions of the biclustering problem have been shown to be NP-complete. However, when we are interested in identifying biclusters in time series expression data, we can restrict the problem by finding only maximal biclusters with contiguous columns. This restriction leads to a tractable problem. Its motivation is the fact that biological processes start and finish in an identifiable contiguous period of time, leading to increased (or decreased) activity of sets of genes forming biclusters with contiguous
On Using Class-Labels in Evaluation of Clusterings
"... Although clustering has been studied for several decades, the fundamental problem of a valid evaluation has not yet been solved. The sound evaluation of clustering results in particular on real data is inherently difficult. In the literature, new clustering algorithms and their results are often ext ..."
Abstract
-
Cited by 15 (9 self)
- Add to MetaCart
(Show Context)
Although clustering has been studied for several decades, the fundamental problem of a valid evaluation has not yet been solved. The sound evaluation of clustering results in particular on real data is inherently difficult. In the literature, new clustering algorithms and their results are often externally evaluated with respect to an existing class labeling. These class-labels, however, may not be adequate for the structure of the data or the evaluated cluster model. Here, we survey the literature of different related research areas that have observed this problem. We discuss common “defects ” that clustering algorithms exhibit w.r.t. this evaluation, and show them on several real world data sets of different domains along with a discussion why the detected clusters do not indicate a bad performance of the algorithm but are valid and useful results. An useful alternative evaluation method requires more extensive data labeling than the commonly used class labels or it needs a combination of information measures to take subgroups, supergroups, and overlapping sets of traditional classes into account. Finally, we discuss an evaluation scenario that regards the possible existence of several complementary sets of labels and hope to stimulate the discussion among different sub-communities — like ensemble-clustering, subspace-clustering, multi-label classification, hierarchical classification or hierarchical clustering, and multiview-clustering or alternative clustering — regarding requirements on enhanced evaluation methods. 1.