Results 1 - 10
of
10
Integrative analysis of histone ChIP-seq and transcription data using Bayesian mixture models
- Bioinformatics
, 2014
"... Motivation: Histone modifications are a key epigenetic mechanism to activate or repress the transcription of genes. Datasets of matched transcription data and histone modification data obtained by ChIP-seq exist, but methods for integrative analysis of both data types are still rare. Here, we presen ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Motivation: Histone modifications are a key epigenetic mechanism to activate or repress the transcription of genes. Datasets of matched transcription data and histone modification data obtained by ChIP-seq exist, but methods for integrative analysis of both data types are still rare. Here, we present a novel bioinformatics approach to detect genes that show different transcript abundances between two condi-tions putatively caused by alterations in histone modification. Results:We introduce a correlation measure for integrative analysis of ChIP-seq and gene transcription data measured by RNA sequencing or microarrays and demonstrate that a proper normalization of ChIP-seq data is crucial. We suggest applying Bayesian mixture models of different types of distributions to further study the distribution of the correlation measure. The implicit classification of the mixture models is used to detect genes with differences between two conditions in both gene transcription and histone modification. The method is applied to different datasets, and its superiority to a naive separate analysis of both data types is demonstrated. Availability and implementation: R/Bioconductor package epigenomix. Contact:
Integrative analysis of histone ChIP-seq and transcription data using Bayesian mixture models
"... Motivation: Histone modifications are a key epigenetic mechanism to activate or repress the transcription of genes. Data sets of matched transcription data and histone modification data obtained by ChIP-seq exist, but methods for integrative analysis of both data types are still rare. Here, we prese ..."
Abstract
- Add to MetaCart
(Show Context)
Motivation: Histone modifications are a key epigenetic mechanism to activate or repress the transcription of genes. Data sets of matched transcription data and histone modification data obtained by ChIP-seq exist, but methods for integrative analysis of both data types are still rare. Here, we present a novel bioinformatics approach to de-tect genes that show different transcript abundances between two conditions putatively caused by alterations in histone modification. Results: We introduce a correlation measure for integrative analy-sis of ChIP-seq and gene transcription data measured by RNA-seq or microarrays and demonstrate that a proper normalisation of ChIP-seq data is crucial. We suggest applying Bayesian mixture models of different types of distributions to further study the distribution of the correlation measure. The implicit classification of the mixture mod-els is used to detect genes with differences between two conditions in both gene transcription and histone modification. The method is applied to different data sets and its superiority to a naive separate analysis of both data types is demonstrated. Availability: R/Bioconductor package epigenomix Contact:
Multimodal probabilistic generative models for time-course gene expression data and Gene Ontology (GO) tags
, 2015
"... a b s t r a c t We propose four probabilistic generative models for simultaneously modeling gene expression levels and Gene Ontology (GO) tags. Unlike previous approaches for using GO tags, the joint modeling framework allows the two sources of information to complement and reinforce each other. We ..."
Abstract
- Add to MetaCart
a b s t r a c t We propose four probabilistic generative models for simultaneously modeling gene expression levels and Gene Ontology (GO) tags. Unlike previous approaches for using GO tags, the joint modeling framework allows the two sources of information to complement and reinforce each other. We fit our models to three time-course datasets collected to study biological processes, specifically blood vessel growth (angiogenesis) and mitotic cell cycles. The proposed models result in a joint clustering of genes and GO annotations. Different models group genes based on GO tags and their behavior over the entire time-course, within biological stages, or even individual time points. We show how such models can be used for biological stage boundary estimation de novo. We also evaluate our models on biological stage prediction accuracy of held out samples. Our results suggest that the models usually perform better when GO tag information is included.
BIOINFORMATICS ORIGINAL PAPER doi:10.1093/bioinformatics/btt425 Systems biology Advance Access publication August 28, 2013 Bayesian consensus clustering
, 2013
"... Motivation: In biomedical research a growing number of platforms and technologies are used to measure diverse but related information, and the task of clustering a set of objects based on multiple sources of data arises in several applications. Most current approaches to multi-source clustering eith ..."
Abstract
- Add to MetaCart
Motivation: In biomedical research a growing number of platforms and technologies are used to measure diverse but related information, and the task of clustering a set of objects based on multiple sources of data arises in several applications. Most current approaches to multi-source clustering either independently determine a separate clustering for each data source or determine a single ‘joint ’ clustering for all data sources. There is a need for more flexible approaches that simultan-eously model the dependence and the heterogeneity of the data sources. Results: We propose an integrative statistical model that permits a separate clustering of the objects for each data source. These separ-ate clusterings adhere loosely to an overall consensus clustering, and hence they are not independent. We describe a computationally scal-able Bayesian framework for simultaneous estimation of both the consensus clustering and the source-specific clusterings. We demon-strate that this flexible approach is more robust than joint clustering of all data sources, and is more powerful than clustering each data source independently. We present an application to subtype identifi-cation of breast cancer tumor samples using publicly available data
RESEARCH ARTICLE A Standardised Vocabulary for Identifying Benthic Biota and Substrata from Underwater Imagery: The CATAMI Classification Scheme
"... ☯ These authors contributed equally to this work. ‡ These authors made smaller but essential contributions to this work. ..."
Abstract
- Add to MetaCart
(Show Context)
☯ These authors contributed equally to this work. ‡ These authors made smaller but essential contributions to this work.
Research Article Integration Strategy Is a Key Step in Network-Based Analysis and Dramatically Affects Network Topological Properties and Inferring Outcomes
"... Copyright © 2014 Nana Jin et al.This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. An increasing number of experiments have been designe ..."
Abstract
- Add to MetaCart
(Show Context)
Copyright © 2014 Nana Jin et al.This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. An increasing number of experiments have been designed to detect intracellular and intercellular molecular interactions. Based on these molecular interactions (especially protein interactions), molecular networks have been built for using in several typical applications, such as the discovery of new disease genes and the identification of drug targets and molecular complexes. Because the data are incomplete and a considerable number of false-positive interactions exist, protein interactions from different sources are commonly integrated in network analyses to build a stable molecular network. Although various types of integration strategies are being applied in current studies, the topological properties of the networks from these different integration strategies, especially typical applications based on these network integration strategies, have not been rigorously evaluated. In this paper, systematic analyses were performed to evaluate 11 frequently used methods using two types of integration strategies: empirical and machine learning methods. The topological properties of the networks of these different integration strategies were found to significantly differ. Moreover, these networks were found to dramatically affect the outcomes of typical applications, such as disease gene predictions, drug target detections, and molecular complex identifications. The analysis presented in this paper could provide an
unknown title
"... dependent partition-valued process for multitask clustering and time evolving network modelling ..."
Abstract
- Add to MetaCart
dependent partition-valued process for multitask clustering and time evolving network modelling
A coupled finite mixture model for transcriptional module discovery
"... Approaches to elucidate complex gene regulatory networks usually rely on the analysis of transcriptional modules (TMs). Two high-throughput technologies, gene expression mi-croarray and Chromatin Immuno-Precipitation on Chip, often provide complementary in-formation for discovering TMs. To efficient ..."
Abstract
- Add to MetaCart
Approaches to elucidate complex gene regulatory networks usually rely on the analysis of transcriptional modules (TMs). Two high-throughput technologies, gene expression mi-croarray and Chromatin Immuno-Precipitation on Chip, often provide complementary in-formation for discovering TMs. To efficiently integrate these two data sources, we propose a novel Bayesian model referred to as Coupled Finite Mixture Model (CFMM), which per-mits a separate clustering for each data source and also explicitly models their dependence. We validate our model in both a synthetic dataset and a real dataset. Our method is shown to find more consensus genes and the resulting TMs have improved biological functional coherence than those inferred by other state-of-the-art methods. Key Words: Chip-chip data, gene expression, integrative clustering 1
Journal of Experimental Botany doi:10.1093/jxb/eru054 Review papeR
"... Modelling transcriptional networks in leaf senescence ..."
Review papeR Modelling transcriptional networks in leaf senescence
"... The process of leaf senescence is induced by an extensive range of developmental and environmental signals and controlled by multiple, cross-linking pathways, many of which overlap with plant stress-response signals. Elucidation of this complex regulation requires a step beyond a traditional one-gen ..."
Abstract
- Add to MetaCart
The process of leaf senescence is induced by an extensive range of developmental and environmental signals and controlled by multiple, cross-linking pathways, many of which overlap with plant stress-response signals. Elucidation of this complex regulation requires a step beyond a traditional one-gene-at-a-time analysis. Application of a more global analysis using statistical and mathematical tools of systems biology is an approach that is being applied to address this problem. A variety of modelling methods applicable to the analysis of current and future senescence data are reviewed and discussed using some senescence-specific examples. Network modelling with a senescence transcriptome time course followed by testing predictions with gene-expression data illustrates the application of systems biology tools. Key words: Gene regulation, modelling, senescence, systems biology, transcriptional networks.