Results 1  10
of
16
Reverse Engineering of Molecular Networks from a Common Combinatorial Approach
"... ..."
(Show Context)
Multiscale Binarization of Gene Expression Data for Reconstructing Boolean Networks
 IEEE/ACM transactions on computational biology and bioinformatics
, 2011
"... Abstract—Network inference algorithms can assist life scientists in unraveling generegulatory systems on a molecular level. In recent years, great attention has been drawn to the reconstruction of Boolean networks from time series. These need to be binarized, as such networks model genes as binary ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Abstract—Network inference algorithms can assist life scientists in unraveling generegulatory systems on a molecular level. In recent years, great attention has been drawn to the reconstruction of Boolean networks from time series. These need to be binarized, as such networks model genes as binary variables (either “expressed ” or “not expressed”). Common binarization methods often cluster measurements or separate them according to statistical or information theoretic characteristics and may require many data points to determine a robust threshold. Yet, time series measurements frequently comprise only a small number of samples. To overcome this limitation, we propose a binarization that incorporates measurements at multiple resolutions. We introduce two such binarization approaches which determine thresholds based on limited numbers of samples and additionally provide a measure of threshold validity. Thus, network reconstruction and further analysis can be restricted to genes with meaningful thresholds. This reduces the complexity of network inference. The performance of our binarization algorithms was evaluated in network reconstruction experiments using artificial data as well as realworld yeast expression time series. The new approaches yield considerably improved correct network identification rates compared to other binarization techniques by effectively reducing the amount of candidate networks. Index Terms—Binarization, generegulatory networks, Boolean networks, reconstruction. Ç
Faster Mass Spectrometrybased Protein Inference: Junction Trees are More Efficient than Sampling and Marginalization by Enumeration
"... The problem of identifying the proteins in a complex mixture using tandem mass spectrometry can be framed as an inference problem on a graph that connects peptides to proteins. Several existing protein identification methods make use of statistical inference methods for graphical models, including e ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
The problem of identifying the proteins in a complex mixture using tandem mass spectrometry can be framed as an inference problem on a graph that connects peptides to proteins. Several existing protein identification methods make use of statistical inference methods for graphical models, including expectation maximization, Markov chain Monte Carlo, and full marginalization coupled with approximation heuristics. We show that, for this problem, the majority of the cost of inference usually comes from a few highly connected subgraphs. Furthermore, we evaluate three different statistical inference methods using a common graphical model, and we demonstrate that junction tree inference substantially improves rates of convergence compared to existing methods. The python code used for this paper is available at
An Unsupervised Conditional Random Fields Approach for Clustering Gene Expression Time Series
, 2008
"... Motivation: There is a growing interest in extracting statistical patterns from gene expression time series data, in which a key challenge is the development of stable and accurate probabilistic models. Currently popular models, however, would be computationally prohibitive unless some independence ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Motivation: There is a growing interest in extracting statistical patterns from gene expression time series data, in which a key challenge is the development of stable and accurate probabilistic models. Currently popular models, however, would be computationally prohibitive unless some independence assumptions are made to describe large scale data. We propose an unsupervised conditional random fields model to overcome this problem by progressively infusing information into the labelling process through a samll variable voting pool. Results: An unsupervised conditional random fields model (CRF) is proposed for efficient analysis of gene expression time series and is successfully applied to gene class discovery and class prediction. The proposed model treats each time series as a random field and assigns an optimal cluster label to each time series, so as to partition the time series into clusters without a priori knowledge about the number of clusters and the initial centroids. Another advantage of the proposed method is the relaxation of independence assumptions.
An Unsupervised Conditional Random Fields Approach for Clustering Gene Expression Time Series
, 2008
"... Motivation: There is a growing interest in extracting statistical patterns from gene expression time series data, in which a key challenge is the development of stable and accurate probabilistic models. Currently popular models, however, would be computationally prohibitive unless some independence ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Motivation: There is a growing interest in extracting statistical patterns from gene expression time series data, in which a key challenge is the development of stable and accurate probabilistic models. Currently popular models, however, would be computationally prohibitive unless some independence assumptions are made to describe large scale data. We propose an unsupervised conditional random fields model to overcome this problem by progressively infusing information into the labelling process through a samll variable voting pool. Results: An unsupervised conditional random fields model (CRF) is proposed for efficient analysis of gene expression time series and is successfully applied to gene class discovery and class prediction. The proposed model treats each time series as a random field and assigns an optimal cluster label to each time series, so as to partition the time series into clusters without a priori knowledge about the number of clusters and the initial centroids. Another advantage of the proposed method is the relaxation of independence assumptions.
unknown title
"... An unsupervised conditional random fields approach for clustering gene expression time series ..."
Abstract
 Add to MetaCart
An unsupervised conditional random fields approach for clustering gene expression time series
Modelchecking based Approaches to Parameter Estimation of Gene Regulatory Networks
"... Abstract—The expression of genes is a fundamental process in living cells, both eukaryotic and prokaryotic. The regulation of gene expression is achieved via sophisticated networks of interactions between DNA, RNA, proteins, and small chemical compounds. The qualitative and quantitative characterisa ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract—The expression of genes is a fundamental process in living cells, both eukaryotic and prokaryotic. The regulation of gene expression is achieved via sophisticated networks of interactions between DNA, RNA, proteins, and small chemical compounds. The qualitative and quantitative characterisation of interactions between genes is one of the major current research targets in systems biology. In this PhD research project, we view gene regulatory networks as Markov chains, resulting from popular formalisation frameworks such as Dynamic Bayesian Networks and Probabilistic Boolean Networks. This will allow us to reason about both the structure and strength of gene interactions. Our goal is to develop new algorithms and tools, which are tailored for the modelling and analysis of gene regulatory networks, by exploring model checking techniques that have been developed and widely used in computer science. More specifically, we will combine model checking techniques with sampling and optimisation methods from the literature to derive new techniques to solve the parameter estimation problem of Markov models of gene regulatory networks. I.
Systems Biology Structural Systems Identification of Genetic Regulatory Networks
"... ..."
(Show Context)
BIOINFORMATICS ORIGINAL PAPER doi:10.1093/bioinformatics/btm623 Systems biology Structural systems identification of genetic regulatory networks
"... Motivation: Reverse engineering of genetic regulatory networks from experimental data is the first step toward the modeling of genetic networks. Linear statespace models, also known as linear dynamical models, have been applied to model genetic networks from gene expression time series data, but ex ..."
Abstract
 Add to MetaCart
Motivation: Reverse engineering of genetic regulatory networks from experimental data is the first step toward the modeling of genetic networks. Linear statespace models, also known as linear dynamical models, have been applied to model genetic networks from gene expression time series data, but existing works have not taken into account available structural information. Without structural constraints, estimated models may contradict biological knowledge and estimation methods may overfit. Results: In this report, we extended expectationmaximization (EM) algorithms to incorporate prior network structure and to estimate genetic regulatory networks that can track and predict gene expression profiles. We applied our method to synthetic data and to SOS data and showed that our method significantly outperforms the regular EM without structural constraints. Availability: The Matlab code is available upon request and the SOS data can be downloaded from
Vol. 23 ISMB/ECCB 2007, pages i499–i507 BIOINFORMATICS doi:10.1093/bioinformatics/btm214 Computational modeling of Caenorhabditis elegans
"... Motivation: Caenorhabditis elegans vulval development is a paradigmatic example of animal organogenesis with extensive experimental data. During vulval induction, each of the six multipotent vulval precursor cells (VPCs) commits to one of three fates (1,2,3). The precise 123 formation of VPC fates ..."
Abstract
 Add to MetaCart
Motivation: Caenorhabditis elegans vulval development is a paradigmatic example of animal organogenesis with extensive experimental data. During vulval induction, each of the six multipotent vulval precursor cells (VPCs) commits to one of three fates (1,2,3). The precise 123 formation of VPC fates is controlled by a network of intercellular signaling, intracellular signal transduction and transcriptional regulation. The construction of mathematical models for this network will enable hypothesis generation, biological mechanism discovery and system behavior analysis. Results: We have developed a mathematical model based on dynamic Bayesian networks to model the biological network that governs the VPC 123 pattern formation process. Our model has six interconnected subnetworks corresponding to six VPCs. Each VPC subnetwork contains 20 components. The causal relationships among network components are quantitatively encoded in the structure and parameters of the model. Statistical machine learning techniques were developed to automatically learn both the structure and parameters of the model from data collected from literatures. The learned model is capable of simulating vulval induction under 36 different genetic conditions. Our model also contains a few hypothetical causal relationships between network components, and hence can serve as guidance for designing future experiments. The statistical learning nature of our methodology makes it easy to not only handle noise in data but also automatically incorporate new experimental data to refine the model. Contact: