Results 1 - 10
of
107
Learning Module Networks
, 2003
"... Methods for learning Bayesian networks can discover dependency structure between observed variables. Although these methods are useful in many applications, they run into computational and statistical problems in domains that involve a large number of variables. In this paper, we ..."
Abstract
-
Cited by 30 (4 self)
- Add to MetaCart
Methods for learning Bayesian networks can discover dependency structure between observed variables. Although these methods are useful in many applications, they run into computational and statistical problems in domains that involve a large number of variables. In this paper, we
Theory-based causal induction
- In
, 2003
"... Inducing causal relationships from observations is a classic problem in scientific inference, statistics, and machine learning. It is also a central part of human learning, and a task that people perform remarkably well given its notorious difficulties. People can learn causal structure in various s ..."
Abstract
-
Cited by 23 (13 self)
- Add to MetaCart
Inducing causal relationships from observations is a classic problem in scientific inference, statistics, and machine learning. It is also a central part of human learning, and a task that people perform remarkably well given its notorious difficulties. People can learn causal structure in various settings, from diverse forms of data: observations of the co-occurrence frequencies between causes and effects, interactions between physical objects, or patterns of spatial or temporal coincidence. These different modes of learning are typically thought of as distinct psychological processes and are rarely studied together, but at heart they present the same inductive challenge—identifying the unobservable mechanisms that generate observable relations between variables, objects, or events, given only sparse and limited data. We present a computational-level analysis of this inductive problem and a framework for its solution, which allows us to model all these forms of causal learning in a common language. In this framework, causal induction is the product of domain-general statistical inference guided by domain-specific prior knowledge, in the form of an abstract causal theory. We identify 3 key aspects of abstract prior knowledge—the ontology of entities, properties, and relations that organizes a domain; the plausibility of specific causal relationships; and the functional form of those relationships—and show how they provide the constraints that people need to induce useful causal models from sparse data.
Biclustering Algorithms: A Survey
- In Handbook of Computational Molecular Biology Edited by: Aluru S. Chapman & Hall/CRC Computer and Information Science Series
, 2005
"... Analysis of large scale geonomics data, notably gene expression, has initially focused on clustering methods. Recently, biclustering techniques were proposed for revealing submatrices showing unique patterns. We review some of the algorithmic approaches to biclustering and discuss their properties. ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
Analysis of large scale geonomics data, notably gene expression, has initially focused on clustering methods. Recently, biclustering techniques were proposed for revealing submatrices showing unique patterns. We review some of the algorithmic approaches to biclustering and discuss their properties. 1
Probabilistic discovery of overlapping cellular processes and their regulation
- J Comput Biol
, 2004
"... Many of the functions carried out by a living cell are regulated at the transcriptional level, to ensure that genes are expressed when they are needed. Thus, to understand biological processes, it is thus necessary to understand the cell’s transcriptional network. In this paper, we propose a novel p ..."
Abstract
-
Cited by 16 (1 self)
- Add to MetaCart
Many of the functions carried out by a living cell are regulated at the transcriptional level, to ensure that genes are expressed when they are needed. Thus, to understand biological processes, it is thus necessary to understand the cell’s transcriptional network. In this paper, we propose a novel probabilistic model of gene regulation for the task of identifying overlapping biological processes and the regulatory mechanism controlling their activation. A key feature of our approach is that we allow genes to participate in multiple processes, thus providing a more biologically plausible model for the process of gene regulation. We present an algorithm to learn this model automatically from data, using only genome-wide measurements of gene expression as input. We compare our results to those obtained by other approaches, and show significant benefits can be gained by modeling both the organization of genes into overlapping cellular processes and the regulatory programs of these processes. Moreover, our method successfully grouped genes known to function together, recovered many regulatory relationships that are known in the literature, and suggested novel hypotheses regarding the regulatory role of previously uncharacterized proteins.
Associative clustering for exploring dependencies between functional genomics data sets
- IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS
, 2005
"... ..."
Information bottleneck for non co-occurrence data
- In Advances in Neural Information Processing Systems 19
, 2007
"... We present a general model-independent approach to the analysis of data in cases when these data do not appear in the form of co-occurrence of two variables X, Y, but rather as a sample of values of an unknown (stochastic) function Z(X, Y). For example, in gene expression data, the expression level ..."
Abstract
-
Cited by 11 (5 self)
- Add to MetaCart
We present a general model-independent approach to the analysis of data in cases when these data do not appear in the form of co-occurrence of two variables X, Y, but rather as a sample of values of an unknown (stochastic) function Z(X, Y). For example, in gene expression data, the expression level Z is a function of gene X and condition Y; or in movie ratings data the rating Z is a function of viewer X and movie Y. The approach represents a consistent extension of the Information Bottleneck method that has previously relied on the availability of co-occurrence statistics. By altering the relevance variable we eliminate the need in the sample of joint distribution of all input variables. This new formulation also enables simple MDL-like model complexity control and prediction of missing values of Z. The approach is analyzed and shown to be on a par with the best known clustering algorithms for a wide range of domains. For the prediction of missing values (collaborative filtering) it improves the currently best known results. 1
An extended transcriptional regulatory network of Escherichia coli and analysis of its hierarchical structure and network motifs
, 2004
"... ..."
An integrative approach for causal gene identification and gene regulatory pathway inference
- Bioinformatics
, 2006
"... doi:10.1093/bioinformatics/btl234 ..."

