Results 1  10
of
31
Theorybased causal induction
 In
, 2003
"... Inducing causal relationships from observations is a classic problem in scientific inference, statistics, and machine learning. It is also a central part of human learning, and a task that people perform remarkably well given its notorious difficulties. People can learn causal structure in various s ..."
Abstract

Cited by 33 (14 self)
 Add to MetaCart
Inducing causal relationships from observations is a classic problem in scientific inference, statistics, and machine learning. It is also a central part of human learning, and a task that people perform remarkably well given its notorious difficulties. People can learn causal structure in various settings, from diverse forms of data: observations of the cooccurrence frequencies between causes and effects, interactions between physical objects, or patterns of spatial or temporal coincidence. These different modes of learning are typically thought of as distinct psychological processes and are rarely studied together, but at heart they present the same inductive challenge—identifying the unobservable mechanisms that generate observable relations between variables, objects, or events, given only sparse and limited data. We present a computationallevel analysis of this inductive problem and a framework for its solution, which allows us to model all these forms of causal learning in a common language. In this framework, causal induction is the product of domaingeneral statistical inference guided by domainspecific prior knowledge, in the form of an abstract causal theory. We identify 3 key aspects of abstract prior knowledge—the ontology of entities, properties, and relations that organizes a domain; the plausibility of specific causal relationships; and the functional form of those relationships—and show how they provide the constraints that people need to induce useful causal models from sparse data.
Structured priors for structure learning
 In Proceedings of the 22nd Conference on Uncertainty in Artificial Intelligence (UAI
, 2006
"... Traditional approaches to Bayes net structure learning typically assume little regularity in graph structure other than sparseness. However, in many cases, we expect more systematicity: variables in realworld systems often group into classes that predict the kinds of probabilistic dependencies they ..."
Abstract

Cited by 19 (8 self)
 Add to MetaCart
Traditional approaches to Bayes net structure learning typically assume little regularity in graph structure other than sparseness. However, in many cases, we expect more systematicity: variables in realworld systems often group into classes that predict the kinds of probabilistic dependencies they participate in. Here we capture this form of prior knowledge in a hierarchical Bayesian framework, and exploit it to enable structure learning and type discovery from small datasets. Specifically, we present a nonparametric generative model for directed acyclic graphs as a prior for Bayes net structure learning. Our model assumes that variables come in one or more classes and that the prior probability of an edge existing between two variables is a function only of their classes. We derive an MCMC algorithm for simultaneous inference of the number of classes, the class assignments of variables, and the Bayes net structure over variables. For several realistic, sparse datasets, we show that the bias towards systematicity of connections provided by our model can yield more accurate learned networks than the traditional approach of using a uniform prior, and that the classes found by our model are appropriate. 1
Probabilistic discovery of overlapping cellular processes and their regulation
 J Comput Biol
, 2004
"... Many of the functions carried out by a living cell are regulated at the transcriptional level, to ensure that genes are expressed when they are needed. Thus, to understand biological processes, it is thus necessary to understand the cell’s transcriptional network. In this paper, we propose a novel p ..."
Abstract

Cited by 19 (1 self)
 Add to MetaCart
Many of the functions carried out by a living cell are regulated at the transcriptional level, to ensure that genes are expressed when they are needed. Thus, to understand biological processes, it is thus necessary to understand the cell’s transcriptional network. In this paper, we propose a novel probabilistic model of gene regulation for the task of identifying overlapping biological processes and the regulatory mechanism controlling their activation. A key feature of our approach is that we allow genes to participate in multiple processes, thus providing a more biologically plausible model for the process of gene regulation. We present an algorithm to learn this model automatically from data, using only genomewide measurements of gene expression as input. We compare our results to those obtained by other approaches, and show significant benefits can be gained by modeling both the organization of genes into overlapping cellular processes and the regulatory programs of these processes. Moreover, our method successfully grouped genes known to function together, recovered many regulatory relationships that are known in the literature, and suggested novel hypotheses regarding the regulatory role of previously uncharacterized proteins.
Bayesian Network Learning with Parameter Constraints
, 2006
"... The task of learning models for many realworld problems requires incorporating domain knowledge into learning algorithms, to enable accurate learning from a realistic volume of training data. ..."
Abstract

Cited by 19 (2 self)
 Add to MetaCart
The task of learning models for many realworld problems requires incorporating domain knowledge into learning algorithms, to enable accurate learning from a realistic volume of training data.
Exploiting parameter domain knowledge for learning in Bayesian networks
 Carnegie Mellon University
, 2005
"... implied, of any sponsoring institution, the U.S. government or any other entity. ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
implied, of any sponsoring institution, the U.S. government or any other entity.
Finding Optimal Bayesian Network Given a SuperStructure
"... Classical approaches used to learn Bayesian network structure from data have disadvantages in terms of complexity and lower accuracy of their results. However, a recent empirical study has shown that a hybrid algorithm improves sensitively accuracy and speed: it learns a skeleton with an independenc ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
Classical approaches used to learn Bayesian network structure from data have disadvantages in terms of complexity and lower accuracy of their results. However, a recent empirical study has shown that a hybrid algorithm improves sensitively accuracy and speed: it learns a skeleton with an independency test (IT) approach and constrains on the directed acyclic graphs (DAG) considered during the searchandscore phase. Subsequently, we theorize the structural constraint by introducing the concept of superstructure S, which is an undirected graph that restricts the search to networks whose skeleton is a subgraph of S. We develop a superstructure constrained optimal search (COS): its time complexity is upper bounded by O(γm n), where γm < 2 depends on the maximal degree m of S. Empirically, complexity depends on the average degree ˜m and sparse structures allow larger graphs to be calculated. Our algorithm is faster than an optimal search by several orders and even finds more accurate results when given a sound superstructure. Practically, S can be approximated by IT approaches; significance level of the tests controls its sparseness, enabling to control the tradeoff between speed and accuracy. For incomplete superstructures, a greedily postprocessed version (COS+) still enables to significantly outperform other heuristic searches. Keywords: subset Bayesian networks, structure learning, optimal search, superstructure, connected 1.
A Functional and Regulatory Map of Asthma
"... The prevalence and morbidity of asthma, a chronic inflammatory airway disease, is increasing. Animal models provide a meaningful but limited view of the mechanisms of asthma in humans. A systemslevel view of asthma that integrates multiple levels of molecular and functional information is needed. Fo ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
The prevalence and morbidity of asthma, a chronic inflammatory airway disease, is increasing. Animal models provide a meaningful but limited view of the mechanisms of asthma in humans. A systemslevel view of asthma that integrates multiple levels of molecular and functional information is needed. For this, we compiled a gene expression compendium from five publicly available mouse microarray datasets and a gene knowledge base of 4,305 gene annotation sets. Using this collection we generated a highlevel map of the functional themes that characterize animal models of asthma, dominated by innate and adaptive immune response. We used Module Networks analysis to identify coregulated gene modules. The resulting modules reflect four distinct responses to treatment, including early response, general induction, repression, and IL13–dependent response. One module with a persistent induction in response to treatment is mainly composed of genes with suggested roles in asthma,
Compact modeling of data using independent variable group analysis
 IEEE Transactions on Neural Networks
, 2007
"... Abstract—We introduce a modeling approach called independent variable group analysis (IVGA) which can be used for finding an efficient structural representation for a given data set. The basic idea is to determine such a grouping for the variables of the data set that mutually dependent variables ar ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
Abstract—We introduce a modeling approach called independent variable group analysis (IVGA) which can be used for finding an efficient structural representation for a given data set. The basic idea is to determine such a grouping for the variables of the data set that mutually dependent variables are grouped together whereas mutually independent or weakly dependent variables end up in separate groups. Computation of an IVGA model requires a combinatorial algorithm for grouping of the variables and a modeling algorithm for the groups. In order to be able to compare different groupings, a cost function which reflects the quality of a grouping is also required. Such a cost function can be derived, for example, using the variational Bayesian approach, which is employed in our study. This approach is also shown to be approximately equivalent to minimizing the mutual information between the groups. The modeling task is computationally demanding. We describe an efficient heuristic grouping algorithm for the variables and derive a computationally light nonlinear mixture model for modeling of the dependencies within the groups. Finally, we carry out a set of experiments which indicate that IVGA may turn out to be beneficial in many different applications. Index Terms—compact modeling, independent variable group analysis, mutual information, variable grouping, variational Bayesian learning I.
Learning gene regulatory networks via globally regularized risk minimization
 In Proceedings of the Fifth Annual RECOMB Satellite Workshop on Comparative Genomics
, 2007
"... Abstract. Learning the structure of a gene regulatory network from timeseries gene expression data is a significant challenge. Most approaches proposed in the literature to date attempt to predict the regulators of each target gene individually, but fail to share regulatory information between rela ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
Abstract. Learning the structure of a gene regulatory network from timeseries gene expression data is a significant challenge. Most approaches proposed in the literature to date attempt to predict the regulators of each target gene individually, but fail to share regulatory information between related genes. In this paper, we propose a new globally regularized risk minimization approach to address this problem. Our approach first clusters genes according to their timeseries expression profiles— identifying related groups of genes. Given a clustering, we then develop a simple technique that exploits the assumption that genes with similar expression patterns are likely to be coregulated by encouraging the genes in the same group to share common regulators. Our experiments on both synthetic and real gene expression data suggest that our new approach is more effective at identifying important transcription factor based regulatory mechanisms than the standard independent approach and a prototype based approach. 1