Results 1  10
of
54
Using Bayesian networks to analyze expression data
 Journal of Computational Biology
, 2000
"... DNA hybridization arrays simultaneously measure the expression level for thousands of genes. These measurements provide a “snapshot ” of transcription levels within the cell. A major challenge in computational biology is to uncover, from such measurements, gene/protein interactions and key biologica ..."
Abstract

Cited by 1077 (18 self)
 Add to MetaCart
(Show Context)
DNA hybridization arrays simultaneously measure the expression level for thousands of genes. These measurements provide a “snapshot ” of transcription levels within the cell. A major challenge in computational biology is to uncover, from such measurements, gene/protein interactions and key biological features of cellular systems. In this paper, we propose a new framework for discovering interactions between genes based on multiple expression measurements. This framework builds on the use of Bayesian networks for representing statistical dependencies. A Bayesian network is a graphbased model of joint multivariate probability distributions that captures properties of conditional independence between variables. Such models are attractive for their ability to describe complex stochastic processes and because they provide a clear methodology for learning from (noisy) observations. We start by showing how Bayesian networks can describe interactions between genes. We then describe a method for recovering gene interactions from microarray data using tools for learning Bayesian networks. Finally, we demonstrate this method on the S. cerevisiae cellcycle measurements of Spellman et al. (1998). Key words: gene expression, microarrays, Bayesian methods. 1.
Learning Bayesian Networks With Local Structure
, 1996
"... . We examine a novel addition to the known methods for learning Bayesian networks from data that improves the quality of the learned networks. Our approach explicitly represents and learns the local structure in the conditional probability distributions (CPDs) that quantify these networks. This inc ..."
Abstract

Cited by 273 (13 self)
 Add to MetaCart
. We examine a novel addition to the known methods for learning Bayesian networks from data that improves the quality of the learned networks. Our approach explicitly represents and learns the local structure in the conditional probability distributions (CPDs) that quantify these networks. This increases the space of possible models, enabling the representation of CPDs with a variable number of parameters. The resulting learning procedure induces models that better emulate the interactions present in the data. We describe the theoretical foundations and practical aspects of learning local structures and provide an empirical evaluation of the proposed learning procedure. This evaluation indicates that learning curves characterizing this procedure converge faster, in the number of training instances, than those of the standard procedure, which ignores the local structure of the CPDs. Our results also show that networks learned with local structures tend to be more complex (in terms of a...
Adaptive Probabilistic Networks with Hidden Variables
 Machine Learning
, 1997
"... . Probabilistic networks (also known as Bayesian belief networks) allow a compact description of complex stochastic relationships among several random variables. They are rapidly becoming the tool of choice for uncertain reasoning in artificial intelligence. In this paper, we investigate the problem ..."
Abstract

Cited by 185 (10 self)
 Add to MetaCart
. Probabilistic networks (also known as Bayesian belief networks) allow a compact description of complex stochastic relationships among several random variables. They are rapidly becoming the tool of choice for uncertain reasoning in artificial intelligence. In this paper, we investigate the problem of learning probabilistic networks with known structure and hidden variables. This is an important problem, because structure is much easier to elicit from experts than numbers, and the world is rarely fully observable. We present a gradientbased algorithmand show that the gradient can be computed locally, using information that is available as a byproduct of standard probabilistic network inference algorithms. Our experimental results demonstrate that using prior knowledge about the structure, even with hidden variables, can significantly improve the learning rate of probabilistic networks. We extend the method to networks in which the conditional probability tables are described using a ...
Bayesian Optimization Algorithm: From Single Level to Hierarchy
, 2002
"... There are four primary goals of this dissertation. First, design a competent optimization algorithm capable of learning and exploiting appropriate problem decomposition by sampling and evaluating candidate solutions. Second, extend the proposed algorithm to enable the use of hierarchical decompositi ..."
Abstract

Cited by 101 (19 self)
 Add to MetaCart
(Show Context)
There are four primary goals of this dissertation. First, design a competent optimization algorithm capable of learning and exploiting appropriate problem decomposition by sampling and evaluating candidate solutions. Second, extend the proposed algorithm to enable the use of hierarchical decomposition as opposed to decomposition on only a single level. Third, design a class of difficult hierarchical problems that can be used to test the algorithms that attempt to exploit hierarchical decomposition. Fourth, test the developed algorithms on the designed class of problems and several realworld applications. The dissertation proposes the Bayesian optimization algorithm (BOA), which uses Bayesian networks to model the promising solutions found so far and sample new candidate solutions. BOA is theoretically and empirically shown to be capable of both learning a proper decomposition of the problem and exploiting the learned decomposition to ensure robust and scalable search for the optimum across a wide range of problems. The dissertation then identifies important features that must be incorporated into the basic BOA to solve problems that are not decomposable on a single level, but that can still be solved by decomposition over multiple levels of difficulty. Hierarchical
Learning Evaluation Functions for Global Optimization and Boolean Satisfiability
 In Proc. of 15th National Conf. on Artificial Intelligence (AAAI
, 1998
"... This paper describes STAGE, a learning approach to automatically improving search performance on optimization problems. STAGE learns an evaluation function which predicts the outcome of a local search algorithm, such as hillclimbing or WALKSAT, as a function of state features along its search ..."
Abstract

Cited by 65 (3 self)
 Add to MetaCart
This paper describes STAGE, a learning approach to automatically improving search performance on optimization problems. STAGE learns an evaluation function which predicts the outcome of a local search algorithm, such as hillclimbing or WALKSAT, as a function of state features along its search trajectories. The learned evaluation function is used to bias future search trajectories toward better optima. We present positive results on six largescale optimization domains.
Learning factor graphs in polynomial time and sample complexity
 JMLR
, 2006
"... We study the computational and sample complexity of parameter and structure learning in graphical models. Our main result shows that the class of factor graphs with bounded degree can be learned in polynomial time and from a polynomial number of training examples, assuming that the data is generated ..."
Abstract

Cited by 65 (0 self)
 Add to MetaCart
We study the computational and sample complexity of parameter and structure learning in graphical models. Our main result shows that the class of factor graphs with bounded degree can be learned in polynomial time and from a polynomial number of training examples, assuming that the data is generated by a network in this class. This result covers both parameter estimation for a known network structure and structure learning. It implies as a corollary that we can learn factor graphs for both Bayesian networks and Markov networks of bounded degree, in polynomial time and sample complexity. Importantly, unlike standard maximum likelihood estimation algorithms, our method does not require inference in the underlying network, and so applies to networks where inference is intractable. We also show that the error of our learned model degrades gracefully when the generating distribution is not a member of the target class of networks. In addition to our main result, we show that the sample complexity of parameter learning in graphical models has an O(1) dependence on the number of variables in the model when using the KLdivergence normalized by the number of variables as the performance criterion.
Learning Evaluation Functions to Improve Optimization by Local Search
 Journal of Machine Learning Research
, 2000
"... This paper describes algorithms that learn to improve search performance on largescale optimization tasks. The main algorithm, Stage, works by learning an evaluation function that predicts the outcome of a local search algorithm, such as hillclimbing or Walksat, from features of states visited durin ..."
Abstract

Cited by 59 (0 self)
 Add to MetaCart
(Show Context)
This paper describes algorithms that learn to improve search performance on largescale optimization tasks. The main algorithm, Stage, works by learning an evaluation function that predicts the outcome of a local search algorithm, such as hillclimbing or Walksat, from features of states visited during search. The learned evaluation function is then used to bias future search trajectories toward better optima on the same problem. Another algorithm, XStage, transfers previously learned evaluation functions to new, similar optimization problems. Empirical results are provided on seven largescale optimization domains: binpacking, channel routing, Bayesian network structurefinding, radiotherapy treatment planning, cartogram design, Boolean satisfiability, and Boggle board setup.
Feature Subset Selection by Bayesian networks: a comparison with genetic and sequential algorithms
"... In this paper we perform a comparison among FSSEBNA, a randomized, populationbased and evolutionary algorithm, and two genetic and other two sequential search approaches in the well known Feature Subset Selection (FSS) problem. In FSSEBNA, the FSS problem, stated as a search problem, uses the E ..."
Abstract

Cited by 55 (15 self)
 Add to MetaCart
In this paper we perform a comparison among FSSEBNA, a randomized, populationbased and evolutionary algorithm, and two genetic and other two sequential search approaches in the well known Feature Subset Selection (FSS) problem. In FSSEBNA, the FSS problem, stated as a search problem, uses the EBNA (Estimation of Bayesian Network Algorithm) search engine, an algorithm within the EDA (Estimation of Distribution Algorithm) approach. The EDA paradigm is born from the roots of the GA community in order to explicitly discover the relationships among the features of the problem and not disrupt them by genetic recombination operators. The EDA paradigm avoids the use of recombination operators and it guarantees the evolution of the population of solutions and the discovery of these relationships by the factorization of the probability distribution of best individuals in each generation of the search. In EBNA, this factorization is carried out by a Bayesian network induced by a chea...
Optimization by learning and simulation of Bayesian and Gaussian networks
, 1999
"... Estimation of Distribution Algorithms (EDA) constitute an example of stochastics heuristics based on populations of individuals every of which encode the possible solutions to the optimization problem. These populations of individuals evolve in succesive generations as the search progresses  organ ..."
Abstract

Cited by 54 (7 self)
 Add to MetaCart
Estimation of Distribution Algorithms (EDA) constitute an example of stochastics heuristics based on populations of individuals every of which encode the possible solutions to the optimization problem. These populations of individuals evolve in succesive generations as the search progresses  organized in the same way as most evolutionary computation heuristics. In opposition to most evolutionary computation paradigms which consider the crossing and mutation operators as essential tools to generate new populations, EDA replaces those operators by the estimation and simulation of the joint probability distribution of the selected individuals. In this work, after making a review of the different approaches based on EDA for problems of combinatorial optimization as well as for problems of optimization in continuous domains, we propose new approaches based on the theory of probabilistic graphical models to solve problems in both domains. More precisely, we propose to adapt algorit...
Learning Bayesian Nets that Perform Well
 In UAI97
, 1997
"... A Bayesian net (BN) is more than a succinct way to encode a probabilistic distribution; it also corresponds to a function used to answer queries. A BN can therefore be evaluated by the accuracy of the answers it returns. Many algorithms for learning BNs, however, attempt to optimize another criterio ..."
Abstract

Cited by 46 (16 self)
 Add to MetaCart
(Show Context)
A Bayesian net (BN) is more than a succinct way to encode a probabilistic distribution; it also corresponds to a function used to answer queries. A BN can therefore be evaluated by the accuracy of the answers it returns. Many algorithms for learning BNs, however, attempt to optimize another criterion (usually likelihood, possibly augmented with a regularizing term), which is independent of the distribution of queries that are posed. This paper takes the "performance criteria" seriously, and considers the challenge of computing the BN whose performance  read "accuracy over the distribution of queries"  is optimal. We show that many aspects of this learning task are more difficult than the corresponding subtasks in the standard model. To appear in Proceedings of the Thirteenth Conference on Uncertainty in Artificial Intelligence (UAI97), Providence, RI, August 1997. 1 INTRODUCTION Many tasks require answering questions; this model applies, for example, to both expert systems th...