Results 11 - 20
of
92
Estimating high-dimensional directed acyclic graphs with the pc-algorithm
- Journal of Machine Learning Research
, 2005
"... We consider the PC-algorithm (Spirtes et al., 2000) for estimating the skeleton and equivalence class of a very high-dimensional directed acyclic graph (DAG) with corresponding Gaussian distribution. The PC-algorithm is computationally feasible and often very fast for sparse problems with many nodes ..."
Abstract
-
Cited by 31 (4 self)
- Add to MetaCart
We consider the PC-algorithm (Spirtes et al., 2000) for estimating the skeleton and equivalence class of a very high-dimensional directed acyclic graph (DAG) with corresponding Gaussian distribution. The PC-algorithm is computationally feasible and often very fast for sparse problems with many nodes (variables), and it has the attractive property to automatically achieve high computational efficiency as a function of sparseness of the true underlying DAG. We prove uniform consistency of the algorithm for very high-dimensional, sparse DAGs where the number of nodes is allowed to quickly grow with sample size n, as fast as O(n a) for any 0 < a < ∞. The sparseness assumption is rather minimal requiring only that the neighborhoods in the DAG are of lower order than sample size n. We also demonstrate the PC-algorithm for simulated data. Keywords: asymptotic consistency, DAG, graphical model, PC-algorithm, skeleton 1.
Kernel measures of conditional dependence
- In Adv. NIPS
, 2008
"... We propose a new measure of conditional dependence of random variables, based on normalized cross-covariance operators on reproducing kernel Hilbert spaces. Unlike previous kernel dependence measures, the proposed criterion does not depend on the choice of kernel in the limit of infinite data, for a ..."
Abstract
-
Cited by 31 (24 self)
- Add to MetaCart
We propose a new measure of conditional dependence of random variables, based on normalized cross-covariance operators on reproducing kernel Hilbert spaces. Unlike previous kernel dependence measures, the proposed criterion does not depend on the choice of kernel in the limit of infinite data, for a wide class of kernels. At the same time, it has a straightforward empirical estimate with good convergence behaviour. We discuss the theoretical properties of the measure, and demonstrate its application in experiments. 1
Learning Probabilistic Networks
- THE KNOWLEDGE ENGINEERING REVIEW
, 1998
"... A probabilistic network is a graphical model that encodes probabilistic relationships between variables of interest. Such a model records qualitative influences between variables in addition to the numerical parameters of the probability distribution. As such it provides an ideal form for combini ..."
Abstract
-
Cited by 27 (1 self)
- Add to MetaCart
A probabilistic network is a graphical model that encodes probabilistic relationships between variables of interest. Such a model records qualitative influences between variables in addition to the numerical parameters of the probability distribution. As such it provides an ideal form for combining prior knowledge, which might be limited solely to experience of the influences between some of the variables of interest, and data. In this paper, we first show how data can be used to revise initial estimates of the parameters of a model. We then progress to showing how the structure of the model can be revised as data is obtained. Techniques for learning with incomplete data are also covered.
Expanding From Discrete To Continuous Estimation Of Distribution Algorithms: The IDEA
- In Parallel Problem Solving From Nature - PPSN VI
, 2000
"... . The direct application of statistics to stochastic optimization based on iterated density estimation has become more important and present in evolutionary computation over the last few years. The estimation of densities over selected samples and the sampling from the resulting distributions, i ..."
Abstract
-
Cited by 24 (7 self)
- Add to MetaCart
. The direct application of statistics to stochastic optimization based on iterated density estimation has become more important and present in evolutionary computation over the last few years. The estimation of densities over selected samples and the sampling from the resulting distributions, is a combination of the recombination and mutation steps used in evolutionary algorithms. We introduce the framework named IDEA to formalize this notion. By combining continuous probability theory with techniques from existing algorithms, this framework allows us to dene new continuous evolutionary optimization algorithms. 1 Introduction Algorithms in evolutionary optimization guide their search through statistics based on a vector of samples, often called a population. By using this stochastic information, non{deterministic induction is performed in order to attempt to use the structure of the search space and thereby aid the search for the optimal solution. In order to perform induct...
A robust procedure for gaussian graphical model search from microarray data with p larger than n
- Journal of Machine Learning Research
, 2006
"... Learning of large-scale networks of interactions from microarray data is an important and challenging problem in bioinformatics. A widely used approach is to assume that the available data constitute a random sample from a multivariate distribution belonging to a Gaussian graphical model. As a conse ..."
Abstract
-
Cited by 19 (1 self)
- Add to MetaCart
Learning of large-scale networks of interactions from microarray data is an important and challenging problem in bioinformatics. A widely used approach is to assume that the available data constitute a random sample from a multivariate distribution belonging to a Gaussian graphical model. As a consequence, the prime objects of inference are full-order partial correlations which are partial correlations between two variables given the remaining ones. In the context of microarray data the number of variables exceed the sample size and this precludes the application of traditional structure learning procedures because a sampling version of full-order partial correlations does not exist. In this paper we consider limited-order partial correlations, these are partial correlations computed on marginal distributions of manageable size, and provide a set of rules that allow one to assess the usefulness of these quantities to derive the independence structure of the underlying Gaussian graphical model. Furthermore, we introduce a novel structure learning procedure based on a quantity, obtained from limited-order partial correlations, that we call the non-rejection rate. The applicability and usefulness of the procedure are demonstrated by both simulated and real data.
Inference and Learning in Hybrid Bayesian Networks
, 1998
"... We survey the literature on methods for inference and learning in Bayesian Networks composed of discrete and continuous nodes, in which the continuous nodes have a multivariate Gaussian distribution, whose mean and variance depends on the values of the discrete nodes. We also briefly consider hybrid ..."
Abstract
-
Cited by 18 (2 self)
- Add to MetaCart
We survey the literature on methods for inference and learning in Bayesian Networks composed of discrete and continuous nodes, in which the continuous nodes have a multivariate Gaussian distribution, whose mean and variance depends on the values of the discrete nodes. We also briefly consider hybrid Dynamic Bayesian Networks, an extension of switching Kalman filters. This report is meant to summarize what is known at a sufficient level of detail to enable someone to implement the algorithms, but without dwelling on formalities.
Remarks concerning graphical models for time series and point processes
- Revista de Econometria
, 1996
"... Uma rede estatística é uma cole,cão de nós representando variáveis aleatórias e um conjunto de arestas que ligam os nós. Um modelo estocástico por isso e chamado um modelo gráfico. Estes modelos, de gráficos e redes, sáo particularmente úteis para examinar as dependéncias estatísticas baseadas em co ..."
Abstract
-
Cited by 18 (3 self)
- Add to MetaCart
Uma rede estatística é uma cole,cão de nós representando variáveis aleatórias e um conjunto de arestas que ligam os nós. Um modelo estocástico por isso e chamado um modelo gráfico. Estes modelos, de gráficos e redes, sáo particularmente úteis para examinar as dependéncias estatísticas baseadas em condi,coes do tipo das que ocorrem frequentemente em economia e estatística. Neste artigo as variáveis aleatórias dos nós serão séries temporais ou processos pontuais. Os casos de gráfos direcionados e não-direcionados são apresentados. A statistical network is a collection of nodes representing random variables and a set of edges that connect the nodes. A probabilistic model for such is called a graphi-cal model. These models, graphs and networks are particularly useful for examining statistical dependencies based on conditioning as often occurs in economics and statis-tics. In this paper the nodal random variables will be time series or point proceses. The cases of undirected and directed graphs are focussed on.
The Posterior Probability of Bayes Nets with Strong Dependences
- Soft Computing
, 1999
"... Stochastic independence is an idealized relationship located at one end of a continuum of values measuring degrees of dependence. Modeling real world systems, we are often not interested in the distinction between exact independence and any degree of dependence, but between weak ignorable and strong ..."
Abstract
-
Cited by 14 (1 self)
- Add to MetaCart
Stochastic independence is an idealized relationship located at one end of a continuum of values measuring degrees of dependence. Modeling real world systems, we are often not interested in the distinction between exact independence and any degree of dependence, but between weak ignorable and strong substantial dependence. Good models map significant deviance from independence and neglect approximate independence or dependence weaker than a noise threshold. This intuition is applied to learning the structure of Bayes nets from data. We determine the conditional posterior probabilities of structures given that the degree of dependence at each of their nodes exceeds a critical noise level. Deviance from independence is measured by mutual information. Arc probabilities are determined by the amount of mutual information the neighbors contribute to a node, is greater than a critical minimum deviance from independence. A Ø 2 approximation for the probability density function of mutual info...
Enumerating Markov Equivalence Classes of Acyclic Digraph Models
- PROC. OF THE CONF. ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE
, 2001
"... Graphical Markov models determined by acyclic digraphs (ADGs), also called directed acyclic graphs (DAGs), are widely studied in statistics, computer science (as Bayesian networks), operations research (as influence diagrams), and many related fields. Because different ADGs may determine the s ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
Graphical Markov models determined by acyclic digraphs (ADGs), also called directed acyclic graphs (DAGs), are widely studied in statistics, computer science (as Bayesian networks), operations research (as influence diagrams), and many related fields. Because different ADGs may determine the same Markov equivalence class, it long has been of interest to determine the efficiency gained in model specification and search by working directly with Markov equivalence classes of ADGs rather than with ADGs themselves. A computer program was written to enumerate the equivalence classes of ADG models as specified by Pearl & Verma's equivalence criterion. The program counted equivalence classes for models up to and including 10 vertices. The ratio of numbers of classes to ADGs appears to approach an asymptote of about 0.267. Classes were analyzed according to number of edges and class size. By edges, the distribution of number of classes approaches a Gaussian shape. By class size, classes of size 1 are most common, with the proportions for larger sizes initially decreasing but then following a more irregular pattern. The maximum number of classes generated by any undirected graph was found to increase approximately factorially. The program also includes a new variation of orderly algorithm for generating undirected graphs.
Software Systems for Tabular Data Releases
- INT. J. UNCERTAINTY, FUZZINESS AND KNOWLEDGE BASED SYSTEMS
, 2002
"... We describe two classes of software systems that release tabular summaries of an underlying database. Table servers respond to user queries for (marginal) sub-tables of the "full" table summarizing the entire database, and are characterized by dynamic assessment of disclosure risk, in light of pr ..."
Abstract
-
Cited by 13 (12 self)
- Add to MetaCart
We describe two classes of software systems that release tabular summaries of an underlying database. Table servers respond to user queries for (marginal) sub-tables of the "full" table summarizing the entire database, and are characterized by dynamic assessment of disclosure risk, in light of previously answered queries. Optimal tabular releases are static releases of sets of sub-tables that are characterized by maximizing the amount of information released, as given by a measure of data utility, subject to a constraint on disclosure risk. Underlying abstractions --- primarily associated with the query space, as well as released and unreleasable sub-tables and frontiers, computational algorithms and issues, especially scalability, and prototype software implementations are discussed.

