Results 1  10
of
13
A Tutorial on Learning Bayesian Networks
 Communications of the ACM
, 1995
"... We examine a graphical representation of uncertain knowledge called a Bayesian network. The representation is easy to construct and interpret, yet has formal probabilistic semantics making it suitable for statistical manipulation. We show how we can use the representation to learn new knowledge by c ..."
Abstract

Cited by 299 (13 self)
 Add to MetaCart
We examine a graphical representation of uncertain knowledge called a Bayesian network. The representation is easy to construct and interpret, yet has formal probabilistic semantics making it suitable for statistical manipulation. We show how we can use the representation to learn new knowledge by combining domain knowledge with statistical data. 1 Introduction Many techniques for learning rely heavily on data. In contrast, the knowledge encoded in expert systems usually comes solely from an expert. In this paper, we examine a knowledge representation, called a Bayesian network, that lets us have the best of both worlds. Namely, the representation allows us to learn new knowledge by combining expert domain knowledge and statistical data. A Bayesian network is a graphical representation of uncertain knowledge that most people find easy to construct and interpret. In addition, the representation has formal probabilistic semantics, making it suitable for statistical manipulation (Howard,...
Estimating Returns to College Quality with Multiple Proxies for Quality
 Journal of Labor Economics
, 2006
"... We thank Art Goldberger for helpful pointers to the literature. Estimating the Returns to College Quality with Multiple ..."
Abstract

Cited by 34 (1 self)
 Add to MetaCart
We thank Art Goldberger for helpful pointers to the literature. Estimating the Returns to College Quality with Multiple
Quantifying and visualizing attribute interactions: An approach based on entropy
 http://arxiv.org/abs/cs.AI/0308002 v3
, 2004
"... Interactions are patterns between several attributes in data that cannot be inferred from any subset of these attributes. While mutual information is a wellestablished approach to evaluating the interactions between two attributes, we surveyed its generalizations as to quantify interactions between ..."
Abstract

Cited by 25 (4 self)
 Add to MetaCart
Interactions are patterns between several attributes in data that cannot be inferred from any subset of these attributes. While mutual information is a wellestablished approach to evaluating the interactions between two attributes, we surveyed its generalizations as to quantify interactions between several attributes. We have chosen McGill’s interaction information, which has been independently rediscovered a number of times under various names in various disciplines, because of its many intuitively appealing properties. We apply interaction information to visually present the most important interactions of the data. Visualization of interactions has provided insight into the structure of data on a number of domains, identifying redundant attributes and opportunities for constructing new features, discovering unexpected regularities in data, and have helped during construction of predictive models; we illustrate the methods on numerous examples. A machine learning method that disregards interactions may get caught in two traps: myopia is caused by learning algorithms assuming independence in spite of interactions, whereas fragmentation arises from assuming an interaction in spite of independence.
Partial autocorrelation function of a nonstationary time series
 J. Multivariate Anal
, 2003
"... The second order properties of a process are usually characterized by the autocovariance function. In the stationary case, the parameterization by the partial autocorrelation function is relatively recent. We extend this parameterization to the nonstationary case. The advantage of this function is t ..."
Abstract

Cited by 9 (5 self)
 Add to MetaCart
The second order properties of a process are usually characterized by the autocovariance function. In the stationary case, the parameterization by the partial autocorrelation function is relatively recent. We extend this parameterization to the nonstationary case. The advantage of this function is that it is subject to very simple constraints in comparison with the autocovariance function which must be nonnegative definite. As in the stationary case, this parameterization is well adapted to autoregressive models or to the identification of deterministic processes.
Extension of autocovariance coefficients sequence for periodically correlated processes
 Journal of Time Series Analysis
"... Abstract. The extension of stationary process autocorrelation coefficient sequence is a classical problem in the field of spectral estimation. In this note, we treat this extension problem for the periodically correlated processes by using the partial autocorrelation function. We show that the theor ..."
Abstract

Cited by 6 (4 self)
 Add to MetaCart
Abstract. The extension of stationary process autocorrelation coefficient sequence is a classical problem in the field of spectral estimation. In this note, we treat this extension problem for the periodically correlated processes by using the partial autocorrelation function. We show that the theory of the nonstationary processes can be adapted to the periodically correlated processes. The partial autocorrelation function has a clear advantage for parameterization over the autocovariance function which should be checked for nonnegative definiteness. In this way, we show that contrary to the stationary case, the Yule–Walker equations (for a periodically correlated process) is no longer a tool for extending the first autocovariance coefficients to an autocovariance function. Next, we treat the extension problem and present a maximum entropy method extension through the the partial autocorrelation function. We show that the solution maximizing the entropy is a periodic autoregressive process and compare this approach with others.
Fisher and Regression
"... Abstract. In 1922 R. A. Fisher introduced the modern regression model, synthesizing the regression theory of Pearson and Yule and the least squares theory of Gauss. The innovation was based on Fisher’s realization that the distribution associated with the regression coefficient was unaffected by the ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
Abstract. In 1922 R. A. Fisher introduced the modern regression model, synthesizing the regression theory of Pearson and Yule and the least squares theory of Gauss. The innovation was based on Fisher’s realization that the distribution associated with the regression coefficient was unaffected by the distribution of X. Subsequently Fisher interpreted the fixed X assumption in terms of his notion of ancillarity. This paper considers these developments against the background of the development of statistical theory in the early twentieth century.
Covariance Estimation: The GLM and Regularization Perspectives
"... Finding an unconstrained and statistically interpretable reparameterization of a covariance matrix is still an open problem in statistics. Its solution is of central importance in covariance estimation, particularly in the recent highdimensional data environment where enforcing the positivedefinit ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Finding an unconstrained and statistically interpretable reparameterization of a covariance matrix is still an open problem in statistics. Its solution is of central importance in covariance estimation, particularly in the recent highdimensional data environment where enforcing the positivedefiniteness constraint could be computationally expensive. We provide a survey of the progress made in modeling covariance matrices from the perspectives of generalized linear models (GLM) or parsimony and use of covariates in low dimensions, regularization (shrinkage, sparsity) for highdimensional data, and the role of various matrix factorizations. A viable and emerging regressionbased setup which is suitable for both the GLM and the regularization approaches is to link a covariance matrix, its inverse or their factors to certain regression models and then solve the relevant (penalized) least squares problems. We point out several instances of this regressionbased setup in the literature. A notable case is in the Gaussian graphical models where linear regressions with LASSO penalty are used to estimate the neighborhood of one node at a time (Meinshausen and Bühlmann, 2006). Some advantages
Geometrical Aspects of Linear Prediction Algorithms
"... In this paper, an old identity of G. U. Yule among partial correlation coefficients is recognized as being equal to the cosine law of spherical trigonometry. Exploiting this connection enables us to derive some new (and potentially useful) relations among partial correlation coefficients. Moreover, ..."
Abstract
 Add to MetaCart
In this paper, an old identity of G. U. Yule among partial correlation coefficients is recognized as being equal to the cosine law of spherical trigonometry. Exploiting this connection enables us to derive some new (and potentially useful) relations among partial correlation coefficients. Moreover, this observation provides new (dual) nonEuclidean geometrical interpretations of the Schur and LevinsonSzegö algorithms.