Results 1  10
of
55
Determinants of longterm growth: a Bayesian Averaging of Classical Estimates (BACE) approach
 American Economic Review
"... This paper examines the robustness of explanatory variables in crosscountry economic growth regressions. It introduces and employs a novel approach, Bayesian Averaging of Classical Estimates (BACE), which constructs estimates by averaging OLS coefficients across models. The weights given to indiv ..."
Abstract

Cited by 301 (2 self)
 Add to MetaCart
This paper examines the robustness of explanatory variables in crosscountry economic growth regressions. It introduces and employs a novel approach, Bayesian Averaging of Classical Estimates (BACE), which constructs estimates by averaging OLS coefficients across models. The weights given to individual regressions have a Bayesian justification similar to the Schwarz model selection criterion. Of 67 explanatory variables we find 18 to be significantly and robustly partially correlated with longterm growth and another three variables to be marginally related. The strongest evidence is for the relative price of investment, primary school enrollment, and the initial level of real GDP per capita. (JEL O51, O52,
A Bayesian Approach to Causal Discovery
, 1997
"... We examine the Bayesian approach to the discovery of directed acyclic causal models and compare it to the constraintbased approach. Both approaches rely on the Causal Markov assumption, but the two differ significantly in theory and practice. An important difference between the approaches is that t ..."
Abstract

Cited by 93 (1 self)
 Add to MetaCart
We examine the Bayesian approach to the discovery of directed acyclic causal models and compare it to the constraintbased approach. Both approaches rely on the Causal Markov assumption, but the two differ significantly in theory and practice. An important difference between the approaches is that the constraintbased approach uses categorical information about conditionalindependence constraints in the domain, whereas the Bayesian approach weighs the degree to which such constraints hold. As a result, the Bayesian approach has three distinct advantages over its constraintbased counterpart. One, conclusions derived from the Bayesian approach are not susceptible to incorrect categorical decisions about independence facts that can occur with data sets of finite size. Two, using the Bayesian approach, finer distinctions among model structuresboth quantitative and qualitativecan be made. Three, information from several models can be combined to make better inferences and to better ...
Knowledge Acquisition from Examples Via Multiple Models
 In Proceedings of the Fourteenth International Conference on Machine Learning
, 1997
"... If it is to qualify as knowledge, a learner's output should be accurate, stable and comprehensible. Learning multiple models can improve significantly on the accuracy and stability of single models, but at the cost of losing their comprehensibility (when they possess it, as do, for example, sim ..."
Abstract

Cited by 58 (7 self)
 Add to MetaCart
If it is to qualify as knowledge, a learner's output should be accurate, stable and comprehensible. Learning multiple models can improve significantly on the accuracy and stability of single models, but at the cost of losing their comprehensibility (when they possess it, as do, for example, simple decision trees and rule sets). This paper proposes and evaluates CMM, a metalearner that seeks to retain most of the accuracy gains of multiple model approaches, while still producing a single comprehensible model. CMM is based on reapplying the base learner to recover the frontiers implicit in the multiple model ensemble. This is done by giving the base learner a new training set, composed of a large number of examples generated and classified according to the ensemble, plus the original examples. CMM is evaluated using C4.5RULES as the base learner, and bagging as the multiplemodel methodology. On 26 benchmark datasets, CMM retains on average 60% of the accuracy gains obtained by bagging ...
Feature Subset Selection by Bayesian networks: a comparison with genetic and sequential algorithms
"... In this paper we perform a comparison among FSSEBNA, a randomized, populationbased and evolutionary algorithm, and two genetic and other two sequential search approaches in the well known Feature Subset Selection (FSS) problem. In FSSEBNA, the FSS problem, stated as a search problem, uses the E ..."
Abstract

Cited by 51 (15 self)
 Add to MetaCart
In this paper we perform a comparison among FSSEBNA, a randomized, populationbased and evolutionary algorithm, and two genetic and other two sequential search approaches in the well known Feature Subset Selection (FSS) problem. In FSSEBNA, the FSS problem, stated as a search problem, uses the EBNA (Estimation of Bayesian Network Algorithm) search engine, an algorithm within the EDA (Estimation of Distribution Algorithm) approach. The EDA paradigm is born from the roots of the GA community in order to explicitly discover the relationships among the features of the problem and not disrupt them by genetic recombination operators. The EDA paradigm avoids the use of recombination operators and it guarantees the evolution of the population of solutions and the discovery of these relationships by the factorization of the probability distribution of best individuals in each generation of the search. In EBNA, this factorization is carried out by a Bayesian network induced by a chea...
Learning Probabilistic Networks
 THE KNOWLEDGE ENGINEERING REVIEW
, 1998
"... A probabilistic network is a graphical model that encodes probabilistic relationships between variables of interest. Such a model records qualitative influences between variables in addition to the numerical parameters of the probability distribution. As such it provides an ideal form for combini ..."
Abstract

Cited by 43 (2 self)
 Add to MetaCart
A probabilistic network is a graphical model that encodes probabilistic relationships between variables of interest. Such a model records qualitative influences between variables in addition to the numerical parameters of the probability distribution. As such it provides an ideal form for combining prior knowledge, which might be limited solely to experience of the influences between some of the variables of interest, and data. In this paper, we first show how data can be used to revise initial estimates of the parameters of a model. We then progress to showing how the structure of the model can be revised as data is obtained. Techniques for learning with incomplete data are also covered.
Cumulative advantage as a mechanism for inequality: A review of theoretical and empirical development
 Annual Review of Sociology
, 2006
"... While originally developed by Merton to explain advancement in scienti c careers, cumulative advantage is a general mechanism for inequality across any temporal process (e.g., life course, family generations) in which a favorable relative position becomes a resource that produces further relative ..."
Abstract

Cited by 40 (3 self)
 Add to MetaCart
(Show Context)
While originally developed by Merton to explain advancement in scienti c careers, cumulative advantage is a general mechanism for inequality across any temporal process (e.g., life course, family generations) in which a favorable relative position becomes a resource that produces further relative gains. We show that the term "cumulative advantage " has come to have multiple meanings in the sociological literature. We distinguish between these alternative forms, discuss mechanisms that have been proposed in the literature that might produce cumulative advantage, and review the empirical literature in the areas of education, work careers, and related life course processes.
The State of Boosting
, 1999
"... In many problem domains, combining the predictions of several models often results in a model with improved predictive performance. Boosting is one such method that has shown great promise. On the applied side, empirical studies have shown that combining models using boosting methods produces more a ..."
Abstract

Cited by 35 (2 self)
 Add to MetaCart
In many problem domains, combining the predictions of several models often results in a model with improved predictive performance. Boosting is one such method that has shown great promise. On the applied side, empirical studies have shown that combining models using boosting methods produces more accurate classification and regression models. These methods are extendible to the exponential family as well as proportional hazards regression models. This article shows that boosting, which is still new to statistics, is widely applicable. I will introduce boosting, discuss the current state of boosting, and show how these methods connect to more standard statistical practice.
Why Does Bagging Work? A Bayesian Account and its Implications
 In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining
, 1997
"... The error rate of decisiontree and other classification learners can often be much reduced by bagging: learning multiple models from bootstrap samples of the database, and combining them by uniform voting. In this paper we empirically test two alternative explanations for this, both based on Bayes ..."
Abstract

Cited by 35 (7 self)
 Add to MetaCart
(Show Context)
The error rate of decisiontree and other classification learners can often be much reduced by bagging: learning multiple models from bootstrap samples of the database, and combining them by uniform voting. In this paper we empirically test two alternative explanations for this, both based on Bayesian learning theory: (1) bagging works because it is an approximation to the optimal procedure of Bayesian model averaging, with an appropriate implicit prior; (2) bagging works because it effectively shifts the prior to a more appropriate region of model space. All the experimental evidence contradicts the first hypothesis, and confirms the second. Bagging Bagging (Breiman 1996a) is a simple and effective way to reduce the error rate of many classification learning algorithms. For example, in the empirical study described below, it reduces the error of a decisiontree learner in 19 of 26 databases, by 4% on average. In the bagging procedure, given a training set of size s, a "bootstrap" re...
Knowledge Discovery Via Multiple Models
 Intelligent Data Analysis
, 1998
"... If it is to qualify as knowledge, a learner's output should be accurate, stable and comprehensible. Learning multiple models can improve significantly on the accuracy and stability of single models, but at the cost of losing their comprehensibility (when they possess it, as do, for example, sim ..."
Abstract

Cited by 34 (0 self)
 Add to MetaCart
(Show Context)
If it is to qualify as knowledge, a learner's output should be accurate, stable and comprehensible. Learning multiple models can improve significantly on the accuracy and stability of single models, but at the cost of losing their comprehensibility (when they possess it, as do, for example, simple decision trees and rule sets). This article proposes and evaluates CMM, a metalearner that seeks to retain most of the accuracy gains of multiple model approaches, while still producing a single comprehensible model. CMM is based on reapplying the base learner to recover the frontiers implicit in the multiple model ensemble. This is done by giving the base learner a new training set, composed of a large number of examples generated and classified according to the ensemble, plus the original examples. CMM is evaluated using C4.5RULES as the base learner, and bagging as the multiplemodel methodology. On 26 benchmark datasets, CMM retains on average 60% of the accuracy gains obtained by baggin...