Results 1 
7 of
7
A Guide to the Literature on Learning Probabilistic Networks From Data
, 1996
"... This literature review discusses different methods under the general rubric of learning Bayesian networks from data, and includes some overlapping work on more general probabilistic networks. Connections are drawn between the statistical, neural network, and uncertainty communities, and between the ..."
Abstract

Cited by 172 (0 self)
 Add to MetaCart
This literature review discusses different methods under the general rubric of learning Bayesian networks from data, and includes some overlapping work on more general probabilistic networks. Connections are drawn between the statistical, neural network, and uncertainty communities, and between the different methodological communities, such as Bayesian, description length, and classical statistics. Basic concepts for learning and Bayesian networks are introduced and methods are then reviewed. Methods are discussed for learning parameters of a probabilistic network, for learning the structure, and for learning hidden variables. The presentation avoids formal definitions and theorems, as these are plentiful in the literature, and instead illustrates key concepts with simplified examples. Keywords Bayesian networks, graphical models, hidden variables, learning, learning structure, probabilistic networks, knowledge discovery. I. Introduction Probabilistic networks or probabilistic gra...
Very Fast EMbased Mixture Model Clustering Using Multiresolution kdtrees
 In Advances in Neural Information Processing Systems 11
, 1998
"... Clustering is importantinmany fields including manufacturing, biology, finance, and astronomy. Mixture models are a popular approach due to their statistical foundations, and EM is a very popular method for finding mixture models. EM, however, requires many accesses of the data, and thus has bee ..."
Abstract

Cited by 89 (4 self)
 Add to MetaCart
Clustering is importantinmany fields including manufacturing, biology, finance, and astronomy. Mixture models are a popular approach due to their statistical foundations, and EM is a very popular method for finding mixture models. EM, however, requires many accesses of the data, and thus has been dismissed as impractical (e.g. (Zhang, Ramakrishnan, & Livny, 1996)) for data mining of enormous datasets.
Learning Bayesian Networks Using Feature Selection
 in D. Fisher & H. Lenz, eds, Proceedings of the fifth International Workshop on Artificial Intelligence and Statistics, Ft. Lauderdale, FL
, 1995
"... This paper introduces a novel enhancement for learning Bayesian networks with a bias for small, highpredictiveaccuracy networks. The new approach selects a subset of features which maximizes predictive accuracy prior to the network learning phase. We examine explicitly the effects of two aspects o ..."
Abstract

Cited by 19 (2 self)
 Add to MetaCart
This paper introduces a novel enhancement for learning Bayesian networks with a bias for small, highpredictiveaccuracy networks. The new approach selects a subset of features which maximizes predictive accuracy prior to the network learning phase. We examine explicitly the effects of two aspects of the algorithm, feature selection and node ordering. Our approach generates networks which are computationally simpler to evaluate and which display predictive accuracy comparable to that of Bayesian networks which model all attributes. 1 INTRODUCTION Bayesian networks are being increasingly recognized as an important representation for probabilistic reasoning. For many domains, the need to specify the probability distributions for a Bayesian network is considerable, and learning these probabilities from data using an algorithm like K2 [8] 1 could alleviate such specification difficulties. We describe an extension to the Bayesian network learning approaches introduced in K2. Rather than ...
The TETRAD Project: Constraint Based Aids to Causal Model Specification
 MULTIVARIATE BEHAVIORAL RESEARCH
"... ..."
On the Statistical Comparison of Inductive Learning Methods
 In D. Fisher & H.J. Lenz (Eds.), Learning from Data: Artificial and Intelligence V
, 1996
"... Experimental comparisons between statistical and machine learning methods appear with increasing frequency in the literature. However, there does not seem to be a consensus on how such a comparison is performed in a methodologically sound way. Especially the effect of testing multiple hypotheses on ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
Experimental comparisons between statistical and machine learning methods appear with increasing frequency in the literature. However, there does not seem to be a consensus on how such a comparison is performed in a methodologically sound way. Especially the effect of testing multiple hypotheses on the probability of producing a "false alarm" is often ignored. We transfer multiple comparison procedures from the statistical literature to the type of study discussed in this paper. These testing procedures take the number of tests performed into account, thereby controlling the probability of generating "false alarms". The multiple comparison procedures selected are illustrated on wellknown regression and classification data sets. 26.1 Introduction Recent interactions between the statistical and artificial intelligence communities (see e.g. [Han93, CO94]), have led to many studies that compare the performance of empirical statistical and machine learning methods on reallife data sets; ...
Neural Networks and Logistic Regression
, 1996
"... this paper we investigated whether neural nets are worth to be considered as an alternative to logistic models in settings relevant for biomedical research. The first drawback of neural nets is that they give us no direct information on the value of a single covariate for the prediction. The example ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
this paper we investigated whether neural nets are worth to be considered as an alternative to logistic models in settings relevant for biomedical research. The first drawback of neural nets is that they give us no direct information on the value of a single covariate for the prediction. The examples in section 8 illustrate that there are no simple strategies to interpret the weights in this sense. It remains the question, whether even in those applications which focus on the estimation of the regression function, it is justified to neglect the possible gain in scientific knowledge available from the identification of influential factors. If we restrict ourselves to estimation of the regression function many biomedical applications involve less than 400 observations and/or more than five covariates. Larger samples are the exception, but the investigations in Section 10 reveal that neural networks need large samples to take advantage of their flexibility. Even if we have large samples, neural nets will be superior to the considered model selection procedures only if the true regression function cannot be approximated by a parsimonious member of the class (8.2). Hence it remains the question about the need of more complex regression functions in a particular biomedical application. In our opinion for most applications such complex functions are not very plausible, because the covariates represent meaningful biological factors. It should be emphasized that the successful use of neural networks appears mainly in fields like pattern recognition, where covariates like the grey scale of a pixel are more or less meaningless with respect to their single values and we can expect results only from combining the values of many pixels. As a final point we should mention that neural ...
Modeling and Monitoring Dynamic Systems by Chain Graphs
"... It is widely recognized that probabilistic graphical models provide a good framework for both knowledge representation and probabilistic inference (e.g., see [Cheeseman94], [Whittaker90 ]). The dynamic behaviour of a system which changes over time requires an implicit or explicit time representation ..."
Abstract
 Add to MetaCart
It is widely recognized that probabilistic graphical models provide a good framework for both knowledge representation and probabilistic inference (e.g., see [Cheeseman94], [Whittaker90 ]). The dynamic behaviour of a system which changes over time requires an implicit or explicit time representation. In this paper, an implicit time representation using dynamic graphical models is proposed. Our goal is to model the state of a system and its evolution over time in a richer and more natural way than other approaches together with a more suitable treatment of the inference on variables of interest. 7.1 Introduction It is widely recognized that probabilistic graphical models provide a good framework for both knowledge representation and probabilistic inference (e.g., see [Cheeseman94] and [Whittaker90]). The dynamic behaviour of any specific system which changes over time requires an implicit or explicit time representation. To model such systems is a very important task: the initial struc...