Results 1  10
of
39
Operations for Learning with Graphical Models
 Journal of Artificial Intelligence Research
, 1994
"... This paper is a multidisciplinary review of empirical, statistical learning from a graphical model perspective. Wellknown examples of graphical models include Bayesian networks, directed graphs representing a Markov chain, and undirected networks representing a Markov field. These graphical models ..."
Abstract

Cited by 249 (12 self)
 Add to MetaCart
This paper is a multidisciplinary review of empirical, statistical learning from a graphical model perspective. Wellknown examples of graphical models include Bayesian networks, directed graphs representing a Markov chain, and undirected networks representing a Markov field. These graphical models are extended to model data analysis and empirical learning using the notation of plates. Graphical operations for simplifying and manipulating a problem are provided including decomposition, differentiation, and the manipulation of probability models from the exponential family. Two standard algorithm schemas for learning are reviewed in a graphical framework: Gibbs sampling and the expectation maximization algorithm. Using these operations and schemas, some popular algorithms can be synthesized from their graphical specification. This includes versions of linear regression, techniques for feedforward networks, and learning Gaussian and discrete Bayesian networks from data. The paper conclu...
A Guide to the Literature on Learning Probabilistic Networks From Data
, 1996
"... This literature review discusses different methods under the general rubric of learning Bayesian networks from data, and includes some overlapping work on more general probabilistic networks. Connections are drawn between the statistical, neural network, and uncertainty communities, and between the ..."
Abstract

Cited by 172 (0 self)
 Add to MetaCart
This literature review discusses different methods under the general rubric of learning Bayesian networks from data, and includes some overlapping work on more general probabilistic networks. Connections are drawn between the statistical, neural network, and uncertainty communities, and between the different methodological communities, such as Bayesian, description length, and classical statistics. Basic concepts for learning and Bayesian networks are introduced and methods are then reviewed. Methods are discussed for learning parameters of a probabilistic network, for learning the structure, and for learning hidden variables. The presentation avoids formal definitions and theorems, as these are plentiful in the literature, and instead illustrates key concepts with simplified examples. Keywords Bayesian networks, graphical models, hidden variables, learning, learning structure, probabilistic networks, knowledge discovery. I. Introduction Probabilistic networks or probabilistic gra...
A Hierarchical Unsupervised Growing Neural Network for Clustering Gene Expression Patterns
, 2001
"... Motivation: We describe a new approach to the analysis of gene expression data coming from DNA array experiments, using an unsupervised neural network. DNA array technologies allow monitoring thousands of genes rapidly and efficiently. One of the interests of these studies is the search for correlat ..."
Abstract

Cited by 135 (13 self)
 Add to MetaCart
Motivation: We describe a new approach to the analysis of gene expression data coming from DNA array experiments, using an unsupervised neural network. DNA array technologies allow monitoring thousands of genes rapidly and efficiently. One of the interests of these studies is the search for correlated gene expression patterns, and this is usually achieved by clustering them. The SelfOrganising Tree Algorithm, (SOTA) (Dopazo,J. and Carazo,J.M. (1997) J. Mol. Evol., 44, 226233), is a neural network that grows adopting the topology of a binary tree. The result of the algorithm is a hierarchical cluster obtained with the accuracy and robustness of a neural network. Results: SOTA clustering confers several advantages over classical hierarchical clustering methods. SOTA is a divisive method: the clustering process is performed from top to bottom, i.e. the highest hierarchical levels are resolved before going to the details of the lowest levels. The growing can be stopped at the desired hierarchical level. Moreover, a criterion to stop the growing of the tree, based on the approximate distribution of probability obtained by randomisation of the original data set, is provided. By means of this criterion, a statistical support for the definition of clusters is proposed. In addition, obtaining average gene expression patterns is a builtin feature of the algorithm. Different neurons defining the different hierarchical levels represent the averages of the gene expression patterns contained in the clusters. Since SOTA runtimes are approximately linear with the number of items to be classified, it is especially suitable for dealing with huge amounts of data. The method proposed is very general and applies to any data providing that they can be coded as a series of numbers and t...
Wrappers For Performance Enhancement And Oblivious Decision Graphs
, 1995
"... In this doctoral dissertation, we study three basic problems in machine learning and two new hypothesis spaces with corresponding learning algorithms. The problems we investigate are: accuracy estimation, feature subset selection, and parameter tuning. The latter two problems are related and are stu ..."
Abstract

Cited by 107 (8 self)
 Add to MetaCart
In this doctoral dissertation, we study three basic problems in machine learning and two new hypothesis spaces with corresponding learning algorithms. The problems we investigate are: accuracy estimation, feature subset selection, and parameter tuning. The latter two problems are related and are studied under the wrapper approach. The hypothesis spaces we investigate are: decision tables with a default majority rule (DTMs) and oblivious readonce decision graphs (OODGs).
Evaluating Message Understanding Systems: An Analysis of . . .
 COMPUTATIONAL LINGUISTICS
, 1993
"... This paper describes and analyzes the results of the Third Message Understanding Conference (MUC3). It reviews the purpose, history, and methodology of the conference, summarizes the participating systems, discusses issues of measuring system effectiveness, describes the linguistic phenomena tests, ..."
Abstract

Cited by 56 (4 self)
 Add to MetaCart
This paper describes and analyzes the results of the Third Message Understanding Conference (MUC3). It reviews the purpose, history, and methodology of the conference, summarizes the participating systems, discusses issues of measuring system effectiveness, describes the linguistic phenomena tests, and provides a critical look at the evaluation in terms of the lessons learned. One of the common problems with evaluations is that the statistical significance of the results is unknown. In the discussion of system performance, the statistical significance of the evaluation results is reported and the use of approximate randomization to calculate the statistical significance of the results of MUC3 is described
representative are the known structures of the proteins in a complete genome? A comprehensive structural census. Fold Des 3
, 1998
"... Manuscript is 43 Pages in Length (including this one) ..."
Abstract

Cited by 48 (25 self)
 Add to MetaCart
Manuscript is 43 Pages in Length (including this one)
The psychometric function: II. Bootstrapbased confidence intervals and sampling, Perception and Psychophysics 63
, 2001
"... The psychometric function relates an observer’s performance to an independent variable, usually a physical quantity of an experimental stimulus. Even if a model is successfully fit to the data and its goodness of fit is acceptable, experimenters require an estimate of the variability of the paramete ..."
Abstract

Cited by 41 (12 self)
 Add to MetaCart
The psychometric function relates an observer’s performance to an independent variable, usually a physical quantity of an experimental stimulus. Even if a model is successfully fit to the data and its goodness of fit is acceptable, experimenters require an estimate of the variability of the parameters to assess whether differences across conditions are significant. Accurate estimates of variability are difficult to obtain, however, given the typically small size of psychophysical data sets: Traditional statistical techniques are only asymptotically correct and can be shown to be unreliable in some common situations. Here and in our companion paper (Wichmann & Hill, 2001), we suggest alternative statistical techniques based on Monte Carlo resampling methods. The present paper’s principal topic is the estimation of the variability of fitted parameters and derived quantities, such as thresholds and slopes. First, we outline the basic bootstrap procedure and argue in favor of the parametric, as opposed to the nonparametric, bootstrap. Second, we describe how the bootstrap bridging assumption, on which the validity of the procedure depends, can be tested. Third, we show how one’s choice of sampling scheme (the placement of sample points on the stimulus axis) strongly affects the reliability of bootstrap confidence intervals, and we make recommendations on how to sample the psychometric function efficiently. Fourth, we show that, under certain circumstances, the (arbitrary) choice of the distribution function can exert an unwanted influence on
Comprehensive analysis of pseudogenes in prokaryotes: widespread gene decay and failure of putative horizontally transferred genes
 Genome Biol
, 2004
"... electronic version of this article is the complete one and can be found online at ..."
Abstract

Cited by 28 (4 self)
 Add to MetaCart
electronic version of this article is the complete one and can be found online at
A Toolbox for Analyzing Programs
"... The paper describes two separate but synergistic tools for running experiments on large Lisp programs. The first tool, ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
The paper describes two separate but synergistic tools for running experiments on large Lisp programs. The first tool,
Survey of Data Mining Approaches to User Modeling for Adaptive Hypermedia
 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS
, 2006
"... The ability of an adaptive hypermedia system to create tailored environments depends mainly on the amount and accuracy of information stored in each user model. Some of the difficulties that user modeling faces are the amount of data available to create user models, the adequacy of the data, the noi ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
The ability of an adaptive hypermedia system to create tailored environments depends mainly on the amount and accuracy of information stored in each user model. Some of the difficulties that user modeling faces are the amount of data available to create user models, the adequacy of the data, the noise within that data, and the necessity of capturing the imprecise nature of human behavior. Data mining and machine learning techniques have the ability to handle large amounts of data and to process uncertainty. These characteristics make these techniques suitable for automatic generation of user models that simulate human decision making. This paper surveys different data mining techniques that can be used to efficiently and accurately capture user behavior. The paper also presents guidelines that show which techniques may be used more efficiently according to the task implemented by the application. Index Terms—Adaptive hypermedia (AH), data mining, machine learning, user modeling (UM). I.