Results 1 
7 of
7
Dynamic Bayesian Networks: Representation, Inference and Learning
, 2002
"... Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and biosequence analysis, and KFMs have bee ..."
Abstract

Cited by 564 (3 self)
 Add to MetaCart
Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and biosequence analysis, and KFMs have been used for problems ranging from tracking planes and missiles to predicting the economy. However, HMMs
and KFMs are limited in their “expressive power”. Dynamic Bayesian Networks (DBNs) generalize HMMs by allowing the state space to be represented in factored form, instead of as a single discrete random variable. DBNs generalize KFMs by allowing arbitrary probability distributions, not just (unimodal) linearGaussian. In this thesis, I will discuss how to represent many different kinds of models as DBNs, how to perform exact and approximate inference in DBNs, and how to learn DBN models from sequential data.
In particular, the main novel technical contributions of this thesis are as follows: a way of representing
Hierarchical HMMs as DBNs, which enables inference to be done in O(T) time instead of O(T 3), where T is the length of the sequence; an exact smoothing algorithm that takes O(log T) space instead of O(T); a simple way of using the junction tree algorithm for online inference in DBNs; new complexity bounds on exact online inference in DBNs; a new deterministic approximate inference algorithm called factored frontier; an analysis of the relationship between the BK algorithm and loopy belief propagation; a way of
applying RaoBlackwellised particle filtering to DBNs in general, and the SLAM (simultaneous localization
and mapping) problem in particular; a way of extending the structural EM algorithm to DBNs; and a variety of different applications of DBNs. However, perhaps the main value of the thesis is its catholic presentation of the field of sequential data modelling.
Automatic Construction of Decision Trees from Data: A MultiDisciplinary Survey
 Data Mining and Knowledge Discovery
, 1997
"... Decision trees have proved to be valuable tools for the description, classification and generalization of data. Work on constructing decision trees from data exists in multiple disciplines such as statistics, pattern recognition, decision theory, signal processing, machine learning and artificial ne ..."
Abstract

Cited by 146 (1 self)
 Add to MetaCart
Decision trees have proved to be valuable tools for the description, classification and generalization of data. Work on constructing decision trees from data exists in multiple disciplines such as statistics, pattern recognition, decision theory, signal processing, machine learning and artificial neural networks. Researchers in these disciplines, sometimes working on quite different problems, identified similar issues and heuristics for decision tree construction. This paper surveys existing work on decision tree construction, attempting to identify the important issues involved, directions the work has taken and the current state of the art. Keywords: classification, treestructured classifiers, data compaction 1. Introduction Advances in data collection methods, storage and processing technology are providing a unique challenge and opportunity for automated data exploration techniques. Enormous amounts of data are being collected daily from major scientific projects e.g., Human Genome...
Symbolic and neural learning algorithms: an experimental comparison
 Machine Learning
, 1991
"... Abstract Despite the fact that many symbolic and neural network (connectionist) learning algorithms address the same problem of learning from classified examples, very little is known regarding their comparative strengths and weaknesses. Experiments comparing the ID3 symbolic learning algorithm with ..."
Abstract

Cited by 99 (6 self)
 Add to MetaCart
Abstract Despite the fact that many symbolic and neural network (connectionist) learning algorithms address the same problem of learning from classified examples, very little is known regarding their comparative strengths and weaknesses. Experiments comparing the ID3 symbolic learning algorithm with the perception and backpropagation neural learning algorithms have been performed using five large, realworld data sets. Overall, backpropagation performs slightly better than the other two algorithms in terms of classification accuracy on new examples, but takes much longer to train. Experimental results suggest that backpropagation can work significantly better on data sets containing numerical data. Also analyzed empirically are the effects of (1) the amount of training data, (2) imperfect training examples, and (3) the encoding of the desired outputs. Backpropagation occasionally outperforms the other two systems when given relatively small amounts of training data. It is slightly more accurate than ID3 when examples are noisy or incompletely specified. Finally, backpropagation more effectively utilizes a "distributed " output encoding.
A Theory of Learning Classification Rules
, 1992
"... The main contributions of this thesis are a Bayesian theory of learning classification rules, the unification and comparison of this theory with some previous theories of learning, and two extensive applications of the theory to the problems of learning class probability trees and bounding error whe ..."
Abstract

Cited by 79 (6 self)
 Add to MetaCart
The main contributions of this thesis are a Bayesian theory of learning classification rules, the unification and comparison of this theory with some previous theories of learning, and two extensive applications of the theory to the problems of learning class probability trees and bounding error when learning logical rules. The thesis is motivated by considering some current research issues in machine learning such as bias, overfitting and search, and considering the requirements placed on a learning system when it is used for knowledge acquisition. Basic Bayesian decision theory relevant to the problem of learning classification rules is reviewed, then a Bayesian framework for such learning is presented. The framework has three components: the hypothesis space, the learning protocol, and criteria for successful learning. Several learning protocols are analysed in detail: queries, logical, noisy, uncertain and positiveonly examples. The analysis is done by interpreting a protocol as a...
Dynamical Selection of Learning Algorithms
, 1995
"... Determining the conditions for which a given learning algorithm is appropriate is an open problem in machine learning. Methods for selecting a learning algorithm for a given domain have met with limited success. This paper proposes a new approach to predicting a given example's class by locating it ..."
Abstract

Cited by 22 (3 self)
 Add to MetaCart
Determining the conditions for which a given learning algorithm is appropriate is an open problem in machine learning. Methods for selecting a learning algorithm for a given domain have met with limited success. This paper proposes a new approach to predicting a given example's class by locating it in the "example space" and then choosing the best learner(s) in that region of the example space to make predictions. The regions of the example space are defined by the prediction patterns of the learners being used. The learner(s) chosen for prediction are selected according to their past performance in that region. This dynamic approach to learning algorithm selection is compared to other methods for selecting from multiple learning algorithms. The approach is then extended to weight rather than select the algorithms according to their past performance in a given region. Both approaches are further evaluated on a set of ten domains and compared to several other metalearning strategies. 1...
Seer: Maximum Likelihood Regression for LearningSpeed Curves
 University of Illinois at
, 1995
"... The research presented here focuses on modeling machinelearning performance. The thesis introduces Seer, a system that generates empirical observations of classificationlearning performance and then uses those observations to create statistical models. The models can be used to predict the number ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
The research presented here focuses on modeling machinelearning performance. The thesis introduces Seer, a system that generates empirical observations of classificationlearning performance and then uses those observations to create statistical models. The models can be used to predict the number of training examples needed to achieve a desired level and the maximum accuracy possible given an unlimited number of training examples. Seer advances the state of the art with 1) models that embody the best constraints for classification learning and most useful parameters, 2) algorithms that efficiently find maximumlikelihood models, and 3) a demonstration on realworld data from three domains of a practicable application of such modeling. The first part of the thesis gives an overview of the requirements for a good maximumlikelihood model of classificationlearning performance. Next, reasonable design choices for such models are explored. Selection among such models is a task of nonlinear programming, but by exploiting appropriate problem constraints, the task is reduced to a nonlinear regression task that can be solved with an efficient iterative algorithm. The latter part of the thesis describes almost 100 experiments in the domains of soybean disease, heart disease, and audiological problems. The tests show that Seer is excellent at characterizing learningperformance and that it seems to be as good as possible at predicting learning
MESO: Supporting online decision making in autonomic computing systems
 IEEE Transactions on Knowledge and Data Engineering (TKDE
, 2007
"... Abstract—Autonomic computing systems must be able to detect and respond to errant behavior or changing conditions with little or no human intervention. Clearly, decision making is a critical issue in such systems, which must learn how and when to invoke corrective actions based on past experience. T ..."
Abstract

Cited by 5 (4 self)
 Add to MetaCart
Abstract—Autonomic computing systems must be able to detect and respond to errant behavior or changing conditions with little or no human intervention. Clearly, decision making is a critical issue in such systems, which must learn how and when to invoke corrective actions based on past experience. This paper describes the design, implementation, and evaluation of MESO, a pattern classifier designed to support online, incremental learning and decision making in autonomic systems. A novel feature of MESO is its use of small agglomerative clusters, called sensitivity spheres, that aggregate similar training samples. Sensitivity spheres are partitioned into sets during the construction of a memoryefficient hierarchical data structure. This structure facilitates data compression, which is important to many autonomic systems. Results are presented demonstrating that MESO achieves high accuracy while enabling rapid incremental training and classification. A case study is described in which MESO enables a mobile computing application to learn, by imitation, user preferences for balancing wireless network packet loss and bandwidth consumption. Once trained, the application can autonomously adjust error control parameters as needed while the user roams about a wireless cell. Index Terms—Autonomic computing, adaptive software, pattern classification, decision making, imitative learning, machine learning, mobile computing, perceptual memory, reinforcement learning. Ç