Results 1  10
of
226
Learning in graphical models
 STATISTICAL SCIENCE
, 2004
"... Statistical applications in fields such as bioinformatics, information retrieval, speech processing, image processing and communications often involve largescale models in which thousands or millions of random variables are linked in complex ways. Graphical models provide a general methodology for ..."
Abstract

Cited by 806 (10 self)
 Add to MetaCart
(Show Context)
Statistical applications in fields such as bioinformatics, information retrieval, speech processing, image processing and communications often involve largescale models in which thousands or millions of random variables are linked in complex ways. Graphical models provide a general methodology for approaching these problems, and indeed many of the models developed by researchers in these applied fields are instances of the general graphical model formalism. We review some of the basic ideas underlying graphical models, including the algorithmic ideas that allow graphical models to be deployed in largescale data analysis problems. We also present examples of graphical models in bioinformatics, errorcontrol coding and language processing.
Dynamic Bayesian Networks: Representation, Inference and Learning
, 2002
"... Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and biosequence analysis, and KFMs have bee ..."
Abstract

Cited by 770 (3 self)
 Add to MetaCart
Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and biosequence analysis, and KFMs have been used for problems ranging from tracking planes and missiles to predicting the economy. However, HMMs
and KFMs are limited in their “expressive power”. Dynamic Bayesian Networks (DBNs) generalize HMMs by allowing the state space to be represented in factored form, instead of as a single discrete random variable. DBNs generalize KFMs by allowing arbitrary probability distributions, not just (unimodal) linearGaussian. In this thesis, I will discuss how to represent many different kinds of models as DBNs, how to perform exact and approximate inference in DBNs, and how to learn DBN models from sequential data.
In particular, the main novel technical contributions of this thesis are as follows: a way of representing
Hierarchical HMMs as DBNs, which enables inference to be done in O(T) time instead of O(T 3), where T is the length of the sequence; an exact smoothing algorithm that takes O(log T) space instead of O(T); a simple way of using the junction tree algorithm for online inference in DBNs; new complexity bounds on exact online inference in DBNs; a new deterministic approximate inference algorithm called factored frontier; an analysis of the relationship between the BK algorithm and loopy belief propagation; a way of
applying RaoBlackwellised particle filtering to DBNs in general, and the SLAM (simultaneous localization
and mapping) problem in particular; a way of extending the structural EM algorithm to DBNs; and a variety of different applications of DBNs. However, perhaps the main value of the thesis is its catholic presentation of the field of sequential data modelling.
Detecting intrusion using system calls: alternative data models
 In Proceedings of the IEEE Symposium on Security and Privacy
, 1999
"... Intrusion detection systems rely on a wide variety of observable data to distinguish between legitimate and illegitimate activities. In this paper we study one such observable— sequences of system calls into the kernel of an operating system. Using systemcall data sets generated by several differen ..."
Abstract

Cited by 433 (3 self)
 Add to MetaCart
(Show Context)
Intrusion detection systems rely on a wide variety of observable data to distinguish between legitimate and illegitimate activities. In this paper we study one such observable— sequences of system calls into the kernel of an operating system. Using systemcall data sets generated by several different programs, we compare the ability of different data modeling methods to represent normal behavior accurately and to recognize intrusions. We compare the following methods: Simple enumeration of observed sequences, comparison of relative frequencies of different sequences, a rule induction technique, and Hidden Markov Models (HMMs). We discuss the factors affecting the performance of each method, and conclude that for this particular problem, weaker methods than HMMs are likely sufficient. 1.
The Hierarchical Hidden Markov Model: Analysis and Applications
 MACHINE LEARNING
, 1998
"... . We introduce, analyze and demonstrate a recursive hierarchical generalization of the widely used hidden Markov models, which we name Hierarchical Hidden Markov Models (HHMM). Our model is motivated by the complex multiscale structure which appears in many natural sequences, particularly in langua ..."
Abstract

Cited by 326 (3 self)
 Add to MetaCart
. We introduce, analyze and demonstrate a recursive hierarchical generalization of the widely used hidden Markov models, which we name Hierarchical Hidden Markov Models (HHMM). Our model is motivated by the complex multiscale structure which appears in many natural sequences, particularly in language, handwriting and speech. We seek a systematic unsupervised approach to the modeling of such structures. By extendingthe standard forwardbackward(BaumWelch) algorithm, we derive an efficient procedure for estimating the model parameters from unlabeled data. We then use the trained model for automatic hierarchical parsing of observation sequences. We describe two applications of our model and its parameter estimation procedure. In the first application we show how to construct hierarchical models of natural English text. In these models different levels of the hierarchy correspond to structures on different length scales in the text. In the second application we demonstrate how HHMMs can b...
Selective Markov Models for Predicting WebPage Accesses
, 2001
"... The problem of predicting a user’s behavior on a website has gained importance due to the rapid growth of the worldwideweb and the need to personalize and influence a user’s browsing experience. Markov models and their variations have been found well suited for addressing this problem. Of the dif ..."
Abstract

Cited by 166 (1 self)
 Add to MetaCart
The problem of predicting a user’s behavior on a website has gained importance due to the rapid growth of the worldwideweb and the need to personalize and influence a user’s browsing experience. Markov models and their variations have been found well suited for addressing this problem. Of the different variations or Markov models it is generally found that higherorder Markov models display high predictive accuracies. However higher order models are also extremely complicated due to their large number of states that increases their space and runtime requirements. In this paper we present different techniques for intelligently selecting parts of different order Markov models so that the resulting model has a reduced state complexity and improved prediction accuracy. We have tested our models on various datasets and have found that their performance is consistently superior to that obtained by higherorder Markov models.
Markovian Models for Sequential Data
, 1996
"... Hidden Markov Models (HMMs) are statistical models of sequential data that have been used successfully in many machine learning applications, especially for speech recognition. Furthermore, in the last few years, many new and promising probabilistic models related to HMMs have been proposed. We firs ..."
Abstract

Cited by 119 (2 self)
 Add to MetaCart
Hidden Markov Models (HMMs) are statistical models of sequential data that have been used successfully in many machine learning applications, especially for speech recognition. Furthermore, in the last few years, many new and promising probabilistic models related to HMMs have been proposed. We first summarize the basics of HMMs, and then review several recent related learning algorithms and extensions of HMMs, including in particular hybrids of HMMs with artificial neural networks, InputOutput HMMs (which are conditional HMMs using neural networks to compute probabilities), weighted transducers, variablelength Markov models and Markov switching statespace models. Finally, we discuss some of the challenges of future research in this very active area. 1 Introduction Hidden Markov Models (HMMs) are statistical models of sequential data that have been used successfully in many applications in artificial intelligence, pattern recognition, speech recognition, and modeling of biological ...
The Continuator: Musical Interaction with Style
 INTERNATIONAL COMPUTER MUSIC CONFERENCE, GOTHEBORG (SWEDEN), ICMA
, 2002
"... We propose a system, the Continuator, that bridges the gap between two classes of traditionally incompatible musical systems: 1) interactive musical systems, limited in their ability to generate stylistically consistent material, and 2) music imitation systems, which are fundamentally not interactiv ..."
Abstract

Cited by 116 (16 self)
 Add to MetaCart
We propose a system, the Continuator, that bridges the gap between two classes of traditionally incompatible musical systems: 1) interactive musical systems, limited in their ability to generate stylistically consistent material, and 2) music imitation systems, which are fundamentally not interactive. Our purpose is to allow musicians to extend their technical ability with stylistically consistent, automatically learnt material. This goal requires the ability for the system to build operational representations of musical styles in a real time context. Our approach is based on a Markov model of musical styles augmented to account for musical issues such as management of rhythm, beat, harmony, and imprecision. The resulting system is able to learn and generate music in any style, either in standalone mode, as continuations of musician's input, or as interactive improvisation back up. Lastly, the very design of the system makes possible new modes of musical collaborative playing. We describe the architecture, implementation issues and experimentations conducted with the system in several real world contexts.
On prediction using variable order Markov models
 JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH
, 2004
"... This paper is concerned with algorithms for prediction of discrete sequences over a finite alphabet, using variable order Markov models. The class of such algorithms is large and in principle includes any lossless compression algorithm. We focus on six prominent prediction algorithms, including Cont ..."
Abstract

Cited by 105 (1 self)
 Add to MetaCart
(Show Context)
This paper is concerned with algorithms for prediction of discrete sequences over a finite alphabet, using variable order Markov models. The class of such algorithms is large and in principle includes any lossless compression algorithm. We focus on six prominent prediction algorithms, including Context Tree Weighting (CTW), Prediction by Partial Match (PPM) and Probabilistic Suffix Trees (PSTs). We discuss the properties of these algorithms and compare their performance using real life sequences from three domains: proteins, English text and music pieces. The comparison is made with respect to prediction quality as measured by the average logloss. We also compare classification algorithms based on these predictors with respect to a number of large protein classification tasks. Our results indicate that a “decomposed” CTW (a variant of the CTW algorithm) and PPM outperform all other algorithms in sequence prediction tasks. Somewhat surprisingly, a different algorithm, which is a modification of the LempelZiv compression algorithm, significantly outperforms all algorithms on the protein classification problems.
Variations on Probabilistic Suffix Trees: Statistical Modeling and Prediction of Protein Families
, 2001
"... Motivation: We present a method for modeling protein families by means of probabilistic suffix trees (PSTs). The method is based on identifying significant patterns in a set of related protein sequences. The patterns can be of arbitrary length, and the input sequences do not need to be aligned, nor ..."
Abstract

Cited by 82 (7 self)
 Add to MetaCart
Motivation: We present a method for modeling protein families by means of probabilistic suffix trees (PSTs). The method is based on identifying significant patterns in a set of related protein sequences. The patterns can be of arbitrary length, and the input sequences do not need to be aligned, nor is delineation of domain boundaries required. The method is automatic, and can be applied, without assuming any preliminary biological information, with surprising success. Basic biological considerations such as amino acid background probabilities, and amino acids substitution probabilities can be incorporated to improve performance. Results: The PST can serve as a predictive tool for protein sequence classification, and for detecting conserved patterns (possibly functionally or structurally important) within protein sequences. The method was tested on the Pfam database of protein families with more than satisfactory performance. Exhaustive evaluations show that the PST model detects much more related sequences than pairwise methods such as GappedBLAST, and is almost as sensitive as a hidden Markov model that is trained from a multiple alignment of the input sequences, while being much faster. Availability: The programs are available upon request from the authors. Contact: jill@cs.huji.ac.il; golan@cs.cornell.edu
Mixed memory Markov models: decomposing complex stochastic processes as mixtures of simpler ones
, 1998
"... . We study Markov models whose state spaces arise from the Cartesian product of two or more discrete random variables. We show how to parameterize the transition matrices of these models as a convex combinationor mixtureof simpler dynamical models. The parameters in these models admit a simple ..."
Abstract

Cited by 73 (1 self)
 Add to MetaCart
(Show Context)
. We study Markov models whose state spaces arise from the Cartesian product of two or more discrete random variables. We show how to parameterize the transition matrices of these models as a convex combinationor mixtureof simpler dynamical models. The parameters in these models admit a simple probabilistic interpretation and can be fitted iteratively by an ExpectationMaximization (EM) procedure. We derive a set of generalized BaumWelch updates for factorial hidden Markov models that make use of this parameterization. We also describe a simple iterative procedure for approximately computing the statistics of the hidden states. Throughout, we give examples where mixed memory models provide a useful representation of complex stochastic processes. Keywords: Markov models, mixture models, discrete time series 1. Introduction The modeling of time series is a fundamental problem in machine learning, with widespread applications. These include speech recognition (Rabiner, 1989), natu...