Results 1  10
of
47
Dynamic Bayesian Networks: Representation, Inference and Learning
, 2002
"... Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and biosequence analysis, and KFMs have bee ..."
Abstract

Cited by 564 (3 self)
 Add to MetaCart
Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and biosequence analysis, and KFMs have been used for problems ranging from tracking planes and missiles to predicting the economy. However, HMMs
and KFMs are limited in their “expressive power”. Dynamic Bayesian Networks (DBNs) generalize HMMs by allowing the state space to be represented in factored form, instead of as a single discrete random variable. DBNs generalize KFMs by allowing arbitrary probability distributions, not just (unimodal) linearGaussian. In this thesis, I will discuss how to represent many different kinds of models as DBNs, how to perform exact and approximate inference in DBNs, and how to learn DBN models from sequential data.
In particular, the main novel technical contributions of this thesis are as follows: a way of representing
Hierarchical HMMs as DBNs, which enables inference to be done in O(T) time instead of O(T 3), where T is the length of the sequence; an exact smoothing algorithm that takes O(log T) space instead of O(T); a simple way of using the junction tree algorithm for online inference in DBNs; new complexity bounds on exact online inference in DBNs; a new deterministic approximate inference algorithm called factored frontier; an analysis of the relationship between the BK algorithm and loopy belief propagation; a way of
applying RaoBlackwellised particle filtering to DBNs in general, and the SLAM (simultaneous localization
and mapping) problem in particular; a way of extending the structural EM algorithm to DBNs; and a variety of different applications of DBNs. However, perhaps the main value of the thesis is its catholic presentation of the field of sequential data modelling.
Structural extension to logistic regression: Discriminative parameter learning of belief net classifiers
 In Proceedings of the Eighteenth Annual National Conference on Artificial Intelligence (AAAI02
, 2002
"... Abstract. Bayesian belief nets (BNs) are often used for classification tasks — typically to return the most likely class label for each specified instance. Many BNlearners, however, attempt to find the BN that maximizes a different objective function — viz., likelihood, rather than classification a ..."
Abstract

Cited by 58 (8 self)
 Add to MetaCart
Abstract. Bayesian belief nets (BNs) are often used for classification tasks — typically to return the most likely class label for each specified instance. Many BNlearners, however, attempt to find the BN that maximizes a different objective function — viz., likelihood, rather than classification accuracy — typically by first learning an appropriate graphical structure, then finding the parameters for that structure that maximize the likelihood of the data. As these parameters may not maximize the classification accuracy, “discriminative parameter learners ” follow the alternative approach of seeking the parameters that maximize conditional likelihood (CL), over the distribution of instances the BN will have to classify. This paper first formally specifies this task, shows how it extends standard logistic regression, and analyzes its inherent sample and computational complexity. We then present a general algorithm for this task, ELR, that applies to arbitrary BN structures and that works effectively even when given incomplete training data. Unfortunately, ELR is not guaranteed to find the parameters that optimize conditional likelihood; moreover, even the optimalCL parameters need not have minimal classification error. This paper therefore presents empirical evidence that ELR produces effective classifiers, often superior to the ones produced by the standard “generative” algorithms, especially in common situations where the given BNstructure is incorrect. Keywords: (Bayesian) belief nets, Logistic regression, Classification, PAClearning, Computational/sample complexity
A variational approximation for Bayesian networks with discrete and continuous latent variables
 In UAI
, 1999
"... We show how to use a variational approximation to the logistic function to perform approximate inference in Bayesian networks containing discrete nodes with continuous parents. Essentially, we convert the logistic function to a Gaussian, which facilitates exact inference, and then iteratively adjust ..."
Abstract

Cited by 43 (6 self)
 Add to MetaCart
We show how to use a variational approximation to the logistic function to perform approximate inference in Bayesian networks containing discrete nodes with continuous parents. Essentially, we convert the logistic function to a Gaussian, which facilitates exact inference, and then iteratively adjust the variational parameters to improve the quality of the approximation. We demonstrate experimentally that this approximation is much faster than sampling, but comparable in accuracy. We also introduce a simple new technique for handling evidence, which allows us to handle arbitrary distributionson observed nodes, as well as achieving a significant speedup in networks with discrete variables of large cardinality. 1
Survey of Neural Transfer Functions
 Neural Computing Surveys
, 1999
"... The choice of transfer functions may strongly influence complexity and performance of neural networks. Although sigmoidal transfer functions are the most common there is no apriorireason why models based on such functions should always provide optimal decision borders. A large number of alternative ..."
Abstract

Cited by 35 (19 self)
 Add to MetaCart
The choice of transfer functions may strongly influence complexity and performance of neural networks. Although sigmoidal transfer functions are the most common there is no apriorireason why models based on such functions should always provide optimal decision borders. A large number of alternative transfer functions has been described in the literature. A taxonomy of activation and output functions is proposed, and advantages of various nonlocal and local neural transfer functions are discussed. Several lessknown types of transfer functions and new combinations of activation/output functions are described. Universal transfer functions, parametrized to change from localized to delocalized type, are of greatest interest. Other types of neural transfer functions discussed here include functions with activations based on nonEuclidean distance measures, bicentral functions, formed from products or linear combinations of pairs of sigmoids, and extensions of such functions making rotations...
Irrelevance and parameter learning in Bayesian networks
 Artificial Intelligence
, 1996
"... Bayesian network classifiers have been widely used for classification problems. Given a fixed Bayesian network structure, parameters learning can take two different approaches: generative and discriminative learning. While generative parameter learning is more efficient, discriminative parameter lea ..."
Abstract

Cited by 24 (2 self)
 Add to MetaCart
Bayesian network classifiers have been widely used for classification problems. Given a fixed Bayesian network structure, parameters learning can take two different approaches: generative and discriminative learning. While generative parameter learning is more efficient, discriminative parameter learning is more effective. In this paper, we propose a simple, efficient, and effective discriminative parameter learning method, called Discriminative Frequency Estimate (DFE), which learns parameters by discriminatively computing frequencies from data. Empirical studies show that the DFE algorithm integrates the advantages of both generative and discriminative learning: it performs as well as the stateoftheart discriminative parameter learning method ELR in accuracy, but is significantly more efficient. 1.
Perceptually Inspired Signalprocessing Strategies for Robust Speech Recognition in Reverberant Environments
, 1998
"... Natural, handsfree interaction with computers is currently one of the great unfulfilled promises of automatic speech recognition (ASR), in part because ASR systems cannot reliably recognize speech under everyday, reverberant conditions that pose no problems for most human listeners. The specific pr ..."
Abstract

Cited by 14 (0 self)
 Add to MetaCart
Natural, handsfree interaction with computers is currently one of the great unfulfilled promises of automatic speech recognition (ASR), in part because ASR systems cannot reliably recognize speech under everyday, reverberant conditions that pose no problems for most human listeners. The specific properties of the auditory representation of speech likely contribute to reliable human speech recognition under such conditions. This dissertation explores the use of perceptually inspired signalprocessing strategies  criticalbandlike frequency analysis, an emphasis of slow changes in the spectral structure of the speech signal, adaptation, integration of phonetic information over syllabic durations, and use of multiple signal representations for...
Probabilistic Alignment of Motifs with Sequences
 Bioinformatics
, 2002
"... Motivation: Motif detection is an important component of the classification and annotation of protein sequences. A method for aligning motifs with an amino acid sequence is introduced. The motifs can be described by the secondary (i.e. functional, biophysical, etc. . . ) characteristics of a signal ..."
Abstract

Cited by 13 (1 self)
 Add to MetaCart
Motivation: Motif detection is an important component of the classification and annotation of protein sequences. A method for aligning motifs with an amino acid sequence is introduced. The motifs can be described by the secondary (i.e. functional, biophysical, etc. . . ) characteristics of a signal or pattern to be detected. The results produced are based on the statistical relevance of the alignment. The method was targeted to avoid the problems (i.e. overfitting, biological interpretation and mathematical soundness) encountered in other methods currently available.
Suboptimal behavior of Bayes and MDL in classification under misspecification
 COLT
, 2004
"... We show that forms of Bayesian and MDL inference that are often applied to classification problems can be inconsistent. This means that there exists a learning problem such that for all amounts of data the generalization errors of the MDL classifier and the Bayes classifier relative to the Bayesian ..."
Abstract

Cited by 13 (3 self)
 Add to MetaCart
We show that forms of Bayesian and MDL inference that are often applied to classification problems can be inconsistent. This means that there exists a learning problem such that for all amounts of data the generalization errors of the MDL classifier and the Bayes classifier relative to the Bayesian posterior both remain bounded away from the smallest achievable generalization error. From a Bayesian point of view, the result can be reinterpreted as saying that Bayesian inference can be inconsistent under misspecification, even for countably infinite models. We extensively discuss the result from both a Bayesian and an MDL perspective.
A Contribution to MultiModal Identity Verification Using Decision Fusion
 Department of
, 1999
"... The contribution of this paper is to compare paradigms coming from the classes of parametric, and nonparametric techniques to solve the decision fusion problem encountered in the design of a multimodal biometrical identity verification system. The multimodal identity verification system under con ..."
Abstract

Cited by 12 (1 self)
 Add to MetaCart
The contribution of this paper is to compare paradigms coming from the classes of parametric, and nonparametric techniques to solve the decision fusion problem encountered in the design of a multimodal biometrical identity verification system. The multimodal identity verification system under consideration is built of d modalities in parallel, each one delivering as output a scalar number, called score, stating how well the claimed identity is verified. A decision fusion module receiving as input the d scores has to take a binary decision: accept or reject the claimed identity. We have solved this fusion problem using parametric and nonparametric classifiers. The performances of all these fusion modules have been evaluated and compared with other approaches on a multimodal database, containing both vocal and visual biometric modalities. Keywords: Multimodal identity verification, biometrics, decision fusion. 1 Introduction The automatic verification 1 of a person is more and...