Results 11 
18 of
18
Supervised Learning of Bayesian Network Parameters Made Easy
 Level Perspective on Branch Architecture Performance, IEEE Micro28
, 2002
"... Bayesian network models are widely used for supervised prediction tasks such as classification. Usually the parameters of such models are determined using `unsupervised' methods such as maximization of the joint likelihood. In many cases, the reason is that it is not clear how to find the parameters ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
Bayesian network models are widely used for supervised prediction tasks such as classification. Usually the parameters of such models are determined using `unsupervised' methods such as maximization of the joint likelihood. In many cases, the reason is that it is not clear how to find the parameters maximizing the supervised (conditional) likelihood. We show how the supervised learning problem can be solved e#ciently for a large class of Bayesian network models, including the Naive Bayes (NB) and treeaugmented NB (TAN) classifiers. We do this by showing that under a certain general condition on the network structure, the supervised learning problem is exactly equivalent to logistic regression. Hitherto this was known only for Naive Bayes models. Since logistic regression models have a concave loglikelihood surface, the global maximum can be easily found by local optimization methods.
Unsupervised Bayesian Visualization of HighDimensional Data
 In
, 2000
"... We propose a data reduction method based on a probabilistic similarity framework where two vectors are considered similar if they lead to similar predictions. We show how this type of a probabilistic similarity metric can be defined both in a supervised and unsupervised manner. As a concrete applica ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
We propose a data reduction method based on a probabilistic similarity framework where two vectors are considered similar if they lead to similar predictions. We show how this type of a probabilistic similarity metric can be defined both in a supervised and unsupervised manner. As a concrete application of the suggested multidimensional scaling scheme, we describe how the method can be used for producing visual images of highdimensional data, and give several examples of visualizations obtained by using the suggested scheme with probabilistic Bayesian network models. 1. INTRODUCTION Multidimensional scaling (see, e.g., [3, 2]) is a data compression or data reduction task where the goal is to replace the original highdimensional data vectors with much shorter vectors, while losing as little information as possible. Intuitively speaking, it can be argued that a pragmatically sensible data reduction scheme is such that two vectors close to each other in the original multidimensional s...
Fast Factored Density Estimation and Compression with Bayesian Networks
, 2002
"... my family especially my father, Donald. iv Abstract Many important data analysis tasks can be addressed by formulating them as probability estimation problems. For example, a popular general approach to automatic classification problems is to learn a probabilistic model of each class from data in ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
my family especially my father, Donald. iv Abstract Many important data analysis tasks can be addressed by formulating them as probability estimation problems. For example, a popular general approach to automatic classification problems is to learn a probabilistic model of each class from data in which the classes are known, and then use Bayes's rule with these models to predict the correct classes of other data for which they are not known. Anomaly detection and scientific discovery tasks can often be addressed by learning probability models over possible events and then looking for events to which these models assign low probabilities. Many data compression algorithms such as Huffman coding and arithmetic coding rely on probabilistic models of the data stream in order achieve high compression rates.
Knowledge Discovery From Distributed And Textual Data
 Hong Kong University of Science and Technology
, 1999
"... xvi 1) ..."
An Unsupervised Bayesian Distance Measure
 Proceedings of the Fifth European Workshop on Casebased Reasoning (EWCBR’2000). LNAI1898
, 2000
"... . We introduce a distance measure based on the idea that two vectors are considered similar if they lead to similar predictive probability distributions. The suggested approach avoids the scaling problem inherent to many alternative techniques as the method automatically transforms the original ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
. We introduce a distance measure based on the idea that two vectors are considered similar if they lead to similar predictive probability distributions. The suggested approach avoids the scaling problem inherent to many alternative techniques as the method automatically transforms the original attribute space to a probability space where all the numbers lie between 0 and 1. The method is also flexible in the sense that it allows different attribute types (discrete or continuous) in the same consistent framework. To study the validity of the suggested measure, we ran a series of experiments with publicly available data sets. The empirical results demonstrate that the unsupervised distance measure is sensible in the sense that it can be used for discovering the hidden clustering structure of the data. 1 Introduction Machine learning techniques usually aim at compressing available sample data into more compact representations called models. These models can then be used for ...
Supervised Naive Bayes Parameters
, 2002
"... this paper we show, how this supervised learning problem can be solved e#ciently. We introduce an alternative parametrization in which the supervised likelihood becomes concave. From this result it follows that there can be at most one maximum, easily found by local optimization methods. We present ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
this paper we show, how this supervised learning problem can be solved e#ciently. We introduce an alternative parametrization in which the supervised likelihood becomes concave. From this result it follows that there can be at most one maximum, easily found by local optimization methods. We present test results that show this is feasible and highly beneficial
The design and collection of COSINE, a multimicrophone insitu speech corpus recorded in noisy environments
"... We present an overview of the data collection and transcription efforts for the COnversational Speech In Noisy Environments (COSINE) corpus. The corpus is a set of multiparty conversations recorded in real world environments, with background noise, that can be used to train noiserobust speech reco ..."
Abstract
 Add to MetaCart
We present an overview of the data collection and transcription efforts for the COnversational Speech In Noisy Environments (COSINE) corpus. The corpus is a set of multiparty conversations recorded in real world environments, with background noise, that can be used to train noiserobust speech recognition systems or develop speech denoising algorithms. We explain the motivation for creating such a corpus, and describe the resulting audio recordings and transcriptions that comprise the corpus. These high quality recordings were captured insitu on a custom wearable recording system, whose design and construction is also described. On separate synchronized audio channels, sevenchannel audio is captured with a 4channel farfield microphone array, along with a closetalking, a monophonic farfield, and a throat microphone. This corpus thus creates many possibilities for speech algorithm research.
Bayesian Network Classifier for Medical Data Analysis
"... Abstract: Bayesian networks encode causal relations between variables using probability and graph theory. They can be used both for prediction of an outcome and interpretation of predictions based on the encoded causal relations. In this paper we analyse a treelike Bayesian network learning algorit ..."
Abstract
 Add to MetaCart
Abstract: Bayesian networks encode causal relations between variables using probability and graph theory. They can be used both for prediction of an outcome and interpretation of predictions based on the encoded causal relations. In this paper we analyse a treelike Bayesian network learning algorithm optimised for classification of data and we give solutions to the interpretation and analysis of predictions. The classification of logical – i.e. binary – data arises specifically in the field of medical diagnosis, where we have to predict the survival chance based on different types of medical observations or we must select the most relevant cause corresponding again to a given patient record. Surgery survival prediction was examined with the algorithm. Bypass surgery survival chance must be computed for a given patient, having a dataset of 66 medical examinations for 313 patients.