Results 1  10
of
272,028
Estimating Continuous Distributions in Bayesian Classifiers
 In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence
, 1995
"... When modeling a probability distribution with a Bayesian network, we are faced with the problem of how to handle continuous variables. Most previous work has either solved the problem by discretizing, or assumed that the data are generated by a single Gaussian. In this paper we abandon the normality ..."
Abstract

Cited by 489 (2 self)
 Add to MetaCart
the normality assumption and instead use statistical methods for nonparametric density estimation. For a naive Bayesian classifier, we present experimental results on a variety of natural and artificial domains, comparing two methods of density estimation: assuming normality and modeling each conditional
A maximum likelihood approach to continuous speech recognition
 IEEE Trans. Pattern Anal. Machine Intell
, 1983
"... AbstractSpeech recognition is formulated as a problem of maximum likelihood decoding. This formulation requires statistical models of the speech production process. In this paper, we describe a number of statistical models for use in speech recognition. We give special attention to determining the ..."
Abstract

Cited by 472 (9 self)
 Add to MetaCart
with them. Index TermsMarkov models, maximum likelihood, parameter estimation, speech recognition, statistical models. I.
A Simple, Fast, and Accurate Algorithm to Estimate Large Phylogenies by Maximum Likelihood
, 2003
"... The increase in the number of large data sets and the complexity of current probabilistic sequence evolution models necessitates fast and reliable phylogeny reconstruction methods. We describe a new approach, based on the maximumlikelihood principle, which clearly satisfies these requirements. The ..."
Abstract

Cited by 2109 (30 self)
 Add to MetaCart
of the topology and branch lengths, only a few iterations are sufficient to reach an optimum. We used extensive and realistic computer simulations to show that the topological accuracy of this new method is at least as high as that of the existing maximumlikelihood programs and much higher than the performance
Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods
 ADVANCES IN LARGE MARGIN CLASSIFIERS
, 1999
"... The output of a classifier should be a calibrated posterior probability to enable postprocessing. Standard SVMs do not provide such probabilities. One method to create probabilities is to directly train a kernel classifier with a logit link function and a regularized maximum likelihood score. Howev ..."
Abstract

Cited by 1041 (0 self)
 Add to MetaCart
The output of a classifier should be a calibrated posterior probability to enable postprocessing. Standard SVMs do not provide such probabilities. One method to create probabilities is to directly train a kernel classifier with a logit link function and a regularized maximum likelihood score
Probabilistic Latent Semantic Analysis
 In Proc. of Uncertainty in Artificial Intelligence, UAI’99
, 1999
"... Probabilistic Latent Semantic Analysis is a novel statistical technique for the analysis of twomode and cooccurrence data, which has applications in information retrieval and filtering, natural language processing, machine learning from text, and in related areas. Compared to standard Latent Sema ..."
Abstract

Cited by 760 (9 self)
 Add to MetaCart
to avoid overfitting, we propose a widely applicable generalization of maximum likelihood model fitting by tempered EM. Our approach yields substantial and consistent improvements over Latent Semantic Analysis in a number of experiments.
Probabilistic Principal Component Analysis
 Journal of the Royal Statistical Society, Series B
, 1999
"... Principal component analysis (PCA) is a ubiquitous technique for data analysis and processing, but one which is not based upon a probability model. In this paper we demonstrate how the principal axes of a set of observed data vectors may be determined through maximumlikelihood estimation of paramet ..."
Abstract

Cited by 703 (5 self)
 Add to MetaCart
of parameters in a latent variable model closely related to factor analysis. We consider the properties of the associated likelihood function, giving an EM algorithm for estimating the principal subspace iteratively, and discuss, with illustrative examples, the advantages conveyed by this probabilistic approach
Maximum Likelihood Phylogenetic Estimation from DNA Sequences with Variable Rates over Sites: Approximate Methods
 J. Mol. Evol
, 1994
"... Two approximate methods are proposed for maximum likelihood phylogenetic estimation, which allow variable rates of substitution across nucleotide sites. Three data sets with quite different characteristics were analyzed to examine empirically the performance of these methods. The first, called ..."
Abstract

Cited by 540 (28 self)
 Add to MetaCart
Two approximate methods are proposed for maximum likelihood phylogenetic estimation, which allow variable rates of substitution across nucleotide sites. Three data sets with quite different characteristics were analyzed to examine empirically the performance of these methods. The first, called
Mixtures of Probabilistic Principal Component Analysers
, 1998
"... Principal component analysis (PCA) is one of the most popular techniques for processing, compressing and visualising data, although its effectiveness is limited by its global linearity. While nonlinear variants of PCA have been proposed, an alternative paradigm is to capture data complexity by a com ..."
Abstract

Cited by 537 (6 self)
 Add to MetaCart
maximumlikelihood framework, based on a specific form of Gaussian latent variable model. This leads to a welldefined mixture model for probabilistic principal component analysers, whose parameters can be determined using an EM algorithm. We discuss the advantages of this model in the context
Learning probabilistic relational models
 In IJCAI
, 1999
"... A large portion of realworld data is stored in commercial relational database systems. In contrast, most statistical learning methods work only with "flat " data representations. Thus, to apply these methods, we are forced to convert our data into a flat form, thereby losing much ..."
Abstract

Cited by 619 (31 self)
 Add to MetaCart
objects. Although PRMs are significantly more expressive than standard models, such as Bayesian networks, we show how to extend wellknown statistical methods for learning Bayesian networks to learn these models. We describe both parameter estimation and structure learning — the automatic induction
Bayesian Network Classifiers
, 1997
"... Recent work in supervised learning has shown that a surprisingly simple Bayesian classifier with strong assumptions of independence among features, called naive Bayes, is competitive with stateoftheart classifiers such as C4.5. This fact raises the question of whether a classifier with less restr ..."
Abstract

Cited by 788 (23 self)
 Add to MetaCart
Recent work in supervised learning has shown that a surprisingly simple Bayesian classifier with strong assumptions of independence among features, called naive Bayes, is competitive with stateoftheart classifiers such as C4.5. This fact raises the question of whether a classifier with less
Results 1  10
of
272,028