Results 11 
19 of
19
Latent Dirichlet conditional naiveBayes models
 In ICDM
, 2007
"... In spite of the popularity of probabilistic mixture models for latent structure discovery from data, mixture models do not have a natural mechanism for handling sparsity, where each data point only has a few nonzero observations. In this paper, we introduce conditional naiveBayes (CNB) models, whi ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
In spite of the popularity of probabilistic mixture models for latent structure discovery from data, mixture models do not have a natural mechanism for handling sparsity, where each data point only has a few nonzero observations. In this paper, we introduce conditional naiveBayes (CNB) models, which generalize naiveBayes mixture models to naturally handle sparsity by conditioning the model on observed features. Further, we present latent Dirichlet conditional naiveBayes (LDCNB) models, which constitute a family of powerful hierarchical Bayesian models for latent structure discovery from sparse data. The proposed family of models are quite general and can work with arbitrary regular exponential family conditional distributions. We present a variational inference based EM algorithm for
Local sparsity control for Naive Bayes with extreme misclassification costs
"... In applications of data mining characterized by highly skewed misclassification costs certain types of errors become virtually unacceptable. This limits the utility of a classifier to a range in which such constraints can be met. Naive Bayes, which has proven to be very useful in text mining applic ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
In applications of data mining characterized by highly skewed misclassification costs certain types of errors become virtually unacceptable. This limits the utility of a classifier to a range in which such constraints can be met. Naive Bayes, which has proven to be very useful in text mining applications due to high scalability, can be particularly affected. Although its 0/1 loss tends to be small, its misclassifications are often made with apparently high con…dence. Aside from e¤orts to better calibrate Naive Bayes scores, it has been shown that its accuracy depends on document sparsity and feature selection can lead to marked improvement in classification performance. Traditionally, sparsity is controlled globally, and the result for any particular document may vary. In this work we examine the merits of local sparsity control for Naive Bayes in the context of highly asymmetric misclassification costs. In experiments with three benchmark document collections we demonstrate clear advantages of documentlevel feature selection. In the extreme cost setting, multinomial Naive Bayes with local sparsity control is able to outperform even some of the recently proposed e¤ective improvements to the Naive Bayes classifier. There are also indications that local feature selection may be preferable in different cost settings.
On Decision Boundaries of Naive Bayes In Continuous Domains
, 2003
"... Nave Bayesian classiers assume the conditional independence of attribute values given the class. Despite this in practice often violated assumption, these simple classiers have been found ecient, eective, and robust to noise. ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Nave Bayesian classiers assume the conditional independence of attribute values given the class. Despite this in practice often violated assumption, these simple classiers have been found ecient, eective, and robust to noise.
Data mining for hypertext: A tutorial survey
, 2000
"... With over 800 million pages covering most areas of human endeavor, the Worldwide Web is a fertile ground for data mining research to make a difference to the effectiveness of information search. Today, Web surfers access the Web through two dominant interfaces: clicking on hyperlinks and searchin ..."
Abstract
 Add to MetaCart
With over 800 million pages covering most areas of human endeavor, the Worldwide Web is a fertile ground for data mining research to make a difference to the effectiveness of information search. Today, Web surfers access the Web through two dominant interfaces: clicking on hyperlinks and searching via keyword queries. This process is often tentative and unsatisfactory. Better support is needed for expressing one's information need and dealing with a search result in more structured ways than available now. Data mining and machine learning have significant roles to play towards this end. In this paper we willsurvey recent advances in learning and mining problems related to hypertext in general and the Web in particular. We will review the continuum of supervised to semisupervised to unsupervised learning problems, highlight the specific challenges which distinguish data mining in the hypertext domain from data mining in the context of data warehouses, and summarize the key areas of recent and ongoing research.
Online Detection of Rule Violations in Table Soccer
, 2008
"... In table soccer, humans can not always thoroughly observe fast actions like rod spins and kicks. However, this is necessary in order to detect rule violations for example for tournament play. We describe an automatic system using sensors on a regular soccer table to detect rule violations in realtim ..."
Abstract
 Add to MetaCart
(Show Context)
In table soccer, humans can not always thoroughly observe fast actions like rod spins and kicks. However, this is necessary in order to detect rule violations for example for tournament play. We describe an automatic system using sensors on a regular soccer table to detect rule violations in realtime. Naive Bayes is used for kick classi cation, the parameters are trained using supervised learning. In the online experiments, rule violations were detected at a higher rate than by the human players. The implementation proved its usefulness by being used by humans in real games and sets a basis for future research using probability models in table soccer.
Abstract A Probabilistic Similarity Framework for ContentBased Image Retrieval
, 2001
"... This is to certify that I have examined this copy of a doctoral dissertation by ..."
Abstract
 Add to MetaCart
(Show Context)
This is to certify that I have examined this copy of a doctoral dissertation by
unknown title
, 2003
"... A nonparametric model for transcription factor binding sites ..."
(Show Context)
Research Article Comparison of Classi�cation �lgorithms withWrapperBased Feature Selection for Predicting Osteoporosis Outcome Based on Genetic Factors in a TaiwaneseWomen Population
"... Copyright © 2013 HsuehWei Chang et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. An essential task in a genomic analysis of a hu ..."
Abstract
 Add to MetaCart
(Show Context)
Copyright © 2013 HsuehWei Chang et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. An essential task in a genomic analysis of a human disease is limiting the number of strongly associated genes when studying susceptibility to the disease. e goal of this study was to compare computational tools with and without feature selection for predicting osteoporosis outcome in Taiwanese women based on genetic factors such as single nucleotide polymorphisms (SNPs). To elucidate relationships between osteoporosis and SNPs in this population, three classi�cation algorithmswere applied:multilayer feedforward neural network (MFNN), naive Bayes, and logistic regression. Awrapperbased feature selectionmethodwas also used to identify a subset of major SNPs. Experimental results showed that the MFNN model with the wrapperbased approach was the best predictive model for inferring disease susceptibility based on the complex relationship between osteoporosis and SNPs in Taiwanese women. e �ndings suggest that patients and doctors can use the proposed tool to enhance decision making based on clinical factors such as SNP genotyping data. 1.