Results 11 - 20
of
318
Dynamic Bayesian Multinets
, 2000
"... In this work, dynamic Bayesian multinets are introduced where a Markov chain state at time t determines conditional independence patterns between random variables lying within a local time window surrounding t. It is shown how information-theoretic criterion functions can be used to induce spa ..."
Abstract
-
Cited by 54 (14 self)
- Add to MetaCart
In this work, dynamic Bayesian multinets are introduced where a Markov chain state at time t determines conditional independence patterns between random variables lying within a local time window surrounding t. It is shown how information-theoretic criterion functions can be used to induce sparse, discriminative, and classconditional network structures that yield an optimal approximation to the class posterior probability, and therefore are useful for the classification task. Using a new structure learning heuristic, the resulting models are tested on a medium-vocabulary isolated-word speech recognition task. It is demonstrated that these discriminatively structured dynamic Bayesian multinets, when trained in a maximum likelihood setting using EM, can outperform both HMMs and other dynamic Bayesian networks with a similar number of parameters. 1 Introduction While Markov chains are sometimes a useful model for sequences, such simple independence assumptions can lead...
Modeling Dependencies in Protein-DNA Binding Sites
, 2003
"... The availability of whole genome sequences and high-throughput genomic assays opens the door for in silico analysis of transcription regulation. This includes methods for discovering and characterizing the binding sites of DNA-binding proteins, such as transcription factors. A common representation ..."
Abstract
-
Cited by 54 (1 self)
- Add to MetaCart
The availability of whole genome sequences and high-throughput genomic assays opens the door for in silico analysis of transcription regulation. This includes methods for discovering and characterizing the binding sites of DNA-binding proteins, such as transcription factors. A common representation of transcription factor binding sites is aposition specific score matrix (PSSM). This representation makes the strong assumption that binding site positions are independent of each other. In this work, we explore Bayesian network representations of binding sites that provide different tradeoffs between complexity (number of parameters) and the richness of dependencies between positions. We develop the formal machinery for learning such models from data and for estimating the statistical significance of putative binding sites. We then evaluate the ramifications of these richer representations in characterizing binding site motifs and predicting their genomic locations. We show that these richer representations improve over the PSSM model in both tasks.
Semi-supervised Learning of Classifiers: Theory, Algorithms and Their Application to Human-Computer Interaction
- IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2004
"... Automatic classification is one of the basic tasks required in any pattern recognition and human computer interaction application. In this paper we discuss training probabilistic classifiers with labeled and unlabeled data. We provide a new analysis that shows under what conditions unlabeled data ..."
Abstract
-
Cited by 47 (14 self)
- Add to MetaCart
Automatic classification is one of the basic tasks required in any pattern recognition and human computer interaction application. In this paper we discuss training probabilistic classifiers with labeled and unlabeled data. We provide a new analysis that shows under what conditions unlabeled data can be used in learning to improve classification performance. We also show that if the conditions are violated, using unlabeled data can be detrimental to classification performance. We discuss the implications of this analysis to a specific type of probabilistic classifiers, Bayesian networks, and propose a new structure learning algorithm that can utilize unlabeled data to improve classification. Finally, we show how the resulting algorithms are successfully employed in two applications related to human-computer interaction and pattern recognition; facial expression recognition and face detection.
Learning Bayesian Belief Network Classifiers: Algorithms and System
- Proceedings of 14 th Biennial conference of the
, 2001
"... This paper investigates the methods for learning predictive classifiers based on Bayesian belief networks (BN) -- primarily unrestricted Bayesian networks and Bayesian multinets. We present our algorithms for learning these classifiers, and discuss how these methods address the overfitting proble ..."
Abstract
-
Cited by 45 (3 self)
- Add to MetaCart
This paper investigates the methods for learning predictive classifiers based on Bayesian belief networks (BN) -- primarily unrestricted Bayesian networks and Bayesian multinets. We present our algorithms for learning these classifiers, and discuss how these methods address the overfitting problem and provide a natural method for feature subset selection. Using a set of standard classification problems, we empirically evaluate the performance of various BN-based classifiers. The results show that the proposed BN and Bayes multi-net classifiers are competitive with (or superior to) the best known classifiers, based on both BN and other formalisms; and that the computational time for learning and using these classifiers is relatively small. These results argue that BN based classifiers deserve more attention in the data mining community. 1 In t roduct i on Many tasks -- including fault diagnosis, pattern recognition and forecasting -- can be viewed as classification, as each r...
Not so naive Bayes: Aggregating one-dependence estimators
- Machine Learning
, 2005
"... Of numerous proposals to improve the accuracy of naive Bayes by weakening its attribute independence assumption, both LBR and super-parent TAN have demonstrated remarkable error performance. However, both techniques obtain this outcome at a considerable computational cost. We present a new approach ..."
Abstract
-
Cited by 44 (8 self)
- Add to MetaCart
Of numerous proposals to improve the accuracy of naive Bayes by weakening its attribute independence assumption, both LBR and super-parent TAN have demonstrated remarkable error performance. However, both techniques obtain this outcome at a considerable computational cost. We present a new approach to weakening the attribute independence assumption by averaging all of a constrained class of classifiers. In extensive experiments this technique delivers comparable prediction accuracy to LBR and super-parent TAN with substantially improved computational e#ciency at test time relative to the former and at training time relative to the latter. The new algorithm is shown to have low variance and is suited to incremental learning.
Learning Bayesian network classifiers by maximizing conditional likelihood
- In ICML2004
, 2004
"... Bayesian networks are a powerful probabilistic representation, and their use for classification has received considerable attention. However, they tend to perform poorly when learned in the standard way. This is attributable to a mismatch between the objective function used (likelihood or a function ..."
Abstract
-
Cited by 44 (0 self)
- Add to MetaCart
Bayesian networks are a powerful probabilistic representation, and their use for classification has received considerable attention. However, they tend to perform poorly when learned in the standard way. This is attributable to a mismatch between the objective function used (likelihood or a function thereof) and the goal of classification (maximizing accuracy or conditional likelihood). Unfortunately, the computational cost of optimizing structure and parameters for conditional likelihood is prohibitive. In this paper we show that a simple approximation— choosing structures by maximizing conditional likelihood while setting parameters by maximum likelihood—yields good results. On a large suite of benchmark datasets, this approach produces better class probability estimates than naive Bayes, TAN, and generatively-trained Bayesian networks. 1.
Recognizing Planned, Multiperson Action
- Computer Vision and Image Understanding
, 2001
"... This paper demonstrates how highly structured, multiperson action can be recognized from noisy perceptual data using visually grounded goal-based primitives and low-order temporal relationships that are integrated in a probabilistic framework. The representation, which is motivated by work in mo ..."
Abstract
-
Cited by 41 (2 self)
- Add to MetaCart
This paper demonstrates how highly structured, multiperson action can be recognized from noisy perceptual data using visually grounded goal-based primitives and low-order temporal relationships that are integrated in a probabilistic framework. The representation, which is motivated by work in model-based object recognition and probabilistic plan recognition, makes four principal assumptions: (1) the goals of individual agents are natural atomic representational units for specifying the temporal relationships between agents engaged in group activities, (2) a high-level description of temporal structure of the action using a small set of low-order temporal and logical constraints is adequate for representing the relationships between the agent goals for highly structured, multiagent action recognition, (3) Bayesian networks provide a suitable mechanism for integrating multiple sources of uncertain visual perceptual feature evidence, and (4) an automatically generated Bayesian
Context-Specific Bayesian Clustering for Gene Expression Data
, 2002
"... The recent growth in genomic data and measurements of genome-wide expression patterns allows us to apply computational tools to examine gene regulation by transcription factors. ..."
Abstract
-
Cited by 41 (5 self)
- Add to MetaCart
The recent growth in genomic data and measurements of genome-wide expression patterns allows us to apply computational tools to examine gene regulation by transcription factors.
Augmenting Naive Bayes Classifiers with Statistical Language Models
, 2003
"... We augment naive Bayes models with statistical n-gram language models to address shortcomings of the standard naive Bayes text classifier. The result is a generalized naive Bayes classifier ..."
Abstract
-
Cited by 38 (0 self)
- Add to MetaCart
We augment naive Bayes models with statistical n-gram language models to address shortcomings of the standard naive Bayes text classifier. The result is a generalized naive Bayes classifier
Structural extension to logistic regression: Discriminative parameter learning of belief net classifiers
- In Proceedings of the Eighteenth Annual National Conference on Artificial Intelligence (AAAI-02
, 2002
"... Abstract. Bayesian belief nets (BNs) are often used for classification tasks — typically to return the most likely class label for each specified instance. Many BN-learners, however, attempt to find the BN that maximizes a different objective function — viz., likelihood, rather than classification a ..."
Abstract
-
Cited by 36 (4 self)
- Add to MetaCart
Abstract. Bayesian belief nets (BNs) are often used for classification tasks — typically to return the most likely class label for each specified instance. Many BN-learners, however, attempt to find the BN that maximizes a different objective function — viz., likelihood, rather than classification accuracy — typically by first learning an appropriate graphical structure, then finding the parameters for that structure that maximize the likelihood of the data. As these parameters may not maximize the classification accuracy, “discriminative parameter learners ” follow the alternative approach of seeking the parameters that maximize conditional likelihood (CL), over the distribution of instances the BN will have to classify. This paper first formally specifies this task, shows how it extends standard logistic regression, and analyzes its inherent sample and computational complexity. We then present a general algorithm for this task, ELR, that applies to arbitrary BN structures and that works effectively even when given incomplete training data. Unfortunately, ELR is not guaranteed to find the parameters that optimize conditional likelihood; moreover, even the optimal-CL parameters need not have minimal classification error. This paper therefore presents empirical evidence that ELR produces effective classifiers, often superior to the ones produced by the standard “generative” algorithms, especially in common situations where the given BN-structure is incorrect. Keywords: (Bayesian) belief nets, Logistic regression, Classification, PAC-learning, Computational/sample complexity

