Results 1  10
of
48
Probability Estimates for Multiclass Classification by Pairwise Coupling
 Journal of Machine Learning Research
, 2003
"... Pairwise coupling is a popular multiclass classification method that combines together all pairwise comparisons for each pair of classes. This paper presents two approaches for obtaining class probabilities. Both methods can be reduced to linear systems and are easy to implement. ..."
Abstract

Cited by 266 (1 self)
 Add to MetaCart
(Show Context)
Pairwise coupling is a popular multiclass classification method that combines together all pairwise comparisons for each pair of classes. This paper presents two approaches for obtaining class probabilities. Both methods can be reduced to linear systems and are easy to implement.
A tutorial on MM algorithms
 Amer. Statist
, 2004
"... Most problems in frequentist statistics involve optimization of a function such as a likelihood or a sum of squares. EM algorithms are among the most effective algorithms for maximum likelihood estimation because they consistently drive the likelihood uphill by maximizing a simple surrogate function ..."
Abstract

Cited by 122 (4 self)
 Add to MetaCart
Most problems in frequentist statistics involve optimization of a function such as a likelihood or a sum of squares. EM algorithms are among the most effective algorithms for maximum likelihood estimation because they consistently drive the likelihood uphill by maximizing a simple surrogate function for the loglikelihood. Iterative optimization of a surrogate function as exemplified by an EM algorithm does not necessarily require missing data. Indeed, every EM algorithm is a special case of the more general class of MM optimization algorithms, which typically exploit convexity rather than missing data in majorizing or minorizing an objective function. In our opinion, MM algorithms deserve to part of the standard toolkit of professional statisticians. The current article explains the principle behind MM algorithms, suggests some methods for constructing them, and discusses some of their attractive features. We include numerous examples throughout the article to illustrate the concepts described. In addition to surveying previous work on MM algorithms, this article introduces some new material on constrained optimization and standard error estimation. Key words and phrases: constrained optimization, EM algorithm, majorization, minorization, NewtonRaphson 1 1
Active exploration for learning rankings from clickthrough data
 In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
, 2007
"... We address the task of learning rankings of documents from search engine logs of user behavior. Previous work on this problem has relied on passively collected clickthrough data. In contrast, we show that an active exploration strategy can provide data that leads to much faster learning. Specificall ..."
Abstract

Cited by 71 (4 self)
 Add to MetaCart
(Show Context)
We address the task of learning rankings of documents from search engine logs of user behavior. Previous work on this problem has relied on passively collected clickthrough data. In contrast, we show that an active exploration strategy can provide data that leads to much faster learning. Specifically, we develop a Bayesian approach for selecting rankings to present users so that interations result in more informative training data. Our results using the TREC10 Web corpus, as well as synthetic data, demonstrate that a directed exploration strategy quickly leads to users being presented improved rankings in an online learning setting. We find that active exploration substantially outperforms passive observation and random exploration.
Computing Elo Ratings of Move Patterns in the Game of Go
"... Move patterns are an essential method to incorporate domain knowledge into Goplaying programs. This paper presents a new Bayesian technique for supervised learning of such patterns from game records, based on a generalization of Elo ratings. Each sample move in the training data is considered as a ..."
Abstract

Cited by 64 (0 self)
 Add to MetaCart
(Show Context)
Move patterns are an essential method to incorporate domain knowledge into Goplaying programs. This paper presents a new Bayesian technique for supervised learning of such patterns from game records, based on a generalization of Elo ratings. Each sample move in the training data is considered as a victory of a team of pattern features. Elo ratings of individual pattern features are computed from these victories, and can be used in previously unseen positions to compute a probability distribution over legal moves. In this approach, several pattern features may be combined, without an exponential cost in the number of features. Despite a very small number of training games (652), this algorithm outperforms most previous patternlearning algorithms, both in terms of mean logevidence (−2.69), and prediction rate (34.9%). A 19 × 19 MonteCarlo program improved with these patterns reached the level of the strongest classical programs.
Bayesian inference for PlackettLuce ranking models
"... This paper gives an efficient Bayesian method for inferring the parameters of a PlackettLuce ranking model. Such models are parameterised distributions over rankings of a finite set of objects, and have typically been studied and applied within the psychometric, sociometric and econometric literatu ..."
Abstract

Cited by 29 (0 self)
 Add to MetaCart
(Show Context)
This paper gives an efficient Bayesian method for inferring the parameters of a PlackettLuce ranking model. Such models are parameterised distributions over rankings of a finite set of objects, and have typically been studied and applied within the psychometric, sociometric and econometric literature. The inference scheme is an application of Power EP (expectation propagation). The scheme is robust and can be readily applied to large scale data sets. The inference algorithm extends to variations of the basic PlackettLuce model, including partial rankings. We show a number of advantages of the EP approach over the traditional maximum likelihood method. We apply the method to aggregate rankings of NASCAR racing drivers over the 2002 season, and also to rankings of movie genres. 1.
Random Utility Theory for Social Choice
"... Random utility theory models an agent’s preferences on alternatives by drawing a realvalued score on each alternative (typically independently) from a parameterized distribution, and then ranking the alternatives according to scores. A special case that has received significant attention is the Pla ..."
Abstract

Cited by 21 (10 self)
 Add to MetaCart
(Show Context)
Random utility theory models an agent’s preferences on alternatives by drawing a realvalued score on each alternative (typically independently) from a parameterized distribution, and then ranking the alternatives according to scores. A special case that has received significant attention is the PlackettLuce model, for which fast inference methods for maximum likelihood estimators are available. This paper develops conditions on general random utility models that enable fast inference within a Bayesian framework through MCEM, providing concave loglikelihood functions and bounded sets of global maxima solutions. Results on both realworld and simulated data provide support for the scalability of the approach and capability for model selection among general random utility models including PlackettLuce. 1
Computing Maximum Likelihood Estimates in loglinear models
, 2006
"... We develop computational strategies for extended maximum likelihood estimation, as defined in Rinaldo (2006), for general classes of loglinear models of widespred use, under Poisson and productmultinomial sampling schemes. We derive numerically efficient procedures for generating and manipulating ..."
Abstract

Cited by 19 (3 self)
 Add to MetaCart
We develop computational strategies for extended maximum likelihood estimation, as defined in Rinaldo (2006), for general classes of loglinear models of widespred use, under Poisson and productmultinomial sampling schemes. We derive numerically efficient procedures for generating and manipulating design matrices and we propose various algorithms for computing the extended maximum likelihood estimates of the expectations of the cell counts. These algorithms allow to identify the set of estimable cell means for any given observable table and can be used for modifying traditional goodnessoffit tests to accommodate for a nonexistent MLE. We describe and take advantage of the connections between extended maximum likelihood
Efficient Bayesian Inference for Generalized BradleyTerry Models
 Journal of Computational and Graphical Statistics
, 2012
"... The BradleyTerry model is a popular approach to describe probabilities of the possible outcomes when elements of a set are repeatedly compared with one another in pairs. It has found many applications including animal behaviour, chess ranking and multiclass classification. Numerous extensions of th ..."
Abstract

Cited by 15 (5 self)
 Add to MetaCart
(Show Context)
The BradleyTerry model is a popular approach to describe probabilities of the possible outcomes when elements of a set are repeatedly compared with one another in pairs. It has found many applications including animal behaviour, chess ranking and multiclass classification. Numerous extensions of the basic model have also been proposed in the literature including models with ties, multiple comparisons, group comparisons and random graphs. From a computational point of view, Hunter (2004) has proposed efficient iterative MM (minorizationmaximization) algorithms to perform maximum likelihood estimation for these generalized BradleyTerry models whereas Bayesian inference is typically performed using MCMC (Markov chain Monte Carlo) algorithms based on tailored MetropolisHastings (MH) proposals. We show here that these MM algorithms can be reinterpreted as special instances of ExpectationMaximization (EM) algorithms associated to suitable sets 1 of latent variables and propose some original extensions. These latent variables allow us to derive simple Gibbs samplers for Bayesian inference. We demonstrate experimentally the efficiency of these algorithms on a variety of applications.
Supplement to “A mixture of experts model for rank data with applications in election studies
, 2008
"... A voting bloc is defined to be a group of voters who have similar voting preferences. The cleavage of the Irish electorate into voting blocs is of interest. Irish elections employ a “single transferable vote” electoral system; under this system voters rank some or all of the electoral candidates in ..."
Abstract

Cited by 14 (3 self)
 Add to MetaCart
A voting bloc is defined to be a group of voters who have similar voting preferences. The cleavage of the Irish electorate into voting blocs is of interest. Irish elections employ a “single transferable vote” electoral system; under this system voters rank some or all of the electoral candidates in order of preference. These rank votes provide a rich source of preference information from which inferences about the composition of the electorate may be drawn. Additionally, the influence of social factors or covariates on the electorate composition is of interest. A mixture of experts model is a mixture model in which the model parameters are functions of covariates. A mixture of experts model for rank data is developed to provide a modelbased method to cluster Irish voters into voting blocs, to examine the influence of social factors on this clustering and to examine the characteristic preferences of the voting blocs. The Benter model for rank data is employed as the
Generalized MethodofMoments for Rank Aggregation
"... In this paper we propose a class of efficient Generalized MethodofMoments (GMM) algorithms for computing parameters of the PlackettLuce model, where the data consists of full rankings over alternatives. Our technique is based on breaking the full rankings into pairwise comparisons, and then compu ..."
Abstract

Cited by 12 (3 self)
 Add to MetaCart
(Show Context)
In this paper we propose a class of efficient Generalized MethodofMoments (GMM) algorithms for computing parameters of the PlackettLuce model, where the data consists of full rankings over alternatives. Our technique is based on breaking the full rankings into pairwise comparisons, and then computing parameters that satisfy a set of generalized moment conditions. We identify conditions for the output of GMM to be unique, and identify a general class of consistent and inconsistent breakings. We then show by theory and experiments that our algorithms run significantly faster than the classical MinorizeMaximization (MM) algorithm, while achieving competitive statistical efficiency. 1