Results 11  20
of
54
Cluster Analysis of Heterogeneous Rank Data
"... This revision of the ICML 2007 proceedings article corrects an error in Sec. 3. Cluster analysis of ranking data, which occurs in consumer questionnaires, voting forms or other inquiries of preferences, attempts to identify typical groups of rank choices. Empirically measured rankings are often inco ..."
Abstract

Cited by 26 (0 self)
 Add to MetaCart
This revision of the ICML 2007 proceedings article corrects an error in Sec. 3. Cluster analysis of ranking data, which occurs in consumer questionnaires, voting forms or other inquiries of preferences, attempts to identify typical groups of rank choices. Empirically measured rankings are often incomplete, i.e. different numbers of filled rank positions cause heterogeneity in the data. We propose a mixture approach for clustering of heterogeneous rank data. Rankings of different lengths can be described and compared by means of a single probabilistic model. A maximum entropy approach avoids hidden assumptions about missing rank positions. Parameter estimators and an efficient EM algorithm for unsupervised inference are derived for the ranking mixture model. Experiments on both synthetic data and realworld data demonstrate significantly improved parameter estimates on heterogeneous data when the incomplete rankings are included in the inference process. 1.
Analysis of systematic scan Metropolis algorithms using Iwahori–Hecke algebra techniques
 Michigan Math. J
, 2000
"... Abstract. We give the first analysis of a systematic scan version of the Metropolis algorithm. Our examples include generating random elements of a Coxeter group with probability determined by the length function. The analysis is based on interpreting Metropolis walks in terms of the multiplication ..."
Abstract

Cited by 22 (7 self)
 Add to MetaCart
Abstract. We give the first analysis of a systematic scan version of the Metropolis algorithm. Our examples include generating random elements of a Coxeter group with probability determined by the length function. The analysis is based on interpreting Metropolis walks in terms of the multiplication in the IwahoriHecke algebra. 1.
Nonparametric modeling of partially ranked data
 Journal of Machine Learning Research
"... Statistical models on full and partial rankings of n items are often of limited practical use for large n due to computational consideration. We explore the use of nonparametric models for partially ranked data and derive computationally efficient procedures for their use for large n. The derivatio ..."
Abstract

Cited by 22 (2 self)
 Add to MetaCart
Statistical models on full and partial rankings of n items are often of limited practical use for large n due to computational consideration. We explore the use of nonparametric models for partially ranked data and derive computationally efficient procedures for their use for large n. The derivations are largely possible through combinatorial and algebraic manipulations based on the lattice of partial rankings. A biasvariance analysis and an experimental study demonstrate the applicability of the proposed method.
Similarity of Personal Preferences: Theoretical Foundations and Empirical Analysis
, 2003
"... We study the problem of defining similarity measures on preferences from a decisiontheoretic point of view. We propose a similarity measure, called probabilistic distance, that originates from the Kendall's tau function, a wellknown concept in the statistical literature. We compare this measure to ..."
Abstract

Cited by 20 (0 self)
 Add to MetaCart
We study the problem of defining similarity measures on preferences from a decisiontheoretic point of view. We propose a similarity measure, called probabilistic distance, that originates from the Kendall's tau function, a wellknown concept in the statistical literature. We compare this measure to other existing similarity measures on preferences. The key advantage of this measure is its extensibility to accommodate partial preferences and uncertainty. We develop e#cient methods to compute this measure, exactly or approximately, under all circumstances. These methods make use of recent advances in the area of Markov chain Monte Carlo simulation. We discuss two applications of the probabilistic distance: in the construction of the DecisionTheoretic Video Advisor (diva), and in robustness analysis of a theory refinement technique for preference elicitation.
Fourier Theoretic Probabilistic Inference over Permutations
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2009
"... Permutations are ubiquitous in many realworld problems, such as voting, ranking, and data association. Representing uncertainty over permutations is challenging, since there are n! possibilities, and typical compact and factorized probability distribution representations, such as graphical models, ..."
Abstract

Cited by 19 (8 self)
 Add to MetaCart
Permutations are ubiquitous in many realworld problems, such as voting, ranking, and data association. Representing uncertainty over permutations is challenging, since there are n! possibilities, and typical compact and factorized probability distribution representations, such as graphical models, cannot capture the mutual exclusivity constraints associated with permutations. In this paper, we use the “lowfrequency” terms of a Fourier decomposition to represent distributions over permutations compactly. We present Kronecker conditioning, a novel approach for maintaining and updating these distributions directly in the Fourier domain, allowing for polynomial time bandlimited approximations. Low order Fourierbased approximations, however, may lead to functions that do not correspond to valid distributions. To address this problem, we present a quadratic program defined directly in the Fourier domain for projecting the approximation onto a relaxation of the polytope of legal marginal distributions. We demonstrate the effectiveness of our approach on a real camerabased multiperson tracking scenario.
Comparing partial rankings
 SIAM Journal on Discrete Mathematics
, 2004
"... Abstract. We provide a comprehensive picture of how to compare partial rankings, that is, rankings that allow ties. We propose several metrics to compare partial rankings and prove that they are within constant multiples of each other. Key words. partial ranking, bucket order, permutation, metric AM ..."
Abstract

Cited by 18 (1 self)
 Add to MetaCart
Abstract. We provide a comprehensive picture of how to compare partial rankings, that is, rankings that allow ties. We propose several metrics to compare partial rankings and prove that they are within constant multiples of each other. Key words. partial ranking, bucket order, permutation, metric AMS subject classifications. 06A06, 68R99 DOI. 10.1137/05063088X
THE MARKOV CHAIN MONTE CARLO REVOLUTION
"... Abstract. The use of simulation for highdimensional intractable computations has revolutionized applied mathematics. Designing, improving and understanding the new tools leads to (and leans on) fascinating mathematics, from representation theory through microlocal analysis. 1. ..."
Abstract

Cited by 18 (1 self)
 Add to MetaCart
Abstract. The use of simulation for highdimensional intractable computations has revolutionized applied mathematics. Designing, improving and understanding the new tools leads to (and leans on) fascinating mathematics, from representation theory through microlocal analysis. 1.
Metrics on permutations, a survey
 Journal of Combinatorics, Information and System Sciences
, 1998
"... Abstract: This is a survey on distances on the symmetric groups Sn together with their applications in many contexts; for example: statistics, coding theory, computing, bellringing and so on, which were originally seen unrelated. This paper initializes a step of research toward this direction in th ..."
Abstract

Cited by 17 (0 self)
 Add to MetaCart
Abstract: This is a survey on distances on the symmetric groups Sn together with their applications in many contexts; for example: statistics, coding theory, computing, bellringing and so on, which were originally seen unrelated. This paper initializes a step of research toward this direction in the hope that it will stimulate more researchs and eventually lead to a systematic study on this subject. Distances on Sn were used in many papers in different contexts; for example, in statistics (see [Cr] and its references), coding theory (see [BCD] and its references), in computing (see, for example [Kn]), bellringing and so on. Here we attempt to give a brief bird’s view of distances on Sn according to types of problems considered:
Unsupervised Rank Aggregation with DistanceBased Models
"... The need to meaningfully combine sets of rankings often comes up when one deals with ranked data. Although a number of heuristic and supervised learning approaches to rank aggregation exist, they require domain knowledge or supervised ranked data, both of which are expensive to acquire. In order to ..."
Abstract

Cited by 16 (6 self)
 Add to MetaCart
The need to meaningfully combine sets of rankings often comes up when one deals with ranked data. Although a number of heuristic and supervised learning approaches to rank aggregation exist, they require domain knowledge or supervised ranked data, both of which are expensive to acquire. In order to address these limitations, we propose a mathematical and algorithmic framework for learning to aggregate (partial) rankings without supervision. We instantiate the framework for the cases of combining permutations and combining topk lists, and propose a novel metric for the latter. Experiments in both scenarios demonstrate the effectiveness of the proposed formalism. 1.
Region proximity in metric spaces and its use for approximate similarity search
 ACM Trans. Inf. Syst
, 2003
"... Similarity search structures for metric data typically bound object partitions by ball regions. Since regions can overlap, a relevant issue is to estimate the proximity of regions in order to predict the number of objects in the regions ’ intersection. This paper analyzes the problem using a probabi ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
Similarity search structures for metric data typically bound object partitions by ball regions. Since regions can overlap, a relevant issue is to estimate the proximity of regions in order to predict the number of objects in the regions ’ intersection. This paper analyzes the problem using a probabilistic approach and provides a solution that effectively computes the proximity through realistic heuristics that only require small amounts of auxiliary data. An extensive simulation to validate the technique is provided. An application is developed to demonstrate how the proximity measure can be successfully applied to the approximate similarity search. Search speedup is achieved by ignoring data regions whose proximity to the query region is smaller than a userdefined threshold. This idea is implemented in a metric tree environment for the similarity range and “nearest neighbors ” queries. Several measures of efficiency and effectiveness are applied to evaluate proposed approximate search algorithms on reallife data sets. An analytical model is developed to relate proximity parameters and the quality of search. Improvements of two orders of magnitude are achieved for moderately approximated search results. We demonstrate that the precision of proximity measures can significantly influence the quality of approximated algorithms.