Results 11 - 20
of
41
Analysis of systematic scan Metropolis algorithms using Iwahori–Hecke algebra techniques
- Michigan Math. J
, 2000
"... Abstract. We give the first analysis of a systematic scan version of the Metropolis algorithm. Our examples include generating random elements of a Coxeter group with probability determined by the length function. The analysis is based on interpreting Metropolis walks in terms of the multiplication ..."
Abstract
-
Cited by 13 (3 self)
- Add to MetaCart
Abstract. We give the first analysis of a systematic scan version of the Metropolis algorithm. Our examples include generating random elements of a Coxeter group with probability determined by the length function. The analysis is based on interpreting Metropolis walks in terms of the multiplication in the Iwahori-Hecke algebra. 1.
Metrics on permutations, a survey
- Journal of Combinatorics, Information and System Sciences
, 1998
"... Abstract: This is a survey on distances on the symmetric groups Sn together with their applications in many contexts; for example: statistics, coding theory, computing, bell-ringing and so on, which were originally seen unrelated. This paper initializes a step of research toward this direction in th ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
Abstract: This is a survey on distances on the symmetric groups Sn together with their applications in many contexts; for example: statistics, coding theory, computing, bell-ringing and so on, which were originally seen unrelated. This paper initializes a step of research toward this direction in the hope that it will stimulate more researchs and eventually lead to a systematic study on this subject. Distances on Sn were used in many papers in different contexts; for example, in statistics (see [Cr] and its references), coding theory (see [BCD] and its references), in computing (see, for example [Kn]), bell-ringing and so on. Here we attempt to give a brief bird’s view of distances on Sn according to types of problems considered:
Non-parametric modeling of partially ranked data
- Journal of Machine Learning Research
"... Statistical models on full and partial rankings of n items are often of limited practical use for large n due to computational consideration. We explore the use of non-parametric models for partially ranked data and derive computationally efficient procedures for their use for large n. The derivatio ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
Statistical models on full and partial rankings of n items are often of limited practical use for large n due to computational consideration. We explore the use of non-parametric models for partially ranked data and derive computationally efficient procedures for their use for large n. The derivations are largely possible through combinatorial and algebraic manipulations based on the lattice of partial rankings. A bias-variance analysis and an experimental study demonstrate the applicability of the proposed method.
Cluster Analysis of Heterogeneous Rank Data
"... This revision of the ICML 2007 proceedings article corrects an error in Sec. 3. Cluster analysis of ranking data, which occurs in consumer questionnaires, voting forms or other inquiries of preferences, attempts to identify typical groups of rank choices. Empirically measured rankings are often inco ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
This revision of the ICML 2007 proceedings article corrects an error in Sec. 3. Cluster analysis of ranking data, which occurs in consumer questionnaires, voting forms or other inquiries of preferences, attempts to identify typical groups of rank choices. Empirically measured rankings are often incomplete, i.e. different numbers of filled rank positions cause heterogeneity in the data. We propose a mixture approach for clustering of heterogeneous rank data. Rankings of different lengths can be described and compared by means of a single probabilistic model. A maximum entropy approach avoids hidden assumptions about missing rank positions. Parameter estimators and an efficient EM algorithm for unsupervised inference are derived for the ranking mixture model. Experiments on both synthetic data and real-world data demonstrate significantly improved parameter estimates on heterogeneous data when the incomplete rankings are included in the inference process. 1.
Data Clustering: 50 Years Beyond K-Means
, 2008
"... Organizing data into sensible groupings is one of the most fundamental modes of understanding and learning. As an example, a common scheme of scientific classification puts organisms into taxonomic ranks: domain, kingdom, phylum, class, etc.). Cluster analysis is the formal study of algorithms and m ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
Organizing data into sensible groupings is one of the most fundamental modes of understanding and learning. As an example, a common scheme of scientific classification puts organisms into taxonomic ranks: domain, kingdom, phylum, class, etc.). Cluster analysis is the formal study of algorithms and methods for grouping, or clustering, objects according to measured or perceived intrinsic characteristics or similarity. Cluster analysis does not use category labels that tag objects with prior identifiers, i.e., class labels. The absence of category information distinguishes data clustering (unsupervised learning) from classification or discriminant analysis (supervised learning). The aim of clustering is exploratory in nature to find structure in data. Clustering has a long and rich history in a variety of scientific fields. One of the most popular and simple clustering algorithms, K-means, was first published in 1955. In spite of the fact that K-means was proposed over 50 years ago and thousands of clustering algorithms have been published since then, K-means is still widely used. This speaks to the difficulty of designing a general purpose clustering algorithm and the illposed problem of clustering. We provide a brief overview of clustering, summarize well known clustering methods, discuss the major challenges and key issues in designing clustering algorithms, and point out some of the emerging and useful research directions, including semi-supervised clustering, ensemble clustering, simultaneous feature selection, and data clustering and large scale data clustering.
Comparing partial rankings
- SIAM Journal on Discrete Mathematics
, 2004
"... Abstract. We provide a comprehensive picture of how to compare partial rankings, that is, rankings that allow ties. We propose several metrics to compare partial rankings and prove that they are within constant multiples of each other. Key words. partial ranking, bucket order, permutation, metric AM ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
Abstract. We provide a comprehensive picture of how to compare partial rankings, that is, rankings that allow ties. We propose several metrics to compare partial rankings and prove that they are within constant multiples of each other. Key words. partial ranking, bucket order, permutation, metric AMS subject classifications. 06A06, 68R99 DOI. 10.1137/05063088X
THE MARKOV CHAIN MONTE CARLO REVOLUTION
"... Abstract. The use of simulation for high-dimensional intractable computations has revolutionized applied mathematics. Designing, improving and understanding the new tools leads to (and leans on) fascinating mathematics, from representation theory through micro-local analysis. 1. ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
Abstract. The use of simulation for high-dimensional intractable computations has revolutionized applied mathematics. Designing, improving and understanding the new tools leads to (and leans on) fascinating mathematics, from representation theory through micro-local analysis. 1.
Region proximity in metric spaces and its use for approximate similarity search
- ACM Trans. Inf. Syst
, 2003
"... Similarity search structures for metric data typically bound object partitions by ball regions. Since regions can overlap, a relevant issue is to estimate the proximity of regions in order to predict the number of objects in the regions ’ intersection. This paper analyzes the problem using a probabi ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
Similarity search structures for metric data typically bound object partitions by ball regions. Since regions can overlap, a relevant issue is to estimate the proximity of regions in order to predict the number of objects in the regions ’ intersection. This paper analyzes the problem using a probabilistic approach and provides a solution that effectively computes the proximity through realistic heuristics that only require small amounts of auxiliary data. An extensive simulation to validate the technique is provided. An application is developed to demonstrate how the proximity measure can be successfully applied to the approximate similarity search. Search speedup is achieved by ignoring data regions whose proximity to the query region is smaller than a user-defined threshold. This idea is implemented in a metric tree environment for the similarity range and “nearest neighbors ” queries. Several measures of efficiency and effectiveness are applied to evaluate proposed approximate search algorithms on real-life data sets. An analytical model is developed to relate proximity parameters and the quality of search. Improvements of two orders of magnitude are achieved for moderately approximated search results. We demonstrate that the precision of proximity measures can significantly influence the quality of approximated algorithms.
Unsupervised Rank Aggregation with Distance-Based Models
"... The need to meaningfully combine sets of rankings often comes up when one deals with ranked data. Although a number of heuristic and supervised learning approaches to rank aggregation exist, they require domain knowledge or supervised ranked data, both of which are expensive to acquire. In order to ..."
Abstract
-
Cited by 9 (5 self)
- Add to MetaCart
The need to meaningfully combine sets of rankings often comes up when one deals with ranked data. Although a number of heuristic and supervised learning approaches to rank aggregation exist, they require domain knowledge or supervised ranked data, both of which are expensive to acquire. In order to address these limitations, we propose a mathematical and algorithmic framework for learning to aggregate (partial) rankings without supervision. We instantiate the framework for the cases of combining permutations and combining top-k lists, and propose a novel metric for the latter. Experiments in both scenarios demonstrate the effectiveness of the proposed formalism. 1.
On Rankings Generated By Pairwise Linear Discriminant Analysis of Populations
, 1995
"... this paper in which the corresponding population is given the rank 1. Furthermore, the Voronoi diagram is generalized in a varietyofways. One generalization which is closely related to our theory is the (ordered) order-k Voronoi diagram (Okabe, Boots, and Sugihara [12]). Our regions in pairwise line ..."
Abstract
-
Cited by 7 (5 self)
- Add to MetaCart
this paper in which the corresponding population is given the rank 1. Furthermore, the Voronoi diagram is generalized in a varietyofways. One generalization which is closely related to our theory is the (ordered) order-k Voronoi diagram (Okabe, Boots, and Sugihara [12]). Our regions in pairwise linear discriminant analysis of m populations are the interiors of the "ordered order-m Voronoi polyhedrons." For a comprehensive treatmentoftheVoronoi diagram, the reader is referred to Okabe, Boots, and Sugihara [12]

