Results 1  10
of
237
Rank Aggregation Methods for the Web
, 2001
"... We consider the problem of combining ranking results from various sources. In the context of the Web, the main applications include building metasearch engines, combining ranking functions, selecting documents based on multiple criteria, and improving search precision through word associations. Wed ..."
Abstract

Cited by 325 (5 self)
 Add to MetaCart
We consider the problem of combining ranking results from various sources. In the context of the Web, the main applications include building metasearch engines, combining ranking functions, selecting documents based on multiple criteria, and improving search precision through word associations. Wedevelop a set of techniques for the rank aggregation problem and compare their performance to that of wellknown methods. A primary goal of our work is to design rank aggregation techniques that can effectively combat "spam," a serious problem in Web searches. Experiments show that our methods are simple, efficient, and effective. Keywords: rank aggregation, ranking functions, metasearch, multiword queries, spam 1.
The Markov Chain Monte Carlo method: an approach to approximate counting and integration
, 1996
"... In the area of statistical physics, Monte Carlo algorithms based on Markov chain simulation have been in use for many years. The validity of these algorithms depends crucially on the rate of convergence to equilibrium of the Markov chain being simulated. Unfortunately, the classical theory of stocha ..."
Abstract

Cited by 234 (13 self)
 Add to MetaCart
In the area of statistical physics, Monte Carlo algorithms based on Markov chain simulation have been in use for many years. The validity of these algorithms depends crucially on the rate of convergence to equilibrium of the Markov chain being simulated. Unfortunately, the classical theory of stochastic processes hardly touches on the sort of nonasymptotic analysis required in this application. As a consequence, it had previously not been possible to make useful, mathematically rigorous statements about the quality of the estimates obtained. Within the last ten years, analytical tools have been devised with the aim of correcting this deficiency. As well as permitting the analysis of Monte Carlo algorithms for classical problems in statistical physics, the introduction of these tools has spurred the development of new approximation algorithms for a wider class of problems in combinatorial enumeration and optimization. The “Markov chain Monte Carlo ” method has been applied to a variety of such problems, and often provides the only known efficient (i.e., polynomial time) solution technique.
Improved bounds for mixing rates of Markov chains and multicommodity flow
 Combinatorics, Probability and Computing
, 1992
"... The paper is concerned with tools for the quantitative analysis of finite Markov chains whose states are combinatorial structures. Chains of this kind have algorithmic applications in many areas, including random sampling, approximate counting, statistical physics and combinatorial optimisation. The ..."
Abstract

Cited by 186 (8 self)
 Add to MetaCart
The paper is concerned with tools for the quantitative analysis of finite Markov chains whose states are combinatorial structures. Chains of this kind have algorithmic applications in many areas, including random sampling, approximate counting, statistical physics and combinatorial optimisation. The efficiency of the resulting algorithms depends crucially on the mixing rate of the chain, i.e., the time taken for it to reach its stationary or equilibrium distribution. The paper presents a new upper bound on the mixing rate, based on the solution to a multicommodity flow problem in the Markov chain viewed as a graph. The bound gives sharper estimates for the mixing rate of several important complex Markov chains. As a result, improved bounds are obtained for the runtimes of randomised approximation algorithms for various problems, including computing the permanent of a 01 matrix, counting matchings in graphs, and computing the partition function of a ferromagnetic Ising system. Moreove...
Learning Decision Trees using the Fourier Spectrum
, 1991
"... This work gives a polynomial time algorithm for learning decision trees with respect to the uniform distribution. (This algorithm uses membership queries.) The decision tree model that is considered is an extension of the traditional boolean decision tree model that allows linear operations in each ..."
Abstract

Cited by 182 (10 self)
 Add to MetaCart
This work gives a polynomial time algorithm for learning decision trees with respect to the uniform distribution. (This algorithm uses membership queries.) The decision tree model that is considered is an extension of the traditional boolean decision tree model that allows linear operations in each node (i.e., summation of a subset of the input variables over GF (2)). This paper shows how to learn in polynomial time any function that can be approximated (in norm L 2 ) by a polynomially sparse function (i.e., a function with only polynomially many nonzero Fourier coefficients). The authors demonstrate that any function f whose L 1 norm (i.e., the sum of absolute value of the Fourier coefficients) is polynomial can be approximated by a polynomially sparse function, and prove that boolean decision trees with linear operations are a subset of this class of functions. Moreover, it is shown that the functions with polynomial L 1 norm can be learned deterministically. The algorithm can a...
Supporting topk join queries in relational databases
 In VLDB
, 2003
"... Abstract. Ranking queries, also known as topk queries, produce results that are ordered on some computed score. Typically, these queries involve joins, where users are usually interested only in the topk join results. Topk queries are dominant in many emerging applications, e.g., multimedia retri ..."
Abstract

Cited by 114 (13 self)
 Add to MetaCart
Abstract. Ranking queries, also known as topk queries, produce results that are ordered on some computed score. Typically, these queries involve joins, where users are usually interested only in the topk join results. Topk queries are dominant in many emerging applications, e.g., multimedia retrieval by content, Web databases, data mining, middlewares, and most information retrieval applications. Current relational query processors do not handle ranking queries efficiently, especially when joins are involved. In this paper, we address supporting topk join queries in relational query processors. We introduce a new rankjoin algorithm that makes use of the individual orders of its inputs to produce join results ordered on a userspecified scoring function. The idea is to rank the join results progressively during the join operation. We introduce two physical query operators based on variants of ripple join that implement the rankjoin algorithm. The operators are nonblocking and can be integrated into pipelined execution plans. We also propose an efficient heuristic designed to optimize a topk join query by choosing the best join order. We address several practical issues and optimization heuristics to integrate the new join operators in practical query processors. We implement the new operators inside a prototype database engine based on PREDATOR. The experimental evaluation of our approach compares recent algorithms for joining ranked inputs and shows superior performance. Keywords: Ranking – Topk queries – Rank aggregarion – Query operators
General state space Markov chains and MCMC algorithm
 PROBABILITY SURVEYS
, 2004
"... This paper surveys various results about Markov chains on general (noncountable) state spaces. It begins with an introduction to Markov chain Monte Carlo (MCMC) algorithms, which provide the motivation and context for the theory which follows. Then, sufficient conditions for geometric and uniform e ..."
Abstract

Cited by 114 (27 self)
 Add to MetaCart
This paper surveys various results about Markov chains on general (noncountable) state spaces. It begins with an introduction to Markov chain Monte Carlo (MCMC) algorithms, which provide the motivation and context for the theory which follows. Then, sufficient conditions for geometric and uniform ergodicity are presented, along with quantitative bounds on the rate of convergence to stationarity. Many of these results are proved using direct coupling constructions based on minorisation and drift conditions. Necessary and sufficient conditions for Central Limit Theorems (CLTs) are also presented, in some cases proved via the Poisson Equation or direct regeneration constructions. Finally, optimal scaling and weak convergence results for MetropolisHastings algorithms are discussed. None of the results presented is new, though many of the proofs are. We also describe some Open Problems.
Metropolized Independent Sampling with Comparisons to Rejection Sampling and Importance Sampling
, 1996
"... this paper, a special MetropolisHastings type algorithm, Metropolized independent sampling, proposed firstly in Hastings (1970), is studied in full detail. The eigenvalues and eigenvectors of the corresponding Markov chain, as well as a sharp bound for the total variation distance between the nth ..."
Abstract

Cited by 96 (3 self)
 Add to MetaCart
this paper, a special MetropolisHastings type algorithm, Metropolized independent sampling, proposed firstly in Hastings (1970), is studied in full detail. The eigenvalues and eigenvectors of the corresponding Markov chain, as well as a sharp bound for the total variation distance between the nth updated distribution and the target distribution, are provided. Furthermore, the relationship between this scheme, rejection sampling, and importance sampling are studied with emphasizes on their relative efficiencies. It is shown that Metropolized independent sampling is superior to rejection sampling in two aspects: asymptotic efficiency and ease of computation. Key Words: Coupling, Delta method, Eigen analysis, Importance ratio. 1 1 Introduction
Shuffling cards and stopping times
 In Proceedings of the 43rd IEEE Conference on Decision and Control
, 1986
"... 1. Introduction. How many times must a deck of cards be shuffled until it is close to random? There is an elementary technique which often yields sharp estimates in such problems. The method is best understood through a simple example. EXAMPLE1. Top in at random shuffle. Consider the following metho ..."
Abstract

Cited by 93 (11 self)
 Add to MetaCart
1. Introduction. How many times must a deck of cards be shuffled until it is close to random? There is an elementary technique which often yields sharp estimates in such problems. The method is best understood through a simple example. EXAMPLE1. Top in at random shuffle. Consider the following method of mixing a deck of cards: the top card is removed and inserted into the deck at a random position. This procedure is