The Cache Performance and Optimizations of Blocked Algorithms
 In Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems
, 1991
"... Blocking is a wellknown optimization technique for improving the effectiveness of memory hierarchies. Instead of operating on entire rows or columns of an array, blocked algorithms operate on submatrices or blocks, so that data loaded into the faster levels of the memory hierarchy are reused. This ..."
Cited by 574 (5 self)
is highly sensitive to the stride of data accesses and the size of the blocks, and can cause wide variations in machine performance for different matrix sizes. The conventional wisdom of trying to use the entire cache, or even a fixed fraction of the cache, is incorrect. If a fixed block size is used for a
CURE: An Efficient Clustering Algorithm for Large Data sets
 Published in the Proceedings of the ACM SIGMOD Conference
, 1998
"... Clustering, in data mining, is useful for discovering groups and identifying interesting distributions in the underlying data. Traditional clustering algorithms either favor clusters with spherical shapes and similar sizes, or are very fragile in the presence of outliers. We propose a new clustering ..."
Cited by 722 (5 self)
clustering algorithm called CURE that is more robust to outliers, and identifies clusters having nonspherical shapes and wide variances in size. CURE achieves this by representing each cluster by a certain fixed number of points that are generated by selecting well scattered points from the cluster
Reevaluating Amdahl’s law
 Commun. ACM
, 1988
"... At Sandia National Laboratories, we are currently engaged in research involving massively parallel processing. There is considerable skepticism regarding the viability of massive parallelism; the skepticism centers around Amdahl’s law, an argument put forth by Gene Amda.hl in 1967 [l] that even w ..."
Cited by 316 (4 self)
when the fraction of serial work in a given problem is small, say, s, the maximum speedup obtainable from even an infinite number of parallel processors is only l/s. We now have timing results for a 1024processor system that demonstrate that the assumptions underlying Amdahl’s 1967 argument
Further computations of the consequences of setting the annual krill catch limit to a fixed fraction of the estimate of krill biomass from a survey
 CCAMLR Science
, 1994
"... Butterworth et a l. (1992) put forward an extension of the harvesting model of Beddington and Cooke (1983) to relate potential krill yield to a preexploitation survey estimate of krill biomass. In this paper, the approach is extended further so as to incorporate most of the amendments specified by ..."
Cited by 5 (0 self)
by the Third and Fourth Meetings of the Working Group on Krill (WCKrill). The most important of these extensions is integration over the ranges of uncertainty for a number of model parameters. Results are provided for the probability of spawning biomass falling below various fractions of its median pre
APPROXIMATION ALGORITHMS FOR SCHEDULING UNRELATED PARALLEL MACHINES
, 1990
"... We consider the following scheduling problem. There are m parallel machines and n independent.jobs. Each job is to be assigned to one of the machines. The processing of.job j on machine i requires time Pip The objective is to lind a schedule that minimizes the makespan. Our main result is a polynomi ..."
Cited by 265 (7 self)
polynomial algorithm which constructs a schedule that is guaranteed to be no longer than twice the optimum. We also present a polynomial approximation scheme for the case that the number of machines is fixed. Both approximation results are corollaries of a theorem about the relationship of a class of integer
Drowsy Caches: Simple Techniques for Reducing Leakage Power
 PROC. 29TH INT’L SYMP. COMPUTER ARCHITECTURE
, 2002
"... Onchip caches represent a sizable fraction of the total power consumption of microprocessors. Although large caches can significantly improve performance, they have the potential to increase power consumption. As feature sizes shrink, the dominant component of this power loss will be leakage. Howev ..."
Cited by 251 (1 self)
Onchip caches represent a sizable fraction of the total power consumption of microprocessors. Although large caches can significantly improve performance, they have the potential to increase power consumption. As feature sizes shrink, the dominant component of this power loss will be leakage
Truthful Mechanisms for OneParameter Agents
"... In this paper, we show how to design truthful (dominant strategy) mechanisms for several combinatorial problems where each agent’s secret data is naturally expressed by a single positive real number. The goal of the mechanisms we consider is to allocate loads placed on the agents, and an agent’s sec ..."
Cited by 232 (3 self)
problems in combinatorial optimization to which the celebrated VCG mechanism does not apply. For scheduling related parallel machines (QjjCmax), we give a 3approximation mechanism based on randomized rounding of the optimal fractional solution. This problem is NPcomplete, and the standard approximation
Nekrasov, “Gravity duals of fractional branes and logarithmic
 RG flow,” Nucl. Phys. B
"... We study fractional branes in N = 2 orbifold and N = 1 conifold theories. Placing a large number N of regular D3branes at the singularity produces the dual AdS5 × X 5 geometry, and we describe the fractional branes as small perturbations to this background. For the orbifolds, X 5 = S 5 /Γ and fract ..."
Cited by 138 (6 self)
and fractional D3branes excite complex scalars from the twisted sector which are localized on the fixed circle of X 5. The resulting solutions are given by holomorphic functions and the fieldtheoretic betafunction is simply reproduced. For N regular and M fractional D3branes at the conifold singularity we
Filterbankbased fingerprint matching
 IEEE TRANSACTIONS ON IMAGE PROCESSING
, 2000
"... With identity fraud in our society reaching unprecedented proportions and with an increasing emphasis on the emerging automatic personal identification applications, biometricsbased verification, especially fingerprintbased identification, is receiving a lot of attention. There are two major shor ..."
Cited by 219 (26 self)
shortcomings of the traditional approaches to fingerprint representation. For a considerable fraction of population, the representations based on explicit detection of complete ridge structures in the fingerprint are difficult to extract automatically. The widely used minutiaebased representation does
Adapting to unknown sparsity by controlling the false discovery rate
, 2000
"... We attempt to recover a highdimensional vector observed in white noise, where the vector is known to be sparse, but the degree of sparsity is unknown. We consider three different ways of defining sparsity of a vector: using the fraction of nonzero terms; imposing powerlaw decay bounds on the order ..."
Cited by 183 (23 self)
We attempt to recover a highdimensional vector observed in white noise, where the vector is known to be sparse, but the degree of sparsity is unknown. We consider three different ways of defining sparsity of a vector: using the fraction of nonzero terms; imposing powerlaw decay bounds
