Results 1  10
of
31
HighPerformance Algorithm Engineering for Computational Phylogenetics
 J. Supercomputing
, 2002
"... A phylogeny is the evolutionary history of a group of organisms; systematists (and other biologists) attempt to reconstruct this history from various forms of data about contemporary organisms. Phylogeny reconstruction is a crucial step in the understanding of evolution as well as an important tool ..."
Abstract

Cited by 24 (7 self)
 Add to MetaCart
A phylogeny is the evolutionary history of a group of organisms; systematists (and other biologists) attempt to reconstruct this history from various forms of data about contemporary organisms. Phylogeny reconstruction is a crucial step in the understanding of evolution as well as an important tool in biological, pharmaceutical, and medical research. Phylogeny reconstruction from molecular data is very difficult: almost all optimization models give rise to NPhard (and thus computationally intractable) problems. Yet approximations must be of very high quality in order to avoid outright biological nonsense. Thus many biologists have been willing to run farms of processors for many months in order to analyze just one dataset. Highperformance algorithm engineering offers a battery of tools that can reduce, sometimes spectacularly, the running time of existing phylogenetic algorithms, as well as help designers produce better algorithms. We present an overview of algorithm engineering techniques, illustrating them with an application to the "breakpoint analysis" method of Sankoff et al., which resulted in the GRAPPA software suite. GRAPPA demonstrated a speedup in running time by over eight orders of magnitude over the original implementation on a variety of real and simulated datasets. We show how these algorithmic engineering techniques are directly applicable to a large variety of challenging combinatorial problems in computational biology.
Semimatchings for bipartite graphs and load balancing
 In Proc. 8th WADS
, 2003
"... We consider the problem of fairly matching the lefthand vertices of a bipartite graph to the righthand vertices. We refer to this problem as the optimal semimatching problem; it is a relaxation of the known bipartite matching problem. We present a way to evaluate the quality of a given semimatchi ..."
Abstract

Cited by 14 (0 self)
 Add to MetaCart
We consider the problem of fairly matching the lefthand vertices of a bipartite graph to the righthand vertices. We refer to this problem as the optimal semimatching problem; it is a relaxation of the known bipartite matching problem. We present a way to evaluate the quality of a given semimatching and show that, under this measure, an optimal semimatching balances the load on the right hand vertices with respect to any Lpnorm. In particular, when modeling a job assignment system, an optimal semimatching achieves the minimal makespan and the minimal flow time for the system. The problem of finding optimal semimatchings is a special case of certain scheduling problems for which known solutions exist. However, these known solutions are based on general network optimization algorithms, and are not the most efficient way to solve the optimal semimatching problem. To compute optimal semimatchings efficiently, we present and analyze two new algorithms. The first algorithm generalizes the Hungarian method for computing maximum bipartite matchings, while the second, more efficient algorithm is based on a new notion of costreducing paths. Our experimental results demonstrate that the second algorithm is vastly superior to using known network optimization algorithms to solve the optimal semimatching problem. Furthermore, this same algorithm can also be used to find maximum bipartite matchings and is shown to be roughly as efficient as the best known algorithms for this goal. Key words: bipartite graphs, loadbalancing, matching algorithms, optimal algorithms, semimatching
Graph and Hashing Algorithms for Modern Architectures: Design and Performance
 PROC. 2ND WORKSHOP ON ALGORITHM ENG. WAE 98, MAXPLANCK INST. FÜR INFORMATIK, 1998, IN TR MPII981019
, 1998
"... We study the eects of caches on basic graph and hashing algorithms and show how cache effects inuence the best solutions to these problems. We study the performance of basic data structures for storing lists of values and use these results to design and evaluate algorithms for hashing, BreadthFi ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
We study the eects of caches on basic graph and hashing algorithms and show how cache effects inuence the best solutions to these problems. We study the performance of basic data structures for storing lists of values and use these results to design and evaluate algorithms for hashing, BreadthFirstSearch (BFS) and DepthFirstSearch (DFS). For the basic
A cacheaware parallel implementation of the pushrelabel network flow algorithm and experimental evaluation of the gap relabeling heuristic
 In Proc. 18th intl. conf. on Parallel and Distributed Computing Systems
, 2005
"... The maximum flow problem is a combinatorial problem of significant importance in a wide variety of research and commercial applications. It has been extensively studied and implemented over the past 40 years. The pushrelabel method has been shown to be superior to other methods, both in theoretica ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
(Show Context)
The maximum flow problem is a combinatorial problem of significant importance in a wide variety of research and commercial applications. It has been extensively studied and implemented over the past 40 years. The pushrelabel method has been shown to be superior to other methods, both in theoretical bounds and in experimental implementations. Our study discusses the implementation of the pushrelabel network flow algorithm on presentday symmetric multiprocessors (SMP’s) with large shared memories. The maximum flow problem is an irregular graph problem and requires frequent finegrained locking of edges and vertices. Over a decade ago, Anderson and Setubal implemented Goldberg’s pushrelabel algorithm for shared memory parallel computers; however, modern systems differ significantly from those targeted by their implementation in that SMP’s today have deep memory hierarchies and different performance costs for synchronization and finegrained locking. Besides our new cacheaware implementation of Goldberg’s parallel algorithm for modern sharedmemory parallel computers, our main new contribution is the first parallel implementation and analysis of the gap relabeling heuristic that runs from 2.1 to 4.3 times faster for sparse graphs.
Searching with Numbers
 Proceedings of WWW
, 2002
"... A large fraction of the useful web comprises of specification documents that largely consist of hattribute name, numeric valuei pairs embedded in text. Examples include product information, classified advertisements, resumes, etc. The approach taken in the past to search these documents by first est ..."
Abstract

Cited by 11 (0 self)
 Add to MetaCart
(Show Context)
A large fraction of the useful web comprises of specification documents that largely consist of hattribute name, numeric valuei pairs embedded in text. Examples include product information, classified advertisements, resumes, etc. The approach taken in the past to search these documents by first establishing correspondences between values and their names has achieved limited success because of the difficulty of extracting this information from free text. We propose a new approach that does not require this correspondence to be accurately established. Provided the data has "low reflectivity ", we can do effective search even if the values in the data have not been assigned attribute names and the user has omitted attribute names in the query. We give algorithms and indexing structures for implementing the search. We also show how hints (i.e., imprecise, partial correspondences) from automatic data extraction techniques can be incorporated into our approach for better accuracy on high reflectivity datasets. Finally, we validate our approach by showing that we get high precision in our answers on real datasets from a variety of domains.
Heuristic Initialization for Bipartite Matching Problems
, 2010
"... It is a wellestablished result that improved pivoting in linear solvers can be achieved by computing a bipartite matching between matrix entries and positions on the main diagonal. With the availability of increasingly faster linear solvers, the speed of bipartite matching computations must keep up ..."
Abstract

Cited by 10 (6 self)
 Add to MetaCart
It is a wellestablished result that improved pivoting in linear solvers can be achieved by computing a bipartite matching between matrix entries and positions on the main diagonal. With the availability of increasingly faster linear solvers, the speed of bipartite matching computations must keep up to avoid slowing down the main computation. Fast algorithms for bipartite matching, which are usually initialized with simple heuristics, have been known for a long time. However, the performance of these algorithms is largely dependent on the quality of the heuristic. We compare combinations of several known heuristics and exact algorithms to find fast combined methods, using realworld matrices as well as randomly generated instances. In addition, we present a new heuristic aimed at obtaining highquality matchings and compare its impact on bipartite matching algorithms with that of other heuristics. The experiments suggest that its performance compares favorably to the bestknown heuristics, and that it is especially suited for application in linear solvers.
Algorithms and Experiments: The New (and Old) Methodology
 J. Univ. Comput. Sci
, 2001
"... The last twenty years have seen enormous progress in the design of algorithms, but little of it has been put into practice. Because many recently developed algorithms are hard to characterize theoretically and have large runningtime coefficients, the gap between theory and practice has widened over ..."
Abstract

Cited by 9 (4 self)
 Add to MetaCart
The last twenty years have seen enormous progress in the design of algorithms, but little of it has been put into practice. Because many recently developed algorithms are hard to characterize theoretically and have large runningtime coefficients, the gap between theory and practice has widened over these years. Experimentation is indispensable in the assessment of heuristics for hard problems, in the characterization of asymptotic behavior of complex algorithms, and in the comparison of competing designs for tractable problems. Implementation, although perhaps not rigorous experimentation, was characteristic of early work in algorithms and data structures. Donald Knuth has throughout insisted on testing every algorithm and conducting analyses that can predict behavior on actual data; more recently, Jon Bentley has vividly illustrated the difficulty of implementation and the value of testing. Numerical analysts have long understood the need for standardized test suites to ensure robustness, precision and efficiency of numerical libraries. It is only recently, however, that the algorithms community has shown signs of returning to implementation and testing as an integral part of algorithm development. The emerging disciplines of experimental algorithmics and algorithm engineering have revived and are extending many of the approaches used by computing pioneers such as Floyd and Knuth and are placing on a formal basis many of Bentley's observations. We reflect on these issues, looking back at the last thirty years of algorithm development and forward to new challenges: designing cacheaware algorithms, algorithms for mixed models of computation, algorithms for external memory, and algorithms for scientific research.
Matching Algorithms Are Fast in Sparse Random Graphs
 THEORY OF COMPUTING SYSTEMS
, 2005
"... We present an improved average case analysis of the maximum cardinality matching problem. We show that in a bipartite or general random graph on n vertices, with high probability every nonmaximum matching has an augmenting path of length O(log n). This implies that augmenting path algorithms like t ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
We present an improved average case analysis of the maximum cardinality matching problem. We show that in a bipartite or general random graph on n vertices, with high probability every nonmaximum matching has an augmenting path of length O(log n). This implies that augmenting path algorithms like the Hopcroft–Karp algorithm for bipartite graphs and the Micali–Vazirani algorithm for general graphs, which have a worst case running time of O(m √ n), run in time O(m log n) with high probability, where m is the number of edges in the graph. Motwani proved these results for random graphs when the average degree is at least ln(n)
Design, implementation, and analysis of maximum transversal algorithms
, 2010
"... We report on careful implementations of seven algorithms for solving the problem of finding a maximum transversal of a sparse matrix. We analyse the algorithms and discuss the design choices. To the best of our knowledge, this is the most comprehensive comparison of maximum transversal algorithms ba ..."
Abstract

Cited by 7 (5 self)
 Add to MetaCart
(Show Context)
We report on careful implementations of seven algorithms for solving the problem of finding a maximum transversal of a sparse matrix. We analyse the algorithms and discuss the design choices. To the best of our knowledge, this is the most comprehensive comparison of maximum transversal algorithms based on augmenting paths. Previous papers with the same objective either do not have all the algorithms discussed in this paper or they used nonuniform implementations from different researchers. We use a common base to implement all of the algorithms and compare their relative performance on a wide range of graphs and matrices. We systematize, develop and use several ideas for enhancing performance. One of these ideas improves the performance of one of the existing algorithms in most cases, sometimes significantly. So much so that we use this as the eighth algorithm in comparisons. 1
Experiments on pushrelabelbased maximum cardinality matching algorithms for bipartite graphs
, 2011
"... We report on careful implementations of several pushrelabelbased algorithms for solving the problem of finding a maximum cardinality matching in a bipartite graph and compare them with fast augmentingpathbased algorithms. We analyze the algorithms using a common base for all implementations and c ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
(Show Context)
We report on careful implementations of several pushrelabelbased algorithms for solving the problem of finding a maximum cardinality matching in a bipartite graph and compare them with fast augmentingpathbased algorithms. We analyze the algorithms using a common base for all implementations and compare their relative performance and stability on a wide range of graphs. The effect of a set of known initialization heuristics on the performance of matching algorithms is also investigated. Our results identify a variant of the pushrelabel algorithm and a variant of the augmentingpathbased algorithm as the fastest with proper initialization heuristics, while the pushrelabel based one having a better worst case performance. 1