Results 1  10
of
22
HighPerformance Algorithm Engineering for Computational Phylogenetics
 J. Supercomputing
, 2002
"... A phylogeny is the evolutionary history of a group of organisms; systematists (and other biologists) attempt to reconstruct this history from various forms of data about contemporary organisms. Phylogeny reconstruction is a crucial step in the understanding of evolution as well as an important tool ..."
Abstract

Cited by 21 (7 self)
 Add to MetaCart
A phylogeny is the evolutionary history of a group of organisms; systematists (and other biologists) attempt to reconstruct this history from various forms of data about contemporary organisms. Phylogeny reconstruction is a crucial step in the understanding of evolution as well as an important tool in biological, pharmaceutical, and medical research. Phylogeny reconstruction from molecular data is very difficult: almost all optimization models give rise to NPhard (and thus computationally intractable) problems. Yet approximations must be of very high quality in order to avoid outright biological nonsense. Thus many biologists have been willing to run farms of processors for many months in order to analyze just one dataset. Highperformance algorithm engineering offers a battery of tools that can reduce, sometimes spectacularly, the running time of existing phylogenetic algorithms, as well as help designers produce better algorithms. We present an overview of algorithm engineering techniques, illustrating them with an application to the "breakpoint analysis" method of Sankoff et al., which resulted in the GRAPPA software suite. GRAPPA demonstrated a speedup in running time by over eight orders of magnitude over the original implementation on a variety of real and simulated datasets. We show how these algorithmic engineering techniques are directly applicable to a large variety of challenging combinatorial problems in computational biology.
Semimatchings for bipartite graphs and load balancing
 In Proc. 8th WADS
, 2003
"... We consider the problem of fairly matching the lefthand vertices of a bipartite graph to the righthand vertices. We refer to this problem as the optimal semimatching problem; it is a relaxation of the known bipartite matching problem. We present a way to evaluate the quality of a given semimatchi ..."
Abstract

Cited by 13 (0 self)
 Add to MetaCart
We consider the problem of fairly matching the lefthand vertices of a bipartite graph to the righthand vertices. We refer to this problem as the optimal semimatching problem; it is a relaxation of the known bipartite matching problem. We present a way to evaluate the quality of a given semimatching and show that, under this measure, an optimal semimatching balances the load on the right hand vertices with respect to any Lpnorm. In particular, when modeling a job assignment system, an optimal semimatching achieves the minimal makespan and the minimal flow time for the system. The problem of finding optimal semimatchings is a special case of certain scheduling problems for which known solutions exist. However, these known solutions are based on general network optimization algorithms, and are not the most efficient way to solve the optimal semimatching problem. To compute optimal semimatchings efficiently, we present and analyze two new algorithms. The first algorithm generalizes the Hungarian method for computing maximum bipartite matchings, while the second, more efficient algorithm is based on a new notion of costreducing paths. Our experimental results demonstrate that the second algorithm is vastly superior to using known network optimization algorithms to solve the optimal semimatching problem. Furthermore, this same algorithm can also be used to find maximum bipartite matchings and is shown to be roughly as efficient as the best known algorithms for this goal. Key words: bipartite graphs, loadbalancing, matching algorithms, optimal algorithms, semimatching
Graph and Hashing Algorithms for Modern Architectures: Design and Performance
 PROC. 2ND WORKSHOP ON ALGORITHM ENG. WAE 98, MAXPLANCK INST. FÜR INFORMATIK, 1998, IN TR MPII981019
, 1998
"... We study the eects of caches on basic graph and hashing algorithms and show how cache effects inuence the best solutions to these problems. We study the performance of basic data structures for storing lists of values and use these results to design and evaluate algorithms for hashing, BreadthFi ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
We study the eects of caches on basic graph and hashing algorithms and show how cache effects inuence the best solutions to these problems. We study the performance of basic data structures for storing lists of values and use these results to design and evaluate algorithms for hashing, BreadthFirstSearch (BFS) and DepthFirstSearch (DFS). For the basic
Algorithms and Experiments: The New (and Old) Methodology
 J. Univ. Comput. Sci
, 2001
"... The last twenty years have seen enormous progress in the design of algorithms, but little of it has been put into practice. Because many recently developed algorithms are hard to characterize theoretically and have large runningtime coefficients, the gap between theory and practice has widened over ..."
Abstract

Cited by 9 (4 self)
 Add to MetaCart
The last twenty years have seen enormous progress in the design of algorithms, but little of it has been put into practice. Because many recently developed algorithms are hard to characterize theoretically and have large runningtime coefficients, the gap between theory and practice has widened over these years. Experimentation is indispensable in the assessment of heuristics for hard problems, in the characterization of asymptotic behavior of complex algorithms, and in the comparison of competing designs for tractable problems. Implementation, although perhaps not rigorous experimentation, was characteristic of early work in algorithms and data structures. Donald Knuth has throughout insisted on testing every algorithm and conducting analyses that can predict behavior on actual data; more recently, Jon Bentley has vividly illustrated the difficulty of implementation and the value of testing. Numerical analysts have long understood the need for standardized test suites to ensure robustness, precision and efficiency of numerical libraries. It is only recently, however, that the algorithms community has shown signs of returning to implementation and testing as an integral part of algorithm development. The emerging disciplines of experimental algorithmics and algorithm engineering have revived and are extending many of the approaches used by computing pioneers such as Floyd and Knuth and are placing on a formal basis many of Bentley's observations. We reflect on these issues, looking back at the last thirty years of algorithm development and forward to new challenges: designing cacheaware algorithms, algorithms for mixed models of computation, algorithms for external memory, and algorithms for scientific research.
Searching with Numbers
 Proceedings of WWW
, 2002
"... A large fraction of the useful web comprises of specification documents that largely consist of hattribute name, numeric valuei pairs embedded in text. Examples include product information, classified advertisements, resumes, etc. The approach taken in the past to search these documents by first est ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
A large fraction of the useful web comprises of specification documents that largely consist of hattribute name, numeric valuei pairs embedded in text. Examples include product information, classified advertisements, resumes, etc. The approach taken in the past to search these documents by first establishing correspondences between values and their names has achieved limited success because of the difficulty of extracting this information from free text. We propose a new approach that does not require this correspondence to be accurately established. Provided the data has "low reflectivity ", we can do effective search even if the values in the data have not been assigned attribute names and the user has omitted attribute names in the query. We give algorithms and indexing structures for implementing the search. We also show how hints (i.e., imprecise, partial correspondences) from automatic data extraction techniques can be incorporated into our approach for better accuracy on high reflectivity datasets. Finally, we validate our approach by showing that we get high precision in our answers on real datasets from a variety of domains.
Heuristic Initialization for Bipartite Matching Problems
, 2010
"... It is a wellestablished result that improved pivoting in linear solvers can be achieved by computing a bipartite matching between matrix entries and positions on the main diagonal. With the availability of increasingly faster linear solvers, the speed of bipartite matching computations must keep up ..."
Abstract

Cited by 9 (5 self)
 Add to MetaCart
It is a wellestablished result that improved pivoting in linear solvers can be achieved by computing a bipartite matching between matrix entries and positions on the main diagonal. With the availability of increasingly faster linear solvers, the speed of bipartite matching computations must keep up to avoid slowing down the main computation. Fast algorithms for bipartite matching, which are usually initialized with simple heuristics, have been known for a long time. However, the performance of these algorithms is largely dependent on the quality of the heuristic. We compare combinations of several known heuristics and exact algorithms to find fast combined methods, using realworld matrices as well as randomly generated instances. In addition, we present a new heuristic aimed at obtaining highquality matchings and compare its impact on bipartite matching algorithms with that of other heuristics. The experiments suggest that its performance compares favorably to the bestknown heuristics, and that it is especially suited for application in linear solvers.
Matching algorithms are fast in sparse random graphs
 PROC. OF THE 21ST ANNUAL SYMPOSIUM ON THEORETICAL ASPECTS OF COMPUTER SCIENCE (STACS), LNCS 2996
, 2006
"... We present an improved average case analysis of the maximum cardinality matching problem. We show that in a bipartite or general random graph on n vertices, with high probability every nonmaximum matching has an augmenting path of length O(log n). This implies that augmenting path algorithms like th ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
We present an improved average case analysis of the maximum cardinality matching problem. We show that in a bipartite or general random graph on n vertices, with high probability every nonmaximum matching has an augmenting path of length O(log n). This implies that augmenting path algorithms like the Hopcroft–Karp algorithm for bipartite graphs and the Micali–Vazirani algorithm for general graphs, which have a worst case running time of O(m √ n), run in time O(m log n) with high probability, where m is the number of edges in the graph. Motwani proved these results for random graphs when the average degree is at least ln(n) [Average Case Analysis of Algorithms for Matchings and Related Problems, Journal of the ACM, 41(6), 1994]. Our results hold, if only the average degree is a large enough constant. At the same time we simplify the analysis of Motwani.
Design, implementation, and analysis of maximum transversal algorithms
, 2010
"... We report on careful implementations of seven algorithms for solving the problem of finding a maximum transversal of a sparse matrix. We analyse the algorithms and discuss the design choices. To the best of our knowledge, this is the most comprehensive comparison of maximum transversal algorithms ba ..."
Abstract

Cited by 6 (4 self)
 Add to MetaCart
We report on careful implementations of seven algorithms for solving the problem of finding a maximum transversal of a sparse matrix. We analyse the algorithms and discuss the design choices. To the best of our knowledge, this is the most comprehensive comparison of maximum transversal algorithms based on augmenting paths. Previous papers with the same objective either do not have all the algorithms discussed in this paper or they used nonuniform implementations from different researchers. We use a common base to implement all of the algorithms and compare their relative performance on a wide range of graphs and matrices. We systematize, develop and use several ideas for enhancing performance. One of these ideas improves the performance of one of the existing algorithms in most cases, sometimes significantly. So much so that we use this as the eighth algorithm in comparisons. 1
TwoLevel PushRelabel Algorithm for the Maximum Flow Problem
"... Abstract. We describe a twolevel pushrelabel algorithm for the maximum flow problem and compare it to the competing codes. The algorithm generalizes a practical algorithm for bipartite flows. Experiments show that the algorithm performs well on several problem families. 1 ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
Abstract. We describe a twolevel pushrelabel algorithm for the maximum flow problem and compare it to the competing codes. The algorithm generalizes a practical algorithm for bipartite flows. Experiments show that the algorithm performs well on several problem families. 1
Matching Algorithms and Feature Match Quality Measures For Model Based Object Recognition with Applications to Automatic Target Recognition
 York University
, 1999
"... iii Preface Needless to say, this work would not have been possible without the continuing support of Robert Hummel and Benjamin Goldberg. To them goes my deepest gratitude. iv Table of Contents Acknowledgements............................................................................. iii ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
iii Preface Needless to say, this work would not have been possible without the continuing support of Robert Hummel and Benjamin Goldberg. To them goes my deepest gratitude. iv Table of Contents Acknowledgements............................................................................. iii