## Empirical Hardness Models: Methodology and a Case Study on Combinatorial Auctions

Citations: | 18 - 6 self |

### BibTeX

@MISC{Leyton-brown_empiricalhardness,

author = {Kevin Leyton-brown},

title = {Empirical Hardness Models: Methodology and a Case Study on Combinatorial Auctions},

year = {}

}

### OpenURL

### Abstract

Is it possible to predict how long an algorithm will take to solve a previously-unseen instance of an NP-complete problem? If so, what uses can be found for models that make such predictions? This paper provides answers to these questions and evaluates the answers experimentally. We propose the use of supervised machine learning to build models that predict an algorithm’s runtime given a problem instance. We discuss the construction of these models and describe techniques for interpreting them to gain understanding of the characteristics that cause instances to be hard or easy. We also present two applications of our models: building algorithm portfolios that outperform their constituent algorithms, and generating test distributions that emphasize hard problems. We demonstrate the effectiveness of our techniques in a case study of the combinatorial auction winner determination problem. Our experimental results show that we can build very accurate models of an algorithm’s running time, interpret our models, build an algorithm portfolio that strongly outperforms the best single algorithm, and tune a standard benchmark suite to generate much harder problem instances.

### Citations

2066 |
The Elements of Statistical Learning
- Hastie, Tibshirani, et al.
- 2001
(Show Context)
Citation Context ...riety of different regression techniques; the most appropriate for our purposes perform supervised learning. A large literature addresses these statistical techniques; for an introduction see, e.g., [=-=Hastie et al. 2001-=-]. Such techniques choose a function from a given hypothesis space (i.e., a space of candidate mappings from the features to the running time) in order to minimize a given error metric (a function tha... |

1040 | Wrappers for feature subset selection
- Kohavi, John
- 1983
(Show Context)
Citation Context ... also an option. 14For a detailed discussion of techniques for selecting relevant feature subsets and for comparisons of different definitions of “relevant,” focusing on classification problems, see [=-=Kohavi and John 1997-=-]. 28sBecause it is often the case that many variables are correlated in complex ways, there may be other very different sets of features that would achieve similar performance. Furthermore, since our... |

901 | Sequential Monte Carlo Methods in Practice - Doucet, Freitas, et al. - 2001 |

666 | The strenght of weak learnability - Schapire - 1989 |

578 | Where the really hard problems are - Cheeseman, Kanefsky, et al. - 1991 |

515 | Algorithm for optimal winner determination in combinatorial auctions
- Sandholm
- 2002
(Show Context)
Citation Context ...actable subcases of the problem and addressed computational approaches to the general WDP only briefly. The first algorithms designed specifically for the general WDP were published at IJCAI in 1999 [=-=Sandholm 1999-=-; Fujishima et al. 1999]; the authors of these papers subsequently improved and extended upon their algorithms in [Sandholm et al. 2001; Leyton-Brown et al. 2000b]. BoB [Sandholm 1999] and CASS [Fujis... |

427 | Multivariate Adaptive Regression Splines
- Friedman
- 1988
(Show Context)
Citation Context ...able to analysis. Before discussing quadratic regression further, we will begin by describing the other approaches that we considered. First, we tried Multivariate Adaptive Regression Splines (MARS) [=-=Friedman 1991-=-]. MARS models are linear combinations of the products of one or more basis functions, where basis functions are the positive parts of linear functions of single features. The RMSE on our MARS models ... |

316 | Spectrum Auctions - Cramton |

313 | Computationally manageable combinatorial auctions
- Rothkopf, Pekeč, et al.
- 1998
(Show Context)
Citation Context ...h step in our case study. 13s3.1 Step 1: Selecting an Algorithm There are many WDP algorithms from which to choose, as much recent work has addressed this problem. A very influential early paper was [=-=Rothkopf et al. 1998-=-], but it focused on tractable subcases of the problem and addressed computational approaches to the general WDP only briefly. The first algorithms designed specifically for the general WDP were publi... |

269 | Taming the Computational Complexity of Combinatorial Auctions: Optimal and Approximate Approaches
- Fujishima, Leyton-Brown, et al.
- 1999
(Show Context)
Citation Context ...s of the problem and addressed computational approaches to the general WDP only briefly. The first algorithms designed specifically for the general WDP were published at IJCAI in 1999 [Sandholm 1999; =-=Fujishima et al. 1999-=-]; the authors of these papers subsequently improved and extended upon their algorithms in [Sandholm et al. 2001; Leyton-Brown et al. 2000b]. BoB [Sandholm 1999] and CASS [Fujishima et al. 1999] made ... |

246 | Bidding and allocation in combinatorial auctions - Nisan - 2000 |

240 |
Stochastic Local Search: Foundations and Applications
- Hoos, Stützle
- 2004
(Show Context)
Citation Context ...which may explain the observed empirical behavior of SAT solvers. A lot of effort has also gone into the study of search space topologies for stochastic local search algorithms [Hoos and Sttzle 1999; =-=Hoos and Sttzle 2004-=-]. Finally, some work has been more theoretical in nature. For example, Kolaitis [2003] defined and studied “islands of tractability” of hard problems. 2s1.2 Past Work: Empirical Hardness of Optimizat... |

177 | Bayesian experimental design: A review
- Chaloner, Verdinelli
- 1995
(Show Context)
Citation Context ...arlier work [Leyton-Brown et al. 2002], though we present it somewhat differently here. We also note that it is related to approaches for statistical experiment design (see, e.g., [Mason et al. 2003; =-=Chaloner and Verdinelli 1995-=-]). 1. Select an algorithm of interest. 2. Select an instance distribution. In practice, this may be achieved as a distribution over different instance generators, along with a distribution over each ... |

169 | Combinatorial auctions: A survey - Vries, Vohra - 2003 |

149 | Heavy-tailed phenomena in satisfiability and constraint satisfaction problems
- Gomes, Selman, et al.
- 2000
(Show Context)
Citation Context ...dy natural phenomena. This approach has also been applied to other decision problems such as quasigroup completion [Gomes and Selman 1997]. Follow-up work took a closer look at runtime distributions [=-=Gomes et al. 2000-=-], demonstrating that runtimes of many SAT algorithms tend to follow powerlaw distributions, and that random restarts provably improve such algorithms. Later, Gomes et al. [2004] refined these notions... |

138 | Towards a universal test suite for combinatorial auction algorithms - Leyton-Brown, Shoham - 2000 |

137 | Analysis of two simple heuristics on a random instance of k-SAT
- Frieze, Suen
- 1996
(Show Context)
Citation Context ...ansition occurs, particularly for uniform random 3-SAT and k-SAT, and describing how the easy–hard–less hard transition arises, particularly for simple backtracking-based (so-called DPLL) algorithms [=-=Frieze and Suen 1996-=-; Dubois and Boufkhad 1997; Beame et al. 1998; Dubois et al. 2000; Franco 2001; Achlioptas 2001; Cocco and Monasson 2004; Achlioptas et al. 2004]. The last decade or so has seen increased enthusiasm f... |

127 | Algorithm portfolios - Gomes, Selman - 2001 |

122 | CABOB: A fast optimal algorithm for combinatorial auctions. IJCAI
- Sandholm, Suri, et al.
- 2001
(Show Context)
Citation Context ...igned specifically for the general WDP were published at IJCAI in 1999 [Sandholm 1999; Fujishima et al. 1999]; the authors of these papers subsequently improved and extended upon their algorithms in [=-=Sandholm et al. 2001-=-; Leyton-Brown et al. 2000b]. BoB [Sandholm 1999] and CASS [Fujishima et al. 1999] made use of classical AI heuristic search techniques, structuring their search by branching on bids and goods respect... |

111 | An economics approach to hard computational problems - Huberman, Lukose, et al. - 1997 |

103 | Estimating the efficiency of backtracking programs - Knuth - 1975 |

98 | Generating hard satisfiability problems - Selman, Mitchell, et al. - 1996 |

93 | Solving combinatorial auctions using stochastic local search
- Hoos, Boutilier
- 2000
(Show Context)
Citation Context ...lgorithms using artificial distributions. 3.2.1 Legacy Data Distributions Along with the first wave of algorithms for the WDP, seven distributions were proposed [Sandholm 1999; Fujishima et al. 1999; =-=Hoos and Boutilier 2000-=-]. These distributions have been widely used by other researchers including many 4 We should mention that the CABOB algorithm continues to be developed commercially through CombineNet, Inc. (http://ww... |

92 | SATzilla: portfoliobased algorithm selection for SAT
- Xu, Hutter, et al.
- 2008
(Show Context)
Citation Context ...ethodology may be used to build empirical hardness models for both randomized tree search algorithms and stochastic local search algorithms [Nudelman et al. 2004b; Hutter et al. 2006; Xu et al. 2007; =-=Xu et al. 2008-=-]. Based on the findings in that work, we observe that our techniques extend directly to cases where the algorithm’s running time varies from one invocation to another. Indeed, even when we do restric... |

86 |
The algorithm selection problem
- Rice
- 1976
(Show Context)
Citation Context ... highly variable from instance to instance. When algorithms exhibit high runtime variance, it can be difficult to decide which algorithm to use; Rice dubbed this the “algorithm 33sselection problem” [=-=Rice 1976-=-]. In the nearly three decades that have followed, the issue of algorithm selection has failed to receive widespread study, though of course some excellent work does exist. (We discuss much of this wo... |

86 | Backdoors to typical case complexity - Williams, Gomes, et al. - 2003 |

85 | Typical random 3-SAT formulae and the satisfiability threshold
- Dubois, Boufkhad, et al.
- 2000
(Show Context)
Citation Context ...nd describing how the easy–hard–less hard transition arises, particularly for simple backtracking-based (so-called DPLL) algorithms [Frieze and Suen 1996; Dubois and Boufkhad 1997; Beame et al. 1998; =-=Dubois et al. 2000-=-; Franco 2001; Achlioptas 2001; Cocco and Monasson 2004; Achlioptas et al. 2004]. The last decade or so has seen increased enthusiasm for the idea of studying algorithm performance experimentally, usi... |

80 | Generating satisfiable problem instances
- Achlioptas, Gomes, et al.
- 2000
(Show Context)
Citation Context ...ed recent work. Journal of the ACM, Vol. 22, No. 4, June 2009.Empirical Hardness Models: Methodology and a Case Study on Combinatorial Auctions · 3 on the notion of a backbone [Monasson et al. 1998; =-=Achlioptas et al. 2000-=-], which is the set of solution invariants. Williams et al. [2003] defined the concept of a backdoor of a CSP instance: the set of variables, which, if assigned correctly, lead to a residual problem t... |

71 | Problemstructure in the presence of perturbations
- Gomes, Selman
- 1997
(Show Context)
Citation Context ...his work has complemented the theoretical worst-case analysis of algorithms, leading to interesting findings and concepts. For example, this approach was applied to the quasigroup completion problem [=-=Gomes and Selman 1997-=-]. Follow-up work took a closer look at runtime distributions [Gomes et al. 2000], demonstrating that runtimes of many SAT algorithms tend to follow power-law distributions and that random restarts pr... |

69 | Optimal solutions for multi-unit combinatorial auctions: Branch and bound heuristics
- Gonen, Lehmann
- 2000
(Show Context)
Citation Context ...e these distributions so that they would generate hard instances. Based on experimental evidence, some researchers have remarked that some CATS problems are comparatively easy in practice (see e.g., [=-=Gonen and Lehmann 2000-=-; Sandholm et al. 2001]). In Section 3.5.1 we show experimentally that some CATS distributions are always very easy for CPLEX, while others can be extremely hard. We consider whether these distributio... |

68 | Lower bounds for random 3-SAT via differential equations.Theoretical Computer Science 265 - Achlioptas - 2001 |

63 | A Bayesian approach to tackling hard computational problems - Horvitz, Ruan, et al. - 2001 |

62 | High-level optimization via automated statistical modeling - Brewer - 1995 |

60 | An algorithm for multi-unit combinatorial auctions
- Leyton-Brown, Shoham, et al.
- 2000
(Show Context)
Citation Context ...goods selected. 3.2.2 CATS Distributions. The above distributions have been criticized in several ways, perhaps most significantly for lacking economic justification [see, e.g., Anderson et al. 2000; =-=Leyton-Brown et al. 2000-=-; de Vries and Vohra 2003]. This criticism was significant because the WDP is ultimately a weighted set packing problem; if the data on which algorithms are evaluated lacks any connection to the combi... |

59 | Learning the empirical hardness of optimization problems: The case of combinatorial auctions
- Leyton-Brown, Nudelman, et al.
- 2002
(Show Context)
Citation Context ...he running time of a given algorithm on individual instances of a problem such as WDP, where instances are drawn from some arbitrary distribution. We first discussed this methodology in earlier work [=-=Leyton-Brown et al. 2002-=-], though we present it somewhat differently here. We also note that it is related to approaches for statistical experiment design (see, e.g., [Mason et al. 2003; Chaloner and Verdinelli 1995]). 1. Se... |

56 |
Least angle regression. The Annals of statistics
- Efron, Hastie, et al.
(Show Context)
Citation Context ...mber of features. Thus, it is worthwhile to explore faster ways of building these models. The Shermann16 When we applied our methodology to SAT [Nudelman et al. 2004b] we also used the LAR algorithm [=-=Efron et al. 2004-=-]. LAR is a shrinkage technique for linear regression that can set the coefficients of sufficiently unimportant variables to zero as well as simply reducing them; thus, it can be also be used for subs... |

55 |
A general upper bound for the satisfiability threshold of random r-sat formulae
- Dubois, Boufkhad
- 1997
(Show Context)
Citation Context ...cularly for uniform random 3-SAT and k-SAT, and describing how the easy–hard–less hard transition arises, particularly for simple backtracking-based (so-called DPLL) algorithms [Frieze and Suen 1996; =-=Dubois and Boufkhad 1997-=-; Beame et al. 1998; Dubois et al. 2000; Franco 2001; Achlioptas 2001; Cocco and Monasson 2004; Achlioptas et al. 2004]. The last decade or so has seen increased enthusiasm for the idea of studying al... |

52 | Algorithm selection using reinforcement learning - Lagoudakis, Littman - 2000 |

52 | On the complexity of unsatisfiability proofs for random k-cnf formulas
- Beame, Karp, et al.
- 1998
(Show Context)
Citation Context ... 3-SAT and k-SAT, and describing how the easy–hard–less hard transition arises, particularly for simple backtracking-based (so-called DPLL) algorithms [Frieze and Suen 1996; Dubois and Boufkhad 1997; =-=Beame et al. 1998-=-; Dubois et al. 2000; Franco 2001; Achlioptas 2001; Cocco and Monasson 2004; Achlioptas et al. 2004]. The last decade or so has seen increased enthusiasm for the idea of studying algorithm performance... |

49 | CABOB: A Fast Optimal Algorithm for Winner Determination in Combinatorial Auctions. Management Science 51(3 - Sandholm, Suri, et al. - 2005 |

45 | Understanding random sat: beyond the clauses-to-variables ratio - Nudelman, Leyton-Brown, et al. - 2004 |

44 | Towards a characterisation of the behaviour of stochastic local search algorithms for SAT
- Hoos, Stützle
- 1999
(Show Context)
Citation Context ...have small backdoors, which may explain the observed empirical behavior of SAT solvers. A lot of effort has also gone into the study of search space topologies for stochastic local search algorithms [=-=Hoos and Sttzle 1999-=-; Hoos and Sttzle 2004]. Finally, some work has been more theoretical in nature. For example, Kolaitis [2003] defined and studied “islands of tractability” of hard problems. 2s1.2 Past Work: Empirical... |

38 | A portfolio approach to algorithm selection
- Leyton-Brown, Nudelman, et al.
- 2003
(Show Context)
Citation Context ...nts both on fixed and variable-sized data. We used three fixed-size datasets which we had studied in our previously published conference papers on empirical hardness models [Leyton-Brown et al. 2002; =-=Leyton-Brown et al. 2003-=-b; Leyton-Brown et al. 2003a]. These datasets were collected using CPLEX 7.1. We also added a new variable-size dataset which represented roughly three times as much computer time. Because a new versi... |

36 | The adaptive constraint engine - Epstein, Freuder, et al. - 2002 |

33 | Applying machine learning to low-knowledge control of optimization algorithms - Carchrae, Beck |

33 | SATzilla07: The design and analysis of an algorithm portfolio for SAT
- Xu, Hutter, et al.
- 2007
(Show Context)
Citation Context ...rated that our methodology may be used to build empirical hardness models for both randomized tree search algorithms and stochastic local search algorithms [Nudelman et al. 2004b; Hutter et al. 2006; =-=Xu et al. 2007-=-; Xu et al. 2008]. Based on the findings in that work, we observe that our techniques extend directly to cases where the algorithm’s running time varies from one invocation to another. Indeed, even wh... |

31 | Complexity analysis of admissible heuristic search - Reid - 1998 |

31 | Restart policies with dependence among runs : a dynamic programming approach - Ruan, Horvitz, et al. - 2002 |

30 | Linear programming helps solving large multi-unit combinatorial auctions
- Gonen, Lehmann
- 2002
(Show Context)
Citation Context ...urpose software, Sandholm et al.’s CABOB [Sandholm et al. 2001]. (In fact, CABOB makes use of CPLEX’s linear programming package as a subroutine and also uses branch-and-bound search.) Likewise, GL ([=-=Gonen and Lehmann 2001-=-]) is a branch-and-bound algorithm that uses CPLEX’s LP solver as its heuristic. Thus, the combinatorial auctions research community has seen convergence towards branch-and-bound search with an LP heu... |

28 | Statistical regimes across constrainedness regions - Gomes, Fernández, et al. - 2005 |