## The Fastest And Shortest Algorithm For All Well-Defined Problems (2002)

### Cached

### Download Links

- [ftp.idsia.ch]
- [www.hutter1.net]
- [www.idsia.ch]
- [www.hutter1.de]
- [ftp.idsia.ch]
- [arxiv.org]
- DBLP

### Other Repositories/Bibliography

Citations: | 37 - 7 self |

### BibTeX

@MISC{Hutter02thefastest,

author = {Marcus Hutter},

title = {The Fastest And Shortest Algorithm For All Well-Defined Problems},

year = {2002}

}

### Years of Citing Articles

### OpenURL

### Abstract

An algorithm M is described that solves any well-defined problem p as quickly as the fastest algorithm computing a solution to p, save for a factor of 5 and low-order additive terms. M optimally distributes resources between the execution of provably correct p-solving programs and an enumeration of all proofs, including relevant proofs of program correctness and of time bounds on program runtimes. M avoids Blum's speed-up theorem by ignoring programs without correctness proof. M has broader applicability and can be faster than Levin's universal search, the fastest method for inverting functions save for a large multiplicative constant. An extension of Kolmogorov complexity and two novel natural measures of function complexity are used to show that the most efficient program computing some function f is also among the shortest programs provably computing f.

### Citations

6494 |
The mathematical theory of communication
- Shannon
- 1948
(Show Context)
Citation Context ...B) The time assignment of algorithm B to the tp’s only works if the Kraft inequality ∑ (p,tp)∈L 2 −l(p)−l(tp) ≤ 1 is satisfied [10]. This can be ensured by using prefix free (e.g. Shannon-Fano) codes =-=[17, 13]-=-. The number of steps to calculate tp ′(x) is, by definition, timetp ′(x). The relative computation time α available for computing tp ′(x) is 10% ·2−l(p′)−l(t p ′). Hence, tp ′(x) is computed and tfas... |

1735 | An introduction to Kolmogorov complexity and its applications
- Li, Vitányi
- 1993
(Show Context)
Citation Context ...n steps, but cannot improve the time order.Marcus Hutter, The Fastest and Shortest Algorithm 2 In section 2 we review Levin search and the universal search algorithms simple and search, described in =-=[13]-=-. We point out that simple has the same asymptotic time complexity as search not only w.r.t. the problem instance, but also w.r.t. to the problem class. In Section 3 we elucidate Theorem 1 and the ran... |

545 |
Three approaches to the quantitative definition of information
- Kolmogorov
- 1968
(Show Context)
Citation Context ...the most likely to be correct. This has been put into a rigorous scheme by [18] and proved to be optimal in [19, 7]. Kolmogorov Complexity is a universal notion of the information content of a string =-=[9, 4, 23]-=-. It is defined as the length of the shortest program computing string x. KU(x) := min p {l(p) : U(p) = x} = K(x) + O(1) where U is some universal Turing Machine. It can be shown that KU(x) varies, at... |

422 |
First-order logic and automated theorem proving
- Fitting
- 1990
(Show Context)
Citation Context ...to any other time t and space l bounded agent. The computation time of AIξ tl is of the order t·2 l .Marcus Hutter, The Fastest and Shortest Algorithm 5 sequence by applying the inference rules. See =-=[5]-=- or any other textbook on logic or proof theory. We only need to know that provability, Turing Machines, and computation time can be formalized: 1. The set of (correct) proofs is enumerable. 2. A term... |

386 |
Gaussian elimination is not optimal
- Strassen
- 1968
(Show Context)
Citation Context ... size l(x)∼n 2 , then tp∗(x) := 2n3 upper bounds the true computation time timep∗(x) =n2(2n − 1). We know there exists an algorithm p ′ for matrix multiplication with timep ′(x) ≤ tp ′(x) := c ·n2.81 =-=[21]-=-. The time-bound function (cast to an integer) can, as in many cases, be computed very quickly, timetp ′(x) = O(log2n). Hence, using Theorem 1, also Mp∗ is fast, timeM p ∗(x) ≤ 5c·n 2.81 +O(log 2 n). ... |

262 | Checking computations in polylogarithmic time
- Babai, Fortnow, et al.
- 1991
(Show Context)
Citation Context ...ted to subclasses of problems. A more fascinatingMarcus Hutter, The Fastest and Shortest Algorithm 11 (and more speculative) way may be the utilization of so called transparent or holographic proofs =-=[1]-=-. Under certain circumstances they allow an exponential speed up for checking proofs. This would reduce the constants cp and dp to their logarithm, which is a small value. I would like to conclude wit... |

234 | On the length of programs for computing finite binary sequences: statistical considerations
- Chaitin
- 1969
(Show Context)
Citation Context ...the most likely to be correct. This has been put into a rigorous scheme by [18] and proved to be optimal in [19, 7]. Kolmogorov Complexity is a universal notion of the information content of a string =-=[9, 4, 23]-=-. It is defined as the length of the shortest program computing string x. KU(x) := min p {l(p) : U(p) = x} = K(x) + O(1) where U is some universal Turing Machine. It can be shown that KU(x) varies, at... |

195 | The complexity of finite objects and the development of the concepts of information and randomness by means of the theory of algorithms
- Zvonkin, Levin
- 1970
(Show Context)
Citation Context ...the most likely to be correct. This has been put into a rigorous scheme by [18] and proved to be optimal in [19, 7]. Kolmogorov Complexity is a universal notion of the information content of a string =-=[9, 4, 23]-=-. It is defined as the length of the shortest program computing string x. KU(x) := min p {l(p) : U(p) = x} = K(x) + O(1) where U is some universal Turing Machine. It can be shown that KU(x) varies, at... |

150 |
A machine-independent theory of the complexity of recursive functions
- Blum
- 1967
(Show Context)
Citation Context ...gorithmic), specification of the problem. Ideally, we would like to have the fastest algorithm, maybe apart from some small constant factor in computation time. Unfortunately, Blum’s Speed-up Theorem =-=[2, 3]-=- shows that there are problems for which an (incomputable) sequence of speed-improving algorithms (of increasing size) exists, but no fastest algorithm. In the approach presented here, we consider onl... |

129 |
Complexity-Based Induction Systems: Comparisons and Convergence Theorems
- Solomonoff
- 1978
(Show Context)
Citation Context ...ee interpretation of Occam’s razor is that the shortest theory consistent with past data is the most likely to be correct. This has been put into a rigorous scheme by [18] and proved to be optimal in =-=[19, 7]-=-. Kolmogorov Complexity is a universal notion of the information content of a string [9, 4, 23]. It is defined as the length of the shortest program computing string x. KU(x) := min p {l(p) : U(p) = x... |

123 |
Universal sequential search problems
- Levin
- 1973
(Show Context)
Citation Context ...rch Levin search is one of the few rather general speed-up algorithms. Within a (typically large) factor, it is the fastest algorithm for inverting a function g : Y → X, if g can be evaluated quickly =-=[11, 12]-=-. Given x, an inversion algorithm p tries to find a y ∈ Y , called g-witness for x, with g(y)=x. Levin search just runs and verifies the result of all algorithms p in parallel with relative computatio... |

102 |
Randomness conservation inequalities: information and independence in mathematical theories
- Levin
- 1984
(Show Context)
Citation Context ...rch Levin search is one of the few rather general speed-up algorithms. Within a (typically large) factor, it is the fastest algorithm for inverting a function g : Y → X, if g can be evaluated quickly =-=[11, 12]-=-. Given x, an inversion algorithm p tries to find a y ∈ Y , called g-witness for x, with g(y)=x. Levin search just runs and verifies the result of all algorithms p in parallel with relative computatio... |

68 | Minimum description length induction, Bayesianism, and Kolmogorov complexity
- Vitanyi, Li
(Show Context)
Citation Context ...s uniquely defined up to an additive constant. K(x) can be approximated from above (is co-enumerable), but not finitely computable. See [13] for an excellent introduction to Kolmogorov Complexity and =-=[22]-=- for a review of Kolmogorov inspired prediction schemes. Recently, Schmidhuber [15] has generalized Kolmogorov complexity in various ways to the limits of computability and beyond. In the following, w... |

63 | Shifting inductive bias with success-story algorithm, adaptive Levin search, and incremental self-improvement
- Schmidhuber, Zhao, et al.
- 1997
(Show Context)
Citation Context ...s one might take the field IF2 = {0, 1} to avoid subtleties arising from large numbers. Arithmetic operations are assumed to need one unit of time.Marcus Hutter, The Fastest and Shortest Algorithm 4 =-=[14, 16]-=-, when handled with care. The same should hold for Theorem 1, as will be discussed. We avoid the O() notation as far as possible, as it can be severely misleading (e.g. 10 42 = O(1) O(1) = O(1)). This... |

54 |
A device for quantizing, grouping and coding amplitude modulated pulses
- Kraft
- 1949
(Show Context)
Citation Context ...mpanied by different time bounds tp; for instance (p, timep) will occur. B) The time assignment of algorithm B to the tp’s only works if the Kraft inequality ∑ (p,tp)∈L 2 −l(p)−l(tp) ≤ 1 is satisfied =-=[10]-=-. This can be ensured by using prefix free (e.g. Shannon-Fano) codes [17, 13]. The number of steps to calculate tp ′(x) is, by definition, timetp ′(x). The relative computation time α available for co... |

52 |
A formal theory of inductive inference: Parts 1 and 2
- Solomonoff
- 1964
(Show Context)
Citation Context ...sting, classification, ...). A free interpretation of Occam’s razor is that the shortest theory consistent with past data is the most likely to be correct. This has been put into a rigorous scheme by =-=[18]-=- and proved to be optimal in [19, 7]. Kolmogorov Complexity is a universal notion of the information content of a string [9, 4, 23]. It is defined as the length of the shortest program computing strin... |

49 | Universal sequential search problems,” Problems Inform - Levin - 1973 |

49 | Discovering neural nets with low kolmogorov complexity and high generalization capability
- Schmidhuber
- 1997
(Show Context)
Citation Context ...function. The large constants cp and dp seem to spoil a direct implementation of Mp∗. On the other hand, Levin search has been successfully applied to solve rather difficult machine learning problems =-=[14, 16]-=-, even though it suffers from a large multiplicative factor of similar origin. The use of more elaborate theorem-provers, rather than brute force enumeration of all proofs, could lead to smaller const... |

47 | Gaussian elimination is not optimal. Numerische Mathematik 13(4), 354–356 - Strassen - 1969 |

39 | On the Length of Programs for Computing Binary Sequences - Chaitin - 1966 |

32 | Algorithmic Theories of Everything
- Schmidhuber
- 2000
(Show Context)
Citation Context ...(is co-enumerable), but not finitely computable. See [13] for an excellent introduction to Kolmogorov Complexity and [22] for a review of Kolmogorov inspired prediction schemes. Recently, Schmidhuber =-=[15]-=- has generalized Kolmogorov complexity in various ways to the limits of computability and beyond. In the following, we also need a generalization, but of a different kind. We need a short description ... |

22 | New error bounds for Solomonoff prediction
- Hutter
- 1999
(Show Context)
Citation Context ...ee interpretation of Occam’s razor is that the shortest theory consistent with past data is the most likely to be correct. This has been put into a rigorous scheme by [18] and proved to be optimal in =-=[19, 7]-=-. Kolmogorov Complexity is a universal notion of the information content of a string [9, 4, 23]. It is defined as the length of the shortest program computing string x. KU(x) := min p {l(p) : U(p) = x... |

21 |
On effective procedures for speeding up algorithms
- BLUM
- 1969
(Show Context)
Citation Context ...gorithmic), specification of the problem. Ideally, we would like to have the fastest algorithm, maybe apart from some small constant factor in computation time. Unfortunately, Blum’s Speed-up Theorem =-=[2, 3]-=- shows that there are problems for which an (incomputable) sequence of speed-improving algorithms (of increasing size) exists, but no fastest algorithm. In the approach presented here, we consider onl... |

19 | Three approaches to the quantitative de of information - Kolmogorov - 1965 |

18 | A theory of universal artificial intelligence based on algorithmic complexity
- Hutter
- 2000
(Show Context)
Citation Context ...roblems as well [20]. Many, but not all problems, are of inversion or optimization type. The matrix multiplication example (section 3), the decision problem SAT [13, p503], and reinforcement learning =-=[8]-=-, for instance, cannot be brought into this form. Furthermore, the large factor 2 l(p) somewhat limits the applicability of Levin search. See [13, pp518-519] for a historical review and further refere... |

17 | Complexity-based induction systems: comparison and convergence theorems - Solomono - 1978 |

11 | The complexity of objects and the development of the concepts of information and randomness by means of the theory of algorithms - Zvonkin, Levin - 1970 |

8 | A formal theory of inductive inference: Part 1 and 2 - Solomono - 1964 |

5 | New error bounds for Solomono prediction - Hutter - 1999 |

5 | Algorithmic theories of everything. (Report IDSIA-20-00 - Schmidhuber - 2000 |

4 | A theory of universal arti intelligence based on algorithmic complexity (Technical Report - Hutter - 2000 |

3 |
Applications of algorithmic probability to artificial intelligence
- Solomonoff
- 1986
(Show Context)
Citation Context ...search. However, in practice, search should be favored, because also constants matter, and 2 kSEARCH is rather large. Levin search can be modified to handle time-limited optimization problems as well =-=[20]-=-. Many, but not all problems, are of inversion or optimization type. The matrix multiplication example (section 3), the decision problem SAT [13, p503], and reinforcement learning [8], for instance, c... |

2 |
Relations between diagonalization, proof systems, and complexity gaps
- Hartmanis
- 1979
(Show Context)
Citation Context ...rmalized within the considered proof system. A formal proof of the correctness of Mp∗ would prove the consistency of the proof system, which is impossible by Gödels second incompleteness theorem. See =-=[6]-=- for details in a related context.Marcus Hutter, The Fastest and Shortest Algorithm 10 8 Generalizations If p ∗ has to be evaluated repeatedly, algorithm A can be modified to remember its current sta... |

1 | Applications of algorithmic probability to arti intelligence - Solomono - 1986 |