## Simplicity, Truth, and the Unending Game of Science (2005)

### BibTeX

@MISC{Kelly05simplicity,truth,,

author = {Kevin T. Kelly},

title = {Simplicity, Truth, and the Unending Game of Science},

year = {2005}

}

### OpenURL

### Abstract

This paper presents a new explanation of how preferring the simplest theory compatible with experience assists one in finding the true answer to a scientific question when the answers are theories or models. Science is portrayed as an infinite game between science and nature. Simplicity is a structural invariant reflecting sequences of theory choices nature could force the scientist to produce. It is demonstrated that among the methods that converge to the truth in an empirical problem, the ones that do so with a minimum number of reversals of opinion prior to convergence are exactly the ones that prefer simple theories. The idea explains not only simplicity tastes in model selection, but aspects of theory testing and the unwillingness of natural science to break symmetries without a reason. In natural science, one typically faces a situation in which several (or even infinitely many) available theories are compatible with experience. Standard practice is to choose the simplest theory among them and to cite “Ockham’s razor ” as the excuse (figure

### Citations

1941 | The Structure of Scientific Revolutions - Kuhn - 1970 |

1691 | An Introduction to Kolmogorov Complexity and Its Applications
- Li, Vitanyi
- 1997
(Show Context)
Citation Context ...ibly be necessary for efficient convergence to the truth in a wide range of distinct problems possessing different structures. That is the trouble with concepts of simplicity like notational brevity (=-=Li and Vitanyi 1997-=-), uniformity of worlds (Carnap 1950), prior probabilistic biases, and historical “entrenchment” (Goodman 1983). Left to themselves, none of these ideas conforms to the essential structural interplay ... |

1250 |
Information theory and an extension of the maximum likelihood principle
- Akaike
- 1973
(Show Context)
Citation Context ...ibution by setting the free parameters in some statistical model. In that case, the expected squared predictive error of the estimated model will be higher if the model employed is too complex (e.g., =-=Akaike 1973-=-, Forster and Sober 1994). This is a kind of objective, short-run connection between simplicity and truth-finding, but it doesn’t really 6saddress the question at hand, which is how Ockham’s razor hel... |

941 |
The Logic of Scientific Discovery
- Popper
- 1972
(Show Context)
Citation Context ...t should be. Many philosophers have observed that simple theories have various “virtues”, most notably, that simpler or more unified theories are more thoroughly tested by a given evidence set (e.g., =-=Popper 1968-=-, Glymour 1981, Friedman 1983). For if a theory has many free parameters (ways of being true) then new evidence simply “sets” the parameters and there is no risk of the theory itself being refuted alt... |

477 |
Categories for the Working Mathematician
- MacLane
- 1972
(Show Context)
Citation Context ...0 Examples of non-stacked problems illustrate intuitive ideas about empirical symmetry and will be considered in the next section. The result is: 19 Such a result is called a universal factorization (=-=MacLane 1971-=-, pp. 1-2). 20 To see that the particle-counting problem is stacked, suppose that A is not Ockham upon seeing, say, four particles. Let U be the Ockham answer “four”. Then the binary sequence A ∗ U ma... |

328 |
Classical descriptive set theory
- Kechris
- 1995
(Show Context)
Citation Context ...ically invariant structure of boundary points between answers to a question. 0.8 The Unending Game of Science Each scientific problem determines an infinite, zero-sum game of perfect information (cf. =-=Kechris 1991-=-) between the scientist, who responds to each information state by selecting an answer (or by refusing to choose), and the impish inductive demon, who 21 Hsresponds to the scientist’s current guess hi... |

265 |
Logical Foundations of Probability
- Carnap
- 1950
(Show Context)
Citation Context ...to the truth in a wide range of distinct problems possessing different structures. That is the trouble with concepts of simplicity like notational brevity (Li and Vitanyi 1997), uniformity of worlds (=-=Carnap 1950-=-), prior probabilistic biases, and historical “entrenchment” (Goodman 1983). Left to themselves, none of these ideas conforms to the essential structural interplay between a problem’s question and its... |

218 |
Systems that Learn: An Introduction to Learning Theory, second edition
- Jain, Osherson, et al.
- 1999
(Show Context)
Citation Context ... that they are 3 The idea of counting mind-changes already appears in (Putnam 1965). Since then, the idea has been studied extensively by computer scientists interested in computational learning (cf. =-=Jain et al. 1999-=- for a review). The focus, however, is on categorizing the complexities of problems rather than on singling out Ockham’s razor as an optimal method. Oliver Schulte and I began looking at retraction mi... |

172 | On the problem of the most efficient tests of statistical hypotheses - Neyman, Pearson - 1933 |

141 | All of Statistics. A Concise Course in Statistical Inference. Springer. 9 Simulation This appendix provides a minimal introduction to simulation. Simulation means (here) the use of computer-generated data from specified stochastic mechanisms: an earlier t - Wasserman - 2004 |

120 |
The logic of reliable inquiry
- Kelly
- 1996
(Show Context)
Citation Context ... be grounded in a problem’s topological structure. For example, if the space is separable and the question is a countable partition, then solvability is equivalent to each cell being ∆ 0 2 Borel (cf. =-=Kelly 1996-=-). Such questions are not strictly necessary for understanding Ockham’s 22s0.9 Comparing Mind-Changes Consider two possible sequences of answers, σ and τ. Say that σ maps into τ (written σ ≤ τ) just i... |

118 | Borel determinacy - Martin - 1975 |

98 |
Trial and error predicates and the solution to a problem of Mostowski
- Putnam
- 1965
(Show Context)
Citation Context ...rs, but tuning the parameters toward zero makes the extra effects arbitrarily small and, therefore, arbitrarily hard to detect so that they are 3 The idea of counting mind-changes already appears in (=-=Putnam 1965-=-). Since then, the idea has been studied extensively by computer scientists interested in computational learning (cf. Jain et al. 1999 for a review). The focus, however, is on categorizing the complex... |

98 | Theory and evidence - Glymour - 1980 |

86 |
How to tell when Simpler, More Unified or Less ad Hoc Theories will Provide More Accurate Predictions”, The British Journal for the Philosophy of Science
- Forster, Sober
- 1994
(Show Context)
Citation Context ...tting the free parameters in some statistical model. In that case, the expected squared predictive error of the estimated model will be higher if the model employed is too complex (e.g., Akaike 1973, =-=Forster and Sober 1994-=-). This is a kind of objective, short-run connection between simplicity and truth-finding, but it doesn’t really 6saddress the question at hand, which is how Ockham’s razor helps you find the true mod... |

59 |
Foundations of Space-Time Theories
- Friedman
- 1983
(Show Context)
Citation Context ...ers have observed that simple theories have various “virtues”, most notably, that simpler or more unified theories are more thoroughly tested by a given evidence set (e.g., Popper 1968, Glymour 1981, =-=Friedman 1983-=-). For if a theory has many free parameters (ways of being true) then new evidence simply “sets” the parameters and there is no risk of the theory itself being refuted altogether. But a simple theory ... |

59 |
Logical Foundations of Probability. Chicago
- Carnap
- 1950
(Show Context)
Citation Context ... in a wide range of distinct problems possessing different structures. That is the trouble with concepts of simplicity like notational brevity (Li and Vitanyi 1997 pp. 317-337), uniformity of worlds (=-=Carnap 1950-=- pp.562-567), prior probabilistic biases, and historical “entrenchment” (Goodman 1983 pp. 90-100). Left to themselves, none of these ideas conforms to the essential structural interplay between a prob... |

57 | Bayesian model selection and model averaging
- Wasserman
- 2000
(Show Context)
Citation Context ...for doesn’t the more testable theory end up more probable after a fair contest? 1 This is a discrete version of the typical restrictions on prior probability in Bayesian model selection described in (=-=Wasserman 2000-=-). 5sfairness to both theories 1/2 1/2 C(1) 1/6 C(2) C(3) complex 1/6 1/6 S simple It would be a miracle if the parameter were set precisely to 1. Figure 9: The Miracle Explanation. One must beware wh... |

30 | Minimum description length induction - Vitányi, Li |

28 |
A confutation of convergent realism
- Laudan
- 1981
(Show Context)
Citation Context ... failure again and guess “five” (or some greater number of your choosing) rather than the Ockham answer “four”. Philosophers of science call this the “negative induction from the history of science” (=-=Laudan 1981-=-). Why side with Ockham rather than with the negative induction against him? Think of the game of inquiry as starting from scratch at the moment you first say “five”. I am not concerned about what you... |

23 | A purely inductive proof of Borel determinacy - Martin |

18 |
Why Probability does not Capture the Logic of Scientific Justification
- Kelly, Glymour
- 2004
(Show Context)
Citation Context ...dea to the inference of conservation laws in particle physics (Schulte 2001). The ideas in this essay build upon and substantially simplify and generalize the initial approach taken in (Kelly 2002), (=-=Kelly and Glymour 2004-=-) and (Kelly 2004). 11sdetected arbitrarily late. For example, in curve fitting, the curvature of a quadratic curve may be so slight that it requires a huge amount of data to notice that the curve is ... |

17 |
Fact, Fiction, and Forecast, Fourth Edition
- Goodman
- 1983
(Show Context)
Citation Context ...ructures. That is the trouble with concepts of simplicity like notational brevity (Li and Vitanyi 1997), uniformity of worlds (Carnap 1950), prior probabilistic biases, and historical “entrenchment” (=-=Goodman 1983-=-). Left to themselves, none of these ideas conforms to the essential structural interplay between a problem’s question and its underlying informational topology, so none of them could contribute objec... |

17 | cient Convergence Implies Ockham’s Razor, in - Kelly, E - 2002 |

17 |
Means-Ends Epistemology”, The British
- Schulte
- 1999
(Show Context)
Citation Context ...singling out Ockham’s razor as an optimal method. Oliver Schulte and I began looking at retraction minimization as a way to severely constrain one’s choice of hypothesis in the short run in 1996 (cf. =-=Schulte 1999-=-a, 1999b). Schulte has also applied the idea to the inference of conservation laws in particle physics (Schulte 2001). The ideas in this essay build upon and substantially simplify and generalize the ... |

13 |
Causation, prediction, and search (second edition
- Spirtes, Glymour, et al.
- 2000
(Show Context)
Citation Context ...nt that Ockham’s razor is, in some 3 This is particularly true when the features of the model have counterfactual import beyond prediction of the actual sampling distribution, as in causal inference (=-=Spirtes et al. 2000-=-, pp. 47-53). 4 Cf. (Halmos 1974, p. 212, theorem A). Also, see the critical discussion of Bayesian convergence theorems in (Kelly 1996, pp. 302-330). 7How is this flat tire helping me to get home? B... |

10 |
Justification as Truth-finding Efficiency
- Kelly
- 2004
(Show Context)
Citation Context ...rvation laws in particle physics (Schulte 2001). The ideas in this essay build upon and substantially simplify and generalize the initial approach taken in (Kelly 2002), (Kelly and Glymour 2004) and (=-=Kelly 2004-=-). 11sdetected arbitrarily late. For example, in curve fitting, the curvature of a quadratic curve may be so slight that it requires a huge amount of data to notice that the curve is non-linear. 4 So ... |

10 | Inferring Conservation Laws in Particle Physics: A Case Study
- Schulte
- 2001
(Show Context)
Citation Context ... way to severely constrain one’s choice of hypothesis in the short run in 1996 (cf. Schulte 1999a, 1999b). Schulte has also applied the idea to the inference of conservation laws in particle physics (=-=Schulte 2001-=-). The ideas in this essay build upon and substantially simplify and generalize the initial approach taken in (Kelly 2002), (Kelly and Glymour 2004) and (Kelly 2004). 11sdetected arbitrarily late. For... |

6 |
Why Glymour is a Bayesian,” in Testing Scientific Theories
- Rosenkrantz
- 1983
(Show Context)
Citation Context ...the complex theory could save the data just as well as the simple one, the simple theory that did so without any ad hoc fiddling ends up being “confirmed” much more sharply by the same data Et (e.g., =-=Rosenkrantz 1983-=-). Surely that explains how severe testability is a mark of truth, for doesn’t the more testable theory end up more probable after a fair contest? 1 This is a discrete version of the typical restricti... |

5 |
The Logic of Reliable and Efficient
- Schulte
- 1999
(Show Context)
Citation Context ...singling out Ockham’s razor as an optimal method. Oliver Schulte and I began looking at retraction minimization as a way to severely constrain one’s choice of hypothesis in the short run in 1996 (cf. =-=Schulte 1999-=-a, 1999b). Schulte has also applied the idea to the inference of conservation laws in particle physics (Schulte 2001). The ideas in this essay build upon and substantially simplify and generalize the ... |

5 | 1714) Monadologie, in Die Philosophischen Schriften von - Leibniz |

3 | Theory and Evidence, Princetion - Glymour - 1980 |

2 | Measure Theory - unknown authors - 1974 |

2 | Kant´s gesammelte Schriften - KANT - 1900 |

2 | Trial and Error Predicates and a Solution to a Problem of Mostowski - unknown authors - 1965 |

1 |
Domains for Denotational Semantics Automata
- Scott
- 1982
(Show Context)
Citation Context ...p in the Rice-Shapiro theorem, which characterizes computable verifiability (i.e. recursive enumerability). Furthermore, topology is used to model partial informaton states in denotational semantics (=-=Scott 1982-=-). 18swhen a particle appears yields a verification procedure for the proposition that at least one particle will appear. The contradiction is the empty set of worlds (it can’t possibly be true) (figu... |

1 | 1790) Kritik der Urtheilskraft. Berlin and Libau: Lagarde und Friederich - Kant |

1 | How Ockham’s Razor Helps You Find the Truth— Without Pointing at it”, forthcoming - Kelly, K - 2005 |