Results 1 -
9 of
9
Algorithmic Statistics
- IEEE Transactions on Information Theory
, 2001
"... While Kolmogorov complexity is the accepted absolute measure of information content of an individual finite object, a similarly absolute notion is needed for the relation between an individual data sample and an individual model summarizing the information in the data, for example, a finite set (or ..."
Abstract
-
Cited by 41 (8 self)
- Add to MetaCart
While Kolmogorov complexity is the accepted absolute measure of information content of an individual finite object, a similarly absolute notion is needed for the relation between an individual data sample and an individual model summarizing the information in the data, for example, a finite set (or probability distribution) where the data sample typically came from. The statistical theory based on such relations between individual objects can be called algorithmic statistics, in contrast to classical statistical theory that deals with relations between probabilistic ensembles. We develop the algorithmic theory of statistic, sufficient statistic, and minimal sufficient statistic. This theory is based on two-part codes consisting of the code for the statistic (the model summarizing the regularity, the meaningful information, in the data) and the model-to-data code. In contrast to the situation in probabilistic statistical theory, the algorithmic relation of (minimal) sufficiency is an absolute relation between the individual model and the individual data sample. We distinguish implicit and explicit descriptions of the models. We give characterizations of algorithmic (Kolmogorov) minimal sufficient statistic for all data samples for both description modes in the explicit mode under some constraints. We also strengthen and elaborate earlier results on the "Kolmogorov structure function" and "absolutely non-stochastic objects" those rare objects for which the simplest models that summarize their relevant information (minimal sucient statistics) are at least as complex as the objects themselves. We demonstrate a close relation between the probabilistic notions and the algorithmic ones: (i) in both cases there is an "information non-increase" law; (ii) it is shown that a function is a...
Kolmogorov's Structure Functions and Model Selection
- IEEE TRANS. INFORM. THEORY
, 2003
"... In 1974 Kolmogorov proposed a non-probabilistic approach to statistics, an individual combinatorial relation between the data and its model, expressed by the so-called "structure function" of the data. We show that the structure function determines all stochastic properties of the data in the sense ..."
Abstract
-
Cited by 17 (7 self)
- Add to MetaCart
In 1974 Kolmogorov proposed a non-probabilistic approach to statistics, an individual combinatorial relation between the data and its model, expressed by the so-called "structure function" of the data. We show that the structure function determines all stochastic properties of the data in the sense of determining the best- tting model at every model-complexity level. A consequence is this: minimizing the data-to-model code length (finding the ML estimator or MDL estimator), in a class of contemplated models of prescribed maximal (Kolmogorov) complexity, always results in a model of best fit, irrespective of whether the source producing the data is in the model class considered. In this setting, code minimization always separates optimal model information from the remaining accidental information, and not only with high probability. The function that maps the maximal allowed model complexity to the goodness-of-fit (expressed as minimal "randomness deficiency") of the best model cannot itself be monotonically approximated. However, the shortest one-part or two-part code above can -- implicitly optimizing this elusive goodness-of-fit. We show that -- within the obvious constraints -- every graph is realized by the structure function of some data. We determine the (un)computability properties of the various functions contemplated and of the "algorithmic minimal sufficient statistic."
The Generalized Universal Law of Generalization
- Journal of Mathematical Psychology
, 2001
"... It has been argued by Shepard that there is a robust psychological law that relates the distance between a pair of items in psychological space and the probability that they will be confused with each other. Specifically, the probability of confusion is a negative exponential function of the dista ..."
Abstract
-
Cited by 14 (3 self)
- Add to MetaCart
It has been argued by Shepard that there is a robust psychological law that relates the distance between a pair of items in psychological space and the probability that they will be confused with each other. Specifically, the probability of confusion is a negative exponential function of the distance between the pair of items. In experimental contexts, distance is typically defined in terms of a multidimensional Euclidean space---but this assumption seems unlikely to hold for complex stimuli. We show that, nonetheless, the Universal Law of Generalization can be derived in the more complex setting of arbitrary stimuli, using a much more universal measure of distance. This universal distance is defined as the length of the shortest program that transforms the representations of the two items of interest into one another: the algorithmic information distance. It is universal in the sense that it minorizes every computable distance: it is the smallest computable distance. We show ...
Predicting and Controlling Resource Usage in a Heterogeneous Active Network
- In Proceedings of the Third International Workshop on Active Middleware Services
, 2001
"... Active network technology envisions deployment of virtual execution environments within network elements, such as switches and routers. As a result, inhomogeneous processing can be applied to network traffic. To use such technology safely and efficiently, individual nodes must provide mechanisms to ..."
Abstract
-
Cited by 13 (1 self)
- Add to MetaCart
Active network technology envisions deployment of virtual execution environments within network elements, such as switches and routers. As a result, inhomogeneous processing can be applied to network traffic. To use such technology safely and efficiently, individual nodes must provide mechanisms to enforce resource limits. This implies that each node must understand the varying resource requirements for specific network traffic. This paper presents an approach to model the CPU time requirements of active applications in a form that can be interpreted among heterogeneous nodes. Further, the paper demonstrates how this approach can be used successfully to control resources consumed at an active-network node and to predict load among nodes in an active network, when integrated within the Active Virtual Network Management Prediction system. 1.
Minimum description length principle: Generators are preferable to closed patterns
- In AAAI
, 2006
"... The generators and the unique closed pattern of an equivalence class of itemsets share a common set of transactions. The generators are the minimal ones among the equivalent itemsets, while the closed pattern is the maximum one. As a generator is usually smaller than the closed pattern in cardinalit ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
The generators and the unique closed pattern of an equivalence class of itemsets share a common set of transactions. The generators are the minimal ones among the equivalent itemsets, while the closed pattern is the maximum one. As a generator is usually smaller than the closed pattern in cardinality, by the Minimum Description Length Principle, the generator is preferable to the closed pattern in inductive inference and classification. To efficiently discover frequent generators from a large dataset, we develop a depth-first algorithm called Gr-growth. The idea is novel in contrast to traditional breadth-first bottom-up generator-mining algorithms. Our extensive performance study shows that Gr-growth is significantly faster (an order or even two orders of magnitudes when the support thresholds are low) than the existing generator mining algorithms. It can be also faster than the state-of-the-art frequent closed itemset mining algorithms such as FPclose and CLOSET+.
Kolmogorov's Structure Functions with an Application to the Foundations of Model Selection
- In Proceedings of the 43rd Annual Symposium on Foundations of Computer Science. IEEE Computer Society
, 2002
"... In 1974 Kolmogorov proposed a non-probabilistic approach to statistics, an individual combinatorial relation between the data and its model. We vindicate, for the first time, the rightness of the original "structure function", proposed by Kolmogorov: minimizing the data-to-model code length (finding ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
In 1974 Kolmogorov proposed a non-probabilistic approach to statistics, an individual combinatorial relation between the data and its model. We vindicate, for the first time, the rightness of the original "structure function", proposed by Kolmogorov: minimizing the data-to-model code length (finding the ML estimator or MDL estimator), in a class of contemplated models of prescribed maximal (Kolmogorov) complexity, always results in a model of best fit (expressed as minimal randomness deficiency). We show that both the structure function and the minimum randomness deficiency function can assume all shapes over their full domain (improving an old result of L.A. Levin and both an old and a recent one of V.V. Vyugin). We determine the (un)computability properties of the various functions and "algorithmic sufficient statistic." 1
Abstract Hierarchy in Cognitive Maps and the Simplicity Principle
, 2002
"... The aim of this paper is to relate the research on cognitive maps, mental representations of the external world, which are thought to be organized hierarchically, with the simplicity principle, a formal version of Occam’s Razor inductive bias, which states that short hypotheses are to be preferred o ..."
Abstract
- Add to MetaCart
The aim of this paper is to relate the research on cognitive maps, mental representations of the external world, which are thought to be organized hierarchically, with the simplicity principle, a formal version of Occam’s Razor inductive bias, which states that short hypotheses are to be preferred over longer ones. First, I will describe some important properties of hierarchical structures and define what is meant by simplicity. Next, I will review some results supporting the view that cognitive maps are structured hierarchically, not following Euclidean metrics, and that some cognitive functions seem to prefer simpler representations over complex ones. This hints at the possibility that the hierarchical organization of cognitive maps may also follow the simplicity principle. Finally, I will suggest an experiment in which such hypothesis could be tested and discuss the implications and limitations of integrating both approaches. 1 What are hierarchies and why are they important? Hierarchies are ordered structures of objects arranged over several distinct levels. Individuals at the same level share common properties and are related to each other through a partial ordering relation. Ob-
Towards an Algorithmic Statistics (Extended Abstract)
"... ) Peter G'acs ? , John Tromp, and Paul Vit'anyi ?? Abstract. While Kolmogorov complexity is the accepted absolute measure of information content of an individual finite object, a similarly absolute notion is needed for the relation between an individual data sample and an individual model su ..."
Abstract
- Add to MetaCart
) Peter G'acs ? , John Tromp, and Paul Vit'anyi ?? Abstract. While Kolmogorov complexity is the accepted absolute measure of information content of an individual finite object, a similarly absolute notion is needed for the relation between an individual data sample and an individual model summarizing the information in the data, for example, a finite set where the data sample typically came from. The statistical theory based on such relations between individual objects can be called algorithmic statistics, in contrast to ordinary statistical theory that deals with relations between probabilistic ensembles. We develop a new algorithmic theory of typical statistic, sufficient statistic, and minimal sufficient statistic. 1 Introduction We take statistical theory to ideally consider the following problem: Given a data sample and a family of models (hypotheses) one wants to select the model that produced the data. But a priori it is possible that the data is atypical for the...
An Active Model-Based Prototype for Predictive Network Management
, 2005
"... If current trends continue, the next generation of enterprise networks is likely to become a more complex mixture of hardware, communication media, architectures, protocols, and standards. One approach toward reducing the management burden caused by growing complexity is to integrate management supp ..."
Abstract
- Add to MetaCart
If current trends continue, the next generation of enterprise networks is likely to become a more complex mixture of hardware, communication media, architectures, protocols, and standards. One approach toward reducing the management burden caused by growing complexity is to integrate management support into the inherent function of network operation. In this paper, management support is provided in the form of network components that, simultaneously with their network function, collaboratively project and adjust projections of future state based upon actual network state. It is well known that more accurate predictions over a longer time horizon enables better control decisions. This paper focuses upon improving prediction; the many potential uses of predictive capabilities for predictive network control will be addressed in future work.

