Results 1  10
of
37
Approaches to the Automatic Discovery of Patterns in Biosequences
, 1995
"... This paper is a survey of approaches and algorithms used for the automatic discovery of patterns in biosequences. Patterns with the expressive power in the class of regular languages are considered, and a classification of pattern languages in this class is developed, covering those patterns which a ..."
Abstract

Cited by 138 (21 self)
 Add to MetaCart
This paper is a survey of approaches and algorithms used for the automatic discovery of patterns in biosequences. Patterns with the expressive power in the class of regular languages are considered, and a classification of pattern languages in this class is developed, covering those patterns which are the most frequently used in molecular bioinformatics. A formulation is given of the problem of the automatic discovery of such patterns from a set of sequences, and an analysis presented of the ways in which an assessment can be made of the significance and usefulness of the discovered patterns. It is shown that this problem is related to problems studied in the field of machine learning. The largest part of this paper comprises a review of a number of existing methods developed to solve this problem and how these relate to each other, focusing on the algorithms underlying the approaches. A comparison is given of the algorithms, and examples are given of patterns that have been discovered...
A tutorial introduction to the minimum description length principle
 in Advances in Minimum Description Length: Theory and Applications. 2005
"... ..."
Algorithmic Statistics
 IEEE Transactions on Information Theory
, 2001
"... While Kolmogorov complexity is the accepted absolute measure of information content of an individual finite object, a similarly absolute notion is needed for the relation between an individual data sample and an individual model summarizing the information in the data, for example, a finite set (or ..."
Abstract

Cited by 52 (14 self)
 Add to MetaCart
While Kolmogorov complexity is the accepted absolute measure of information content of an individual finite object, a similarly absolute notion is needed for the relation between an individual data sample and an individual model summarizing the information in the data, for example, a finite set (or probability distribution) where the data sample typically came from. The statistical theory based on such relations between individual objects can be called algorithmic statistics, in contrast to classical statistical theory that deals with relations between probabilistic ensembles. We develop the algorithmic theory of statistic, sufficient statistic, and minimal sufficient statistic. This theory is based on twopart codes consisting of the code for the statistic (the model summarizing the regularity, the meaningful information, in the data) and the modeltodata code. In contrast to the situation in probabilistic statistical theory, the algorithmic relation of (minimal) sufficiency is an absolute relation between the individual model and the individual data sample. We distinguish implicit and explicit descriptions of the models. We give characterizations of algorithmic (Kolmogorov) minimal sufficient statistic for all data samples for both description modes in the explicit mode under some constraints. We also strengthen and elaborate earlier results on the "Kolmogorov structure function" and "absolutely nonstochastic objects" those rare objects for which the simplest models that summarize their relevant information (minimal sucient statistics) are at least as complex as the objects themselves. We demonstrate a close relation between the probabilistic notions and the algorithmic ones: (i) in both cases there is an "information nonincrease" law; (ii) it is shown that a function is a...
Model Selection based on Minimum Description Length
 Journal of Mathematical Psychology
, 1999
"... this paper is, of necessity, quite technical. To get a first but much gentler glimpse, we advise to just read the following (section 2) and the last section (7), which discusses in what sense we may expect Occam's razor to actually work. 2 The Fundamental Idea ..."
Abstract

Cited by 38 (3 self)
 Add to MetaCart
this paper is, of necessity, quite technical. To get a first but much gentler glimpse, we advise to just read the following (section 2) and the last section (7), which discusses in what sense we may expect Occam's razor to actually work. 2 The Fundamental Idea
A Minimum Description Length Approach to Grammar Inference
 Connectionist, Statistical and Symbolic Approaches to Learning for Natural Language, volume 1004 of Lecture Notes in AI
, 1994
"... . We describe a new abstract model for the computational learning of grammars. The model deals with a learning process in which an algorithm is given an input of a large set of training sentences that belong to some unknown grammar. The algorithm then tries to infer this grammar. Our model is ba ..."
Abstract

Cited by 35 (4 self)
 Add to MetaCart
. We describe a new abstract model for the computational learning of grammars. The model deals with a learning process in which an algorithm is given an input of a large set of training sentences that belong to some unknown grammar. The algorithm then tries to infer this grammar. Our model is based on the wellknown Minimum Description Length Principle. It is quite close to, but more general than several other existing approaches. We have shown that one of these approaches (based on ngram statistics) coincides exactly with a restricted version of our own model. We have used a restricted version of the algorithm implied by the model to find classes of related words in natural language texts. It turns out that for this task, which can be seen as a `degenerate' case of grammar learning, our approach gives quite good results. As opposed to many other approaches, it also provides a clear `stopping criterion' indicating at what point the learning process should stop. 1
Complexity distortion theory
 in Proc. IEEE Int. Symp. Information Theory
, 1997
"... Abstract—Complexity distortion theory (CDT) is a mathematical framework providing a unifying perspective on media representation. The key component of this theory is the substitution of the decoder in Shannon’s classical communication model with a universal Turing machine. Using this model, the math ..."
Abstract

Cited by 23 (2 self)
 Add to MetaCart
Abstract—Complexity distortion theory (CDT) is a mathematical framework providing a unifying perspective on media representation. The key component of this theory is the substitution of the decoder in Shannon’s classical communication model with a universal Turing machine. Using this model, the mathematical framework for examining the efficiency of coding schemes is the algorithmic or Kolmogorov complexity. CDT extends this framework to include distortion by defining the complexity distortion function. We show that despite their different natures, CDT and rate distortion theory (RDT) predict asymptotically the same results, under stationary and ergodic assumptions. This closes the circle of representation models, from probabilistic models of information proposed by Shannon in information and rate distortion theories, to deterministic algorithmic models, proposed by Kolmogorov in Kolmogorov complexity theory and its extension to lossy source coding, CDT. Index Terms—Kolmogorov complexity, Markov types, rate distortion function, universal coding. I.
Measuring Sets in Infinite Groups
, 2002
"... We are now witnessing a rapid growth of a new part of group theory which has become known as "statistical group theory". A typical result in this area would say something like "a random element (or a tuple of elements) of a group G has a property P with probability p". The validity of a statement li ..."
Abstract

Cited by 13 (6 self)
 Add to MetaCart
We are now witnessing a rapid growth of a new part of group theory which has become known as "statistical group theory". A typical result in this area would say something like "a random element (or a tuple of elements) of a group G has a property P with probability p". The validity of a statement like that does, of course, heavily depend on how one defines probability on groups, or, equivalently, how one measures sets in a group (in particular, in a free group). We hope that new approaches to defining probabilities on groups as outlined in this paper create, among other things, an appropriate framework for the study of the "average case" complexity of algorithms on groups.
Algorithmic information theory
 In Handbook on the Philosophy of Information
"... We introduce algorithmic information theory, also known as the theory of Kolmogorov complexity. We explain the main concepts of this quantitative approach to defining ‘information’. We discuss the extent to which Kolmogorov’s and Shannon’s information theory have a common purpose, and where they are ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
We introduce algorithmic information theory, also known as the theory of Kolmogorov complexity. We explain the main concepts of this quantitative approach to defining ‘information’. We discuss the extent to which Kolmogorov’s and Shannon’s information theory have a common purpose, and where they are fundamentally different. We indicate how recent developments within the theory allow one to formally distinguish between ‘structural ’ (meaningful) and ‘random ’ information as measured by the Kolmogorov structure function, which leads to a mathematical formalization of Occam’s razor in inductive inference. We end by discussing some of the philosophical implications of the theory.
Upper SemiLattice of Binary Strings With the Relation "x is Simple Conditional to Y"
, 1997
"... We study the properties of the set of binary strings with the relation "the Kolmogorov complexity of x conditional to y is small". We prove that there are pairs of strings which have no greatest common lower bound with respect to this preorder. We present several examples when the greatest common l ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
We study the properties of the set of binary strings with the relation "the Kolmogorov complexity of x conditional to y is small". We prove that there are pairs of strings which have no greatest common lower bound with respect to this preorder. We present several examples when the greatest common lower bound exists but its complexity is much less than mutual information (extending G'acs and Korner result [2]). 1 Introduction The family of Turing degrees is a well studied object. Is there any finite analog of it? In the present paper we study such a finite analog: instead of subsets of N we take binary strings and instead of Turing reducibility we take the relation "Kolmogorov complexity of x conditional to y is small". This structure has many properties common with the set of Turing degrees. For example, it forms also an upper semilattice. And it is more rich, since we can measure the complexity of finite strings. Of course, we should make precise what means that the Kolmogorov com...