Results 1 
8 of
8
The Similarity Metric
 IEEE TRANSACTIONS ON INFORMATION THEORY
, 2003
"... A new class of distances appropriate for measuring similarity relations between sequences, say one type of similarity per distance, is studied. We propose a new "normalized information distance", based on the noncomputable notion of Kolmogorov complexity, and show that it is in this class and it min ..."
Abstract

Cited by 192 (29 self)
 Add to MetaCart
A new class of distances appropriate for measuring similarity relations between sequences, say one type of similarity per distance, is studied. We propose a new "normalized information distance", based on the noncomputable notion of Kolmogorov complexity, and show that it is in this class and it minorizes every computable distance in the class (that is, it is universal in that it discovers all computable similarities). We demonstrate that it is a metric and call it the similarity metric. This theory forms the foundation for a new practical tool. To evidence generality and robustness we give two distinctive applications in widely divergent areas using standard compression programs like gzip and GenCompress. First, we compare whole mitochondrial genomes and infer their evolutionary history. This results in a first completely automatic computed whole mitochondrial phylogeny tree. Secondly, we fully automatically compute the language tree of 52 different languages.
Dualities Between Entropy Functions and Network Codes
, 2008
"... Characterization of the set of entropy functions Γ ∗ is an important open problem in information theory. The region Γ ∗ is central to the theory of information inequalities, and as such could be regarded as a key to the basic laws of information theory. Characterization of Γ ∗ has several important ..."
Abstract

Cited by 14 (6 self)
 Add to MetaCart
Characterization of the set of entropy functions Γ ∗ is an important open problem in information theory. The region Γ ∗ is central to the theory of information inequalities, and as such could be regarded as a key to the basic laws of information theory. Characterization of Γ ∗ has several important consequences. In probability theory, it would provide a solution for the implication problem of conditional independence. In communications networks, the capacity region of multisource network coding is given in terms of Γ ∗. More broadly, determination of Γ ∗ would have an impact on converse theorems for multiterminal problems in information theory. This paper provides several new dualities between entropy functions and network codes. Given a function g ≥ 0 defined on all proper subsets of N random variables, we provide a construction for a network multicast problem which is ”solvable ” if and only if g is the entropy function of a set of quasiuniform random variables. The underlying network topology is fixed and the multicast problem depends on g only through link capacities and source rates. A corresponding duality is developed for linear networks codes, where the constructed multicast problem is linearly solvable if and only if g is linear group characterizable. Relaxing the requirement that the domain of g be subsets of random variables, we obtain a similar duality between polymatroids and the linear programming bound. These duality results provide an alternative proof of the insufficiency of linear (and abelian) network codes, and demonstrate the utility of nonShannon inequalities to tighten outer bounds on network coding capacity regions.
A new class of nonShannontype inequalities for entropies
 Communications in Information and Systems
, 2002
"... Abstract. In this paper we prove a countable set of nonShannontype linear information inequalities for entropies of discrete random variables, i.e., information inequalities which cannot be reduced to the “basic ” inequality I(X: Y Z) ≥ 0. Our results generalize the inequalities of Z. Zhang and ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
Abstract. In this paper we prove a countable set of nonShannontype linear information inequalities for entropies of discrete random variables, i.e., information inequalities which cannot be reduced to the “basic ” inequality I(X: Y Z) ≥ 0. Our results generalize the inequalities of Z. Zhang and R. Yeung (1998) who found the first examples of nonShannontype information inequalities. 1. Introduction. A central notion of information theory is Shannon’s entropy 1. Given a set of jointly distributed random variables x1,..., xn, we can consider entropies of all random variables H(xi), entropies of all pairs H(xi, xj), etc. (2 n − 1 entropy values for all nonempty subsets of {x1,..., xn}). For every ntuple of random variables we get a point in R 2n −1, representing entropies of the given distribution.
On the combinatorial representation of information
 The Twelfth Annual International Computing and Combinatorics Conference (COCOON’06), volume LNCS 4112
, 2006
"... Abstract. Kolmogorov introduced a combinatorial measure of the information I(x: y) about the unknown value of a variable y conveyed by an input variable x taking a given value x. The paper extends this definition of information to a more general setting where ‘x = x ’ may provide a vaguer descriptio ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
Abstract. Kolmogorov introduced a combinatorial measure of the information I(x: y) about the unknown value of a variable y conveyed by an input variable x taking a given value x. The paper extends this definition of information to a more general setting where ‘x = x ’ may provide a vaguer description of the possible value of y. As an application, the space P({0, 1} n) of classes of binary functions f: [n] → {0, 1}, [n] = {1,..., n}, is considered where y represents an unknown function t ∈ {0, 1} [n] and as input, two extreme cases are considered: x = xM d and x = x M ′ d which indicate that t is an element of a set G ⊆ {0, 1} n that satisfies a property Md or M ′ d respectively. Property Md (or M ′ d) means that there exists an E ⊆ [n], E  = d, such that trE(G)  = 1 (or 2 d) where trE(G) denotes the trace of G on E. Estimates of the information value I(xM d: t) and I(x M ′ d: t) are obtained. When d is fixed, it is shown that I(xM d: t) ≈ d and I(x M ′ d: t) ≈ 1 as n → ∞. Key words: Information theory, combinatorial complexity, VCdimension 1
Partitioning multidimensional sets in a small number of “uniform ” parts
"... Our main result implies the following easily formulated statement. The set of edges E of every finite bipartite graph can be split into poly(log E) subsets so that all the resulting bipartite graphs are almost regular. The latter means that the ratio between the maximal and minimal nonzero degree ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Our main result implies the following easily formulated statement. The set of edges E of every finite bipartite graph can be split into poly(log E) subsets so that all the resulting bipartite graphs are almost regular. The latter means that the ratio between the maximal and minimal nonzero degree of the left nodes is bounded by a constant and the same condition holds for the right nodes. Stated differently, every finite 2dimensional set S ⊂ N 2 can be partitioned into poly(log S) parts so that in every part the ratio between the maximal size and the minimal size of nonempty horizontal section is bounded by a constant and the same condition holds for vertical sections. We prove a similar statement for ndimensional sets for any n and show how it can be used to relate information inequalities for Shannon entropy of random variables to inequalities between sizes of sections and their projections of multidimensional finite sets. Let S be a finite ndimensional set, that is, a subset of X1 ×X2 × · · ·×Xn for some X1, X2,..., Xn. For every set of indices A ⊂ {1, 2,..., n} = [n]
Similarity Distance and Phylogeny
, 2002
"... A new class of similarity measures appropriate for measuring relations between sequences is studied. ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
A new class of similarity measures appropriate for measuring relations between sequences is studied.
Partitioning MultiDimensional Sets in a Small
"... Our main result implies the following easily formulated statement. ..."
ON THE INEQUALITIES IN INFORMATION THEORY RETHNAKARAN PULIKKOONATTU
"... information Theory, that enabled engineers for the first time to deal quantitatively with the elusive concept of information”. In his celebrated work, Shannon nicely laid the foundation for transmission and storage of information. Using a probabilistic model, his Theory helped to get further insight ..."
Abstract
 Add to MetaCart
information Theory, that enabled engineers for the first time to deal quantitatively with the elusive concept of information”. In his celebrated work, Shannon nicely laid the foundation for transmission and storage of information. Using a probabilistic model, his Theory helped to get further insight into what is achievable and what is not, in terms of quantifiable information transfer. Indeed the very same concept is used to predict the limits on data compression and achievable transmission rate on a probabilistic channel.These underlying concepts can be thought of as inequalities involving measures of probability distributions. Shannon defined several such basic measures in his original work. The field of Information Theory grew with researchers finding more results and insights into the fundamental problem of transmission of and storage using probabilistic models. By nature of the subject itself, the results obtained are usually inequalities involving basic Shannon’s measures such as entropies. Some of them are elementary, some rather complicated expressions. In order to prove further theorems as well it required to check whether certain expressions are true in an Information Theoretic sense. This motivated researchers to seek a formal method to check all possible inequalities. Raymond Yeung [2] in 1998 came out with a remarkable framework, which could verify many of the inequalities in this field. His framework thus enabled to verify all inequalities, derived from the basic Shannon measure properties. A central notion of Information Theory is entropy, which Shannon defines as measure of information itself. Given a set of jointly distributed random variables X1, X2,..., Xn, we can consider entropies of all random variables H(Xi), entropies of all pairs H(Xi, Xj), etc. (2n − 1 entropy values for all nonempty subsets of {X1, X2,...,Xn}). For every ntuple of random