Results 1  10
of
34
Minimax Entropy Principle and Its Application to Texture Modeling
, 1997
"... This article proposes a general theory and methodology, called the minimax entropy principle, for building statistical models for images (or signals) in a variety of applications. This principle consists of two parts. The first is the maximum entropy principle for feature binding (or fusion): for a ..."
Abstract

Cited by 193 (39 self)
 Add to MetaCart
This article proposes a general theory and methodology, called the minimax entropy principle, for building statistical models for images (or signals) in a variety of applications. This principle consists of two parts. The first is the maximum entropy principle for feature binding (or fusion): for a certain set of feature statistics, a distribution can be built to bind these feature statistics together by maximizing the entropy over all distributions that reproduce these feature statistics. The second part is the minimum entropy principle for feature selection: among all plausible sets of feature statistics, we choose the set whose maximum entropy distribution has the minimum entropy. Computational and inferential issues in both parts are addressed, in particular, a feature pursuit procedure is proposed for approximately selecting the optimal set of features. The model complexity is restricted because of the sample variation in the observed feature statistics. The minimax entropy principle is applied to texture modeling, where a novel Markov random field (MRF) model, called FRAME (Filter, Random field, And Minimax Entropy), is derived, and encouraging results are obtained in experiments on a variety of texture images. Relationship between our theory and the mechanisms of neural computation is also discussed.
Prior Learning and Gibbs ReactionDiffusion
, 1997
"... This article addresses two important themes in early visual computation: rst it presents a novel theory for learning the universal statistics of natural images { a prior model for typical cluttered scenes of the world { from a set of natural images, second it proposes a general framework of designi ..."
Abstract

Cited by 148 (18 self)
 Add to MetaCart
This article addresses two important themes in early visual computation: rst it presents a novel theory for learning the universal statistics of natural images { a prior model for typical cluttered scenes of the world { from a set of natural images, second it proposes a general framework of designing reactiondiusion equations for image processing. We start by studying the statistics of natural images including the scale invariant properties, then generic prior models were learned to duplicate the observed statistics, based on the minimax entropy theory studied in two previous papers. The resulting Gibbs distributions have potentials of the form U(I; ; S) = P K I)(x; y)) with S = fF g being a set of lters and = f the potential functions. The learned Gibbs distributions con rm and improve the form of existing prior models such as lineprocess, but in contrast to all previous models, inverted potentials (i.e. (x) decreasing as a function of jxj) were found to be necessary. We nd that the partial dierential equations given by gradient descent on U(I; ; S) are essentially reactiondiusion equations, where the usual energy terms produce anisotropic diusion while the inverted energy terms produce reaction associated with pattern formation, enhancing preferred image features. We illustrate how these models can be used for texture pattern rendering, denoising, image enhancement and clutter removal by careful choice of both prior and data models of this type, incorporating the appropriate features. Song Chun Zhu is now with the Computer Science Department, Stanford University, Stanford, CA 94305, and David Mumford is with the Division of Applied Mathematics, Brown University, Providence, RI 02912. This work started when the authors were at ...
Bayesian measures of model complexity and fit
 Journal of the Royal Statistical Society, Series B
, 2002
"... [Read before The Royal Statistical Society at a meeting organized by the Research ..."
Abstract

Cited by 132 (2 self)
 Add to MetaCart
[Read before The Royal Statistical Society at a meeting organized by the Research
Sum Rules For Jacobi Matrices And Their Applications To Spectral Theory
 Ann. of Math
"... We discuss the proof of and systematic application of Case's sum rules for Jacobi matrices. Of special interest is a linear combination of two of his sum rules which has strictly positive terms. Among our results are a complete classification of the spectral measures of all Jacobi matrices J for whi ..."
Abstract

Cited by 99 (38 self)
 Add to MetaCart
We discuss the proof of and systematic application of Case's sum rules for Jacobi matrices. Of special interest is a linear combination of two of his sum rules which has strictly positive terms. Among our results are a complete classification of the spectral measures of all Jacobi matrices J for which J J0 is HilbertSchmidt, and a proof of Nevai's conjecture that the Szegö condition holds if J J0 is trace class.
Ensemble Learning
, 2000
"... Introduction When we say we are making a model of a system, we are setting up a tool which can be used to make inferences, predictions and decisions. Each model can be seen as a hypothesis, or explanation, which makes assertions about the quantities which are directly observable and which can only ..."
Abstract

Cited by 62 (2 self)
 Add to MetaCart
Introduction When we say we are making a model of a system, we are setting up a tool which can be used to make inferences, predictions and decisions. Each model can be seen as a hypothesis, or explanation, which makes assertions about the quantities which are directly observable and which can only be inferred from their eect on observable quantities. In the Bayesian framework, knowledge is contained in the conditional probability distributions of the models. We can use Bayes' theorem to evaluate the conditional probability distributions for the unknown parameters, y, given the set of observed quantities, x, using p (y jx ) = p (x jy ) p (y) p (x) (1) The prior distribution p (y) contains our knowledge of the unknown variables before we make any observ
Enhanced Word Clustering for Hierarchical Text Classification
, 2002
"... In this paper we propose a new informationtheoretic divisive algorithm for word clustering applied to text classification. In previous work, such "distributional clustering" of features has been found to achieve improvements over feature selection in terms of classification accuracy, especially at ..."
Abstract

Cited by 44 (2 self)
 Add to MetaCart
In this paper we propose a new informationtheoretic divisive algorithm for word clustering applied to text classification. In previous work, such "distributional clustering" of features has been found to achieve improvements over feature selection in terms of classification accuracy, especially at lower number of features [2, 28]. However the existing clustering techniques are agglomerative in nature and result in (i) suboptimal word clusters and (ii) high computational cost. In order to explicitly capture the optimality of word clusters in an information theoretic framework, we first derive a global criterion for feature clustering. We then present a fast, divisive algorithm that monotonically decreases this objective function value, thus converging to a local minimum. We show that our algorithm minimizes the "withincluster JensenShannon divergence" while simultaneously maximizing the "betweencluster JensenShannon divergence". In comparison to the previously proposed agglomerative strategies our divisive algorithm achieves higher classification accuracy especially at lower number of features. We further show that feature clustering is an effective technique for building smaller class models in hierarchical classification. We present detailed experimental results using Naive Bayes and Support Vector Machines on the 20 Newsgroups data set and a 3level hierarchy of HTML documents collected from Dmoz Open Directory.
An analysis of quantitative measures associated with rules
 Proceedings of PAKDD’99
, 1999
"... Abstract. In this paper, we analyze quantitative measures associated with ifthen type rules. Basic quantities are identified and many existing measures are examined using the basic quantities. The main objective is to provide a synthesis of existing results in a simple and unified framework. The qu ..."
Abstract

Cited by 34 (25 self)
 Add to MetaCart
Abstract. In this paper, we analyze quantitative measures associated with ifthen type rules. Basic quantities are identified and many existing measures are examined using the basic quantities. The main objective is to provide a synthesis of existing results in a simple and unified framework. The quantitative measure is viewed as a multifacet concept, representing the confidence, uncertainty, applicability, quality, accuracy, and interestingness of rules. Roughly, they may be classified as representing oneway and twoway supports. 1
Exploring Texture Ensembles by Efficient Markov Chain Monte Carlo  Towards a "Trichromacy" Theory of Texture
, 1999
"... This article presents a mathematical denition of texture { the Julesz ensemble h), which is the set of all images (defined on Z²) that share identical statistics h. Then texture modeling is posed as an inverse problem: given a set of images sampled from an unknown Julesz ensemble h ), we search f ..."
Abstract

Cited by 32 (13 self)
 Add to MetaCart
This article presents a mathematical denition of texture { the Julesz ensemble h), which is the set of all images (defined on Z²) that share identical statistics h. Then texture modeling is posed as an inverse problem: given a set of images sampled from an unknown Julesz ensemble h ), we search for the statistics h which define the ensemble. A Julesz ensemble h) has an associated probability distribution q(I; h), which is uniform over the images in the ensemble and has zero probability outside. In a companion paper [32], q(I; h) is shown to be the limit distribution of the FRAME (Filter, Random Field, And Minimax Entropy) model[35] as the image lattice ! Z². This conclusion establishes the intrinsic link between the scientific definition of texture on Z² and the mathematical models of texture on finite lattices. It brings two advantages to computer vision. 1). The engineering practice of synthesizing texture images by matching statistics has been put on a mathematical fou...
Learning Models for Robot Navigation
, 1998
"... Hidden Markov models (hmms) and partially observable Markov decision processes (pomdps) provide a useful tool for modeling dynamical systems. They are particularly useful for representing environments such as road networks and office buildings, which are typical for robot navigation and planning. Th ..."
Abstract

Cited by 26 (2 self)
 Add to MetaCart
Hidden Markov models (hmms) and partially observable Markov decision processes (pomdps) provide a useful tool for modeling dynamical systems. They are particularly useful for representing environments such as road networks and office buildings, which are typical for robot navigation and planning. The work presented here describes a formal framework for incorporating readily available odometric information into both the models and the algorithm that learns them. By taking advantage of such information, learning hmms/pomdps can be made better and require fewer iterations, while being robust in the face of data reduction. That is, the performance of our algorithm does not significantly deteriorate as the training sequences provided to it become significantly shorter. Formal proofs for the convergence of the algorithm to a local maximum of the likelihood function are provided. Experimental results, obtained from both simulated and real robot data, demonstrate the effectiveness of the approach....
Modeling Belief in Dynamic Systems. Part I: Foundations
 Artificial Intelligence
, 1997
"... Belief change is a fundamental problem in AI: Agents constantly have to update their beliefs to accommodate new observations. In recent years, there has been much work on axiomatic characterizations of belief change. We claim that a better understanding of belief change can be gained from examining ..."
Abstract

Cited by 23 (11 self)
 Add to MetaCart
Belief change is a fundamental problem in AI: Agents constantly have to update their beliefs to accommodate new observations. In recent years, there has been much work on axiomatic characterizations of belief change. We claim that a better understanding of belief change can be gained from examining appropriate semantic models. In this paper we propose a general framework in which to model belief change. We begin by defining belief in terms of knowledge and plausibility: an agent believes OE if he knows that OE is more plausible than :OE. We then consider some properties defining the interaction between knowledge and plausibility, and show how these properties affect the properties of belief. In particular, we show that by assuming two of the most natural properties, belief becomes a KD45 operator. Finally, we add time to the picture. This gives us a framework in which we can talk about knowledge, plausibility (and hence belief), and time, which extends the framework of Halpern and Fagi...