Results 1  10
of
74
A survey of smoothing techniques for ME models
 IEEE Transactions on Speech and Audio Processing
, 2000
"... Abstract—In certain contexts, maximum entropy (ME) modeling can be viewed as maximum likelihood (ML) training for exponential models, and like other ML methods is prone to overfitting of training data. Several smoothing methods for ME models have been proposed to address this problem, but previous r ..."
Abstract

Cited by 85 (1 self)
 Add to MetaCart
Abstract—In certain contexts, maximum entropy (ME) modeling can be viewed as maximum likelihood (ML) training for exponential models, and like other ML methods is prone to overfitting of training data. Several smoothing methods for ME models have been proposed to address this problem, but previous results do not make it clear how these smoothing methods compare with smoothing methods for other types of related models. In this work, we survey previous work in ME smoothing and compare the performance of several of these algorithms with conventional techniques for smoothinggram language models. Because of the mature body of research ingram model smoothing and the close connection between ME and conventionalgram models, this domain is wellsuited to gauge the performance of ME smoothing methods. Over a large number of data sets, we find that fuzzy ME smoothing performs as well as or better than all other algorithms under consideration. We contrast this method with previousgram smoothing methods to explain its superior performance. Index Terms—Exponential models, language modeling, maximum entropy, minimum divergence,gram models, smoothing.
From Laplace To Supernova Sn 1987a: Bayesian Inference In Astrophysics
, 1990
"... . The Bayesian approach to probability theory is presented as an alternative to the currently used longrun relative frequency approach, which does not offer clear, compelling criteria for the design of statistical methods. Bayesian probability theory offers unique and demonstrably optimal solutions ..."
Abstract

Cited by 51 (2 self)
 Add to MetaCart
. The Bayesian approach to probability theory is presented as an alternative to the currently used longrun relative frequency approach, which does not offer clear, compelling criteria for the design of statistical methods. Bayesian probability theory offers unique and demonstrably optimal solutions to wellposed statistical problems, and is historically the original approach to statistics. The reasons for earlier rejection of Bayesian methods are discussed, and it is noted that the work of Cox, Jaynes, and others answers earlier objections, giving Bayesian inference a firm logical and mathematical foundation as the correct mathematical language for quantifying uncertainty. The Bayesian approaches to parameter estimation and model comparison are outlined and illustrated by application to a simple problem based on the gaussian distribution. As further illustrations of the Bayesian paradigm, Bayesian solutions to two interesting astrophysical problems are outlined: the measurement of wea...
Unknown quantum states: the quantum de Finetti representation
 J. Math. Phys
"... We present an elementary proof of the quantum de Finetti representation theorem, a quantum analogue of de Finetti’s classical theorem on exchangeable probability assignments. This contrasts with the original proof of Hudson and Moody [Z. Wahrschein. verw. Geb. 33, 343 (1976)], which relies on advanc ..."
Abstract

Cited by 44 (7 self)
 Add to MetaCart
We present an elementary proof of the quantum de Finetti representation theorem, a quantum analogue of de Finetti’s classical theorem on exchangeable probability assignments. This contrasts with the original proof of Hudson and Moody [Z. Wahrschein. verw. Geb. 33, 343 (1976)], which relies on advanced mathematics and does not share the same potential for generalization. The classical de Finetti theorem provides an operational definition of the concept of an unknown probability in Bayesian probability theory, where probabilities are taken to be degrees of belief instead of objective states of nature. The quantum de Finetti theorem, in a closely analogous fashion, deals with exchangeable densityoperator assignments and provides an operational definition of the concept of an “unknown quantum state ” in quantumstate tomography. This result is especially important for informationbased interpretations of quantum mechanics, where quantum states, like probabilities, are taken to be states of knowledge rather than states of nature. We further demonstrate that the theorem fails for real Hilbert spaces and discuss the significance of this point. I.
Nonequilibrium measurements of free energy differences for microscopically reversible markovian systems
, 1998
"... An equality has recently been shown relating the free energy difference between two equilibrium ensembles of a system and an ensemble average of the work required to switch between these two configurations. In the present paper it is shown that this result can be derived under the assumption that th ..."
Abstract

Cited by 38 (1 self)
 Add to MetaCart
An equality has recently been shown relating the free energy difference between two equilibrium ensembles of a system and an ensemble average of the work required to switch between these two configurations. In the present paper it is shown that this result can be derived under the assumption that the system's dynamics is Markovian and microscopically reversible. KEY WORDS: Nonequilibrium statistical mechanics; free energy; work; thermodynamic integration; thermodynamic perturbation.
ESTIMATING FUNCTIONS OF PROBABILITY DISTRIBUTIONS FROM A FINITE SET OF SAMPLES Part II: Bayes Estimators for Mutual Information, ChiSquared, Covariance, and other Statistics.
"... This paper is the second in a series of two on the problem of estimating a function of a probability distribution from a finite set of samples of that distribution. In the first paper, the Bayes estimator for a function of a probability distribution was introduced, the optimal properties of the Baye ..."
Abstract

Cited by 38 (3 self)
 Add to MetaCart
This paper is the second in a series of two on the problem of estimating a function of a probability distribution from a finite set of samples of that distribution. In the first paper, the Bayes estimator for a function of a probability distribution was introduced, the optimal properties of the Bayes estimator were discussed, and the Bayes and frequencycounts estimators for the Shannon entropy were derived and graphically contrasted. In the current paper the analysis of the first paper is extended by the derivation of Bayes estimators for several other functions of interest in statistics and information theory. These functions are (powers of) the mutual information, chisquared for tests of independence, variance, covariance, and average. Finding Bayes estimators for several of these functions requires extensions to the analytical techniques developed in the first paper, and these extensions form the main body of this paper. This paper extends the analysis in other ways as well, for example by enlarging the class of potential priors beyond the uniform prior assumed in the first paper. In particular, the use of the entropic and Dirichlet priors is considered.
Multimedia eventbased video indexing using time intervals
 IEEE TRANS. MULTIMEDIA
, 2005
"... We propose the time interval multimedia event (TIME) framework as a robust approach for classification of semantic events in multimodal video documents. The representation used in TIME extends the Allen temporal interval relations and allows for proper inclusion of context and synchronization of the ..."
Abstract

Cited by 19 (4 self)
 Add to MetaCart
We propose the time interval multimedia event (TIME) framework as a robust approach for classification of semantic events in multimodal video documents. The representation used in TIME extends the Allen temporal interval relations and allows for proper inclusion of context and synchronization of the heterogeneous information sources involved in multimodal video analysis. To demonstrate the viability of our approach, it was evaluated on the domains of soccer and news broadcasts. For automatic classification of semantic events, we compare three different machine learning techniques, i.c. C4.5 decision tree, maximum entropy, and support vector machine. The results show that semantic video indexing results significantly benefit from using the TIME framework.
Quantification and Segmentation of Brain Tissues from MR Images: A Probabilistic Neural Network Approach
 IEEE Transactions on Image Processing
, 1998
"... This paper presents a probabilistic neural network based technique for unsupervised quantification and segmentation of brain tissues from magnetic resonance images. It is shown that this problem can be solved by distribution learning and relaxation labeling, resulting in an efficient method that may ..."
Abstract

Cited by 17 (6 self)
 Add to MetaCart
This paper presents a probabilistic neural network based technique for unsupervised quantification and segmentation of brain tissues from magnetic resonance images. It is shown that this problem can be solved by distribution learning and relaxation labeling, resulting in an efficient method that may be particularly useful in quantifying and segmenting abnormal brain tissues where the number of tissue types is unknown and the distributions of tissue types heavily overlap. The new technique uses suitable statistical models for both the pixel and context images and formulates the problem in terms of modelhistogram fitting and global consistency labeling. The quantification is achieved by probabilistic selforganizing mixtures and the segmentation by a probabilistic constraint relaxation network. The experimental results show the efficient and robust performance of the new algorithm and that it outperforms the conventional classification based approaches. I. Introduction Quantitative ana...
THE SECOND LAW OF THERMODYNAMICS AND THE GLOBAL CLIMATE SYSTEM: A REVIEW OF THE MAXIMUM ENTROPY PRODUCTION PRINCIPLE
"... [1] The longterm mean properties of the global climate system and those of turbulent fluid systems are reviewed from a thermodynamic viewpoint. Two general expressions are derived for a rate of entropy production due to thermal and viscous dissipation (turbulent dissipation) in a fluid system. It i ..."
Abstract

Cited by 14 (0 self)
 Add to MetaCart
[1] The longterm mean properties of the global climate system and those of turbulent fluid systems are reviewed from a thermodynamic viewpoint. Two general expressions are derived for a rate of entropy production due to thermal and viscous dissipation (turbulent dissipation) in a fluid system. It is shown with these expressions that maximum entropy production in the Earthʼs climate system suggested by Paltridge, as well as maximum transport properties of heat or momentum in a turbulent system suggested by Malkus and Busse, correspond to a state in which the rate of entropy production due to the turbulent dissipation is at a maximum. Entropy production due to absorption of solar radiation in the climate system is found to be irrelevant to the maximized properties associated with turbulence. The hypothesis of maximum entropy production also seems to be applicable to the planetary atmospheres of Mars and Titan and perhaps to mantle convection. Lorenzʼs conjecture on maximum generation of available potential energy is shown to be akin to this hypothesis with a few minor approximations. A possible mechanism by which turbulent fluid systems adjust themselves to the states of maximum entropy production is presented as a selffeedback mechanism for the generation of available potential energy. These results tend to support the hypothesis of maximum entropy production that underlies a wide variety of nonlinear fluid systems, including our planet as well as other planets and stars. INDEX TERMS: 3220
Random Sets Unify, Explain, And Aid Known Uncertainty Methods In Expert Systems
 Random Sets: Theory and Applications
, 1997
"... . Numerous formalisms have been proposed for representing and processing uncertainty in expert systems. Several of these formalisms are somewhat ad hoc, in the sense that some of their formulas seem to have been chosen rather arbitrarily. In this paper, we show that random sets provide a natural ge ..."
Abstract

Cited by 13 (12 self)
 Add to MetaCart
. Numerous formalisms have been proposed for representing and processing uncertainty in expert systems. Several of these formalisms are somewhat ad hoc, in the sense that some of their formulas seem to have been chosen rather arbitrarily. In this paper, we show that random sets provide a natural general framework for describing uncertainty, a framework in which many existing formalisms appear as particular cases. This interpretation of known formalisms (e.g., of fuzzy logic) in terms of random sets enables us to justify many "ad hoc" formulas. In some cases, when several alternative formulas have been proposed, random sets help to choose the best ones (in some reasonable sense). One of the main objectives of expert systems is not only to describe the current state of the world, but also to provide us with reasonable actions. The simplest case is when we have the exact objective function. In this case, random sets can help in choosing the proper method of "fuzzy optimization." As a t...
Maximum entropy, fluctuations and priors
, 2000
"... The method of maximum entropy (ME) is extended to address the following problem: Once one accepts that the ME distribution is to be preferred over all others, the question is to what extent are distributions with lower entropy supposed to be ruled out. Two applications are given. The first is to the ..."
Abstract

Cited by 13 (6 self)
 Add to MetaCart
The method of maximum entropy (ME) is extended to address the following problem: Once one accepts that the ME distribution is to be preferred over all others, the question is to what extent are distributions with lower entropy supposed to be ruled out. Two applications are given. The first is to the theory of thermodynamic fluctuations. The formulation is exact, covariant under changes of coordinates, and allows fluctuations of both the extensive and the conjugate intensive variables. The second application is to the construction of an objective prior for Bayesian inference. The prior obtained by following the ME method to its inevitable conclusion turns out to be a special case (α = 1) of what are currently known under the name of entropic priors.