Results 1  10
of
532
Probabilistic Inference Using Markov Chain Monte Carlo Methods
, 1993
"... Probabilistic inference is an attractive approach to uncertain reasoning and empirical learning in artificial intelligence. Computational difculties arise, however, because probabilistic models with the necessary realism and flexibility lead to complex distributions over highdimensional spaces. Rel ..."
Abstract

Cited by 567 (20 self)
 Add to MetaCart
Probabilistic inference is an attractive approach to uncertain reasoning and empirical learning in artificial intelligence. Computational difculties arise, however, because probabilistic models with the necessary realism and flexibility lead to complex distributions over highdimensional spaces. Related problems in other fields have been tackled using Monte Carlo methods based on sampling using Markov chains, providing a rich array of techniques that can be applied to problems in artificial intelligence. The "Metropolis algorithm" has been used to solve difficult problems in statistical physics for over forty years, and, in the last few years, the related method of "Gibbs sampling" has been applied to problems of statistical inference. Concurrently, an alternative method for solving problems in statistical physics by means of dynamical simulation has been developed as well, and has recently been unified with the Metropolis algorithm to produce the "hybrid Monte Carlo" method. In computer science, Markov chain sampling is the basis of the heuristic optimization technique of "simulated annealing", and has recently been used in randomized algorithms for approximate counting of large sets. In this review, I outline the role of probabilistic inference in artificial intelligence, and present the theory of Markov chains, and describe various Markov chain Monte Carlo algorithms, along with a number of supporting techniques. I try to present a comprehensive picture of the range of methods that have been developed, including techniques from the varied literature that have not yet seen wide application in artificial intelligence, but which appear relevant. As illustrative examples, I use the problems of probabilitic inference in expert systems, discovery of latent classes from data, and Bayesian learning for neural networks.
Evaluating the Accuracy of SamplingBased Approaches to the Calculation of Posterior Moments
 IN BAYESIAN STATISTICS
, 1992
"... Data augmentation and Gibbs sampling are two closely related, samplingbased approaches to the calculation of posterior moments. The fact that each produces a sample whose constituents are neither independent nor identically distributed complicates the assessment of convergence and numerical accurac ..."
Abstract

Cited by 269 (11 self)
 Add to MetaCart
Data augmentation and Gibbs sampling are two closely related, samplingbased approaches to the calculation of posterior moments. The fact that each produces a sample whose constituents are neither independent nor identically distributed complicates the assessment of convergence and numerical accuracy of the approximations to the expected value of functions of interest under the posterior. In this paper methods from spectral analysis are used to evaluate numerical accuracy formally and construct diagnostics for convergence. These methods are illustrated in the normal linear model with informative priors, and in the Tobitcensored regression model.
Evolutionary Programming Made Faster
 IEEE Transactions on Evolutionary Computation
, 1999
"... Evolutionary programming (EP) has been applied with success to many numerical and combinatorial optimization problems in recent years. EP has rather slow convergence rates, however, on some function optimization problems. In this paper, a "fast EP" (FEP) is proposed which uses a Cauchy instead of Ga ..."
Abstract

Cited by 206 (36 self)
 Add to MetaCart
Evolutionary programming (EP) has been applied with success to many numerical and combinatorial optimization problems in recent years. EP has rather slow convergence rates, however, on some function optimization problems. In this paper, a "fast EP" (FEP) is proposed which uses a Cauchy instead of Gaussian mutation as the primary search operator. The relationship between FEP and classical EP (CEP) is similar to that between fast simulated annealing and the classical version. Both analytical and empirical studies have been carried out to evaluate the performance of FEP and CEP for different function optimization problems. This paper shows that FEP is very good at search in a large neighborhood while CEP is better at search in a small local neighborhood. For a suite of 23 benchmark problems, FEP performs much better than CEP for multimodal functions with many local minima while being comparable to CEP in performance for unimodal and multimodal functions with only a few local minima. This paper also shows the relationship between the search step size and the probability of finding a global optimum and thus explains why FEP performs better than CEP on some functions but not on others. In addition, the importance of the neighborhood size and its relationship to the probability of finding a nearoptimum is investigated. Based on these analyses, an improved FEP (IFEP) is proposed and tested empirically. This technique mixes different search operators (mutations). The experimental results show that IFEP performs better than or as well as the better of FEP and CEP for most benchmark problems tested.
Correlation And Dependence In Risk Management: Properties And Pitfalls
 RISK MANAGEMENT: VALUE AT RISK AND BEYOND
, 1999
"... Modern risk management calls for an understanding of stochastic dependence going beyond simple linear correlation. This paper deals with the static (nontimedependent) case and emphasizes the copula representation of dependence for a random vector. Linear correlation is a natural dependence measure ..."
Abstract

Cited by 195 (30 self)
 Add to MetaCart
Modern risk management calls for an understanding of stochastic dependence going beyond simple linear correlation. This paper deals with the static (nontimedependent) case and emphasizes the copula representation of dependence for a random vector. Linear correlation is a natural dependence measure for multivariate normally and, more generally, elliptically distributed risks but other dependence concepts like comonotonicity and rank correlation should also be understood by the risk management practitioner. Using counterexamples the falsity of some commonly held views on correlation is demonstrated; in general, these fallacies arise from the naive assumption that dependence properties of the elliptical world also hold in the nonelliptical world. In particular, the problem of finding multivariate models which are consistent with prespecified marginal distributions and correlations is addressed. Pitfalls are highlighted and simulation algorithms avoiding these problems are constructed. ...
Using Secure Coprocessors
, 1994
"... The views and conclusions in this document are those of the authors and do not necessarily represent the official policies or endorsements of any of the research sponsors. How do we build distributed systems that are secure? Cryptographic techniques can be used to secure the communications between p ..."
Abstract

Cited by 150 (8 self)
 Add to MetaCart
The views and conclusions in this document are those of the authors and do not necessarily represent the official policies or endorsements of any of the research sponsors. How do we build distributed systems that are secure? Cryptographic techniques can be used to secure the communications between physically separated systems, but this is not enough: we must be able to guarantee the privacy of the cryptographic keys and the integrity of the cryptographic functions, in addition to the integrity of the security kernel and access control databases we have on the machines. Physical security is a central assumption upon which secure distributed systems are built; without this foundation even the best cryptosystem or the most secure kernel will crumble. In this thesis, I address the distributed security problem by proposing the addition of a small, physically secure hardware module, a secure coprocessor, to standard workstations and PCs. My central axiom is that secure coprocessors are able to maintain the privacy of the data they process. This thesis attacks the distributed security problem from multiple sides. First, I analyze the security properties of existing system components, both at the hardware and
WaveletBased Texture Retrieval Using Generalized Gaussian Density and KullbackLeibler Distance
 IEEE Trans. Image Processing
, 2002
"... We present a statistical view of the texture retrieval problem by combining the two related tasks, namely feature extraction (FE) and similarity measurement (SM), into a joint modeling and classification scheme. We show that using a consistent estimator of texture model parameters for the FE step fo ..."
Abstract

Cited by 145 (4 self)
 Add to MetaCart
We present a statistical view of the texture retrieval problem by combining the two related tasks, namely feature extraction (FE) and similarity measurement (SM), into a joint modeling and classification scheme. We show that using a consistent estimator of texture model parameters for the FE step followed by computing the KullbackLeibler distance (KLD) between estimated models for the SM step is asymptotically optimal in term of retrieval error probability. The statistical scheme leads to a new waveletbased texture retrieval method that is based on the accurate modeling of the marginal distribution of wavelet coefficients using generalized Gaussian density (GGD) and on the existence a closed form for the KLD between GGDs. The proposed method provides greater accuracy and flexibility in capturing texture information, while its simplified form has a close resemblance with the existing methods which uses energy distribution in the frequency domain to identify textures. Experimental results on a database of 640 texture images indicate that the new method significantly improves retrieval rates, e.g., from 65% to 77%, compared with traditional approaches, while it retains comparable levels of computational complexity.
Random number generation
"... Random numbers are the nuts and bolts of simulation. Typically, all the randomness required by the model is simulated by a random number generator whose output is assumed to be a sequence of independent and identically distributed (IID) U(0, 1) random variables (i.e., continuous random variables dis ..."
Abstract

Cited by 136 (30 self)
 Add to MetaCart
Random numbers are the nuts and bolts of simulation. Typically, all the randomness required by the model is simulated by a random number generator whose output is assumed to be a sequence of independent and identically distributed (IID) U(0, 1) random variables (i.e., continuous random variables distributed uniformly over the interval
Efficient Simulation from the Multivariate Normal and Studentt Distributions Subject to Linear Constraints and the Evaluation of Constraint Probabilities
, 1991
"... The construction and implementation of a Gibbs sampler for efficient simulation from the truncated multivariate normal and Studentt distributions is described. It is shown how the accuracy and convergence of integrals based on the Gibbs sample may be constructed, and how an estimate of the probabil ..."
Abstract

Cited by 128 (8 self)
 Add to MetaCart
The construction and implementation of a Gibbs sampler for efficient simulation from the truncated multivariate normal and Studentt distributions is described. It is shown how the accuracy and convergence of integrals based on the Gibbs sample may be constructed, and how an estimate of the probability of the constraint set under the unrestricted distribution may be produced. Keywords: Bayesian inference; Gibbs sampler; Monte Carlo; multiple integration; truncated normal This paper was prepared for a presentation at the meeting Computing Science and Statistics: the TwentyThird Symposium on the Interface, Seattle, April 2224, 1991. Research assistance from Zhenyu Wang and financial support from National Science Foundation Grant SES8908365 are gratefully acknowledged. The software for the examples may be requested by electronic mail, and will be returned by that medium. 2 1. Introduction The generation of random samples from a truncated multivariate normal distribution, that is, a ...
A Monte Carlo Approach to Nonnormal and Nonlinear StateSpace Modeling
, 1992
"... this article then is to develop methodology for modeling the nonnormality of the ut, the vt, or both. A second departure from the model specification ( 1 ) is to allow for unknown variances in the state or observational equation, as well as for unknown parameters in the transition matrices Ft and Ht ..."
Abstract

Cited by 126 (14 self)
 Add to MetaCart
this article then is to develop methodology for modeling the nonnormality of the ut, the vt, or both. A second departure from the model specification ( 1 ) is to allow for unknown variances in the state or observational equation, as well as for unknown parameters in the transition matrices Ft and Ht. As a third generalization we allow for nonlinear model structures; that is, X t = ft(Xtl) q Ut, and Yt = ht(xt) + vt, t = 1, ..., n, (2) whereft( ) and ht(. ) are given, but perhaps also depend on some unknown parameters. The experimenter may wish to entertain a variety of error distributions. Our goal throughout the article is an analysis for general statespace models that does not resort to convenient assumptions at the expense of model adequacy
Generative models for discovering sparse distributed representations
 Philosophical Transactions of the Royal Society B
, 1997
"... We describe a hierarchical, generative model that can be viewed as a nonlinear generalization of factor analysis and can be implemented in a neural network. The model uses bottomup, topdown and lateral connections to perform Bayesian perceptual inference correctly. Once perceptual inference has b ..."
Abstract

Cited by 120 (5 self)
 Add to MetaCart
We describe a hierarchical, generative model that can be viewed as a nonlinear generalization of factor analysis and can be implemented in a neural network. The model uses bottomup, topdown and lateral connections to perform Bayesian perceptual inference correctly. Once perceptual inference has been performed the connection strengths can be updated using a very simple learning rule that only requires locally available information. We demonstrate that the network learns to extract sparse, distributed, hierarchical representations.