Results 1  10
of
161
Operations for Learning with Graphical Models
 Journal of Artificial Intelligence Research
, 1994
"... This paper is a multidisciplinary review of empirical, statistical learning from a graphical model perspective. Wellknown examples of graphical models include Bayesian networks, directed graphs representing a Markov chain, and undirected networks representing a Markov field. These graphical models ..."
Abstract

Cited by 246 (12 self)
 Add to MetaCart
This paper is a multidisciplinary review of empirical, statistical learning from a graphical model perspective. Wellknown examples of graphical models include Bayesian networks, directed graphs representing a Markov chain, and undirected networks representing a Markov field. These graphical models are extended to model data analysis and empirical learning using the notation of plates. Graphical operations for simplifying and manipulating a problem are provided including decomposition, differentiation, and the manipulation of probability models from the exponential family. Two standard algorithm schemas for learning are reviewed in a graphical framework: Gibbs sampling and the expectation maximization algorithm. Using these operations and schemas, some popular algorithms can be synthesized from their graphical specification. This includes versions of linear regression, techniques for feedforward networks, and learning Gaussian and discrete Bayesian networks from data. The paper conclu...
Bayesian color constancy
 Journal of the Optical Society of America A
, 1997
"... The problem of color constancy may be solved if we can recover the physical properties of illuminants and surfaces from photosensor responses. We consider this problem within the framework of Bayesian decision theory. First, we model the relation among illuminants, surfaces, and photosensor response ..."
Abstract

Cited by 135 (18 self)
 Add to MetaCart
The problem of color constancy may be solved if we can recover the physical properties of illuminants and surfaces from photosensor responses. We consider this problem within the framework of Bayesian decision theory. First, we model the relation among illuminants, surfaces, and photosensor responses. Second, we construct prior distributions that describe the probability that particular illuminants and surfaces exist in the world. Given a set of photosensor responses, we can then use Bayes’s rule to compute the posterior distribution for the illuminants and the surfaces in the scene. There are two widely used methods for obtaining a single best estimate from a posterior distribution. These are maximum a posteriori (MAP) and minimum meansquarederror (MMSE) estimation. We argue that neither is appropriate for perception problems. We describe a new estimator, which we call the maximum local mass (MLM) estimate, that integrates local probability density. The new method uses an optimality criterion that is appropriate for perception tasks: It finds the most probable approximately correct answer. For the case of low observation noise, we provide an efficient approximation. We develop the MLM estimator for the colorconstancy problem in which flat matte surfaces are uniformly illuminated. In simulations we show that the MLM method performs better than the MAP estimator and better than a number of standard colorconstancy algorithms. We note conditions under which even the optimal estimator produces poor estimates: when the spectral properties of the surfaces in the scene are biased. © 1997 Optical Society of America [S07403232(97)016074] 1.
OneDimensional Quantum Walks
 STOC'01
, 2001
"... We define and analyze quantum computational variants of random walks on onedimensional lattices. In particular, we analyze a quantum analog of the symmetric random walk, which we call the Hadamard walk. Several striking differences between the quantum and classical cases are observed. For example, ..."
Abstract

Cited by 83 (11 self)
 Add to MetaCart
We define and analyze quantum computational variants of random walks on onedimensional lattices. In particular, we analyze a quantum analog of the symmetric random walk, which we call the Hadamard walk. Several striking differences between the quantum and classical cases are observed. For example, when unrestricted in either direction, the Hadamard walk has position that is nearly uniformly distributed in the range [\Gamma t= p
Frequency content of randomly scattered signals
 PART I, WAVE MOTION
, 1990
"... The statistical properties of acoustic signals reflected by a randomly layered medium are analyzed when a pulsed spherical wave issuing from a point source is incident upon it. The asymptotic analysis of stochastic equations and geometrical acoustics is used to arrive at a set of transport equatio ..."
Abstract

Cited by 72 (20 self)
 Add to MetaCart
The statistical properties of acoustic signals reflected by a randomly layered medium are analyzed when a pulsed spherical wave issuing from a point source is incident upon it. The asymptotic analysis of stochastic equations and geometrical acoustics is used to arrive at a set of transport equations that characterize multiply scattered signals observed at the surface of the layered medium. The results of extensive numerical simulations are presented, illustrating the scope of the theory. A number of inverse problems for randomly layered media are also formulated where we
Asymptotically Optimal Importance Sampling and Stratification for Pricing PathDependent Options
 Mathematical Finance
, 1999
"... This paper develops a variance reduction technique for Monte Carlo simulations of pathdependent options driven by highdimensional Gaussian vectors. The method combines importance sampling based on a change of drift with stratified sampling along a small number of key dimensions. The change of dri ..."
Abstract

Cited by 61 (13 self)
 Add to MetaCart
This paper develops a variance reduction technique for Monte Carlo simulations of pathdependent options driven by highdimensional Gaussian vectors. The method combines importance sampling based on a change of drift with stratified sampling along a small number of key dimensions. The change of drift is selected through a large deviations analysis and is shown to be optimal in an asymptotic sense. The drift selected has an interpretation as the path of the underlying state variables which maximizes the product of probability and payoffthe most important path. The directions used for stratified sampling are optimal for a quadratic approximation to the integrand or payoff function. Indeed, under differentiability assumptions our importance sampling method eliminates variability due to the linear part of the payoff function, and stratification eliminates much of the variability due to the quadratic part of the payoff. The two parts of the method are linked because the asymptotically optimal drift vector frequently provides a particularly effective direction for stratification. We illustrate the use of the method with pathdependent options, a stochastic volatility model, and interest rate derivatives. The method reveals novel features of the structure of their payoffs. KEY WORDS: Monte Carlo methods, variance reduction, large deviations, Laplace principle 1. INTRODUCTION This paper develops a variance reduction technique for Monte Carlo simulations driven by highdimensional Gaussian vectors, with particular emphasis on the pricing of pathdependent options. The method combines importance sampling based on a change of drift with stratified sampling along a small number of key dimensions. The change of drift is selected through a large deviations analysis and is shown to...
Random maps, coalescing saddles, singularity analysis, and Airy phenomena
 Random Structures & Algorithms
, 2001
"... A considerable number of asymptotic distributions arising in random combinatorics and analysis of algorithms are of the exponentialquadratic type, that is, Gaussian. We exhibit a class of "universal" phenomena that are of the exponentialcubic type, corresponding to distributions that involve the ..."
Abstract

Cited by 45 (6 self)
 Add to MetaCart
A considerable number of asymptotic distributions arising in random combinatorics and analysis of algorithms are of the exponentialquadratic type, that is, Gaussian. We exhibit a class of "universal" phenomena that are of the exponentialcubic type, corresponding to distributions that involve the Airy function. In this paper, such Airy phenomena are related to the coalescence of saddle points and the confluence of singularities of generating functions. For about a dozen types of random planar maps, a common Airy distribution (equivalently, a stable law of exponent 3/2) describes the sizes of cores and of largest (multi)connected components. Consequences include the analysis and fine optimization of random generation algorithms for multiply connected planar graphs. Based on an extension of the singularity analysis framework suggested by the Airy case, the paper also presents a general classification of compositional schemas in analytic combinatorics.
Quantum Walks on the Hypercube
 In Proc. of RANDOM 02
, 2002
"... Recently, it has been shown that onedimensional quantum walks can mix more quickly than classical random walks, suggesting that quantum Monte Carlo algorithms can outperform their classical counterparts. We study two quantum walks on the ndimensional hypercube, one in discrete time and one in cont ..."
Abstract

Cited by 42 (0 self)
 Add to MetaCart
Recently, it has been shown that onedimensional quantum walks can mix more quickly than classical random walks, suggesting that quantum Monte Carlo algorithms can outperform their classical counterparts. We study two quantum walks on the ndimensional hypercube, one in discrete time and one in continuous time. In both cases we show that the instantaneous mixing time is (p=4)n steps, faster than the Q(nlogn) steps required by the classical walk. In the continuoustime case, the probability distribution is exactly uniform at this time. On the other hand, we show that the average mixing time as defined by Aharonov et al. [AAKV01] is W(n 3=2 ) in the discretetime case, slower than the classical walk, and nonexistent in the continuoustime case. This suggests that the instantaneous mixing time is a more relevant notion than the average mixing time for quantum walks on large, wellconnected graphs. Our analysis treats interference between terms of different phase more carefully than is necessary for the walk on the cycle; previous general bounds predict an exponential average mixing time when applied to the hypercube. 1
Considering Cost Asymmetry in Learning Classifiers
 J. MACHINE LEARNING RESEARCH
, 2006
"... Receiver Operating Characteristic (ROC) curves are a standard way to display the performance of a set of binary classifiers for all feasible ratios of the costs associated with false positives and false negatives. For linear classifiers, the set of classifiers is typically obtained by training onc ..."
Abstract

Cited by 25 (5 self)
 Add to MetaCart
Receiver Operating Characteristic (ROC) curves are a standard way to display the performance of a set of binary classifiers for all feasible ratios of the costs associated with false positives and false negatives. For linear classifiers, the set of classifiers is typically obtained by training once, holding constant the estimated slope and then varying the intercept to obtain a parameterized set of classifiers whose performances can be plotted in the ROC plane. We consider the alternative of varying the asymmetry of the cost function used for training. We show that the ROC curve obtained by varying both the intercept and the asymmetry, and hence the slope, always outperforms the ROC curve obtained by varying only the intercept. In addition, we present a pathfollowing algorithm for the support vector machine (SVM) that can compute efficiently the entire ROC curve, and that has the same computational complexity as training a single classifier. Finally, we provide a theoretical analysis of the relationship between the asymmetric cost model assumed when training a classifier and the cost model assumed in applying the classifier. In particular, we show that the mismatch between the step function used for testing and its convex upper bounds, usually used for training, leads to a provable and quantifiable difference around extreme asymmetries.
Exploiting the generic viewpoint assumption
 IJCV
, 1996
"... The ¨generic viewpointässumption states that an observer is not in a special position relative to the scene. It is commonly used to disqualify scene interpretations that assume special viewpoints, following a binary decision that the viewpoint was either generic or accidental. In this paper, we appl ..."
Abstract

Cited by 22 (1 self)
 Add to MetaCart
The ¨generic viewpointässumption states that an observer is not in a special position relative to the scene. It is commonly used to disqualify scene interpretations that assume special viewpoints, following a binary decision that the viewpoint was either generic or accidental. In this paper, we apply Bayesian statistics to quantify the probability of a view, and so derive a useful tool to estimate scene parameters. This approach may increase the scope and accuracy of scene estimates. It applies to a range of vision problems. We show shape from shading examples, where we rank shapes or reflectance functions in cases where these are otherwise unknown. The rankings agree with the perceived values.
Criteria For Irrationality Of Euler's Constant
 Proc. Amer. Math. Soc
"... By modifying Beukers' proof of Apry's theorem that z ( ) 3 is irrational, we derive criteria for irrationality of Euler's constant, g . For n > 0 , we define a double integral I n and a positive integer S n , and prove that with d n n = LCM( ,..., ) 1 the following are equivalent. 1. The fractiona ..."
Abstract

Cited by 22 (7 self)
 Add to MetaCart
By modifying Beukers' proof of Apry's theorem that z ( ) 3 is irrational, we derive criteria for irrationality of Euler's constant, g . For n > 0 , we define a double integral I n and a positive integer S n , and prove that with d n n = LCM( ,..., ) 1 the following are equivalent. 1. The fractional part of logS n is given by {log } S d I n n n = 2 for some n . 2. The formula holds for all sufficiently large n . 3. Euler's constant is a rational number. A corollary is that if {log } S n 2 infinitely often, then g is irrational. Indeed, if the inequality holds for a given n (we present numerical evidence for 1 2500 n ) and g is rational, then its denominator does not divide d n n . We prove a new combinatorial identity in order to show that a certain linear form in logarithms is in fact logS n . A byproduct is a rapidly converging asymptotic formula for g , used by P. Sebah to compute g correct to 18063 decimals. 1.