Results 1 - 10
of
7,673
Thompson Sampling for 1-Dimensional Exponential Family Bandits
- In Neural Information Processing Systems
, 2013
"... ar ..."
Graphical models, exponential families, and variational inference
, 2008
"... The formalism of probabilistic graphical models provides a unifying framework for capturing complex dependencies among random variables, and building large-scale multivariate statistical models. Graphical models have become a focus of research in many statistical, computational and mathematical fiel ..."
Abstract
-
Cited by 819 (28 self)
- Add to MetaCart
of probability distributions — are best studied in the general setting. Working with exponential family representations, and exploiting the conjugate duality between the cumulant function and the entropy for exponential families, we develop general variational representations of the problems of computing
Articulated body motion capture by annealed particle filtering
- In IEEE Conf. on Computer Vision and Pattern Recognition
, 2000
"... The main challenge in articulated body motion tracking is the large number of degrees of freedom (around 30) to be recovered. Search algorithms, either deterministic or stochastic, that search such a space without constraint, fall foul of exponential computational complexity. One approach is to intr ..."
Abstract
-
Cited by 494 (4 self)
- Add to MetaCart
The main challenge in articulated body motion tracking is the large number of degrees of freedom (around 30) to be recovered. Search algorithms, either deterministic or stochastic, that search such a space without constraint, fall foul of exponential computational complexity. One approach
Dual polyhedra and mirror symmetry for Calabi–Yau hypersurfaces in toric varieties
- J. Alg. Geom
, 1994
"... We consider families F(∆) consisting of complex (n − 1)-dimensional projective algebraic compactifications of ∆-regular affine hypersurfaces Zf defined by Laurent polynomials f with a fixed n-dimensional Newton polyhedron ∆ in n-dimensional algebraic torus T = (C ∗ ) n. If the family F(∆) defined by ..."
Abstract
-
Cited by 467 (20 self)
- Add to MetaCart
We consider families F(∆) consisting of complex (n − 1)-dimensional projective algebraic compactifications of ∆-regular affine hypersurfaces Zf defined by Laurent polynomials f with a fixed n-dimensional Newton polyhedron ∆ in n-dimensional algebraic torus T = (C ∗ ) n. If the family F(∆) defined
A Neural Probabilistic Language Model
- JOURNAL OF MACHINE LEARNING RESEARCH
, 2003
"... A goal of statistical language modeling is to learn the joint probability function of sequences of words in a language. This is intrinsically difficult because of the curse of dimensionality: a word sequence on which the model will be tested is likely to be different from all the word sequences seen ..."
Abstract
-
Cited by 447 (19 self)
- Add to MetaCart
training sentence to inform the model about an exponential number of semantically neighboring sentences. The model learns simultaneously (1) a distributed representation for each word along with (2) the probability function for word sequences, expressed in terms of these representations. Generalization
Clustering with Bregman Divergences
- JOURNAL OF MACHINE LEARNING RESEARCH
, 2005
"... A wide variety of distortion functions are used for clustering, e.g., squared Euclidean distance, Mahalanobis distance and relative entropy. In this paper, we propose and analyze parametric hard and soft clustering algorithms based on a large class of distortion functions known as Bregman divergence ..."
Abstract
-
Cited by 443 (57 self)
- Add to MetaCart
this loss. Secondly, we show an explicit bijection between Bregman divergences and exponential families. The bijection enables the development of an alternative interpretation of an ecient EM scheme for learning models involving mixtures of exponential distributions. This leads to a simple soft clustering
Variational algorithms for approximate Bayesian inference
, 2003
"... The Bayesian framework for machine learning allows for the incorporation of prior knowledge in a coherent way, avoids overfitting problems, and provides a principled basis for selecting between alternative models. Unfortunately the computations required are usually intractable. This thesis presents ..."
Abstract
-
Cited by 440 (9 self)
- Add to MetaCart
the theoretical core of the thesis, generalising the expectation-maximisation (EM) algorithm for learning maximum likelihood parameters to the VB EM al-gorithm which integrates over model parameters. The algorithm is then specialised to the large family of conjugate-exponential (CE) graphical models, and several
The Wiener-Askey Polynomial Chaos for Stochastic Differential Equations
- SIAM J. SCI. COMPUT
, 2002
"... We present a new method for solving stochastic differential equations based on Galerkin projections and extensions of Wiener's polynomial chaos. Specifically, we represent the stochastic processes with an optimum trial basis from the Askey family of orthogonal polynomials that reduces the dime ..."
Abstract
-
Cited by 398 (42 self)
- Add to MetaCart
the dimensionality of the system and leads to exponential convergence of the error. Several continuous and discrete processes are treated, and numerical examples show substantial speed-up compared to Monte-Carlo simulations for low dimensional stochastic inputs.
The Web as a graph: measurements, models, and methods
, 1999
"... . The pages and hyperlinks of the World-Wide Web may be viewed as nodes and edges in a directed graph. This graph is a fascinating object of study: it has several hundred million nodes today, over a billion links, and appears to grow exponentially with time. There are many reasons --- mathematical, ..."
Abstract
-
Cited by 373 (11 self)
- Add to MetaCart
. The pages and hyperlinks of the World-Wide Web may be viewed as nodes and edges in a directed graph. This graph is a fascinating object of study: it has several hundred million nodes today, over a billion links, and appears to grow exponentially with time. There are many reasons --- mathematical
A generalization of principal component analysis to the exponential family
- Advances in Neural Information Processing Systems
, 2001
"... Principal component analysis (PCA) is a commonly applied technique for dimensionality reduction. PCA implicitly minimizes a squared loss function, which may be inappropriate for data that is not real-valued, such as binary-valued data. This paper draws on ideas from the Exponential family, Generaliz ..."
Abstract
-
Cited by 155 (1 self)
- Add to MetaCart
Principal component analysis (PCA) is a commonly applied technique for dimensionality reduction. PCA implicitly minimizes a squared loss function, which may be inappropriate for data that is not real-valued, such as binary-valued data. This paper draws on ideas from the Exponential family
Results 1 - 10
of
7,673