Results 1  10
of
111
Diffusion Maps, Spectral Clustering and Reaction
 Applied and Computational Harmonic Analysis: Special issue on Diffusion Maps and Wavelets
, 2006
"... A central problem in data analysis is the low dimensional representation of high dimensional data, and the concise description of its underlying geometry and density. In the analysis of large scale simulations of complex dynamical systems, where the notion of time evolution comes into play, importan ..."
Abstract

Cited by 90 (14 self)
 Add to MetaCart
A central problem in data analysis is the low dimensional representation of high dimensional data, and the concise description of its underlying geometry and density. In the analysis of large scale simulations of complex dynamical systems, where the notion of time evolution comes into play, important problems are the identification of slow variables and dynamically meaningful reaction coordinates that capture the long time evolution of the system. In this paper we provide a unifying view of these apparently different tasks, by considering a family of di#usion maps, defined as the embedding of complex (high dimensional) data onto a low dimensional Euclidian space, via the eigenvectors of suitably defined random walks defined on the given datasets. Assuming that the data is randomly sampled from an underlying general probability distribution p(x) = e U(x) , we show that as the number of samples goes to infinity, the eigenvectors of each di#usion map converge to the eigenfunctions of a corresponding di#erential operator defined on the support of the probability distribution. Di#erent normalizations of the Markov chain on the graph lead to di#erent limiting di#erential operators.
Sampling the posterior: An approach to nongaussian data assimilation
, 2006
"... The viewpoint taken in this paper is that data assimilation is fundamentally a statistical problem and that this problem should be cast in a Bayesian framework. In the absence of model error, the correct solution to the data assimilation problem is to find the posterior distribution implied by this ..."
Abstract

Cited by 30 (10 self)
 Add to MetaCart
(Show Context)
The viewpoint taken in this paper is that data assimilation is fundamentally a statistical problem and that this problem should be cast in a Bayesian framework. In the absence of model error, the correct solution to the data assimilation problem is to find the posterior distribution implied by this Bayesian setting. Methods for dealing with data assimilation should then be judged by their ability to probe this distribution. In this paper we propose a range of techniques for probing the posterior distribution, based around the Langevin equation; and we compare these new techniques with existing methods. When the underlying dynamics is deterministic, the posterior distribution is on the space of initial conditions leading to a sampling problem over this space. When the underlying dynamics is stochastic the posterior distribution is on the space of continuous time paths. By writing down a density, and conditioning on observations, it is possible to define
Parameter estimation for multiscale diffusions
 J. Statist. Phys
, 2007
"... We study the problem of parameter estimation for timeseries possessing two, widely separated, characteristic time scales. The aim is to understand situations where it is desirable to fit a homogenized singlescale model to such multiscale data. We demonstrate, numerically and analytically, that if ..."
Abstract

Cited by 30 (10 self)
 Add to MetaCart
We study the problem of parameter estimation for timeseries possessing two, widely separated, characteristic time scales. The aim is to understand situations where it is desirable to fit a homogenized singlescale model to such multiscale data. We demonstrate, numerically and analytically, that if the data is sampled too finely then the parameter fit will fail, in that the correct parameters in the homogenized model are not identified. We also show, numerically and analytically, that if the data is subsampled at an appropriate rate then it is possible to estimate the coefficients of the homogenized model correctly. The ideas are studied in the context of thermally activated motion in a twoscale potential. However the ideas may be expected to transfer to other situations where it is desirable to fit an averaged or homogenized equation to multiscale data.
Consistency and stability of tau leaping schemes for chemical reaction systems
 SIAM Multiscale Modeling
, 2005
"... Abstract. We develop a theory of local errors for the explicit and implicit tauleaping methods for simulating stochastic chemical systems, and we prove that these methods are firstorder consistent. Our theory provides local error formulae that could serve as the basis for future stepsize control t ..."
Abstract

Cited by 29 (7 self)
 Add to MetaCart
(Show Context)
Abstract. We develop a theory of local errors for the explicit and implicit tauleaping methods for simulating stochastic chemical systems, and we prove that these methods are firstorder consistent. Our theory provides local error formulae that could serve as the basis for future stepsize control techniques. We prove that, for the special case of systems with linear propensity functions, both tauleaping methods are firstorder convergent in all moments. We provide a stiff stability analysis of the mean of both leaping methods, and we confirm that the implicit method is unconditionally stable in the mean for stable systems. Finally, we give some theoretical and numerical examples to illustrate these results.
DIFFUSION MAPS, REDUCTION COORDINATES AND LOW DIMENSIONAL REPRESENTATION OF STOCHASTIC SYSTEMS
"... The concise representation of complex high dimensional stochastic systems via a few reduced coordinates is an important problem in computational physics, chemistry and biology. In this paper we use the first few eigenfunctions of the backward FokkerPlanck diffusion operator as a coarse grained low ..."
Abstract

Cited by 21 (4 self)
 Add to MetaCart
(Show Context)
The concise representation of complex high dimensional stochastic systems via a few reduced coordinates is an important problem in computational physics, chemistry and biology. In this paper we use the first few eigenfunctions of the backward FokkerPlanck diffusion operator as a coarse grained low dimensional representation for the long term evolution of a stochastic system, and show that they are optimal under a certain mean squared error criterion. We denote the mapping from physical space to these eigenfunctions as the diffusion map. While in high dimensional systems these eigenfunctions are difficult to compute numerically by conventional methods such as finite differences or finite elements, we describe a simple computational datadriven method to approximate them from a large set of simulated data. Our method is based on defining an appropriately weighted graph on the set of simulated data, and computing the first few eigenvectors and eigenvalues of the corresponding random walk matrix on this graph. Thus, our algorithm incorporates the local geometry and density at each point into a global picture that merges in a natural way data from different simulation runs. Furthermore, we describe lifting and restriction operators between the diffusion map space and the original space. These operators facilitate the description of the coarsegrained dynamics, possibly in the form of a lowdimensional effective free energy surface parameterized by the diffusion map reduction coordinates. They also enable a systematic exploration of such effective free energy surfaces through the design of additional “intelligently biased ” computational experiments. We conclude by demonstrating our method on a few examples. Key words. Diffusion maps, dimensional reduction, stochastic dynamical systems, Fokker Planck operator, metastable states, normalized graph Laplacian. AMS subject classifications. 60H10, 60J60, 62M05
A hierarchy of approximations of the master equation scaled by a size parameter
 J. SCI. COMPUT
, 2008
"... Solutions of the master equation are approximated using a hierarchy of models based on the solution of ordinary differential equations: the macroscopic equations, the linear noise approximation and the moment equations. The advantage with the approximations is that the computational work with determ ..."
Abstract

Cited by 21 (3 self)
 Add to MetaCart
Solutions of the master equation are approximated using a hierarchy of models based on the solution of ordinary differential equations: the macroscopic equations, the linear noise approximation and the moment equations. The advantage with the approximations is that the computational work with deterministic algorithms grows as a polynomial in the number of species instead of an exponential growth with conventional methods for the master equation. The relation between the approximations is investigated theoretically and in numerical examples. The solutions converge to the macroscopic equations when a parameter measuring the size of the system grows. A computational criterion is suggested for estimating the accuracy of the approximations. The numerical examples are models for the migration of people, in population dynamics and in molecular biology.
Effective dynamics using conditional expectations
"... Abstract. The question of coarsegraining is ubiquitous in molecular dynamics. In this article, we are interested in deriving effective properties for the dynamics of a coarsegrained variable ξ(x), where x describes the configuration of the system in a highdimensional space R n, and ξ is a smooth ..."
Abstract

Cited by 16 (6 self)
 Add to MetaCart
(Show Context)
Abstract. The question of coarsegraining is ubiquitous in molecular dynamics. In this article, we are interested in deriving effective properties for the dynamics of a coarsegrained variable ξ(x), where x describes the configuration of the system in a highdimensional space R n, and ξ is a smooth function with value in R (typically a reaction coordinate). It is well known that, given a BoltzmannGibbs distribution on x ∈ R n, the equilibrium properties on ξ(x) are completely determined by the free energy. On the other hand, the question of the effective dynamics on ξ(x) is much more difficult to address. Starting from an overdamped Langevin equation on x ∈ R n, we propose an effective dynamics for ξ(x) ∈ R using conditional expectations. Using entropy methods, we give sufficient conditions for the time marginals of the effective dynamics to be close to the original ones. We check numerically on some toy examples that these sufficient conditions yield an effective dynamics which accurately reproduces the residence times in the potential energy wells. We also discuss the accuracy of the effective dynamics in a pathwise sense, and the relevance of the free energy to build a coarsegrained dynamics. AMS classification scheme numbers: 35B40, 82C31, 60H10Effective dynamics using conditional expectations 2 1.
Importance sampling techniques for estimation of diffusion models
 In
, 2012
"... This article develops a class of Monte Carlo (MC) methods for simulating conditioned diffusion sample paths, with special emphasis on importance sampling schemes. We restrict attention to a particular type of conditioned diffusions, the socalled diffusion bridge processes. The diffusion bridge is t ..."
Abstract

Cited by 14 (4 self)
 Add to MetaCart
(Show Context)
This article develops a class of Monte Carlo (MC) methods for simulating conditioned diffusion sample paths, with special emphasis on importance sampling schemes. We restrict attention to a particular type of conditioned diffusions, the socalled diffusion bridge processes. The diffusion bridge is the process obtained by conditioning a diffusion to start and finish at specific values at two consecutive times t0 < t1. Diffusion bridge simulation is a highly nontrivial problem. At an even more elementary level unconditional simulation of diffusions, that is without fixing the value of the process at t1, is difficult. This is a simulation from the transition distribution of the diffusion which is typically intractable. This intractability stems from the implicit specification of the diffusion as a solution of a stochastic differential equation (SDE). Although the unconditional simulation can be carried out by various approximate schemes based on discretizations of the SDE, it is not feasible to devise similar schemes for diffusion bridges in general. This has motivated active research in the last 15 years or so for the development of MC methodology for diffusion bridges. The research in this direction has been fuelled by the fundamental role that diffusion bridge simulation plays in the statistical inference for diffusion processes. Any statistical analysis which requires the transition density of the process is halted whenever the latter is not explicitly available, which is typically the case. Hence it is challenging to fit diffusion models employed in applications to the incomplete data typically available. An interesting possibility is to approximate the intractable transition density using an appropriate MC scheme and carry
Maximum Likelihood Drift Estimation for Multiscale Diffusions
 Stochastic Processes Applications
"... We study the problem of parameter estimation using maximum likelihood for fast/slow systems of stochastic differential equations. Our aim is to shed light on the problem of model/data mismatch at small scales. We consider two classes of fast/slow problems for which a closed coarsegrained equation f ..."
Abstract

Cited by 14 (5 self)
 Add to MetaCart
(Show Context)
We study the problem of parameter estimation using maximum likelihood for fast/slow systems of stochastic differential equations. Our aim is to shed light on the problem of model/data mismatch at small scales. We consider two classes of fast/slow problems for which a closed coarsegrained equation for the slow variables can be rigorously derived, which we refer to as averaging and homogenization problems. We ask whether, given data from the slow variable in the fast/slow system, we can correctly estimate parameters in the drift of the coarsegrained equation for the slow variable, using maximum likelihood. We show that, whereas the maximum likelihood estimator is asymptotically unbiased for the averaging problem, for the homogenization problem maximum likelihood fails unless we subsample the data at an appropriate rate. An explicit formula for the asymptotic error in the log likelihood function is presented. Our theory is applied to two simple examples from molecular dynamics.
On the convergence of population protocols when population goes to infinity
 Applied Mathematics and Computation
"... Abstract. Population protocols have been introduced as a model of sensor networks consisting of very limited mobile agents with no control over their own movement. A population protocol corresponds to a collection of anonymous agents, modeled by finite automata, that interact with one another to car ..."
Abstract

Cited by 14 (2 self)
 Add to MetaCart
(Show Context)
Abstract. Population protocols have been introduced as a model of sensor networks consisting of very limited mobile agents with no control over their own movement. A population protocol corresponds to a collection of anonymous agents, modeled by finite automata, that interact with one another to carry out computations, by updating their states, using some rules. Their computational power has been investigated under several hypotheses but always when restricted to finite size populations. In particular, predicates stably computable in the original model have been characterized as those definable in Presburger arithmetic. We study mathematically the convergence of population protocols when the size of the population goes to infinity. We do so by giving general results, that we illustrate through the example of a particular population protocol for which we even obtain an asymptotic development. This example shows in particular that these protocols seem to have a rather different computational power when a huge population hypothesis is considered. 1