Results 11  20
of
201
The structure of multineuron firing patterns in primate retina
 Petrusca D, Sher A, Litke AM & Chichilnisky EJ
, 2006
"... Synchronized firing among neurons has been proposed to constitute an elementary aspect of the neural code in sensory and motor systems. However, it remains unclear how synchronized firing affects the largescale patterns of activity and redundancy of visual signals in a complete population of neuron ..."
Abstract

Cited by 54 (7 self)
 Add to MetaCart
Synchronized firing among neurons has been proposed to constitute an elementary aspect of the neural code in sensory and motor systems. However, it remains unclear how synchronized firing affects the largescale patterns of activity and redundancy of visual signals in a complete population of neurons. We recorded simultaneously from hundreds of retinal ganglion cells in primate retina, and examined synchronized firing in completely sampled populations of �50–100 ONparasol cells, which form a major projection to the magnocellular layers of the lateral geniculate nucleus. Synchronized firing in pairs of cells was a subset of a much larger pattern of activity that exhibited local, isotropic spatial properties. However, a simple model based solely on interactions between adjacent cells reproduced 99 % of the spatial structure and scale of synchronized firing. No more than 20 % of the variability in firing of an individual cell was predictable from the activity of its neighbors. These results held both for spontaneous firing and in the presence of independent visual modulation of the firing of each cell. In sum, largescale synchronized firing in the entire population of ONparasol cells appears to reflect simple neighbor interactions, rather than a unique visual signal or a highly redundant coding scheme.
Thin Junction Trees
 Advances in Neural Information Processing Systems 14
, 2001
"... We present an algorithm that induces a class of models with thin junction treesmodels that are characterized by an upper bound on the size of the maximal cliques of their triangulated graph. By ensuring that the junction tree is thin, inference in our models remains tractable throughout the l ..."
Abstract

Cited by 50 (1 self)
 Add to MetaCart
We present an algorithm that induces a class of models with thin junction treesmodels that are characterized by an upper bound on the size of the maximal cliques of their triangulated graph. By ensuring that the junction tree is thin, inference in our models remains tractable throughout the learning process. This allows both an efficient implementation of an iterative scaling parameter estimation algorithm and also ensures that inference can be performed efficiently with the final model. We illustrate the approach with applications in handwritten digit recognition and DNA splice site detection.
Expectation maximization and posterior constraints
 In Advances in NIPS
, 2007
"... The expectation maximization (EM) algorithm is a widely used maximum likelihood estimation procedure for statistical models when the values of some of the variables in the model are not observed. Very often, however, our aim is primarily to find a model that assigns values to the latent variables th ..."
Abstract

Cited by 49 (11 self)
 Add to MetaCart
The expectation maximization (EM) algorithm is a widely used maximum likelihood estimation procedure for statistical models when the values of some of the variables in the model are not observed. Very often, however, our aim is primarily to find a model that assigns values to the latent variables that have intended meaning for our data and maximizing expected likelihood only sometimes accomplishes this. Unfortunately, it is typically difficult to add even simple apriori information about latent variables in graphical models without making the models overly complex or intractable. In this paper, we present an efficient, principled way to inject rich constraints on the posteriors of latent variables into the EM algorithm. Our method can be used to learn tractable graphical models that satisfy additional, otherwise intractable constraints. Focusing on clustering and the alignment problem for statistical machine translation, we show that simple, intuitive posterior constraints can greatly improve the performance over standard baselines and be competitive with more complex, intractable models. 1
A Bayesian Network Approach to Ontology Mapping
 In: Proceedings ISWC 2005
, 2005
"... Abstract. This paper presents our ongoing effort on developing a principled methodology for automatic ontology mapping based on BayesOWL, a probabilistic framework we developed for modeling uncertainty in semantic web. In this approach, the source and target ontologies are first translated into Baye ..."
Abstract

Cited by 40 (5 self)
 Add to MetaCart
Abstract. This paper presents our ongoing effort on developing a principled methodology for automatic ontology mapping based on BayesOWL, a probabilistic framework we developed for modeling uncertainty in semantic web. In this approach, the source and target ontologies are first translated into Bayesian networks (BN); the concept mapping between the two ontologies are treated as evidential reasoning between the two translated BNs. Probabilities needed for constructing conditional probability tables (CPT) during translation and for measuring semantic similarity during mapping are learned using text classification techniques where each concept in an ontology is associated with a set of semantically relevant text documents, which are obtained by ontology guided web mining. The basic ideas of this approach are validated by positive results from computer experiments on two small realworld ontologies. 1
Sufficient Dimensionality Reduction
 Journal of Machine Learning Research
, 2003
"... Dimensionality reduction of empirical cooccurrence data is a fundamental problem in unsupervised learning. It is also a well studied problem in statistics known as the analysis of crossclassified data. One principled approach to this problem is to represent the data in low dimension with minimal l ..."
Abstract

Cited by 35 (8 self)
 Add to MetaCart
Dimensionality reduction of empirical cooccurrence data is a fundamental problem in unsupervised learning. It is also a well studied problem in statistics known as the analysis of crossclassified data. One principled approach to this problem is to represent the data in low dimension with minimal loss of (mutual) information contained in the original data. In this paper we introduce an information theoretic nonlinear method for finding such a most informative dimension reduction. In contrast with...
The Multiinformation Function As A Tool For Measuring Stochastic Dependence
 Learning in Graphical Models
, 1998
"... . Given a collection of random variables [¸ i ] i2N where N is a finite nonempty set, the corresponding multiinformation function ascribes the relative entropy of the joint distribution of [¸ i ] i2A with respect to the product of distributions of individual random variables ¸ i for i 2 A to every s ..."
Abstract

Cited by 32 (0 self)
 Add to MetaCart
. Given a collection of random variables [¸ i ] i2N where N is a finite nonempty set, the corresponding multiinformation function ascribes the relative entropy of the joint distribution of [¸ i ] i2A with respect to the product of distributions of individual random variables ¸ i for i 2 A to every subset A ae N . We argue it is a useful tool for problems concerning stochastic (conditional) dependence and independence (at least in discrete case). First, it makes possible to express the conditional mutual information between [¸ i ] i2A and [¸ i ] i2B given [¸ i ] i2C (for every disjoint A; B; C ae N) which can be considered as a good measure of conditional stochastic dependence. Second, one can introduce reasonable measures of dependence of level r among variables [¸ i ] i2A (where A ae N , 1 r ! card A) which are expressible by means of the multiinformation function. Third, it enables one to derive theoretical results on (nonexistence of an) axiomatic characterization of stochastic c...
Minimax and Minimal Distance Martingale Measures and Their Relationship to Portfolio Optimization
, 2000
"... In this paper we give a characterization of minimal distance martingale measures with respect to fdivergence distances in a general semimartingale market model. We provide necessary and sufficient conditions for minimal distance martingale measures and determine them explicitly for exponential ..."
Abstract

Cited by 28 (3 self)
 Add to MetaCart
In this paper we give a characterization of minimal distance martingale measures with respect to fdivergence distances in a general semimartingale market model. We provide necessary and sufficient conditions for minimal distance martingale measures and determine them explicitly for exponential L'evy processes with respect to several classical distances. It is shown that the minimal distance martingale measures are equivalent to minimax martingale measures with respect to related utility functions and that optimal portfolios can be characterized by them. Related results in the context of continuoustime diffusion models were first obtained by He and Pearson (1991b) and Karatzas et al. (1991) and in a general semimartingale setting by Kramkov and Schachermayer (1999). Finally parts of the results are extended to utilitybased hedging.
Conditional limit theorems under Markov conditioning
 IEEE Trans. on Information Theory
, 1987
"... variables taking values in a finite set X and consider the conditional joint distribution of the first m elements of the sample Xt;.., X, on the condition that A’, = x, and the sliding block sample average of a function h (.,.) defined on X2 exceeds a threshold OL> Eh ( Xt, X2). For m fixed and M + ..."
Abstract

Cited by 27 (1 self)
 Add to MetaCart
variables taking values in a finite set X and consider the conditional joint distribution of the first m elements of the sample Xt;.., X, on the condition that A’, = x, and the sliding block sample average of a function h (.,.) defined on X2 exceeds a threshold OL> Eh ( Xt, X2). For m fixed and M + co, this conditional joint distribution is shown to converge to the mstep joint distribution of a Markov chain started in x1 which is closest to X,, X2, in KullbackLeibler information divergence among all Markov chains whose twodimensional stationary distribution P ( ,.) satisfies EP ( x, y) h ( x, y) 2 OL, provided some distribution P on X2 having equal marginals does satisfy this constraint with strict inequality. Similar conditional limit theorems are obtained when X,, X2,... is an arbitrary finiteorder Markov chain and more general conditioning is allowed. S I.
On Capacities of Quantum Channels
, 1997
"... Capacities of quantum mechanical channels are dened in terms of mutual information quantities. Geometry of the relative entropy is used to express capacity as a divergence radius. The symmetric quantum spin 1=2 channel and the attenuation channel of Boson elds are discussed as examples. 1. Introduct ..."
Abstract

Cited by 26 (4 self)
 Add to MetaCart
Capacities of quantum mechanical channels are dened in terms of mutual information quantities. Geometry of the relative entropy is used to express capacity as a divergence radius. The symmetric quantum spin 1=2 channel and the attenuation channel of Boson elds are discussed as examples. 1. Introduction. A discrete communication system { as modeled by Shannon { is capable of transmitting succesively symbols of a nite input alphabet fx 1 ; x 2 ; : : : ; xm g. In the stochastic approach to the communication model it is assumed that the input symbols show up with certain probability. Let p ji be the probability that a symbol x i is sent over the channel and the output symbol y j appears at the destination. The joint distribution p ji yields marginal distributions (p 1 ; p 2 ; : : : ; p m ) and (q 1 ; q 2 ; : : : ; q k ) on the set of input symbols and output symbols, respectively. Shannon introduced the mutual information I = X i;j p ji log p ji p i q j (1:1) in order to measur...
KullbackLeibler approximation of spectral density functions
 IEEE Trans. Inform. Theory
, 2003
"... Abstract—We introduce a Kullback–Leiblertype distance between spectral density functions of stationary stochastic processes and solve the problem of optimal approximation of a given spectral density 9 by one that is consistent with prescribed secondorder statistics. In general, such statistics are ..."
Abstract

Cited by 26 (15 self)
 Add to MetaCart
Abstract—We introduce a Kullback–Leiblertype distance between spectral density functions of stationary stochastic processes and solve the problem of optimal approximation of a given spectral density 9 by one that is consistent with prescribed secondorder statistics. In general, such statistics are expressed as the state covariance of a linear filter driven by a stochastic process whose spectral density is sought. In this context, we show i) that there is a unique spectral density 8 which minimizes this Kullback–Leibler distance, ii) that this optimal approximate is of the form 9 where the “correction term ” is a rational spectral density function, and iii) that the coefficients of can be obtained numerically by solving a suitable convex optimization problem. In the special case where 9=1, the convex functional becomes quadratic and the solution is then specified by linear equations. Index Terms—Approximation of power spectra, crossentropy minimization, Kullback–Leibler distance, mutual information, optimization, spectral estimation. I.