• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 6,739
Next 10 →

A fast learning algorithm for deep belief nets

by Geoffrey E. Hinton, Simon Osindero - Neural Computation , 2006
"... We show how to use “complementary priors ” to eliminate the explaining away effects that make inference difficult in densely-connected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a ..."
Abstract - Cited by 970 (49 self) - Add to MetaCart
very good generative model of the joint distribution of handwritten digit images and their labels. This generative model gives better digit classification than the best discriminative learning algorithms. The low-dimensional manifolds on which the digits lie are modelled by long ravines in the free

Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning

by Richard S. Sutton , Doina Precup , Satinder Singh , 1999
"... Learning, planning, and representing knowledge at multiple levels of temporal abstraction are key, longstanding challenges for AI. In this paper we consider how these challenges can be addressed within the mathematical framework of reinforcement learning and Markov decision processes (MDPs). We exte ..."
Abstract - Cited by 569 (38 self) - Add to MetaCart
Learning, planning, and representing knowledge at multiple levels of temporal abstraction are key, longstanding challenges for AI. In this paper we consider how these challenges can be addressed within the mathematical framework of reinforcement learning and Markov decision processes (MDPs). We

Matching words and pictures

by Kobus Barnard, Pinar Duygulu, David Forsyth, Nando De Freitas, David M. Blei, Michael I. Jordan - JOURNAL OF MACHINE LEARNING RESEARCH , 2003
"... We present a new approach for modeling multi-modal data sets, focusing on the specific case of segmented images with associated text. Learning the joint distribution of image regions and words has many applications. We consider in detail predicting words associated with whole images (auto-annotation ..."
Abstract - Cited by 665 (40 self) - Add to MetaCart
We present a new approach for modeling multi-modal data sets, focusing on the specific case of segmented images with associated text. Learning the joint distribution of image regions and words has many applications. We consider in detail predicting words associated with whole images (auto

Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network

by Kristina Toutanova , Dan Klein, Christopher D. Manning, Yoram Singer - IN PROCEEDINGS OF HLT-NAACL , 2003
"... We present a new part-of-speech tagger that demonstrates the following ideas: (i) explicit use of both preceding and following tag contexts via a dependency network representation, (ii) broad use of lexical features, including jointly conditioning on multiple consecutive words, (iii) effective ..."
Abstract - Cited by 693 (23 self) - Add to MetaCart
We present a new part-of-speech tagger that demonstrates the following ideas: (i) explicit use of both preceding and following tag contexts via a dependency network representation, (ii) broad use of lexical features, including jointly conditioning on multiple consecutive words, (iii

Space-time block codes from orthogonal designs

by Vahid Tarokh, Hamid Jafarkhani, A. R. Calderbank - IEEE Trans. Inform. Theory , 1999
"... Abstract — We introduce space–time block coding, a new paradigm for communication over Rayleigh fading channels using multiple transmit antennas. Data is encoded using a space–time block code and the encoded data is split into � streams which are simultaneously transmitted using � transmit antennas. ..."
Abstract - Cited by 1524 (42 self) - Add to MetaCart
. The received signal at each receive antenna is a linear superposition of the � transmitted signals perturbed by noise. Maximumlikelihood decoding is achieved in a simple way through decoupling of the signals transmitted from different antennas rather than joint detection. This uses the orthogonal structure

Using Bayesian networks to analyze expression data

by Nir Friedman, Michal Linial, Iftach Nachman - Journal of Computational Biology , 2000
"... DNA hybridization arrays simultaneously measure the expression level for thousands of genes. These measurements provide a “snapshot ” of transcription levels within the cell. A major challenge in computational biology is to uncover, from such measurements, gene/protein interactions and key biologica ..."
Abstract - Cited by 1088 (17 self) - Add to MetaCart
of joint multivariate probability distributions that captures properties of conditional independence between variables. Such models are attractive for their ability to describe complex stochastic processes and because they provide a clear methodology for learning from (noisy) observations. We start

The information bottleneck method

by Naftali Tishby, Fernando C. Pereira, William Bialek , 1999
"... We define the relevant information in a signal x ∈ X as being the information that this signal provides about another signal y ∈ Y. Examples include the information that face images provide about the names of the people portrayed, or the information that speech sounds provide about the words spoken. ..."
Abstract - Cited by 540 (35 self) - Add to MetaCart
about Y through a ‘bottleneck ’ formed by a limited set of codewords ˜X. This constrained optimization problem can be seen as a generalization of rate distortion theory in which the distortion measure d(x, ˜x) emerges from the joint statistics of X and Y. This approach yields an exact set of self

Support vector machine learning for interdependent and structured output spaces

by Ioannis Tsochantaridis, Thomas Hofmann, Thorsten Joachims, Yasemin Altun - In ICML , 2004
"... Learning general functional dependencies is one of the main goals in machine learning. Recent progress in kernel-based methods has focused on designing flexible and powerful input representations. This paper addresses the complementary issue of problems involving complex outputs suchas multiple depe ..."
Abstract - Cited by 450 (20 self) - Add to MetaCart
dependent output variables and structured output spaces. We propose to generalize multiclass Support Vector Machine learning in a formulation that involves features extracted jointly from inputs and outputs. The resulting optimization problem is solved efficiently by a cutting plane algorithm that exploits

TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-Class Object . . .

by J. Shotton, J. Winn, C. Rother, A. Criminisi - IN ECCV , 2006
"... This paper proposes a new approach to learning a discriminative model of object classes, incorporating appearance, shape and context information efficiently. The learned model is used for automatic visual recognition and semantic segmentation of photographs. Our discriminative model exploits nov ..."
Abstract - Cited by 426 (17 self) - Add to MetaCart
This paper proposes a new approach to learning a discriminative model of object classes, incorporating appearance, shape and context information efficiently. The learned model is used for automatic visual recognition and semantic segmentation of photographs. Our discriminative model exploits

A Neural Probabilistic Language Model

by Yoshua Bengio, Réjean Ducharme, Pascal Vincent, Christian Jauvin - JOURNAL OF MACHINE LEARNING RESEARCH , 2003
"... A goal of statistical language modeling is to learn the joint probability function of sequences of words in a language. This is intrinsically difficult because of the curse of dimensionality: a word sequence on which the model will be tested is likely to be different from all the word sequences seen ..."
Abstract - Cited by 447 (19 self) - Add to MetaCart
A goal of statistical language modeling is to learn the joint probability function of sequences of words in a language. This is intrinsically difficult because of the curse of dimensionality: a word sequence on which the model will be tested is likely to be different from all the word sequences
Next 10 →
Results 1 - 10 of 6,739
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University