Results 1  10
of
3,052
Option Discovery in Hierarchical Reinforcement Learning for Training Large Factor Graphs for Information Extraction
, 2009
"... Since exact training and inference is not possible for most factor graphs, a number of techniques have been proposed to train models approximately, but they do not scale to large factor graphs used in recent work on joint inference on multiple information extraction tasks. SampleRank is an MCMC ba ..."
Abstract
 Add to MetaCart
Since exact training and inference is not possible for most factor graphs, a number of techniques have been proposed to train models approximately, but they do not scale to large factor graphs used in recent work on joint inference on multiple information extraction tasks. SampleRank is an MCMC
A distributed, developmental model of word recognition and naming
 PSYCHOLOGICAL REVIEW
, 1989
"... A parallel distributed processing model of visual word recognition and pronunciation is described. The model consists of sets of orthographic and phonological units and an interlevel of hidden units. Weights on connections between units were modified during a training phase using the backpropagatio ..."
Abstract

Cited by 706 (49 self)
 Add to MetaCart
is simulated without pronunciation rules, and lexical decisions are simulated without accessing wordlevel representations. The performance of the model is largely determined by three factors: the nature of the input, a significant fragment of written English; the learning rule, which encodes the implicit
Cumulated Gainbased Evaluation of IR Techniques
 ACM Transactions on Information Systems
, 2002
"... Modem large retrieval environments tend to overwhelm their users by their large output. Since all documents are not of equal relevance to their users, highly relevant documents should be identified and ranked first for presentation to the users. In order to develop IR techniques to this direction, i ..."
Abstract

Cited by 694 (3 self)
 Add to MetaCart
Modem large retrieval environments tend to overwhelm their users by their large output. Since all documents are not of equal relevance to their users, highly relevant documents should be identified and ranked first for presentation to the users. In order to develop IR techniques to this direction
Variational algorithms for approximate Bayesian inference
, 2003
"... The Bayesian framework for machine learning allows for the incorporation of prior knowledge in a coherent way, avoids overfitting problems, and provides a principled basis for selecting between alternative models. Unfortunately the computations required are usually intractable. This thesis presents ..."
Abstract

Cited by 440 (9 self)
 Add to MetaCart
theorems are presented to pave the road for automated VB derivation procedures in both directed and undirected graphs (Bayesian and Markov networks, respectively). Chapters 35 derive and apply the VB EM algorithm to three commonlyused and important models: mixtures of factor analysers, linear dynamical
Large margin dags for multiclass classification
 Advances in Neural Information Processing Systems 12
, 2000
"... We present a new learning architecture: the Decision Directed Acyclic Graph (DDAG), which is used to combine many twoclass classifiers into a multiclass classifier. For anclass problem, the DDAG contains � classifiers, one for each pair of classes. We present a VC analysis of the case when the nod ..."
Abstract

Cited by 374 (1 self)
 Add to MetaCart
We present a new learning architecture: the Decision Directed Acyclic Graph (DDAG), which is used to combine many twoclass classifiers into a multiclass classifier. For anclass problem, the DDAG contains � classifiers, one for each pair of classes. We present a VC analysis of the case when
A general approximation technique for constrained forest problems
 SIAM J. COMPUT.
, 1995
"... We present a general approximation technique for a large class of graph problems. Our technique mostly applies to problems of covering, at minimum cost, the vertices of a graph with trees, cycles, or paths satisfying certain requirements. In particular, many basic combinatorial optimization proble ..."
Abstract

Cited by 414 (21 self)
 Add to MetaCart
We present a general approximation technique for a large class of graph problems. Our technique mostly applies to problems of covering, at minimum cost, the vertices of a graph with trees, cycles, or paths satisfying certain requirements. In particular, many basic combinatorial optimization
Learning from Labeled and Unlabeled Data using Graph Mincuts
, 2001
"... Many application domains suffer from not having enough labeled training data for learning. However, large amounts of unlabeled examples can often be gathered cheaply. As a result, there has been a great deal of work in recent years on how unlabeled data can be used to aid classification. We consi ..."
Abstract

Cited by 334 (5 self)
 Add to MetaCart
Many application domains suffer from not having enough labeled training data for learning. However, large amounts of unlabeled examples can often be gathered cheaply. As a result, there has been a great deal of work in recent years on how unlabeled data can be used to aid classification. We
Online learning for matrix factorization and sparse coding
, 2010
"... Sparse coding—that is, modelling data vectors as sparse linear combinations of basis elements—is widely used in machine learning, neuroscience, signal processing, and statistics. This paper focuses on the largescale matrix factorization problem that consists of learning the basis set in order to ad ..."
Abstract

Cited by 330 (31 self)
 Add to MetaCart
, which scales up gracefully to large data sets with millions of training samples, and extends naturally to various matrix factorization formulations, making it suitable for a wide range of learning problems. A proof of convergence is presented, along with experiments with natural images and genomic data
An Introduction to Factor Graphs
 IEEE SIGNAL PROCESSING MAG., JAN. 2004
, 2004
"... A large variety of algorithms in coding, signal processing, and artificial intelligence may be viewed as instances of the summaryproduct algorithm (or belief/probability ..."
Abstract

Cited by 197 (34 self)
 Add to MetaCart
A large variety of algorithms in coding, signal processing, and artificial intelligence may be viewed as instances of the summaryproduct algorithm (or belief/probability
Efficient SVM training using lowrank kernel representations
 Journal of Machine Learning Research
, 2001
"... SVM training is a convex optimization problem which scales with the training set size rather than the feature space dimension. While this is usually considered to be a desired quality, in large scale problems it may cause training to be impractical. The common techniques to handle this difficulty ba ..."
Abstract

Cited by 240 (3 self)
 Add to MetaCart
SVM training is a convex optimization problem which scales with the training set size rather than the feature space dimension. While this is usually considered to be a desired quality, in large scale problems it may cause training to be impractical. The common techniques to handle this difficulty
Results 1  10
of
3,052