• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Regularized adaptation: Theory, algorithms and applications (2007)

by X Li
Venue:University of Washington
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 14
Next 10 →

Tabula rasa: Model transfer for object category detection

by Yusuf Aytar, Andrew Zisserman - In Proc. ICCV , 2011
"... Our objective is transfer training of a discriminatively trained object category detector, in order to reduce the number of training images required. To this end we propose three transfer learning formulations where a template learnt previously for other categories is used to regularize the training ..."
Abstract - Cited by 56 (1 self) - Add to MetaCart
Our objective is transfer training of a discriminatively trained object category detector, in order to reduce the number of training images required. To this end we propose three transfer learning formulations where a template learnt previously for other categories is used to regularize the training of a new category. All the formulations result in convex optimization problems. Experiments (on PASCAL VOC) demonstrate significant performance gains by transfer learning from one class to another (e.g. motorbike to bicycle), including one-shot learning, specialization from class to a subordinate class (e.g. from quadruped to horse) and transfer using multiple components. In the case of multiple training samples it is shown that a detection performance approaching that of the state of the art can be achieved with substantially fewer training samples. 1.
(Show Context)

Citation Context

... determinedby the source category for best results in the case of oneshot learning. Related work. Model based transfer learning, originally developed in the machine learning literature as adaptation =-=[14, 27]-=-, has been applied to computer vision primarily for image classification [17, 22, 23, 27], rather than detection. The work closest to ours is that of Tomassi et al. [22, 23] who also use a discriminat...

Efficient learning of domain-invariant image representations

by Judy Hoffman, Erik Rodner, Trevor Darrell, Jeff Donahue, Kate Saenko - In Proc. ICLR
"... We present an algorithm that learns representations which explicitly compensate for domain mismatch and which can be efficiently realized as linear classifiers. Specifically, we form a linear transformation that maps features from the target (test) domain to the source (training) domain as part of t ..."
Abstract - Cited by 25 (10 self) - Add to MetaCart
We present an algorithm that learns representations which explicitly compensate for domain mismatch and which can be efficiently realized as linear classifiers. Specifically, we form a linear transformation that maps features from the target (test) domain to the source (training) domain as part of training the classifier. We optimize both the transformation and classifier parameters jointly, and introduce an efficient cost function based on misclassification loss. Our method combines several features previously unavailable in a single algorithm: multi-class adaptation through representation learning, ability to map across heterogeneous feature spaces, and scalability to large datasets. We present experiments on several image datasets that demonstrate improved accuracy and computational advantages compared to previous approaches. 1
(Show Context)

Citation Context

...hey are efficient and prevalent in vision applications, with fast linear SVMs forming the core of some of the most popular object detection methods [6, 7]. Previous work proposed to adapt linear SVMs =-=[8, 9, 10]-=-, learning a perturbation of the source hyperplane by minimizing the classification error on labeled target examples for each binary task. These perturbations can be thought of as new feature represen...

Natural Language Processing Tools for Reading Level Assessment and Text Simplification for . . .

by Sarah E. Petersen , 2007
"... ..."
Abstract - Cited by 10 (0 self) - Add to MetaCart
Abstract not found

RATIO SEMI-DEFINITE CLASSIFIERS

by Jonathan Malkin, Jeff Bilmes
"... We present a novel classification model that is formulated as a ratio of semi-definite polynomials. We derive an efficient learning algorithm for this classifier, and apply it to two separate phoneme classification corpora. Results show that our disciminatively trained model can achieve accuracies c ..."
Abstract - Cited by 7 (6 self) - Add to MetaCart
We present a novel classification model that is formulated as a ratio of semi-definite polynomials. We derive an efficient learning algorithm for this classifier, and apply it to two separate phoneme classification corpora. Results show that our disciminatively trained model can achieve accuracies comparable with state-of-the-art techniques such as multi-layer perceptrons, but does not posses the overconfident bias often found in models based on ratios of exponentials. Index Terms — Pattern recognition, Speech recognition 1.
(Show Context)

Citation Context

...arization and Penalty One side-effect of requiring summation to identity as in [5] is implicit regularization. Since regularization has been shown to be important for many machine learning algorithms =-=[18, 3, 12]-=-, we have added regularization to our model, along with the aforementioned penalty, both of which we will explain here. Our full objective function is max Θ X X log p(yi|xi)−λB i ∂dkn k ‖Bk‖ 2 F −λd X...

Semi-supervised domain adaptation with instance constraints

by Jeff Donahue, Judy Hoffman, Erik Rodner, Kate Saenko, Trevor Darrell - in IEEE International Conference on Computer Vision , 2013
"... Most successful object classification and detection meth-ods rely on classifiers trained on large labeled datasets. However, for domains where labels are limited, simply bor-rowing labeled data from existing datasets can hurt per-formance, a phenomenon known as “dataset bias. ” We propose a general ..."
Abstract - Cited by 6 (2 self) - Add to MetaCart
Most successful object classification and detection meth-ods rely on classifiers trained on large labeled datasets. However, for domains where labels are limited, simply bor-rowing labeled data from existing datasets can hurt per-formance, a phenomenon known as “dataset bias. ” We propose a general framework for adapting classifiers from “borrowed ” data to the target domain using a combination of available labeled and unlabeled examples. Specifically, we show that imposing smoothness constraints on the clas-sifier scores over the unlabeled data can lead to improved adaptation results. Such constraints are often available in the form of instance correspondences, e.g. when the same object or individual is observed simultaneously from multi-ple views, or tracked between video frames. In these cases, the object labels are unknown but can be constrained to be the same or similar. We propose techniques that build on existing domain adaptation methods by explicitly mod-eling these relationships, and demonstrate empirically that they improve recognition accuracy in two scenarios, multi-category image classification and object detection in video. 1.
(Show Context)

Citation Context

...e been popular. These include simple methods such as a weighted combination of source and target SVMs; transductive SVMs applied to adaptation [2]; the feature replication method of [6]; Adaptive SVM =-=[20, 27]-=-, where the source model parameters are adapted by adding a perturbation function, and its successor PMT-SVM [1]; The general idea behind Adaptive SVMs is to learn the target classifier f(x) as a pert...

On the semi-supervised learning of multi-layered perceptrons

by Jonathan Malkin, Amarnag Subramanya, Jeff Bilmes - In Proc. Annual Conference of the International Speech Communication Association (INTERSPEECH , 2009
"... We present a novel approach for training a multi-layered perceptron (MLP) in a semi-supervised fashion. Our objective function, when optimized, balances training set accuracy with fidelity to a graph-based manifold over all points. Additionally, the objective favors smoothness via an entropy regular ..."
Abstract - Cited by 3 (1 self) - Add to MetaCart
We present a novel approach for training a multi-layered perceptron (MLP) in a semi-supervised fashion. Our objective function, when optimized, balances training set accuracy with fidelity to a graph-based manifold over all points. Additionally, the objective favors smoothness via an entropy regularizer over classifier outputs as well as straightforward ℓ2 regularization. Our approach also scales well enough to enable large-scale training. The results demonstrate significant improvement on several phone classification tasks over baseline MLPs. Index Terms: semi-supervised learning, neural networks, phone classification
(Show Context)

Citation Context

... manifold [1, 19] embedded 2 The corpus is freely available online: web search for “Vocal Joystick vowel corpus” in the feature space. We use the same training, development and test sets specified in =-=[14, 13]-=-. For VJ, the training, development and test sets had roughly 220k, 41k and 90k samples, respectively. For TIMIT, those numbers were 1.4M, 124k and 515k, respectively. For TIMIT, the development set w...

MULTI-LAYER RATIO SEMI-DEFINITE CLASSIFIERS

by Jonathan Malkin, Jeff Bilmes
"... We develop a novel extension to the Ratio Semi-definite Classifier, a discriminative model formulated as a ratio of semi-definite polynomials. By adding a hidden layer to the model, we can efficiently train the model, while achieving higher accuracy than the original version. Results on artificial 2 ..."
Abstract - Cited by 3 (2 self) - Add to MetaCart
We develop a novel extension to the Ratio Semi-definite Classifier, a discriminative model formulated as a ratio of semi-definite polynomials. By adding a hidden layer to the model, we can efficiently train the model, while achieving higher accuracy than the original version. Results on artificial 2-D data as well as two separate phone classification corpora show that our multi-layer model still avoids the overconfidence bias found in models based on ratios of exponentials, while remaining competitive with state-of-the-art techniques such as multi-layer perceptrons. Index Terms — Pattern recognition, Speech recognition 1.
(Show Context)

Citation Context

...s a vector of ones. By appending an extra dimension with a constant 1 onto input vector x, this handles the input-to-hidden layer biases as well. 2.2. Regularization and Penalty We add regularization =-=[2, 9]-=- terms to the training objective via the Frobenius norm of the matrices Bk, the L2 norm of the shifts dk, and the Frobenius norm of the weight matrix W . We use regularization coefficients λB, λd and ...

Graphical Models for Integrating Syllabic Information

by Chris D. Bartels, Jeff A. Bilmes, Chris D. Bartels, Jeff A. Bilmes , 2009
"... We present graphical models that enhance a speech recognizer with information about syllabic segmentations. The segmentations are specified by locations of syllable nuclei, and the graphical models are able to use these locations to specify a “soft ” segmentation of the speech data. The graphs give ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
We present graphical models that enhance a speech recognizer with information about syllabic segmentations. The segmentations are specified by locations of syllable nuclei, and the graphical models are able to use these locations to specify a “soft ” segmentation of the speech data. The graphs give improved discrimination between speech and noise when compared to a baseline model. When using locations derived from oracle information an overall improvement is given, and when the oracle syllable nuclei are augmented with information about lexical stress it gives additional improvements over locations alone. 1
(Show Context)

Citation Context

...ical stress, in time center of each syllable development set. This gave 200 hidden nodes and a 17 frame input window. The neural network training and decoding was performed using RegNet from Xiao Li [=-=Li, 2007-=-]. For the second ANN, an existing set of publicly available networks was used. A detailed description of the networks can be found in Frankel et al. [2007]. They were trained on 2000 hours of Fisher ...

A semi-supervised learning algorithm for multi-layered perceptrons

by Jonathan Malkin, Amarnag Subramanya, Jeff Bilmes, Jonathan Malkin, Amarnag Subramanya, Jeff Bilmes , 2009
"... We address the issue of learning multi-layered perceptrons (MLPs) in a discriminative, inductive, multiclass, parametric, and semi-supervised fashion. We introduce a novel objective function that, when optimized, simultane-ously encourages 1) accuracy on the labeled points, 2) respect for an underly ..."
Abstract - Cited by 1 (1 self) - Add to MetaCart
We address the issue of learning multi-layered perceptrons (MLPs) in a discriminative, inductive, multiclass, parametric, and semi-supervised fashion. We introduce a novel objective function that, when optimized, simultane-ously encourages 1) accuracy on the labeled points, 2) respect for an underlying graph-represented manifold on all points, 3) smoothness via an entropic regularizer of the classifier outputs, and 4) simplicity via an `2 regularizer. Our approach provides a simple, elegant, and computationally efficient way to bring the benefits of semi-supervised learning (and what is typically an enormous amount of unlabeled training data) to MLPs, which are one of the most widely used pattern classifiers in practice. Our objective has the property that efficient learning is possible using stochastic gradient descent even on large datasets. Results demonstrate significant improvements compared both to a baseline supervised MLP, and also to a previous non-parametric manifold-regularized reproducing kernel Hilbert space classifier. 1
(Show Context)

Citation Context

...vowel space. As a result, the notion that this real-world data lies on a manifold [2, 24, 28] is supported by phonetic theory as well. We use the same training, development and test sets specified in =-=[22, 21]-=-. TIMIT is a standard corpus for phonetic classification consisting of phonetically balanced read English sentences. In contrast with the VJ corpus, TIMIT has many more classes; we used the standard 3...

Scalable Graph-based Learning Applied to Human Language Technology

by Andrei Alexandrescu , 2009
"... ..."
Abstract - Add to MetaCart
Abstract not found
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University