• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 3,137
Next 10 →

A Study on L2-Loss (Squared Hinge-Loss) Multi-Class SVM

by Ching-pei Lee, Chih-jen Lin
"... Crammer and Singer’s method is one of the most popular multi-class SVMs. It considers L1 loss (hinge loss) in a complicated optimization problem. In SVM, squared hinge loss (L2 loss) is a common alternative to L1 loss, but surprisingly we have not seen any paper studying details of Crammer and Singe ..."
Abstract - Add to MetaCart
Crammer and Singer’s method is one of the most popular multi-class SVMs. It considers L1 loss (hinge loss) in a complicated optimization problem. In SVM, squared hinge loss (L2 loss) is a common alternative to L1 loss, but surprisingly we have not seen any paper studying details of Crammer

Distance metric learning for large margin nearest neighbor classification

by Kilian Q. Weinberger, John Blitzer, Lawrence K. Saul - In NIPS , 2006
"... We show how to learn a Mahanalobis distance metric for k-nearest neighbor (kNN) classification by semidefinite programming. The metric is trained with the goal that the k-nearest neighbors always belong to the same class while examples from different classes are separated by a large margin. On seven ..."
Abstract - Cited by 695 (14 self) - Add to MetaCart
convex optimization based on the hinge loss. Unlike learning in SVMs, however, our framework requires no modification or extension for problems in multiway (as opposed to binary) classification. 1

Greedy Function Approximation: A Gradient Boosting Machine

by Jerome H. Friedman - Annals of Statistics , 2000
"... Function approximation is viewed from the perspective of numerical optimization in function space, rather than parameter space. A connection is made between stagewise additive expansions and steepest{descent minimization. A general gradient{descent \boosting" paradigm is developed for additi ..."
Abstract - Cited by 1000 (13 self) - Add to MetaCart
for additive expansions based on any tting criterion. Specic algorithms are presented for least{squares, least{absolute{deviation, and Huber{M loss functions for regression, and multi{class logistic likelihood for classication. Special enhancements are derived for the particular case where the individual

The Performance of TCP/IP for Networks with High Bandwidth-Delay Products and Random Loss.

by Member, IEEE T V Lakshman , Senior Member, IEEE Upamanyu Madhow - IEEE/ACM Trans. Networking, , 1997
"... Abstract-This paper examines the performance of TCP/IP, the Internet data transport protocol, over wide-area networks (WANs) in which data traffic could coexist with real-time traffic such as voice and video. Specifically, we attempt to develop a basic understanding, using analysis and simulation, ..."
Abstract - Cited by 465 (6 self) - Add to MetaCart
). The following key results are obtained. First, random loss leads to significant throughput deterioration when the product of the loss probability and the square of the bandwidth-delay product is larger than one. Second, for multiple connections sharing a bottleneck link, TCP is grossly unfair toward connections

The Dantzig selector: statistical estimation when p is much larger than n

by Emmanuel Candes, Terence Tao , 2005
"... In many important statistical applications, the number of variables or parameters p is much larger than the number of observations n. Suppose then that we have observations y = Ax + z, where x ∈ R p is a parameter vector of interest, A is a data matrix with possibly far fewer rows than columns, n ≪ ..."
Abstract - Cited by 879 (14 self) - Add to MetaCart
‖ˆx − x ‖ 2 ℓ2 ≤ C2 ( · 2 log p · σ 2 + ∑ min(x 2 i, σ 2) Our results are nonasymptotic and we give values for the constant C. In short, our estimator achieves a loss within a logarithmic factor of the ideal mean squared error one would achieve with an oracle which would supply perfect information

Online passive-aggressive algorithms

by Koby Crammer, Ofer Dekel, Shai Shalev-Shwartz, Yoram Singer - JMLR , 2006
"... We present a unified view for online classification, regression, and uniclass problems. This view leads to a single algorithmic framework for the three problems. We prove worst case loss bounds for various algorithms for both the realizable case and the non-realizable case. The end result is new alg ..."
Abstract - Cited by 435 (24 self) - Add to MetaCart
algorithms and accompanying loss bounds for hinge-loss regression and uniclass. We also get refined loss bounds for previously studied classification algorithms.

Clustering with Bregman Divergences

by Arindam Banerjee, Srujana Merugu, Inderjit Dhillon, Joydeep Ghosh - JOURNAL OF MACHINE LEARNING RESEARCH , 2005
"... A wide variety of distortion functions are used for clustering, e.g., squared Euclidean distance, Mahalanobis distance and relative entropy. In this paper, we propose and analyze parametric hard and soft clustering algorithms based on a large class of distortion functions known as Bregman divergence ..."
Abstract - Cited by 443 (57 self) - Add to MetaCart
A wide variety of distortion functions are used for clustering, e.g., squared Euclidean distance, Mahalanobis distance and relative entropy. In this paper, we propose and analyze parametric hard and soft clustering algorithms based on a large class of distortion functions known as Bregman

Climate change, coral bleaching and the future of the world’s coral reefs

by Ove Hoegh-guldberg - Marine and Freshwater Research , 1999
"... Sea temperatures in the tropics have increased by almost 1oC over the past 100 years and are currently increasing at the rate of approximately 1-2oC per century. Reef-building corals, which are central to healthy coral reefs, are currently living close to their thermal maxima. They become stressed i ..."
Abstract - Cited by 428 (16 self) - Add to MetaCart
the coral host. Corals tend to die in great numbers immediately following coral bleaching events, which may stretch across thousands of square kilometers of ocean. Bleaching events in 1998, the worst year on record, saw the complete loss of live coral in some parts of the world. This paper reviews our

How to Use Expert Advice

by Nicolò Cesa-Bianchi, Yoav Freund, David Haussler, David P. Helmbold, Robert E. Schapire, Manfred K. Warmuth - JOURNAL OF THE ASSOCIATION FOR COMPUTING MACHINERY , 1997
"... We analyze algorithms that predict a binary value by combining the predictions of several prediction strategies, called experts. Our analysis is for worst-case situations, i.e., we make no assumptions about the way the sequence of bits to be predicted is generated. We measure the performance of the ..."
Abstract - Cited by 377 (79 self) - Add to MetaCart
is on the order of the square root of the number of mistakes of the best expert, and we give efficient algorithms that achieve this. Our upper and lower bounds have matching leading constants in most cases. We then show howthis leads to certain kinds of pattern recognition/learning algorithms with performance

∇wJ = −2 i

by Nisrine Jrad, Ronald Phlypo, Marco Congedo, Alain Rakotomamonjy
"... Given the training examples {xi, yi}, the squared Hinge loss is written as: J = n∑ i=1 max(0, 1 − yix>i w)2 and its gradient is: ..."
Abstract - Add to MetaCart
Given the training examples {xi, yi}, the squared Hinge loss is written as: J = n∑ i=1 max(0, 1 − yix>i w)2 and its gradient is:
Next 10 →
Results 1 - 10 of 3,137
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University