Results 11 - 20
of
162
General convergence results for linear discriminant updates
- Machine Learning
, 1997
"... Abstract. The problem of learning linear-discriminant concepts can be solved by various mistake-driven update procedures, including the Winnow family of algorithms and the well-known Perceptron algorithm. In this paper we define the general class of “quasi-additive ” algorithms, which includes Perce ..."
Abstract
-
Cited by 73 (0 self)
- Add to MetaCart
Abstract. The problem of learning linear-discriminant concepts can be solved by various mistake-driven update procedures, including the Winnow family of algorithms and the well-known Perceptron algorithm. In this paper we define the general class of “quasi-additive ” algorithms, which includes Perceptron and Winnow as special cases. We give a single proof of convergence that covers a broad subset of algorithms in this class, including both Perceptron and Winnow, but also many new algorithms. Our proof hinges on analyzing a generic measure of progress construction that gives insight as to when and how such algorithms converge. Our measure of progress construction also permits us to obtain good mistake bounds for individual algorithms. We apply our unified analysis to new algorithms as well as existing algorithms. When applied to known algorithms, our method “automatically ” produces close variants of existing proofs (recovering similar bounds)—thus showing that, in a certain sense, these seemingly diverse results are fundamentally isomorphic. However, we also demonstrate that the unifying principles are more broadly applicable, and analyze a new class of algorithms that smoothly interpolate between the additive-update behavior of Perceptron and the multiplicative-update behavior of Winnow.
Evaluating Topic-Driven Web Crawlers
, 2001
"... Due to limited bandwidth, storage, and computational resources, and to the dynamic nature of the Web, search engines cannot index every Web page, and even the covered portion of the Web cannot be monitored continuously for changes. Therefore it is essential to develop effective crawling strategies t ..."
Abstract
-
Cited by 72 (19 self)
- Add to MetaCart
Due to limited bandwidth, storage, and computational resources, and to the dynamic nature of the Web, search engines cannot index every Web page, and even the covered portion of the Web cannot be monitored continuously for changes. Therefore it is essential to develop effective crawling strategies to prioritize the pages to be indexed. The issue is even more important for topic-specific search engines, where crawlers must make additional decisions based on the relevance of visited pages. However, it is difficult to evaluate alternative crawling strategies because relevant sets are unknown and the search space is changing. We propose three different methods to evaluate crawling strategies. We apply the proposed metrics to compare three topic-driven crawling algorithms based on similarity ranking, link analysis, and adaptive agents.
On-line portfolio selection using multiplicative updates
- Mathematical Finance
, 1998
"... We present an on-line investment algorithm which achieves almost the same wealth as the best constant-rebalanced portfolio determined in hindsight from the actual market outcomes. The algorithm employs a multiplicative update rule derived using a framework introduced by Kivinen and Warmuth. Our algo ..."
Abstract
-
Cited by 67 (10 self)
- Add to MetaCart
We present an on-line investment algorithm which achieves almost the same wealth as the best constant-rebalanced portfolio determined in hindsight from the actual market outcomes. The algorithm employs a multiplicative update rule derived using a framework introduced by Kivinen and Warmuth. Our algorithm is very simple to implement and requires only constant storage and computing time per stock ineach trading period. We tested the performance of our algorithm on real stock data from the New York Stock Exchange accumulated during a 22-year period. On this data, our algorithm clearly outperforms the best single stock aswell as Cover's universal portfolio selection algorithm. We also present results for the situation in which the We present an on-line investment algorithm which achieves almost the same wealth as the best constant-rebalanced portfolio investment strategy. The algorithm employsamultiplicative update rule derived using a framework introduced by Kivinen and Warmuth [20]. Our algorithm is very simple to implement and its time and storage requirements grow linearly in the number of stocks.
Information-theoretic metric learning
- in NIPS 2006 Workshop on Learning to Compare Examples
, 2007
"... We formulate the metric learning problem as that of minimizing the differential relative entropy between two multivariate Gaussians under constraints on the Mahalanobis distance function. Via a surprising equivalence, we show that this problem can be solved as a low-rank kernel learning problem. Spe ..."
Abstract
-
Cited by 67 (8 self)
- Add to MetaCart
We formulate the metric learning problem as that of minimizing the differential relative entropy between two multivariate Gaussians under constraints on the Mahalanobis distance function. Via a surprising equivalence, we show that this problem can be solved as a low-rank kernel learning problem. Specifically, we minimize the Burg divergence of a low-rank kernel to an input kernel, subject to pairwise distance constraints. Our approach has several advantages over existing methods. First, we present a natural information-theoretic formulation for the problem. Second, the algorithm utilizes the methods developed by Kulis et al. [6], which do not involve any eigenvector computation; in particular, the running time of our method is faster than most existing techniques. Third, the formulation offers insights into connections between metric learning and kernel learning. 1
Approximate Solutions to Markov Decision Processes
, 1999
"... One of the basic problems of machine learning is deciding how to act in an uncertain world. For example, if I want my robot to bring me a cup of coffee, it must be able to compute the correct sequence of electrical impulses to send to its motors to navigate from the coffee pot to my office. In fact, ..."
Abstract
-
Cited by 62 (9 self)
- Add to MetaCart
One of the basic problems of machine learning is deciding how to act in an uncertain world. For example, if I want my robot to bring me a cup of coffee, it must be able to compute the correct sequence of electrical impulses to send to its motors to navigate from the coffee pot to my office. In fact, since the results of its actions are not completely predictable, it is not enough just to compute the correct sequence; instead the robot must sense and correct for deviations from its intended path. In order for any machine learner to act reasonably in an uncertain environment, it must solve problems like the above one quickly and reliably. Unfortunately, the world is often so complicated that it is difficult or impossible to find the optimal sequence of actions to achieve a given goal. So, in order to scale our learners up to real-world problems, we usually must settle for approximate solutions. One representation for a learner's environment and goals is a Markov decision process or MDP. ...
A Learning Approach to Shallow Parsing
- IN PROCEEDINGS OF EMNLP-WVLC'99. ASSOCIATION FOR COMPUTATIONAL LINGUISTICS
, 1999
"... A SNoW based learning approach to shallow parsing tasks is presented and studied experimentally. The approach learns to identify syntactic patterns by combining simple predictors to produce a coherent inference. Two instantiations of this approach are studied and experimental results for Noun-Phrase ..."
Abstract
-
Cited by 57 (23 self)
- Add to MetaCart
A SNoW based learning approach to shallow parsing tasks is presented and studied experimentally. The approach learns to identify syntactic patterns by combining simple predictors to produce a coherent inference. Two instantiations of this approach are studied and experimental results for Noun-Phrases (NP) and Subject-Verb (SV) phrases that compare favorably with the best published results are presented. In doing that, we compare two ways of modeling the problem of learning to recognize patterns and suggest that shallow parsing patterns are bet- ter learned using open/close predictors than using inside/outside predictors.
Large Margin Hierarchical Classification
- In Proceedings of the Twenty-First International Conference on Machine Learning
"... We present an algorithmic framework for supervised classification learning where the set of labels is organized in a predefined hierarchical structure. This structure is encoded by a rooted tree which induces a metric over the label set. ..."
Abstract
-
Cited by 52 (7 self)
- Add to MetaCart
We present an algorithmic framework for supervised classification learning where the set of labels is organized in a predefined hierarchical structure. This structure is encoded by a rooted tree which induces a metric over the label set.
Tight Worst-Case Loss Bounds for Predicting With Expert Advice
, 1994
"... this paper is somewhat different from the one just described. Assume that there are N experts E i , i = 1; : : : ; N , each trying to predict the outcomes y t as best they can. Let x t;i be the prediction of the ith expert E i about the ..."
Abstract
-
Cited by 51 (10 self)
- Add to MetaCart
this paper is somewhat different from the one just described. Assume that there are N experts E i , i = 1; : : : ; N , each trying to predict the outcomes y t as best they can. Let x t;i be the prediction of the ith expert E i about the
The Perceptron algorithm vs. Winnow: linear vs. logarithmic mistake bounds when few input variables are relevant
"... This paper addresses the familiar problem of predicting with a linear classifier . The ..."
Abstract
-
Cited by 50 (8 self)
- Add to MetaCart
This paper addresses the familiar problem of predicting with a linear classifier . The
Update rules for parameter estimation in Bayesian networks
, 1997
"... This paper re-examines the problem of parameter estimation in Bayesian networks with missing values and hidden variables from the perspective of recent work in on-line learning [12]. We provide a unified framework for parameter estimation that encompasses both on-line learning, where the model is co ..."
Abstract
-
Cited by 47 (2 self)
- Add to MetaCart
This paper re-examines the problem of parameter estimation in Bayesian networks with missing values and hidden variables from the perspective of recent work in on-line learning [12]. We provide a unified framework for parameter estimation that encompasses both on-line learning, where the model is continuously adapted to new data cases as they arrive, and the more traditional batch learning, where a pre-accumulated set of samples is used in a one-time model selection process. In the batch case, our framework encompassesboth the gradient projection algorithm [2, 3] and the EM algorithm [14] for Bayesian networks. The framework also leads to new on-line and batch parameter update schemes, including a parameterized version of EM. We provide both empirical and theoretical results indicating that parameterized EM allows faster convergence to the maximum likelihood parameters than does standard EM. 1 Introduction Over the past few years, there has been a growing interest in the problem of le...

