Results 1 
7 of
7
Empirical Support for Winnow and WeightedMajority Algorithms: Results on a Calendar Scheduling Domain
 Machine Learning
, 1995
"... This paper describes experimental results on using Winnow and WeightedMajority based algorithms on a realworld calendar scheduling domain. These two algorithms have been highly studied in the theoretical machine learning literature. We show here that these algorithms can be quite competitive pract ..."
Abstract

Cited by 126 (4 self)
 Add to MetaCart
This paper describes experimental results on using Winnow and WeightedMajority based algorithms on a realworld calendar scheduling domain. These two algorithms have been highly studied in the theoretical machine learning literature. We show here that these algorithms can be quite competitive practically, outperforming the decisiontree approach currently in use in the Calendar Apprentice system in terms of both accuracy and speed. One of the contributions of this paper is a new variant on the Winnow algorithm (used in the experiments) that is especially suited to conditions with stringvalued classifications, and we give a theoretical analysis of its performance. In addition we show how Winnow can be applied to achieve a good accuracy/coverage tradeoff and explore issues that arise such as concept drift. We also provide an analysis of a policy for discarding predictors in WeightedMajority that allows it to speed up as it learns. Keywords: Winnow, WeightedMajority, Multiplicative alg...
Efficient Learning of Typical Finite Automata from Random Walks
, 1997
"... This paper describes new and efficient algorithms for learning deterministic finite automata. Our approach is primarily distinguished by two features: (1) the adoption of an averagecase setting to model the ``typical'' labeling of a finite automaton, while retaining a worstcase model for the under ..."
Abstract

Cited by 48 (10 self)
 Add to MetaCart
This paper describes new and efficient algorithms for learning deterministic finite automata. Our approach is primarily distinguished by two features: (1) the adoption of an averagecase setting to model the ``typical'' labeling of a finite automaton, while retaining a worstcase model for the underlying graph of the automaton, along with (2) a learning model in which the learner is not provided with the means to experiment with the machine, but rather must learn solely by observing the automaton's output behavior on a random input sequence. The main contribution of this paper is in presenting the first efficient algorithms for learning nontrivial classes of automata in an entirely passive learning model. We adopt an online learning model in which the learner is asked to predict the output of the next state, given the next symbol of the random input sequence; the goal of the learner is to make as few prediction mistakes as possible. Assuming the learner has a means of resetting the target machine to a fixed start state, we first present an efficient algorithm that
Approximating HyperRectangles: Learning and Pseudorandom Sets
 Journal of Computer and System Sciences
, 1997
"... The PAC learning of rectangles has been studied because they have been found experimentally to yield excellent hypotheses for several applied learning problems. Also, pseudorandom sets for rectangles have been actively studied recently because (i) they are a subproblem common to the derandomization ..."
Abstract

Cited by 44 (3 self)
 Add to MetaCart
The PAC learning of rectangles has been studied because they have been found experimentally to yield excellent hypotheses for several applied learning problems. Also, pseudorandom sets for rectangles have been actively studied recently because (i) they are a subproblem common to the derandomization of depth2 (DNF) circuits and derandomizing Randomized Logspace, and (ii) they approximate the distribution of n independent multivalued random variables. We present improved upper bounds for a class of such problems of "approximating" highdimensional rectangles that arise in PAC learning and pseudorandomness. Key words and phrases. Rectangles, machine learning, PAC learning, derandomization, pseudorandomness, multipleinstance learning, explicit constructions, Ramsey graphs, random graphs, sample complexity, approximations of distributions. 2 1 Introduction A basic common theme of a large part of PAC learning and derandomization/computational pseudorandomness is to "approximate" a stru...
A simple population protocol for fast robust approximate majority
 Distributed Computing, 21st International Symposium, DISC 2007
, 2008
"... Abstract We describe and analyze a 3state oneway population protocol to compute approximate majority in the model in which pairs of agents are drawn uniformly at random to interact. Given an initial configuration of x’s, y’s and blanks that contains at least one nonblank, the goal is for the agen ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
Abstract We describe and analyze a 3state oneway population protocol to compute approximate majority in the model in which pairs of agents are drawn uniformly at random to interact. Given an initial configuration of x’s, y’s and blanks that contains at least one nonblank, the goal is for the agents to reach consensus on one of the values x or y. Additionally, the value chosen should be the majority nonblank initial value, provided it exceeds the minority by a sufficient margin. We prove that with high probability n agents reach consensus in O(n log n) interactions and the value chosen is the majority provided that its initial margin is at least ω ( √ n log n). This protocol has the additional property of tolerating Byzantine behavior in o ( √ n) of the agents, making it the first known population protocol that tolerates Byzantine agents.
Convergence Of Moments In A MarkovChain Central Limit Theorem
, 2001
"... . Let (X i ) 1 i=0 be a V uniformly ergodic Markov chain on a general state space, and let be its stationary distribution. For g : X ! R, define W k (g) := k \Gamma1=2 k\Gamma1 X i=0 g(X i ) \Gamma (g): It is shown that if jgj V 1=n for a positive integer n, then Ex W k (g) n converg ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
. Let (X i ) 1 i=0 be a V uniformly ergodic Markov chain on a general state space, and let be its stationary distribution. For g : X ! R, define W k (g) := k \Gamma1=2 k\Gamma1 X i=0 g(X i ) \Gamma (g): It is shown that if jgj V 1=n for a positive integer n, then Ex W k (g) n converges to the nth moment of a normal random variable with expectation 0 and variance fl 2 g := (g 2 ) \Gamma (g) 2 + 1 X j=1 `Z g(x)Exg(X j ) \Gamma (g 2 ) ' : This extends the existing Markovchain central limit theorems, according to which expectations of bounded functionals of W k (g) converge. We also derive nonasymptotic bounds for the error in approximating the moments of W k (g) by the normal moments. These yield easy bounds of all feasible polynomial orders, and exponential bounds as well under some circumstances, for the probabilities of large deviations by the empirical measure along the Markov chain path X i . 1.
On the Sample Complexity of Weakly Learning
 Information and Computation
, 1992
"... In this paper, we study the sample complexity of weak learning. That is, we ask how much data must be collected from an unknown distribution in order to extract a small but significant advantage in prediction. We show that it is important to distinguish between those learning algorithms that output ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
In this paper, we study the sample complexity of weak learning. That is, we ask how much data must be collected from an unknown distribution in order to extract a small but significant advantage in prediction. We show that it is important to distinguish between those learning algorithms that output deterministic hypotheses and those that output randomized hypotheses. We prove that in the weak learning model, any algorithm using deterministic hypotheses to weakly learn a class of VapnikChervonenkis dimension d(n) requires\Omega\Gamma p d(n)) examples. In contrast, when randomized hypotheses are allowed, we show that \Theta(1) examples suffice in some cases. We then show that there exists an efficient algorithm using deterministic hypotheses that weakly learns against any distribution on a set of size d(n) with only O(d(n) 2=3 ) examples. Thus for the class of symmetric Boolean functions over n variables, where the strong learning sample complexity is \Theta(n), the sample complexi...
AVRIMBLUM Empirical Support for Winnow and WeightedMajority Algorithms: Results on a Calendar Scheduling Domain
"... Abstract. This paper describes experimental results on using Winnow and WeightedMajority based algorithms on a realworld calendar scheduling domain. These two algorithms have been highly studied in the theoretical machine learning literature. We show here that these algorithms can be quite competi ..."
Abstract
 Add to MetaCart
Abstract. This paper describes experimental results on using Winnow and WeightedMajority based algorithms on a realworld calendar scheduling domain. These two algorithms have been highly studied in the theoretical machine learning literature. We show here that these algorithms can be quite competitive practically,outperforming the decisiontree approach currently in use in the Calendar Apprentice system in terms of both accuracy and speed. One of the contributions of this paper is a new variant on the Winnow algorithm (used in the experiments) that is especially suited to conditions with stringvalued classifications, and we give a theoretical analysis of its performance. In addition we show how Winnow can be applied to achieve a good accuracy/coverage tradeoff and explore issues that arise such as concept drift. We also provide an analysis of a policy for discarding predictors in WeightedMajority that allows it to speed up as it learns.