Results 1  10
of
31
PROBABILITY INEQUALITIES FOR SUMS OF BOUNDED RANDOM VARIABLES
, 1962
"... Upper bounds are derived for the probability that the sum S of n independent random variables exceeds its mean ES by a positive number nt. It is assumed that the range of each summand of S is bounded or bounded above. The bounds for Pr(SES> nt) depend only on the endpoints of the ranges of the smum ..."
Abstract

Cited by 1498 (2 self)
 Add to MetaCart
Upper bounds are derived for the probability that the sum S of n independent random variables exceeds its mean ES by a positive number nt. It is assumed that the range of each summand of S is bounded or bounded above. The bounds for Pr(SES> nt) depend only on the endpoints of the ranges of the smumands and the mean, or the mean and the variance of S. These results are then used to obtain analogous inequalities for certain sums of dependent random variables such as U statistics and the sum of a random sample without replacement from a finite population.
Universal Limit Laws for Depths in Random Trees
 SIAM Journal on Computing
, 1998
"... Random binary search trees, bary search trees, medianof(2k+1) trees, quadtrees, simplex trees, tries, and digital search trees are special cases of random split trees. For these trees, we o#er a universal law of large numbers and a limit law for the depth of the last inserted point, as well as a ..."
Abstract

Cited by 50 (8 self)
 Add to MetaCart
Random binary search trees, bary search trees, medianof(2k+1) trees, quadtrees, simplex trees, tries, and digital search trees are special cases of random split trees. For these trees, we o#er a universal law of large numbers and a limit law for the depth of the last inserted point, as well as a law of large numbers for the height.
Concentration inequalities
 Advanced Lectures in Machine Learning
, 2004
"... Abstract. Concentration inequalities deal with deviations of functions of independent random variables from their expectation. In the last decade new tools have been introduced making it possible to establish simple and powerful inequalities. These inequalities are at the heart of the mathematical a ..."
Abstract

Cited by 32 (1 self)
 Add to MetaCart
Abstract. Concentration inequalities deal with deviations of functions of independent random variables from their expectation. In the last decade new tools have been introduced making it possible to establish simple and powerful inequalities. These inequalities are at the heart of the mathematical analysis of various problems in machine learning and made it possible to derive new efficient algorithms. This text attempts to summarize some of the basic tools. 1
Minimaxoptimal classification with dyadic decision trees
 IEEE TRANSACTIONS ON INFORMATION THEORY
, 2006
"... Decision trees are among the most popular types of classifiers, with interpretability and ease of implementation being among their chief attributes. Despite the widespread use of decision trees, theoretical analysis of their performance has only begun to emerge in recent years. In this paper it is ..."
Abstract

Cited by 27 (4 self)
 Add to MetaCart
Decision trees are among the most popular types of classifiers, with interpretability and ease of implementation being among their chief attributes. Despite the widespread use of decision trees, theoretical analysis of their performance has only begun to emerge in recent years. In this paper it is shown that a new family of decision trees, dyadic decision trees (DDTs), attain nearly optimal (in a minimax sense) rates of convergence for a broad range of classification problems. Furthermore, DDTs are surprisingly adaptive in three important respects: They automatically (1) adapt to favorable conditions near the Bayes decision boundary; (2) focus on data distributed on lower dimensional manifolds; and (3) reject irrelevant features. DDTs are constructed by penalized empirical risk minimization using a new datadependent penalty and may be computed exactly with computational complexity that is nearly linear in the training sample size. DDTs are the first classifier known to achieve nearly optimal rates for the diverse class of distributions studied here while also being practical and implementable. This is also the first study (of which we are aware) to consider rates for adaptation to intrinsic data dimension and relevant features.
Probabilistic bounds on the coefficients of polynomials with only real zeros
 J. Combin. Theory Ser. A
, 1997
"... The work of Harper and subsequent authors has shown that nite sequences (a 0;;an) arising from combinatorial problems are often such that the polynomial A(z): = P n k=0 akz k has only real zeros. Basic examples include rows from the arrays of binomial coe cients, Stirling numbers of the rst and sec ..."
Abstract

Cited by 20 (0 self)
 Add to MetaCart
The work of Harper and subsequent authors has shown that nite sequences (a 0;;an) arising from combinatorial problems are often such that the polynomial A(z): = P n k=0 akz k has only real zeros. Basic examples include rows from the arrays of binomial coe cients, Stirling numbers of the rst and second kinds, and Eulerian numbers. Assuming the ak are nonnegative, A(1)> 0 and that A(z) is not constant, it is known that A(z) has only real zeros i the normalized sequence (a 0=A(1);;an=A(1)) is the probability distribution of the Research supported in part by N.S.F. Grant MCS9404345 1 number of successes in n independent trials for some sequence of success probabilities. Such sequences (a 0;;an) are also known to be characterized by total positivity of the in nite matrix (ai,j) indexed by nonnegative integers i and j. This papers reviews inequalities and approximations for such sequences, called Polya frequency sequences which follow from their probabilistic representation. In combinatorial examples these inequalities yield a number of improvements of known estimates.
Pattern classification and learning theory
"... 1.1 A binary classification problem Pattern recognition (or classification or discrimination) is about guessing or predicting the unknown class of an observation. An observation is a collection of numerical measurements, represented by a ddimensional vector x. The unknown nature of the observation ..."
Abstract

Cited by 17 (7 self)
 Add to MetaCart
1.1 A binary classification problem Pattern recognition (or classification or discrimination) is about guessing or predicting the unknown class of an observation. An observation is a collection of numerical measurements, represented by a ddimensional vector x. The unknown nature of the observation is called a class. It is denoted by y and takes values in the set f0; 1g. (For simplicity, we restrict our attention to binary classification.) In pattern recognition, one creates a function g(x) : R d! f0; 1g which represents one's guess of y given x. The mapping g is called a classifier. A classifier errs on x if g(x) 6 = y. To model the learning problem, we introduce a probabilistic setting, and let (X; Y) be an R d \Theta f0; 1gvalued random pair. The random pair (X; Y) may be described in a variety of ways: for example, it is defined by the pair (_; j), where _ is the probability measure for X and j is the regression of Y on X. More precisely, for a Borelmeasurable set A ` R d
Distances and Finger Search in Random Binary Search Trees
 SIAM Journal on Computing
, 2004
"... For the random binary search tree with n nodes inserted the number of ancestors of the elements with ranks k and l, 1 <= k < l <= n, as well as the path distance between these elements in the tree are considered. For both quantities, central limit theorems for appropriately rescaled versions are der ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
For the random binary search tree with n nodes inserted the number of ancestors of the elements with ranks k and l, 1 <= k < l <= n, as well as the path distance between these elements in the tree are considered. For both quantities, central limit theorems for appropriately rescaled versions are derived. For the path distance, the condition lk > ∞ as $n > ∞ is required. We obtain tail bounds and the order of higher moments for the path distance. The path distance measures the complexity of finger search in the tree.
Expected time analysis for delaunay point location
 Comput. Geom. Theory Appl
, 2004
"... Abstract. We consider point location in Delaunay triangulations with the aid of simple data structures. In particular, we analyze methods in which a simple data structure is used to first locate a point close to the query point. For points uniformly distributed on the unit square, we show that the e ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
Abstract. We consider point location in Delaunay triangulations with the aid of simple data structures. In particular, we analyze methods in which a simple data structure is used to first locate a point close to the query point. For points uniformly distributed on the unit square, we show that the expected point location complexities are Θ ( √ n) for the GreenSibson rectilinear search, Θ(n 1/3) for Jump and Walk, Θ(n 1/4) for BinSearch and Walk (which uses a 1dimensional search tree), Θ(n 0.056...) for search based on a random 2d tree, and Θ(log n) for search aided by a 2d median tree.
On concentration of probability
 Combinatorics, Probability and Computing
"... Abstract. We give a survey of several methods to obtain sharp concentration results, typically with exponentially small error probabilities, for random variables occuring in combinatorial probability. ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
Abstract. We give a survey of several methods to obtain sharp concentration results, typically with exponentially small error probabilities, for random variables occuring in combinatorial probability.