Results 1  10
of
274
SelfTesting/Correcting with Applications to Numerical Problems
, 1990
"... Suppose someone gives us an extremely fast program P that we can call as a black box to compute a function f . Should we trust that P works correctly? A selftesting/correcting pair allows us to: (1) estimate the probability that P (x) 6= f(x) when x is randomly chosen; (2) on any input x, compute ..."
Abstract

Cited by 358 (30 self)
 Add to MetaCart
Suppose someone gives us an extremely fast program P that we can call as a black box to compute a function f . Should we trust that P works correctly? A selftesting/correcting pair allows us to: (1) estimate the probability that P (x) 6= f(x) when x is randomly chosen; (2) on any input x, compute f(x) correctly as long as P is not too faulty on average. Furthermore, both (1) and (2) take time only slightly more than Computer Science Division, U.C. Berkeley, Berkeley, California 94720, Supported by NSF Grant No. CCR 8813632. y International Computer Science Institute, Berkeley, California 94704 z Computer Science Division, U.C. Berkeley, Berkeley, California 94720, Supported by an IBM Graduate Fellowship and NSF Grant No. CCR 8813632. the original running time of P . We present general techniques for constructing simple to program selftesting /correcting pairs for a variety of numerical problems, including integer multiplication, modular multiplication, matrix multiplicatio...
Spectral Efficiency of CDMA with Random Spreading
 IEEE TRANS. INFORM. THEORY
, 1999
"... The CDMA channel with randomly and independently chosen spreading sequences accurately models the situation where pseudonoise sequences span many symbol periods. Furthermore, its analysis provides a comparison baseline for CDMA channels with deterministic signature waveforms spanning one symbol per ..."
Abstract

Cited by 274 (23 self)
 Add to MetaCart
The CDMA channel with randomly and independently chosen spreading sequences accurately models the situation where pseudonoise sequences span many symbol periods. Furthermore, its analysis provides a comparison baseline for CDMA channels with deterministic signature waveforms spanning one symbol period. We analyze the spectral efficiency (total capacity per chip) as a function of the number of users, spreading gain, and signaltonoise ratio, and we quantify the loss in efficiency relative to an optimally chosen set of signature sequences and relative to multiaccess with no spreading. White Gaussian background noise and equalpower synchronous users are assumed. The following receivers are analyzed: a) optimal joint processing, b) singleuser matched filtering, c) decorrelation, and d) MMSE linear processing.
Automatic Construction of Decision Trees from Data: A MultiDisciplinary Survey
 Data Mining and Knowledge Discovery
, 1997
"... Decision trees have proved to be valuable tools for the description, classification and generalization of data. Work on constructing decision trees from data exists in multiple disciplines such as statistics, pattern recognition, decision theory, signal processing, machine learning and artificial ne ..."
Abstract

Cited by 204 (1 self)
 Add to MetaCart
(Show Context)
Decision trees have proved to be valuable tools for the description, classification and generalization of data. Work on constructing decision trees from data exists in multiple disciplines such as statistics, pattern recognition, decision theory, signal processing, machine learning and artificial neural networks. Researchers in these disciplines, sometimes working on quite different problems, identified similar issues and heuristics for decision tree construction. This paper surveys existing work on decision tree construction, attempting to identify the important issues involved, directions the work has taken and the current state of the art. Keywords: classification, treestructured classifiers, data compaction 1. Introduction Advances in data collection methods, storage and processing technology are providing a unique challenge and opportunity for automated data exploration techniques. Enormous amounts of data are being collected daily from major scientific projects e.g., Human Genome...
Unbiased Bits from Sources of Weak Randomness and Probabilistic Communication Complexity
, 1988
"... , Introduction and References only) Benny Chor Oded Goldreich MIT \Gamma Laboratory for Computer Science Cambridge, Massachusetts 02139 ABSTRACT \Gamma A new model for weak random physical sources is presented. The new model strictly generalizes previous models (e.g. the Santha and Vazirani model [2 ..."
Abstract

Cited by 199 (6 self)
 Add to MetaCart
, Introduction and References only) Benny Chor Oded Goldreich MIT \Gamma Laboratory for Computer Science Cambridge, Massachusetts 02139 ABSTRACT \Gamma A new model for weak random physical sources is presented. The new model strictly generalizes previous models (e.g. the Santha and Vazirani model [24]). The sources considered output strings according to probability distributions in which no single string is too probable. The new model provides a fruitful viewpoint on problems studied previously as: ffl Extracting almost perfect bits from sources of weak randomness: the question of possibility as well as the question of efficiency of such extraction schemes are addressed. ffl Probabilistic Communication Complexity: it is shown that most functions have linear communication complexity in a very strong probabilistic sense. ffl Robustness of BPP with respect to sources of weak randomness (generalizing a result of Vazirani and Vazirani [27]). The paper has appeared in SIAM Journal o...
Approximate graph coloring by semidefinite programming
 Proc. 35 th IEEE FOCS, IEEE
, 1994
"... a coloring is called the chromatic number of�, and is usually denoted by��.Determining the chromatic number of a graph is known to be NPhard (cf. [19]). Besides its theoretical significance as a canonical NPhard problem, graph coloring arises naturally in a variety of applications such as register ..."
Abstract

Cited by 188 (6 self)
 Add to MetaCart
(Show Context)
a coloring is called the chromatic number of�, and is usually denoted by��.Determining the chromatic number of a graph is known to be NPhard (cf. [19]). Besides its theoretical significance as a canonical NPhard problem, graph coloring arises naturally in a variety of applications such as register allocation [11, 12, 13] is the maximum degree of any vertex. Beand timetable/examination scheduling [8, 40]. In many We consider the problem of coloring�colorable graphs with the fewest possible colors. We give a randomized polynomial time algorithm which colors a 3colorable graph on vertices with� � ���� colors where sides giving the best known approximation ratio in terms of, this marks the first nontrivial approximation result as a function of the maximum degree. This result can be generalized to�colorable graphs to obtain a coloring using�� � ��� � � � �colors. Our results are inspired by the recent work of Goemans and Williamson who used an algorithm for semidefinite optimization problems, which generalize linear programs, to obtain improved approximations for the MAX CUT and MAX 2SAT problems. An intriguing outcome of our work is a duality relationship established between the value of the optimum solution to our semidefinite program and the Lovász�function. We show lower bounds on the gap between the optimum solution of our semidefinite program and the actual chromatic number; by duality this also demonstrates interesting new facts about the�function. 1
Bounds on the sample complexity of Bayesian learning using information theory
 and the VC dimension,” in Proc. Conf. Comp. Learning Theory
, 1991
"... ..."
(Show Context)
Comparing clusterings: an axiomatic view
 In ICML ’05: Proceedings of the 22nd international conference on Machine learning
, 2005
"... This paper views clusterings as elements of a lattice. Distances between clusterings are analyzed in their relationship to the lattice. From this vantage point, we first give an axiomatic characterization of some criteria for comparing clusterings, including the variation of information and the unad ..."
Abstract

Cited by 100 (3 self)
 Add to MetaCart
(Show Context)
This paper views clusterings as elements of a lattice. Distances between clusterings are analyzed in their relationship to the lattice. From this vantage point, we first give an axiomatic characterization of some criteria for comparing clusterings, including the variation of information and the unadjusted Rand index. Then we study other distances between partitions w.r.t these axioms and prove an impossibility result: there is no “sensible” criterion for comparing clusterings that is simultaneously (1) aligned with the lattice of partitions, (2) convexely additive, and (3) bounded. 1.
Nearestneighbor searching and metric space dimensions
 In NearestNeighbor Methods for Learning and Vision: Theory and Practice
, 2006
"... Given a set S of n sites (points), and a distance measure d, the nearest neighbor searching problem is to build a data structure so that given a query point q, the site nearest to q can be found quickly. This paper gives a data structure for this problem; the data structure is built using the distan ..."
Abstract

Cited by 99 (0 self)
 Add to MetaCart
Given a set S of n sites (points), and a distance measure d, the nearest neighbor searching problem is to build a data structure so that given a query point q, the site nearest to q can be found quickly. This paper gives a data structure for this problem; the data structure is built using the distance function as a “black box”. The structure is able to speed up nearest neighbor searching in a variety of settings, for example: points in lowdimensional or structured Euclidean space, strings under Hamming and edit distance, and bit vector data from an OCR application. The data structures are observed to need linear space, with a modest constant factor. The preprocessing time needed per site is observed to match the query time. The data structure can be viewed as an application of a “kdtree ” approach in the metric space setting, using Voronoi regions of a subset in place of axisaligned boxes. 1