Results 1  10
of
91
LowDensity ParityCheck Codes
, 1963
"... Preface The Noisy Channel Coding Theorem discovered by C. E. Shannon in 1948 offered communication engineers the possibility of reducing error rates on noisy channels to negligible levels without sacrificing data rates. The primary obstacle to the practical use of this theorem has been the equipment ..."
Abstract

Cited by 927 (1 self)
 Add to MetaCart
Preface The Noisy Channel Coding Theorem discovered by C. E. Shannon in 1948 offered communication engineers the possibility of reducing error rates on noisy channels to negligible levels without sacrificing data rates. The primary obstacle to the practical use of this theorem has been the equipment complexity and the computation time required to decode the noisy received data.
Solving multiclass learning problems via errorcorrecting output codes
 Journal of Artificial Intelligence Research
, 1995
"... Multiclass learning problems involve nding a de nition for an unknown function f(x) whose range is a discrete set containing k>2values (i.e., k \classes"). The de nition is acquired by studying collections of training examples of the form hx i;f(x i)i. Existing approaches to multiclass l ..."
Abstract

Cited by 582 (9 self)
 Add to MetaCart
Multiclass learning problems involve nding a de nition for an unknown function f(x) whose range is a discrete set containing k>2values (i.e., k \classes&quot;). The de nition is acquired by studying collections of training examples of the form hx i;f(x i)i. Existing approaches to multiclass learning problems include direct application of multiclass algorithms such as the decisiontree algorithms C4.5 and CART, application of binary concept learning algorithms to learn individual binary functions for each of the k classes, and application of binary concept learning algorithms with distributed output representations. This paper compares these three approaches to a new technique in which errorcorrecting codes are employed as a distributed output representation. We show that these output representations improve the generalization performance of both C4.5 and backpropagation on a wide range of multiclass learning tasks. We also demonstrate that this approach is robust with respect to changes in the size of the training sample, the assignment of distributed representations to particular classes, and the application of over tting avoidance techniques such as decisiontree pruning. Finally,we show thatlike the other methodsthe errorcorrecting code technique can provide reliable class probability estimates. Taken together, these results demonstrate that errorcorrecting output codes provide a generalpurpose method for improving the performance of inductive learning programs on multiclass problems. 1.
Lowdensity paritycheck codes based on finite geometries: A rediscovery and new results
 IEEE Trans. Inform. Theory
, 2001
"... This paper presents a geometric approach to the construction of lowdensity paritycheck (LDPC) codes. Four classes of LDPC codes are constructed based on the lines and points of Euclidean and projective geometries over finite fields. Codes of these four classes have good minimum distances and thei ..."
Abstract

Cited by 127 (4 self)
 Add to MetaCart
This paper presents a geometric approach to the construction of lowdensity paritycheck (LDPC) codes. Four classes of LDPC codes are constructed based on the lines and points of Euclidean and projective geometries over finite fields. Codes of these four classes have good minimum distances and their Tanner graphs have girth T. Finitegeometry LDPC codes can be decoded in various ways, ranging from low to high decoding complexity and from reasonably good to very good performance. They perform very well with iterative decoding. Furthermore, they can be put in either cyclic or quasicyclic form. Consequently, their encoding can be achieved in linear time and implemented with simple feedback shift registers. This advantage is not shared by other LDPC codes in general and is important in practice. Finitegeometry LDPC codes can be extended and shortened in various ways to obtain other good LDPC codes. Several techniques of extension and shortening are presented. Long extended finitegeometry LDPC codes have been constructed and they achieve a performance only a few tenths of a decibel away from the Shannon theoretical limit with iterative decoding.
ErrorCorrecting Output Codes: A General Method for Improving Multiclass Inductive Learning Programs
 IN PROCEEDINGS OF AAAI91
, 1991
"... Multiclass learning problems involve finding a definition for an unknown function f(x) whose range is a discrete set containing k ? 2 values (i.e., k "classes"). The definition is acquired by studying large collections of training examples of the form hx i ; f(x i )i. Existing approaches t ..."
Abstract

Cited by 89 (7 self)
 Add to MetaCart
Multiclass learning problems involve finding a definition for an unknown function f(x) whose range is a discrete set containing k ? 2 values (i.e., k "classes"). The definition is acquired by studying large collections of training examples of the form hx i ; f(x i )i. Existing approaches to this problem include (a) direct application of multiclass algorithms such as the decisiontree algorithms ID3 and CART, (b) application of binary concept learning algorithms to learn individual binary functions for each of the k classes, and (c) application of binary concept learning algorithms with distributed output codes such as those employed by Sejnowski and Rosenberg in the NETtalk system. This paper compares these three approaches to a new technique in which BCH errorcorrecting codes are employed as a distributed output representation. We show that these output representations improve the performance of ID3 on the NETtalk task and of backpropagation on an isolatedletter speechrecognition t...
Everything Old Is New Again: A Fresh Look at Historical Approaches
 in Machine Learning. PhD thesis, MIT
, 2002
"... 2 Everything Old Is New Again: A Fresh Look at Historical ..."
Abstract

Cited by 88 (6 self)
 Add to MetaCart
2 Everything Old Is New Again: A Fresh Look at Historical
Combining Labeled and Unlabeled Data for MultiClass Text Categorization
 In Proceedings of the International Conference on Machine Learning
, 2002
"... Supervised learning techniques for text classification often require a large number of labeled examples to learn accurately. One way to reduce the amount of labeled data required is to develop algorithms that can learn effectively from a small number of labeled examples augmented with a large ..."
Abstract

Cited by 40 (0 self)
 Add to MetaCart
Supervised learning techniques for text classification often require a large number of labeled examples to learn accurately. One way to reduce the amount of labeled data required is to develop algorithms that can learn effectively from a small number of labeled examples augmented with a large number of unlabeled examples. Current text learning techniques for combining labeled and unlabeled, such as EM and CoTraining, are mostly applicable for classification tasks with a small number of classes and do not scale up well for large multiclass problems. In this paper, wedevelop a framework to incorporate unlabeled data in the ErrorCorrecting Output Coding (ECOC) setup by first decomposing multiclass problems into multiple binary problems and then using CoTraining to learn the individual binary classification problems.
Using ErrorCorrecting Codes For Text Classification
 In Proceedings of the Seventeenth International Conference on Machine Learning
, 2000
"... This paper explores in detail the use of Error Correcting Output Coding (ECOC) for learning text classifiers. We show that the accuracy of a Naive Bayes Classifier over text classification tasks can be significantly improved by taking advantage of the errorcorrecting properties of the code. W ..."
Abstract

Cited by 34 (3 self)
 Add to MetaCart
This paper explores in detail the use of Error Correcting Output Coding (ECOC) for learning text classifiers. We show that the accuracy of a Naive Bayes Classifier over text classification tasks can be significantly improved by taking advantage of the errorcorrecting properties of the code. We also explore the use of different kinds of codes, namely ErrorCorrecting Codes, Random Codes, and Domain and Dataspecific codes and give experimental results for each of them. The ECOC method scales well to large data sets with a large number of classes. Experiments on a realworld data set show a reduction in classification error by up to 66% over the traditional Naive Bayes Classifier. We also compare our empirical results to semitheoretical results and find that the two closely agree. 1. Introduction Text Classification is the problem of grouping text documents into classes or categories. For the purpose of this paper, we define classification as categorizing documents in...
New results on error correcting output codes of kernel machines
 IEEE Transactions on Neural Networks
, 2004
"... Abstract—We study the problem of multiclass classification within the framework of error correcting output codes (ECOC) using marginbased binary classifiers. Specifically, we address two important open problems in this context: decoding and model selection. The decoding problem concerns how to map ..."
Abstract

Cited by 29 (0 self)
 Add to MetaCart
Abstract—We study the problem of multiclass classification within the framework of error correcting output codes (ECOC) using marginbased binary classifiers. Specifically, we address two important open problems in this context: decoding and model selection. The decoding problem concerns how to map the outputs of the classifiers into class codewords. In this paper we introduce a new decoding function that combines the margins through an estimate of their class conditional probabilities. Concerning model selection, we present new theoretical results bounding the leaveoneout (LOO) error of ECOC of kernel machines, which can be used to tune kernel hyperparameters. We report experiments using support vector machines as the base binary classifiers, showing the advantage of the proposed decoding function over other functions of the margin commonly used in practice. Moreover, our empirical evaluations on model selection indicate that the bound leads to good estimates of kernel parameters. Index Terms—Error correcting output codes (ECOC), machine learning, statistical learning theory, support vector machines. I.
Achieving HighAccuracy TexttoSpeech with Machine Learning
 In Data mining in speech synthesis
, 1997
"... In 1987, Sejnowski and Rosenberg developed their famous NETtalk system for English textto speech. This chapter describes a machine learning approach to texttospeech that builds upon and extends the initial NETtalk work. Among the many extensions to the NETtalk system were the following: a differe ..."
Abstract

Cited by 24 (2 self)
 Add to MetaCart
In 1987, Sejnowski and Rosenberg developed their famous NETtalk system for English textto speech. This chapter describes a machine learning approach to texttospeech that builds upon and extends the initial NETtalk work. Among the many extensions to the NETtalk system were the following: a different learning algorithm, a wider input "window", errorcorrecting output coding, a righttoleft scan of the word to be pronounced (with the results of each decision influencing subsequent decisions), and the addition of several useful input features. These changes yielded a system that performs much better than the original NETtalk system. After training on 19,002 words, the system achieves 93.7% correct pronunciation of individual phonemes and 64.8% correct pronunciation of whole words (where the pronunciation must exactly match the dictionary pronunciation to be correct). Based on the judgements of three human participants in a blind assessment study, our system was estimated to have a seri...
Efficient faulttolerant quantum computing
 Nature
, 1999
"... Fault tolerant quantum computing methods which work with efficient quantum error correcting codes are discussed. Several new techniques are introduced to restrict accumulation of errors before or during the recovery. Classes of eligible quantum codes are obtained, and good candidates exhibited. This ..."
Abstract

Cited by 20 (4 self)
 Add to MetaCart
Fault tolerant quantum computing methods which work with efficient quantum error correcting codes are discussed. Several new techniques are introduced to restrict accumulation of errors before or during the recovery. Classes of eligible quantum codes are obtained, and good candidates exhibited. This permits a new analysis of the permissible error rates and minimum overheads for robust quantum computing. It is found that, under the standard noise model of ubiquitous stochastic, uncorrelated errors, a quantum computer need be only an order of magnitude larger than the logical machine contained within it in order to be reliable. For example, a scaleup by a factor of 22, with gate error rate of order 10 −5, is sufficient to permit large quantum algorithms such as factorization of thousanddigit numbers.