Results 1  10
of
11
Mining Optimal Decision Trees from Itemset Lattices
, 2007
"... We present an exact algorithm for finding a decision tree that optimizes a ranking function under size, depth, accuracy and leaf constraints. Because the discovery of optimal trees has high theoretical complexity, until now no efforts have been made to compute such trees for realworld datasets. An ..."
Abstract

Cited by 10 (5 self)
 Add to MetaCart
We present an exact algorithm for finding a decision tree that optimizes a ranking function under size, depth, accuracy and leaf constraints. Because the discovery of optimal trees has high theoretical complexity, until now no efforts have been made to compute such trees for realworld datasets. An exact algorithm is of both scientific and practical interest. From the scientific point of view, it can be used as a gold standard to evaluate the performance of heuristic decision tree learners, and it can be used to gain new insight in traditional decision tree learners. From the application point of view, it can be used to discover trees that cannot be found by heuristic decision tree learners. The key idea behind our algorithm is the relation between constraints on decision trees and constraints on itemsets. We propose to exploit lattices of itemsets, from which we can extract optimal decision trees in linear time. We give several strategies to efficiently build these lattices and show that the test set accuracies of C4.5 compete with the test set accuracies of optimal trees.
A theory of pattern rejection
 IN ARPA IMAGE UNDERSTANDING WORKSHOP
, 1996
"... The efficiency of pattern recognition is critical when a large number of classes are to be discriminated, or when the recognition algorithm needs to be applied a large number of times. We propose and analyze a general technique, namely pattern rejection, that results in efficient pattern recognition ..."
Abstract

Cited by 7 (4 self)
 Add to MetaCart
The efficiency of pattern recognition is critical when a large number of classes are to be discriminated, or when the recognition algorithm needs to be applied a large number of times. We propose and analyze a general technique, namely pattern rejection, that results in efficient pattern recognition. Rejectors are introduced as algorithms that can very quickly eliminate from further consideration most classes or inputs (depending on the setting). Rejectors may be combined to form composite rejectors, which are more e ective than any single rejector. Composite rejectors are analyzed and conditions derived which guarantee both efficiency and practicality. A general technique is proposed for the construction of composite rejectors, based on a single assumption about the classes. The generality of this assumption is shown through its connection with the KarhunenLoeve expansion. A relation of pattern rejection with Fisher's discriminant analysis is also shown. Composite rejectors were constructed for two applications, namely, object recognition and local feature detection. In both cases, a substantial improvement in efficiency over existing techniques was found.
Hyperrectanglebased discriminative data generalization and applications in data mining
, 2007
"... The ultimate goal of data mining is to extract knowledge from massive data. Knowledge is ideally represented as humancomprehensible patterns from which endusers can gain intuitions and insights. Axisparallel hyperrectangles provide interpretable generalizations for multidimensional data points ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
The ultimate goal of data mining is to extract knowledge from massive data. Knowledge is ideally represented as humancomprehensible patterns from which endusers can gain intuitions and insights. Axisparallel hyperrectangles provide interpretable generalizations for multidimensional data points with numerical attributes. In this dissertation, we study the fundamental problem of rectanglebased discriminative data generalization in the context of several useful data mining applications: cluster description, rule learning, and Nearest Rectangle classification. Clustering is one of the most important data mining tasks. However, most clustering methods output sets of points as clusters and do not generalize them into interpretable patterns. We perform a systematic study of cluster description, where we propose novel description formats leading to enhanced expressive power and introduce novel description problems specifying different tradeoffs between interpretability and accuracy. We also present efficient heuristic algorithms for the introduced problems in the proposed formats. Ifthen rules are
ClassDependent Features and Multicategory Classification
, 2001
"... Faculty of Engineering and Applied Science Department of Electronics and Computer Science Doctor of Philosophy Classdependent features and multicategory classification by Alex Bailey The problem of pattern classification is considered for the case of multicategory classification where the numb ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Faculty of Engineering and Applied Science Department of Electronics and Computer Science Doctor of Philosophy Classdependent features and multicategory classification by Alex Bailey The problem of pattern classification is considered for the case of multicategory classification where the number of classes, k, is greater than two. Many classification algorithms are in fact 2class classifiers and are generalised to solve kclass problems. Which classifiers are naturally multicategory and the nature of the generalisation of a 2class classifier to k classes is not often investigated. A thorough analysis of multicategory classification is given in this thesis which provides a new taxonomy of popular classification algorithms, and goes on to derive these from a probabilistic viewpoint. A clear distinction is made between classifiers that partition the input space and those that partition the set of k classes. Of the classifiers which partition the set of classes, the oneofn, pairwise and hierarchical methods of decomposition are shown to be equivalent in the knowledge of the true data distributions. The scaling properties of these algorithms are analysed for increasing k. The effects of learning models on finite data are then investigated to show the practical differences between each decomposition.
Optimized blist form (OBF
"... Abstract—Any Boolean expressions may be converted into positiveform, which has only union and intersection operators. Let E be a positiveform expression with n literals. Assume that the truthvalues of the literals are read one at a time. The numbers s(n) of steps (operations) and b(n) of working ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Abstract—Any Boolean expressions may be converted into positiveform, which has only union and intersection operators. Let E be a positiveform expression with n literals. Assume that the truthvalues of the literals are read one at a time. The numbers s(n) of steps (operations) and b(n) of working memory bits (footprint) needed to evaluate E depend on E and on the evaluation technique. A recursive evaluation performs s(n)=n–1 steps, but requires b(n)=log(n)+1 bits. Evaluating the disjunctive form of E uses only b(n)=2 bits, but may lead to an exponential growth of s(n). We propose a new Optimized Blist Form (OBF), which requires only s(n)=n steps and b(n)=⎡log2j ⎤ bits, where j=⎡log2(2n/3+2)⎤. We provide a simple and linearcost algorithm for converting positiveform expressions to their OBF. We discuss three applications: (1) Direct CSG rendering, where a candidate surfel is classified against an arbitrarily complex Boolean expression (up to 27,600,000,000,000,000,000 literals) using a footprint of only 6 stencil bits; (2) the new programmable Logic Matrix (LM), which evaluates any positiveform logical expression of n literals in a single clock cycle and uses a matrix of at most n×j wire/line connections; and (3) the new programmable Logic Pipe (LP), which uses n gates connected by a pipe of ⎡log2j⎤ lines and, when receiving a staggered stream of input vectors, produces a value of a logical expression at each clock cycle.
CST: Constructive Solid Trimming for Rendering BReps and CSG
 IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS
"... To eliminate the need to evaluate the intersection curves in explicit representations of surface cutouts or of trimmed faces in BReps of CSG solids, we advocate using Constructive Solid Trimming (CST). A CST face is the intersection of a surface with a Blist representation of a trimming CSG volume ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
To eliminate the need to evaluate the intersection curves in explicit representations of surface cutouts or of trimmed faces in BReps of CSG solids, we advocate using Constructive Solid Trimming (CST). A CST face is the intersection of a surface with a Blist representation of a trimming CSG volume. We propose a new, GPUbased, CSG rendering algorithm, which trims the boundary of each primitive using a Blist of its Active Zone. This approach is faster than the previously reported Blister approach, eliminates occasional speckles of wrongly colored pixels, and provides additional capabilities: painting on surfaces, rendering semitransparent CSG models, and highlighting selected features in the BReps of CSG models.
Codebook Generation for Vector Quantization on Orthogonal Polynomials based Transform Coding
"... Abstract—In this paper, a new algorithm for generating codebook is proposed for vector quantization (VQ) in image coding. The significant features of the training image vectors are extracted by using the proposed Orthogonal Polynomials based transformation. We propose to generate the codebook by par ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Abstract—In this paper, a new algorithm for generating codebook is proposed for vector quantization (VQ) in image coding. The significant features of the training image vectors are extracted by using the proposed Orthogonal Polynomials based transformation. We propose to generate the codebook by partitioning these feature vectors into a binary tree. Each feature vector at a nonterminal node of the binary tree is directed to one of the two descendants by comparing a single feature associated with that node to a threshold. The binary tree codebook is used for encoding and decoding the feature vectors. In the decoding process the feature vectors are subjected to inverse transformation with the help of basis functions of the proposed Orthogonal Polynomials based transformation to get back the approximated input image training vectors. The results of the proposed coding are compared with the VQ using Discrete Cosine Transform (DCT) and Pairwise Nearest Neighbor (PNN) algorithm. The new algorithm results in a considerable reduction in computation time and provides better reconstructed picture quality.
Technical Note: Algorithms for Optimal Dyadic Decision Trees
"... A dynamic programming algorithm for constructing optimal dyadic decision trees was recently introduced, analyzed, and shown to be very effective for low dimensional data sets. This paper enhances and extends this algorithm by: introducing an adaptive grid search for the regularization parameter tha ..."
Abstract
 Add to MetaCart
A dynamic programming algorithm for constructing optimal dyadic decision trees was recently introduced, analyzed, and shown to be very effective for low dimensional data sets. This paper enhances and extends this algorithm by: introducing an adaptive grid search for the regularization parameter that guarantees optimal solutions for all relevant trees sizes, replacing the dynamic programming algorithm with a memoized recursive algorithm whose run time is substantially smaller for most regularization parameter values on the grid, and incorporating new data structures and data preprocessing steps that provide significant run time enhancement in practice.
Unis d'Amérique (2007)" DOI: 10.1145/1281192.1281250 Mining Optimal Decision Trees from Itemset Lattices
, 2009
"... We present DL8, an exact algorithm for finding a decision tree that optimizes a ranking function under size, depth, accuracy and leaf constraints. Because the discovery of optimal trees has high theoretical complexity, until now no efforts have been made to compute such trees for realworld datasets ..."
Abstract
 Add to MetaCart
We present DL8, an exact algorithm for finding a decision tree that optimizes a ranking function under size, depth, accuracy and leaf constraints. Because the discovery of optimal trees has high theoretical complexity, until now no efforts have been made to compute such trees for realworld datasets. An exact algorithm is of both scientific and practical interest. From a scientific point of view, it can be used as a gold standard to evaluate the performance of heuristic decision tree learners and to gain new insight in these traditional learners. From the application point of view, it can be used to discover trees that cannot be found by heuristic decision tree learners. The key idea behind our algorithm is the relation between constraints on decision trees and constraints on itemsets. We propose to exploit lattices of itemsets, from which we can extract optimal decision trees in linear time. We give several strategies to efficiently build these lattices. Experiments show that under the same constraints, DL8 has better test results than C4.5 which confirm that exhaustive search does not always imply overfitting. The results also show that DL8 is a useful and interesting tool to learn decision trees under constraints.