Results 1 -
8 of
8
A theory of pattern rejection
- IN ARPA IMAGE UNDERSTANDING WORKSHOP
, 1996
"... The efficiency of pattern recognition is critical when a large number of classes are to be discriminated, or when the recognition algorithm needs to be applied a large number of times. We propose and analyze a general technique, namely pattern rejection, that results in efficient pattern recognition ..."
Abstract
-
Cited by 7 (4 self)
- Add to MetaCart
The efficiency of pattern recognition is critical when a large number of classes are to be discriminated, or when the recognition algorithm needs to be applied a large number of times. We propose and analyze a general technique, namely pattern rejection, that results in efficient pattern recognition. Rejectors are introduced as algorithms that can very quickly eliminate from further consideration most classes or inputs (depending on the setting). Rejectors may be combined to form composite rejectors, which are more e ective than any single rejector. Composite rejectors are analyzed and conditions derived which guarantee both efficiency and practicality. A general technique is proposed for the construction of composite rejectors, based on a single assumption about the classes. The generality of this assumption is shown through its connection with the Karhunen-Loeve expansion. A relation of pattern rejection with Fisher's discriminant analysis is also shown. Composite rejectors were constructed for two applications, namely, object recognition and local feature detection. In both cases, a substantial improvement in efficiency over existing techniques was found.
Mining Optimal Decision Trees from Itemset Lattices
, 2007
"... We present an exact algorithm for finding a decision tree that optimizes a ranking function under size, depth, accuracy and leaf constraints. Because the discovery of optimal trees has high theoretical complexity, until now no efforts have been made to compute such trees for real-world datasets. An ..."
Abstract
-
Cited by 6 (4 self)
- Add to MetaCart
We present an exact algorithm for finding a decision tree that optimizes a ranking function under size, depth, accuracy and leaf constraints. Because the discovery of optimal trees has high theoretical complexity, until now no efforts have been made to compute such trees for real-world datasets. An exact algorithm is of both scientific and practical interest. From the scientific point of view, it can be used as a gold standard to evaluate the performance of heuristic decision tree learners, and it can be used to gain new insight in traditional decision tree learners. From the application point of view, it can be used to discover trees that cannot be found by heuristic decision tree learners. The key idea behind our algorithm is the relation between constraints on decision trees and constraints on itemsets. We propose to exploit lattices of itemsets, from which we can extract optimal decision trees in linear time. We give several strategies to efficiently build these lattices and show that the test set accuracies of C4.5 compete with the test set accuracies of optimal trees.
Class-Dependent Features and Multicategory Classification
, 2001
"... Faculty of Engineering and Applied Science Department of Electronics and Computer Science Doctor of Philosophy Class-dependent features and multicategory classification by Alex Bailey The problem of pattern classification is considered for the case of multicategory classification where the numb ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Faculty of Engineering and Applied Science Department of Electronics and Computer Science Doctor of Philosophy Class-dependent features and multicategory classification by Alex Bailey The problem of pattern classification is considered for the case of multicategory classification where the number of classes, k, is greater than two. Many classification algorithms are in fact 2-class classifiers and are generalised to solve k-class problems. Which classifiers are naturally multicategory and the nature of the generalisation of a 2-class classifier to k classes is not often investigated. A thorough analysis of multicategory classification is given in this thesis which provides a new taxonomy of popular classification algorithms, and goes on to derive these from a probabilistic viewpoint. A clear distinction is made between classifiers that partition the input space and those that partition the set of k classes. Of the classifiers which partition the set of classes, the one-of-n, pairwise and hierarchical methods of decomposition are shown to be equivalent in the knowledge of the true data distributions. The scaling properties of these algorithms are analysed for increasing k. The effects of learning models on finite data are then investigated to show the practical differences between each decomposition.
Hyper-rectangle-based discriminative data generalization and applications in data mining
, 2007
"... The ultimate goal of data mining is to extract knowledge from massive data. Knowledge is ideally represented as human-comprehensible patterns from which end-users can gain intuitions and insights. Axis-parallel hyper-rectangles provide interpretable generalizations for multi-dimensional data points ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
The ultimate goal of data mining is to extract knowledge from massive data. Knowledge is ideally represented as human-comprehensible patterns from which end-users can gain intuitions and insights. Axis-parallel hyper-rectangles provide interpretable generalizations for multi-dimensional data points with numerical attributes. In this dissertation, we study the fundamental problem of rectangle-based discriminative data generalization in the context of several useful data mining applications: cluster description, rule learning, and Nearest Rectangle classification. Clustering is one of the most important data mining tasks. However, most clustering methods output sets of points as clusters and do not generalize them into interpretable patterns. We perform a systematic study of cluster description, where we propose novel description formats leading to enhanced expressive power and introduce novel description problems specifying different trade-offs between interpretability and accuracy. We also present efficient heuristic algorithms for the introduced problems in the proposed formats. If-then rules are
CST: Constructive Solid Trimming for Rendering BReps and CSG
- IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS
"... To eliminate the need to evaluate the intersection curves in explicit representations of surface cutouts or of trimmed faces in BReps of CSG solids, we advocate using Constructive Solid Trimming (CST). A CST face is the intersection of a surface with a Blist representation of a trimming CSG volume ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
To eliminate the need to evaluate the intersection curves in explicit representations of surface cutouts or of trimmed faces in BReps of CSG solids, we advocate using Constructive Solid Trimming (CST). A CST face is the intersection of a surface with a Blist representation of a trimming CSG volume. We propose a new, GPU-based, CSG rendering algorithm, which trims the boundary of each primitive using a Blist of its Active Zone. This approach is faster than the previously reported Blister approach, eliminates occasional speckles of wrongly colored pixels, and provides additional capabilities: painting on surfaces, rendering semitransparent CSG models, and highlighting selected features in the BReps of CSG models.
Optimized Blist Form (OBF)
"... Any Boolean expressions may be converted into positive-form, which has only union and intersection operators. Let E be a positive-form expression with n literals. Assume that the truth-values of the literals are read one at a time. The numbers s(n) of steps (operations) and b(n) of working memory bi ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Any Boolean expressions may be converted into positive-form, which has only union and intersection operators. Let E be a positive-form expression with n literals. Assume that the truth-values of the literals are read one at a time. The numbers s(n) of steps (operations) and b(n) of working memory bits (footprint) needed to evaluate E depend on E and on the evaluation technique. A recursive evaluation performs s(n)=n–1 steps but requires b(n)=log(n)+1 bits. Evaluating the disjunctive form of E uses only b(n)=2 bits, but may lead to an exponential growth of s(n). We propose a new Optimized Blist Form (OBF) that requires only s(n)=n steps and b(n)=⎡log 2j⎤ bits, where j=⎡log 2(2n/3+2)⎤. We provide a simple and linear cost algorithm for converting positive-form expressions to their OBF. We discuss three applications: (1) Direct CSG rendering, where a candidate surfel stored at a pixel is classified against an arbitrarily complex Boolean expression using a footprint of only 6 stencil bits; (2) the new Logic Matrix (LM), which evaluates any positive form logical expression of n literals in a single cycle and uses a matrix of at most n×j wire/line connections; and (3) the new Logic Pipe (LP), which uses n gates that are connected by a pipe of ⎡log 2j ⎤ lines and when receiving a staggered stream of input vectors produces a value of a logical expression at each cycle. 1.
Technical Note: Algorithms for Optimal Dyadic Decision Trees
"... A dynamic programming algorithm for constructing optimal dyadic decision trees was recently introduced, analyzed, and shown to be very effective for low dimensional data sets. This paper enhances and extends this algorithm by: introducing an adaptive grid search for the regularization parameter tha ..."
Abstract
- Add to MetaCart
A dynamic programming algorithm for constructing optimal dyadic decision trees was recently introduced, analyzed, and shown to be very effective for low dimensional data sets. This paper enhances and extends this algorithm by: introducing an adaptive grid search for the regularization parameter that guarantees optimal solutions for all relevant trees sizes, replacing the dynamic programming algorithm with a memoized recursive algorithm whose run time is substantially smaller for most regularization parameter values on the grid, and incorporating new data structures and data pre-processing steps that provide significant run time enhancement in practice.

