Results 1  10
of
102
Hierarchical Discriminant Analysis for Image Retrieval
 IEEE Trans. PAMI
, 1999
"... Abstract—A selforganizing framework for object recognition is described. We describe a hierarchical database structure for image retrieval. The SelfOrganizing Hierarchical Optimal Subspace Learning and Inference Framework (SHOSLIF) system uses the theories of optimal linear projection for automati ..."
Abstract

Cited by 47 (3 self)
 Add to MetaCart
Abstract—A selforganizing framework for object recognition is described. We describe a hierarchical database structure for image retrieval. The SelfOrganizing Hierarchical Optimal Subspace Learning and Inference Framework (SHOSLIF) system uses the theories of optimal linear projection for automatic optimal feature derivation and a hierarchical structure to achieve a logarithmic retrieval complexity. A SpaceTessellation Tree is automatically generated using the Most Expressive Features (MEFs) and the Most Discriminating Features (MDFs) at each level of the tree. The major characteristics of the proposed hierarchical discriminant analysis include: 1) avoiding the limitation of global linear features (hyperplanes as separators) by deriving a recursively betterfitted set of features for each of the recursively subdivided sets of training samples; 2) generating a smaller tree whose cell boundaries separate the samples along the class boundaries better than the principal component analysis, thereby giving a better generalization capability (i.e., better recognition rate in a disjoint test); 3) accelerating the retrieval using a tree structure for data pruning, utilizing a different set of discriminant features at each level of the tree. We allow for perturbations in the size and position of objects in the images through learning. We demonstrate the technique on a large image database of widely varying realworld objects taken in natural settings, and show the applicability of the approach for variability in position, size, and 3D orientation. This paper concentrates on the hierarchical partitioning of the feature spaces. Index Terms—Principal component analysis, discriminant analysis, hierarchical image database, image retrieval, tessellation, partitioning, object recognition, face recognition, complexity with large image databases.
Hierarchical discriminant regression
 IEEE Trans. Pattern Anal. Mach. Intell
, 2000
"... AbstractÐThe main motivation of this paper is to propose a new classification and regression method for challenging highdimensional data. The proposed new technique casts classification problems (class labels as output) and regression problems (numeric values as output) into a unified regression pro ..."
Abstract

Cited by 46 (24 self)
 Add to MetaCart
AbstractÐThe main motivation of this paper is to propose a new classification and regression method for challenging highdimensional data. The proposed new technique casts classification problems (class labels as output) and regression problems (numeric values as output) into a unified regression problem. This unified view enables classification problems to use numeric information in the output space that is available for regression problems but are traditionally not readily available for classification problemsÐdistance metric among clustered class labels for coarse and fine classifications. A doubly clustered subspacebased hierarchical discriminating regression (HDR) method is proposed in this work. The major characteristics include: 1) Clustering is performed in both output space and input space at each internal node, termed ªdoubly clustered.º Clustering in the output space provides virtual labels for computing clusters in the input space. 2) Discriminants in the input space are automatically derived from the clusters in the input space. These discriminants span the discriminating subspace at each internal node of the tree. 3) A hierarchical probability distribution model is applied to the resulting discriminating subspace at each internal node. This realizes a coarsetofine approximation of probability distribution of the input samples, in the hierarchical discriminating subspaces. No global distribution models are assumed. 4) To relax the per class sample requirement of traditional discriminant analysis techniques, a samplesize dependent negativeloglikelihood (NLL) is introduced. This new technique is designed for automatically dealing with smallsample applications, largesample applications, and unbalancedsample applications. 5) The execution of HDR method is fast, due to the empirical logarithmic time complexity of the HDR algorithm. Although the method is applicable to any data, we report the experimental results for three types of data: synthetic data for examining the nearoptimal performance, large raw faceimage data bases, and traditional databases with manually selected features along with a comparison with some major existing methods, such as CART,
RankGene: identification of diagnostic genes based on expression data
 Bioinformatics
, 2003
"... Summary: RankGene is a program for analyzing gene expression data and computing diagnostic genes based on their predictive power in distinguishing between different types of samples. The program integrates into one system a variety of popular ranking criteria, ranging from the traditional tstatisti ..."
Abstract

Cited by 33 (1 self)
 Add to MetaCart
Summary: RankGene is a program for analyzing gene expression data and computing diagnostic genes based on their predictive power in distinguishing between different types of samples. The program integrates into one system a variety of popular ranking criteria, ranging from the traditional tstatistic to onedimensional support vector machines. This flexibility makes RankGene a useful tool in gene expression analysis and feature selection. Availability:
Discovering Interesting Patterns for Investment Decision Making with GLOWER  A Genetic Learner Overlaid With Entropy Reduction
, 2000
"... Prediction in financial domains is notoriously difficult for a number of reasons. First, theories tend to be weak or nonexistent, which makes problem formulation open ended by forcing us to consider a large number of independent variables and thereby increasing the dimensionality of the search spac ..."
Abstract

Cited by 31 (0 self)
 Add to MetaCart
Prediction in financial domains is notoriously difficult for a number of reasons. First, theories tend to be weak or nonexistent, which makes problem formulation open ended by forcing us to consider a large number of independent variables and thereby increasing the dimensionality of the search space. Second, the weak relationships among variables tend to be nonlinear, and may hold only in limited areas of the search space. Third, in financial practice, where analysts conduct extensive manual analysis of historically well performing indicators, a key is to find the hidden interactions among variables that perform well in combination. Unfortunately, these are exactly the patterns that the greedy search biases incorporated by many standard rule learning algorithms will miss. In this paper, we describe and evaluate several variations of a new genetic learning algorithm (GLOWER) on a variety of data sets. The design of GLOWER has been motivated by financial prediction problems, but incorpo...
Minimaxoptimal classification with dyadic decision trees
 IEEE TRANSACTIONS ON INFORMATION THEORY
, 2006
"... Decision trees are among the most popular types of classifiers, with interpretability and ease of implementation being among their chief attributes. Despite the widespread use of decision trees, theoretical analysis of their performance has only begun to emerge in recent years. In this paper it is ..."
Abstract

Cited by 27 (4 self)
 Add to MetaCart
Decision trees are among the most popular types of classifiers, with interpretability and ease of implementation being among their chief attributes. Despite the widespread use of decision trees, theoretical analysis of their performance has only begun to emerge in recent years. In this paper it is shown that a new family of decision trees, dyadic decision trees (DDTs), attain nearly optimal (in a minimax sense) rates of convergence for a broad range of classification problems. Furthermore, DDTs are surprisingly adaptive in three important respects: They automatically (1) adapt to favorable conditions near the Bayes decision boundary; (2) focus on data distributed on lower dimensional manifolds; and (3) reject irrelevant features. DDTs are constructed by penalized empirical risk minimization using a new datadependent penalty and may be computed exactly with computational complexity that is nearly linear in the training sample size. DDTs are the first classifier known to achieve nearly optimal rates for the diverse class of distributions studied here while also being practical and implementable. This is also the first study (of which we are aware) to consider rates for adaptation to intrinsic data dimension and relevant features.
Shared Memory Parallelization of Data Mining Algorithms: Techniques, Programming Interface, and Performance
 In Proceedings of the second SIAM conference on Data Mining
, 2002
"... With recent technological advances, shared memory parallel machines have become more scalable, and oer large main memories and high bus bandwidths. They are emerging as good platforms for data warehousing and data mining. In this paper, we focus on shared memory parallelization of data mining alg ..."
Abstract

Cited by 27 (10 self)
 Add to MetaCart
With recent technological advances, shared memory parallel machines have become more scalable, and oer large main memories and high bus bandwidths. They are emerging as good platforms for data warehousing and data mining. In this paper, we focus on shared memory parallelization of data mining algorithms.
Application of Genetic Programming to Induction of Linear Classification Trees
 In Proceedings of the Third European Conference on Genetic Programming
, 2000
"... . A common problem in datamining is to find accurate classifiers for a dataset. For this purpose, genetic programming (GP) is applied to a set of benchmark classification problems. Using GP we are able to induce decision trees with a linear combination of variables in each function node. A new r ..."
Abstract

Cited by 20 (1 self)
 Add to MetaCart
. A common problem in datamining is to find accurate classifiers for a dataset. For this purpose, genetic programming (GP) is applied to a set of benchmark classification problems. Using GP we are able to induce decision trees with a linear combination of variables in each function node. A new representation of decision trees using strong typing in GP is introduced. With this representation it is possible to let the GP classify into any number of classes. Results indicate that GP can be applied successfully to classification problems. Comparisons with current stateoftheart algorithms in machine learning are presented and areas of future research are identified. 1 Introduction Classification problems form an important area in datamining. For example, a bank may want to classify its clients in good and bad credit risks or a doctor may want to classify his patients as having diabetes or not. Classifiers may take the form of decision trees [11] (see Figure 1). In each node, a...
TopDown Induction of Decision Trees Classifiers  A Survey
, 2002
"... Decision Trees are considered to be one of the most popular approaches for representing classifiers. Researchers from various disciplines such as statistics, machine learning, pattern recognition, and data mining considered the issue of growing a decision tree from available data. This paper present ..."
Abstract

Cited by 17 (3 self)
 Add to MetaCart
Decision Trees are considered to be one of the most popular approaches for representing classifiers. Researchers from various disciplines such as statistics, machine learning, pattern recognition, and data mining considered the issue of growing a decision tree from available data. This paper presents an updated survey of current methods for constructing decision tree classifiers in topdown manner. The paper suggests a unified algorithmic framework for presenting these algorithms and provides profound descriptions of the various splitting criteria and pruning methodology.
Incremental Hierarchical Discriminant Regression
"... This paper presents Incremental Hierarchical Discriminant Regression (IHDR) which incrementally builds a decision tree or regression tree for very high dimensional regression or decision spaces by an online, realtime learning system. Biologically motivated, it is an approximate computational model ..."
Abstract

Cited by 16 (9 self)
 Add to MetaCart
This paper presents Incremental Hierarchical Discriminant Regression (IHDR) which incrementally builds a decision tree or regression tree for very high dimensional regression or decision spaces by an online, realtime learning system. Biologically motivated, it is an approximate computational model for automatic development of associative cortex, with both bottomup sensory inputs and topdown motor projections. At each internal node of the IHDR tree, information in the output space is used to automatically derive the local subspace spanned by the most discriminating features. Embedded in the tree is a hierarchical probability distribution model used to prune very unlikely cases during the search. The number of parameters in the coarsetofine approximation is dynamic and datadriven, enabling the IHDR tree to automatically fit data with unknown distribution shapes (thus, it is difficult to select the number of parameters up front). The IHDR tree dynamically assigns longterm memory to avoid the lossofmemory problem typical with a globalfitting learning algorithm for neural networks. A major challenge for an incrementally built tree is that the number of samples varies arbitrarily during the construction process. An incrementally updated probability model, called sample size dependent negativeloglikelihood (SDNLL) metric is used to deal with largesample size cases, smallsample size cases, and unbalancedsample size cases, measured among different internal nodes of the IHDR tree. We report experimental results for four types of data: synthetic data to visualize the behavior of the algorithms, large face image data, continuous video stream from robot navigation, and publicly available data sets that use human defined features.
Omnivariate Decision Trees
"... Univariate decision trees at each decision node consider the value of only one feature leading to axisaligned splits. In a linear multivariate decision tree, each decision node divides the input space into two with a hyperplane. In a nonlinear multivariate tree, a multilayer perceptron at each node ..."
Abstract

Cited by 14 (8 self)
 Add to MetaCart
Univariate decision trees at each decision node consider the value of only one feature leading to axisaligned splits. In a linear multivariate decision tree, each decision node divides the input space into two with a hyperplane. In a nonlinear multivariate tree, a multilayer perceptron at each node divides the input space arbitrarily, at the expense of increased complexity and higher risk of overfitting. We propose omnivariate trees where the decision node may be univariate, linear, or nonlinear depending on the outcome of comparative statistical tests on accuracy thus matching automatically the complexity of the node with the subproblem defined by the data reaching that node. Such an architecture frees the designer from choosing the appropriate node type, doing model selection automatically at each node. Our simulation results indicate that such a decision tree induction method generalizes better than trees with the same types of nodes everywhere and induces small trees.