Results 1 - 10
of
21
Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey
- Data Mining and Knowledge Discovery
, 1997
"... Decision trees have proved to be valuable tools for the description, classification and generalization of data. Work on constructing decision trees from data exists in multiple disciplines such as statistics, pattern recognition, decision theory, signal processing, machine learning and artificial ne ..."
Abstract
-
Cited by 122 (1 self)
- Add to MetaCart
Decision trees have proved to be valuable tools for the description, classification and generalization of data. Work on constructing decision trees from data exists in multiple disciplines such as statistics, pattern recognition, decision theory, signal processing, machine learning and artificial neural networks. Researchers in these disciplines, sometimes working on quite different problems, identified similar issues and heuristics for decision tree construction. This paper surveys existing work on decision tree construction, attempting to identify the important issues involved, directions the work has taken and the current state of the art. Keywords: classification, tree-structured classifiers, data compaction 1. Introduction Advances in data collection methods, storage and processing technology are providing a unique challenge and opportunity for automated data exploration techniques. Enormous amounts of data are being collected daily from major scientific projects e.g., Human Genome...
Data Mining: Research Trends, Challenges, and Applications
- in Roughs Sets and Data Mining: Analysis of Imprecise Data
, 1997
"... Data mining is an interdisciplinary research area spanning severals disciplines such as database systems, machine learning, intelligent information systems, statistics, and expert systems. Data mining has evolved into an important and active area of research because of theoretical challenges and pra ..."
Abstract
-
Cited by 14 (7 self)
- Add to MetaCart
Data mining is an interdisciplinary research area spanning severals disciplines such as database systems, machine learning, intelligent information systems, statistics, and expert systems. Data mining has evolved into an important and active area of research because of theoretical challenges and practical applications associated with the problem of discovering (or extracting) interesting and previously unknown knowledge from very large real-world databases. Many aspects of data mining have been investigated in several related fields. A unique but important aspect of the problem lies in the significance of needs to extend these studies to include the nature of the contents of the real-world databases. In this chapter, we discuss the theory and foundational issues in data mining, describe data mining methods and algorithms, and review data mining applications. Since a major focus of this book is on rough sets and its applications to database mining, one full section is devoted to summari...
Classification of High Dimensional Data With Limited Training Samples
, 1998
"... - ii-TABLE OF CONTENTS ABSTRACT.......................................................................................iv ..."
Abstract
-
Cited by 13 (5 self)
- Add to MetaCart
- ii-TABLE OF CONTENTS ABSTRACT.......................................................................................iv
High Dimensional Feature Reduction Via Projection Pursuit
, 1995
"... - ii-Table of Contents ABSTRACT.................................................................................................................................... v 1. INTRODUCTION..................................................................................................................... 1 ..."
Abstract
-
Cited by 13 (3 self)
- Add to MetaCart
- ii-Table of Contents ABSTRACT.................................................................................................................................... v 1. INTRODUCTION..................................................................................................................... 1 1.1 Background.............................................................................................................. 1
Covariance approximation for fast and accurate computation of channelized Hotelling observer statistics
- IEEE Trans. Nuc. Sci
, 2000
"... We describe a method for computing linear observer statistics for maximum a posteriori (MAP) reconstructions of PET images. The method is based on a theoretical approximation for the mean and covariance of MAP reconstructions. In particular, we derive here a closed form for the channelized Hotelling ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
We describe a method for computing linear observer statistics for maximum a posteriori (MAP) reconstructions of PET images. The method is based on a theoretical approximation for the mean and covariance of MAP reconstructions. In particular, we derive here a closed form for the channelized Hotelling observer (CHO) statistic applied to 2D MAP images. We show reasonably good correspondence between these theoretical results and Monte Carlo studies. The accuracy and low computational cost of the approximation allow us to analyze the observer performance over a wide range of operating conditions and parameter settings for the MAP reconstruction algorithm. I.
Statistical Techniques for Language Recognition: An Introduction and Guide for Cryptanalysts
- Cryptologia
, 1993
"... We explain how to apply statistical techniques to solve several language-recognition problems that arise in cryptanalysis and other domains. Language recognition is important in cryptanalysis because, among other applications, an exhaustive key search of any cryptosystem from ciphertext alone requir ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
We explain how to apply statistical techniques to solve several language-recognition problems that arise in cryptanalysis and other domains. Language recognition is important in cryptanalysis because, among other applications, an exhaustive key search of any cryptosystem from ciphertext alone requires a test that recognizes valid plaintext. Written for cryptanalysts, this guide should also be helpful to others as an introduction to statistical inference on Markov chains. Modeling language as a finite stationary Markov process, we adapt a statistical model of pattern recognition to language recognition. Within this framework we consider four welldefined language-recognition problems: 1) recognizing a known language, 2) distinguishing a known language from uniform noise, 3) distinguishing unknown 0th-order noise from unknown 1st-order language, and 4) detecting non-uniform unknown language. For the second problem we give a most powerful test based on the Neyman-Pearson Lemma. For the oth...
Decision Tree Induction: How Effective is the Greedy Heuristic?
- In Proceedings of the First International Conference on Knowledge Discovery and Data Mining
, 1995
"... Most existing decision tree systems use a greedy approach to induce trees --- locally optimal splits are induced at every node of the tree. Although the greedy approach is suboptimal, it is believed to produce reasonably good trees. In the current work, we attempt to verify this belief. We quantify ..."
Abstract
-
Cited by 8 (4 self)
- Add to MetaCart
Most existing decision tree systems use a greedy approach to induce trees --- locally optimal splits are induced at every node of the tree. Although the greedy approach is suboptimal, it is believed to produce reasonably good trees. In the current work, we attempt to verify this belief. We quantify the goodness of greedy tree induction empirically, using the popular decision tree algorithms, C4.5 and CART. We induce decision trees on thousands of synthetic data sets and compare them to the corresponding optimal trees, which in turn are found using a novel map coloring idea. We measure the effect on greedy induction of variables such as the underlying concept complexity, training set size, noise and dimensionality. Our experiments show, among other things, that the expected classification cost of a greedily induced tree is consistently very close to that of the optimal tree. Introduction Decision trees are known to be effective classifiers in a variety of domains. Most of the methods ...
Statistics enhancement in hyperspectral data analysis using spectral-spatial labeling, the EM algorithm, and the leave-one-out covariance estimator
- Proc. SPIE
, 1999
"... Hyperspectral data potentially contain more information than multispectral data because of higher dimensionality. Information extraction algorithm performance is strongly related to the quantitative precision with which the desired classes are defined, a characteristic which increases rapidly with d ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Hyperspectral data potentially contain more information than multispectral data because of higher dimensionality. Information extraction algorithm performance is strongly related to the quantitative precision with which the desired classes are defined, a characteristic which increases rapidly with dimensionality. Due to the limited number of training samples used in defining classes, the information extraction of hyperspectral data may not perform as well as needed. In this paper, schemes for statistics enhancement are investigated for alleviating this problem. Previous works including the EM algorithm and the Leave-One-Out covariance estimator are discussed. The HALF covariance estimator is proposed for two-class problems by using the symmetry property of the normal distribution. A spectral-spatial labeling scheme is proposed to increase the training sample sizes automatically. We also seek to combine previous works with the proposed methods so as to take full advantage of statistics enhancement. Using these techniques, improvement in classification accuracy has been observed.
Robust support vector method for hyperspectral data classification and knowledge discovery
- IEEE Transactions on Geoscience and Remote Sensing
, 2004
"... Abstract — In this paper, we propose the use of Support Vector Machines (SVM) for automatic hyperspectral data classification and knowledge discovery. In the first stage of the study, we use SVMs for crop classification and analyze their performance in terms of efficiency and robustness, as compared ..."
Abstract
-
Cited by 6 (4 self)
- Add to MetaCart
Abstract — In this paper, we propose the use of Support Vector Machines (SVM) for automatic hyperspectral data classification and knowledge discovery. In the first stage of the study, we use SVMs for crop classification and analyze their performance in terms of efficiency and robustness, as compared to extensively used neural and fuzzy methods. Efficiency is assessed by evaluating accuracy and statistical differences in several scenes. Robustness is analyzed in terms of (a) suitability to working conditions when a feature selection stage is not possible, and (b) performance when different levels of Gaussian noise are introduced at their inputs. In the second stage of this work, we analyze the distribution of the support vectors (SV) and perform sensitivity analysis on the best classifier in order to analyze the significance of the input spectral bands. For classification purposes, six hyperspectral images acquired with the 128-band HyMAP spectrometer during the DAISEX-1999 campaign are used. Six crop classes were labelled for each image. A reduced set of labelled samples is used to train the models and the entire images are used to assess their performance. Several conclusions are drawn: (1) SVMs yield better outcomes than neural networks regarding accuracy, simplicity and robustness; (2) training neural and neurofuzzy models is unfeasible when working with high dimensional input spaces and great amounts of training data; (3) SVMs perform similarly for different training subsets with varying input dimension, which indicates that noisy bands are successfully detected; and (4) a valuable ranking of bands through sensitivity analysis is achieved. Index Terms — Hyperspectral imagery, crop classification, knowledge discovery, Support Vector Machines, neural networks.
A General Model for Finite-Sample Effects in Training and Testing of Competing Classifiers
, 2003
"... The conventional wisdom in the field of statistical pattern recognition (SPR) is that the size of the finite test sample dominates the variance in the assessment of the performance of a classical or neural classifier. The present work shows that this result has only narrow applicability. In particul ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
The conventional wisdom in the field of statistical pattern recognition (SPR) is that the size of the finite test sample dominates the variance in the assessment of the performance of a classical or neural classifier. The present work shows that this result has only narrow applicability. In particular, when competing algorithms are being compared, the finite training sample more commonly dominates this uncertainty. This general problem in SPR is analyzed using a formal structure recently developed for multivariate random-effects receiver operating characteristic (ROC) analysis. Monte Carlo trials within the general model are used to explore the detailed statistical structure of several representative problems in the sub-field of computeraided diagnosis in medicine. The scaling laws between variance of accuracy measures and number of training samples and number of test samples are investigated and found to be comparable to those discussed in the classic text of Fukunaga, but important interaction terms have been neglected by previous authors. Finally, the importance of the contribution of finite trainers to the uncertainties argues for some form of bootstrap analysis to sample that uncertainty. The leading contemporary candidate is an extension of the 0.632 bootstrap and associated error analysis, as opposed to the more commonly used cross-validation.

