Results 1  10
of
35
Automatic Construction of Decision Trees from Data: A MultiDisciplinary Survey
 Data Mining and Knowledge Discovery
, 1997
"... Decision trees have proved to be valuable tools for the description, classification and generalization of data. Work on constructing decision trees from data exists in multiple disciplines such as statistics, pattern recognition, decision theory, signal processing, machine learning and artificial ne ..."
Abstract

Cited by 146 (1 self)
 Add to MetaCart
Decision trees have proved to be valuable tools for the description, classification and generalization of data. Work on constructing decision trees from data exists in multiple disciplines such as statistics, pattern recognition, decision theory, signal processing, machine learning and artificial neural networks. Researchers in these disciplines, sometimes working on quite different problems, identified similar issues and heuristics for decision tree construction. This paper surveys existing work on decision tree construction, attempting to identify the important issues involved, directions the work has taken and the current state of the art. Keywords: classification, treestructured classifiers, data compaction 1. Introduction Advances in data collection methods, storage and processing technology are providing a unique challenge and opportunity for automated data exploration techniques. Enormous amounts of data are being collected daily from major scientific projects e.g., Human Genome...
RSVM: Reduced support vector machines
 Data Mining Institute, Computer Sciences Department, University of Wisconsin
, 2001
"... Abstract An algorithm is proposed which generates a nonlinear kernelbased separating surface that requires as little as 1 % of a large dataset for its explicit evaluation. To generate this nonlinear surface, the entire dataset is used as a constraint in an optimization problem with very few variabl ..."
Abstract

Cited by 122 (16 self)
 Add to MetaCart
Abstract An algorithm is proposed which generates a nonlinear kernelbased separating surface that requires as little as 1 % of a large dataset for its explicit evaluation. To generate this nonlinear surface, the entire dataset is used as a constraint in an optimization problem with very few variables corresponding to the 1%
Smoothing Methods for Convex Inequalities and Linear Complementarity Problems
 Mathematical Programming
, 1993
"... A smooth approximation p(x; ff) to the plus function: maxfx; 0g, is obtained by integrating the sigmoid function 1=(1 + e \Gammaffx ), commonly used in neural networks. By means of this approximation, linear and convex inequalities are converted into smooth, convex unconstrained minimization probl ..."
Abstract

Cited by 62 (6 self)
 Add to MetaCart
A smooth approximation p(x; ff) to the plus function: maxfx; 0g, is obtained by integrating the sigmoid function 1=(1 + e \Gammaffx ), commonly used in neural networks. By means of this approximation, linear and convex inequalities are converted into smooth, convex unconstrained minimization problems, the solution of which approximates the solution of the original problem to a high degree of accuracy for ff sufficiently large. In the special case when a Slater constraint qualification is satisfied, an exact solution can be obtained for finite ff. Speedup over MINOS 5.4 was as high as 515 times for linear inequalities of size 1000 \Theta 1000, and 580 times for convex inequalities with 400 variables. Linear complementarity problems are converted into a system of smooth nonlinear equations and are solved by a quadratically convergent Newton method. For monotone LCP's with as many as 400 variables, the proposed approach was as much as 85 times faster than Lemke's method. Key Words: Smo...
Mathematical Programming for Data Mining: Formulations and Challenges
 INFORMS Journal on Computing
, 1998
"... This paper is intended to serve as an overview of a rapidly emerging research and applications area. In addition to providing a general overview, motivating the importance of data mining problems within the area of knowledge discovery in databases, our aim is to list some of the pressing research ch ..."
Abstract

Cited by 47 (0 self)
 Add to MetaCart
This paper is intended to serve as an overview of a rapidly emerging research and applications area. In addition to providing a general overview, motivating the importance of data mining problems within the area of knowledge discovery in databases, our aim is to list some of the pressing research challenges, and outline opportunities for contributions by the optimization research communities. Towards these goals, we include formulations of the basic categories of data mining methods as optimization problems. We also provide examples of successful mathematical programming approaches to some data mining problems. keywords: data analysis, data mining, mathematical programming methods, challenges for massive data sets, classification, clustering, prediction, optimization. To appear: INFORMS: Journal of Compting, special issue on Data Mining, A. Basu and B. Golden (guest editors). Also appears as Mathematical Programming Technical Report 9801, Computer Sciences Department, University of Wi...
Nuclear Feature Extraction For Breast Tumor Diagnosis
, 1993
"... Interactive image processing techniques, along with a linearprogrammingbased inductive classifier, have been used to create a highly accurate system for diagnosis of breast tumors. A small fraction of a fine needle aspirate slide is selected and digitized. With an interactive interface, the user i ..."
Abstract

Cited by 37 (7 self)
 Add to MetaCart
Interactive image processing techniques, along with a linearprogrammingbased inductive classifier, have been used to create a highly accurate system for diagnosis of breast tumors. A small fraction of a fine needle aspirate slide is selected and digitized. With an interactive interface, the user initializes active contour models, known as snakes, near the boundaries of a set of cell nuclei. The customized snakes are deformed to the exact shape of the nuclei. This allows for precise, automated analysis of nuclear size, shape and texture. Ten such features are computed for each nucleus, and the mean value, largest (or "worst") value and standard error of each feature are found over the range of isolated cells. After 569 images were analyzed in this fashion, different combinations of features were tested to find those which best separate benign from malignant samples. Tenfold crossvalidation accuracy of 97% was achieved using a single separating plane on three of the thirty features: ...
A New Class Of Incremental Gradient Methods For Least Squares Problems
 SIAM J. Optim
, 1996
"... The LMS method for linear least squares problems di#ers from the steepest descent method in that it processes data blocks onebyone, with intermediate adjustment of the parameter vector under optimization. This mode of operation often leads to faster convergence when far from the eventual limit, an ..."
Abstract

Cited by 34 (3 self)
 Add to MetaCart
The LMS method for linear least squares problems di#ers from the steepest descent method in that it processes data blocks onebyone, with intermediate adjustment of the parameter vector under optimization. This mode of operation often leads to faster convergence when far from the eventual limit, and to slower (sublinear) convergence when close to the optimal solution. We embed both LMS and steepest descent, as well as other intermediate methods, within a oneparameter class of algorithms, and we propose a hybrid class of methods that combine the faster early convergence rate of LMS with the faster ultimate linear convergence rate of steepest descent. These methods are wellsuited for neural network training problems with large data sets. Furthermore, these methods allow the e#ective use of scaling based for example on diagonal or other approximations of the Hessian matrix. 1 Research supported by NSF under Grant 9300494DMI. 2 Dept. of Electrical Engineering and Computer Science, M...
SSVM: A Smooth Support Vector Machine for Classification
 Computational Optimization and Applications
, 1999
"... Smoothing methods, extensively used for solving important mathematical programming problems and applications, are applied here to generate and solve an unconstrained smooth reformulation of the support vector machine for pattern classification using a completely arbitrary kernel. We term such re ..."
Abstract

Cited by 31 (0 self)
 Add to MetaCart
Smoothing methods, extensively used for solving important mathematical programming problems and applications, are applied here to generate and solve an unconstrained smooth reformulation of the support vector machine for pattern classification using a completely arbitrary kernel. We term such reformulation a smooth support vec tor machine (SSVM). A fast NewtonArmijo algorithm for solving the SSVM converges globally and quadratically. Numerical results and comparisons are given to demonstrate the effectiveness and speed of the algorithm. On six publicly available datasets, tenfold cross validation correctness of SSVM was the highest compared with four other methods as well as the fastest. On larger problems, SSVM was compa rable or faster than SVM light [17], SOR [23] and SMO [27]. SSVM can also generate a highly nonlinear separating surface such as a checker board.
Machine Learning via Polyhedral Concave Minimization
, 1996
"... Two fundamental problems of machine learning, misclassification minimization [10, 24, 18] and feature selection, [25, 29, 14] are formulated as the minimization of a concave function on a polyhedral set. Other formulations of these problems utilize linear programs with equilibrium constraints [18, 1 ..."
Abstract

Cited by 27 (12 self)
 Add to MetaCart
Two fundamental problems of machine learning, misclassification minimization [10, 24, 18] and feature selection, [25, 29, 14] are formulated as the minimization of a concave function on a polyhedral set. Other formulations of these problems utilize linear programs with equilibrium constraints [18, 1, 4, 3] which are generally intractable. In contrast, for the proposed concave minimization formulation, a successive linearization algorithm without stepsize terminates after a maximum average of 7 linear programs on problems with as many as 4192 points in 14dimensional space. The algorithm terminates at a stationary point or a global solution to the problem. Preliminary numerical results indicate that the proposed approach is quite effective and more efficient than other approaches. 1 Introduction We shall consider the following two fundamental problems of machine learning: Problem 1.1 Misclassification Minimization [24, 18] Given two finite point sets A and B in the ndimensional real s...
Mathematical Programming in Data Mining
 Data Mining and Knowledge Discovery
, 1996
"... Mathematical programming approaches to three fundamental problems will be described: feature selection, clustering and robust representation. The feature selection problem considered is that of discriminating between two sets while recognizing irrelevant and redundant features and suppressing them. ..."
Abstract

Cited by 26 (3 self)
 Add to MetaCart
Mathematical programming approaches to three fundamental problems will be described: feature selection, clustering and robust representation. The feature selection problem considered is that of discriminating between two sets while recognizing irrelevant and redundant features and suppressing them. This creates a lean model that often generalizes better to new unseen data. Computational results on real data confirm improved generalization of leaner models. Clustering is exemplified by the unsupervised learning of patterns and clusters that may exist in a given database and is a useful tool for knowledge discovery in databases (KDD). A mathematical programming formulation of this problem is proposed that is theoretically justifiable and computationally implementable in a finite number of steps. A resulting kMedian Algorithm is utilized to discover very useful survival curves for breast cancer patients from a medical database. Robust representation is concerned with minimizing trained m...