Results 1  10
of
179
Regularization networks and support vector machines
 Advances in Computational Mathematics
, 2000
"... Regularization Networks and Support Vector Machines are techniques for solving certain problems of learning from examples – in particular the regression problem of approximating a multivariate function from sparse data. Radial Basis Functions, for example, are a special case of both regularization a ..."
Abstract

Cited by 266 (33 self)
 Add to MetaCart
Regularization Networks and Support Vector Machines are techniques for solving certain problems of learning from examples – in particular the regression problem of approximating a multivariate function from sparse data. Radial Basis Functions, for example, are a special case of both regularization and Support Vector Machines. We review both formulations in the context of Vapnik’s theory of statistical learning which provides a general foundation for the learning problem, combining functional analysis and statistics. The emphasis is on regression: classification is treated as a special case.
Survey of clustering algorithms
 IEEE TRANSACTIONS ON NEURAL NETWORKS
, 2005
"... Data analysis plays an indispensable role for understanding various phenomena. Cluster analysis, primitive exploration with little or no prior knowledge, consists of research developed across a wide variety of communities. The diversity, on one hand, equips us with many tools. On the other hand, the ..."
Abstract

Cited by 231 (3 self)
 Add to MetaCart
Data analysis plays an indispensable role for understanding various phenomena. Cluster analysis, primitive exploration with little or no prior knowledge, consists of research developed across a wide variety of communities. The diversity, on one hand, equips us with many tools. On the other hand, the profusion of options causes confusion. We survey clustering algorithms for data sets appearing in statistics, computer science, and machine learning, and illustrate their applications in some benchmark data sets, the traveling salesman problem, and bioinformatics, a new field attracting intensive efforts. Several tightly related topics, proximity measure, and cluster validation, are also discussed.
The positive false discovery rate: A Bayesian interpretation and the qvalue
 Annals of Statistics
, 2003
"... Multiple hypothesis testing is concerned with controlling the rate of false positives when testing several hypotheses simultaneously. One multiple hypothesis testing error measure is the false discovery rate (FDR), which is loosely defined to be the expected proportion of false positives among all s ..."
Abstract

Cited by 158 (3 self)
 Add to MetaCart
Multiple hypothesis testing is concerned with controlling the rate of false positives when testing several hypotheses simultaneously. One multiple hypothesis testing error measure is the false discovery rate (FDR), which is loosely defined to be the expected proportion of false positives among all significant hypotheses. The FDR is especially appropriate for exploratory analyses in which one is interested in finding several significant results among many tests. In this work, we introduce a modified version of the FDR called the “positive false discovery rate ” (pFDR). We discuss the advantages and disadvantages of the pFDR and investigate its statistical properties. When assuming the test statistics follow a mixture distribution, we show that the pFDR can be written as a Bayesian posterior probability and can be connected to classification theory. These properties remain asymptotically true under fairly general conditions, even under certain forms of dependence. Also, a new quantity called the “qvalue ” is introduced and investigated, which is a natural “Bayesian posterior pvalue, ” or rather the pFDR analogue of the pvalue.
RSVM: Reduced support vector machines
 Data Mining Institute, Computer Sciences Department, University of Wisconsin
, 2001
"... Abstract An algorithm is proposed which generates a nonlinear kernelbased separating surface that requires as little as 1 % of a large dataset for its explicit evaluation. To generate this nonlinear surface, the entire dataset is used as a constraint in an optimization problem with very few variabl ..."
Abstract

Cited by 122 (16 self)
 Add to MetaCart
Abstract An algorithm is proposed which generates a nonlinear kernelbased separating surface that requires as little as 1 % of a large dataset for its explicit evaluation. To generate this nonlinear surface, the entire dataset is used as a constraint in an optimization problem with very few variables corresponding to the 1%
Proximal support vector machine classifiers
 Proceedings KDD2001: Knowledge Discovery and Data Mining
, 2001
"... Abstract—A new approach to support vector machine (SVM) classification is proposed wherein each of two data sets are proximal to one of two distinct planes that are not parallel to each other. Each plane is generated such that it is closest to one of the two data sets and as far as possible from the ..."
Abstract

Cited by 109 (14 self)
 Add to MetaCart
Abstract—A new approach to support vector machine (SVM) classification is proposed wherein each of two data sets are proximal to one of two distinct planes that are not parallel to each other. Each plane is generated such that it is closest to one of the two data sets and as far as possible from the other data set. Each of the two nonparallel proximal planes is obtained by a single MATLAB command as the eigenvector corresponding to a smallest eigenvalue of a generalized eigenvalue problem. Classification by proximity to two distinct nonlinear surfaces generated by a nonlinear kernel also leads to two simple generalized eigenvalue problems. The effectiveness of the proposed method is demonstrated by tests on simple examples as well as on a number of public data sets. These examples show the advantages of the proposed approach in both computation time and test set correctness. Index Terms—Support vector machines, proximal classification, generalized eigenvalues. 1
Lagrangian Support Vector Machines
, 2000
"... An implicit Lagrangian for the dual of a simple reformulation of the standard quadratic program of a linear support vector machine is proposed. This leads to the minimization of an unconstrained differentiable convex function in a space of dimensionality equal to the number of classified points. Thi ..."
Abstract

Cited by 86 (11 self)
 Add to MetaCart
An implicit Lagrangian for the dual of a simple reformulation of the standard quadratic program of a linear support vector machine is proposed. This leads to the minimization of an unconstrained differentiable convex function in a space of dimensionality equal to the number of classified points. This problem is solvable by an extremely simple linearly convergent Lagrangian support vector machine (LSVM) algorithm. LSVM requires the inversion at the outset of a single matrix of the order of the much smaller dimensionality of the original input space plus one. The full algorithm is given in this paper in 11 lines of MATLAB code without any special optimization tools such as linear or quadratic programming solvers. This LSVM code can be used "as is" to solve classification problems with millions of points. For example, 2 million points in 10 dimensional input space were classified by a linear surface in 82 minutes on a Pentium III 500 MHz notebook with 384 megabytes of memory (and additional swap space), and in 7 minutes on a 250 MHz UltraSPARC II processor with 2 gigabytes of memory. Other standard classification test problems were also solved. Nonlinear kernel classification can also be solved by LSVM. Although it does not scale up to very large problems, it can handle any positive semidefinite kernel and is guaranteed to converge.
Web Usage Mining: Discovery and Application of Interestin Patterns from Web Data
, 2000
"... Web Usage Mining is the application of data mining techniques to Web clickstream data in order to extract usage patterns. As Web sites continue to grow in size and complexity, the results of Web Usage Mining have become critical for a number of applications such as Web site design, business and mark ..."
Abstract

Cited by 72 (0 self)
 Add to MetaCart
Web Usage Mining is the application of data mining techniques to Web clickstream data in order to extract usage patterns. As Web sites continue to grow in size and complexity, the results of Web Usage Mining have become critical for a number of applications such as Web site design, business and marketing decision support, personalization, usability studies, and network trac analysis. The two major challenges involved in Web Usage Mining are preprocessing the raw data to provide an accurate picture of how a site is being used, and ltering the results of the various data mining algorithms in order to present only the rules and patterns that are potentially interesting. This thesis develops and tests an architecture and algorithms for performing Web Usage Mining. An evidence combination framework referred to as the information lter is developed to compare and combine usage, content, and structure information about a Web site. The information lter automatically identi es the discovered ...
Islands of Music  Analysis, Organization, and Visualization of Music Archives
, 2001
"... This report summarizes the master's thesis Islands of Music: Analysis, Organization, and Visualization of Music Archives, which I submitted to the Vienna University of Technology on December 11th, 2001. I wrote it at the Department of Software Technology and Interactive Systems, supervised by Dr. An ..."
Abstract

Cited by 68 (15 self)
 Add to MetaCart
This report summarizes the master's thesis Islands of Music: Analysis, Organization, and Visualization of Music Archives, which I submitted to the Vienna University of Technology on December 11th, 2001. I wrote it at the Department of Software Technology and Interactive Systems, supervised by Dr. Andreas Rauber, and assessed by Prof. Dr. Dieter Merkl
Successive Overrelaxation for Support Vector Machines
 IEEE Transactions on Neural Networks
, 1998
"... Successive overrelaxation (SOR) for symmetric linear complementarity problems and quadratic programs [11, 12, 9] is used to train a support vector machine (SVM) [20, 3] for discriminating between the elements of two massive datasets, each with millions of points. Because SOR handles one point at a t ..."
Abstract

Cited by 66 (14 self)
 Add to MetaCart
Successive overrelaxation (SOR) for symmetric linear complementarity problems and quadratic programs [11, 12, 9] is used to train a support vector machine (SVM) [20, 3] for discriminating between the elements of two massive datasets, each with millions of points. Because SOR handles one point at a time, similar to Platt's sequential minimal optimization (SMO) algorithm [18] which handles two constraints at a time, it can process very large datasets that need not reside in memory. The algorithm converges linearly to a solution. Encouraging numerical results are presented on datasets with up to 10 million points. Such massive discrimination problems cannot be processed by conventional linear or quadratic programming methods, and to our knowledge have not been solved by other methods. 1 Introduction Successive overrelaxation, originally developed for the solution of large systems of linear equations [16, 15] has been successfully applied to mathematical programming problems [4, 11, 12, 1...