Results 1 - 10
of
22
An introduction to kernel-based learning algorithms
- IEEE TRANSACTIONS ON NEURAL NETWORKS
, 2001
"... This paper provides an introduction to support vector machines (SVMs), kernel Fisher discriminant analysis, and ..."
Abstract
-
Cited by 280 (46 self)
- Add to MetaCart
This paper provides an introduction to support vector machines (SVMs), kernel Fisher discriminant analysis, and
Primal-dual approximation algorithms for metric facility location and k-median problems
- Journal of the ACM
, 1999
"... ..."
A survey of outlier detection methodologies
- Artificial Intelligence Review
, 2004
"... Abstract. Outlier detection has been used for centuries to detect and, where appropriate, remove anomalous observations from data. Outliers arise due to mechanical faults, changes in system behaviour, fraudulent behaviour, human error, instrument error or simply through natural deviations in populat ..."
Abstract
-
Cited by 80 (3 self)
- Add to MetaCart
Abstract. Outlier detection has been used for centuries to detect and, where appropriate, remove anomalous observations from data. Outliers arise due to mechanical faults, changes in system behaviour, fraudulent behaviour, human error, instrument error or simply through natural deviations in populations. Their detection can identify system faults and fraud before they escalate with potentially catastrophic consequences. It can identify errors and remove their contaminating effect on the data set and as such to purify the data for processing. The original outlier detection methods were arbitrary but now, principled and systematic techniques are used, drawn from the full gamut of Computer Science and Statistics. In this paper, we introduce a survey of contemporary techniques for outlier detection. We identify their respective motivations and distinguish their advantages and disadvantages in a comparative review.
A Minsat approach for learning in logic domains
- INFORMS Journal on computing
, 2002
"... This paper describes a method for learning logic relationships that correctlyclassifya given data set. The method derives from given logic data certain minimum cost satisfiabilityproblems, solves these problems, and deduces from the solutions the desired logic relationships. Uses of the method inclu ..."
Abstract
-
Cited by 12 (10 self)
- Add to MetaCart
This paper describes a method for learning logic relationships that correctlyclassifya given data set. The method derives from given logic data certain minimum cost satisfiabilityproblems, solves these problems, and deduces from the solutions the desired logic relationships. Uses of the method include data mining, learning logic in expert systems, and identification of critical characteristics for recognition systems. Computational tests have proved that the method is fast and effective.
Statistical Analysis of Financial Networks
, 2005
"... Massive datasets arise in a broad spectrum of scientific, engineering and commercial applications. In many practically important cases, a massive dataset can be represented as a very large graph with certain attributes associated with its vertices and edges. Studying the structure of this graph is e ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
Massive datasets arise in a broad spectrum of scientific, engineering and commercial applications. In many practically important cases, a massive dataset can be represented as a very large graph with certain attributes associated with its vertices and edges. Studying the structure of this graph is essential for understanding the structural properties of the application it represents. Well-known examples of applying this approach are the Internet graph, the Web graph, and the Call graph. It turns out that the degree distributions of al these graphs can be described by the power-law model. Here we consider another important application -- a network representation of the stock market. Stock markets generate huge amounts of data, which can be used for constructing the market graph reflecting the market behavior. We conduct the statistical analysis of this graph and show that it also folliws the power-law model. Moreover, we detect cliques and independent sets in this graph. These special formations have a clear practical interpretation, and their analysis allows one to apply a new data mining technique of classifying financial instruments based on stock prices data, which provides a deeper insight into the internal structure of the stock market.
Genetic programming in classifying large-scale data: an ensemble method
- Information Sciences
, 2004
"... This study demonstrates the potential of genetic programming (GP) as a base classifier algorithm in building ensembles in the context of large-scale data classification. An ensemble built upon base classifiers that were trained with GP was found to significantly outperform its counterparts built upo ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
This study demonstrates the potential of genetic programming (GP) as a base classifier algorithm in building ensembles in the context of large-scale data classification. An ensemble built upon base classifiers that were trained with GP was found to significantly outperform its counterparts built upon base classifiers that were trained with decision tree and logistic regression. The superiority of GP ensembles is attributed to the higher diversity, both in terms of the functional form of as well as with respect to the variables defining the models, among the base classifiers.
Parallel Inductive Logic in Data Mining
- In Workshop on Distributed and Parallel Knowledge Discovery, KDD2000
, 2000
"... Data-mining is the process of automatic extraction of novel, useful and understandable patterns from very large databases. High-performance, scalable, and parallel computing algorithms are crucial in data mining as datasets grow inexorably in size and complexity. Inductive logic is a research area i ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
Data-mining is the process of automatic extraction of novel, useful and understandable patterns from very large databases. High-performance, scalable, and parallel computing algorithms are crucial in data mining as datasets grow inexorably in size and complexity. Inductive logic is a research area in the intersection of machine learning and logic programming, which has been recently applied to data mining. Inductive logic studies learning from examples, within the framework provided by clausal logic. It provides a uniform and very expressive means of representation: All examples, background knowledge as well as the induced theory are expressed in first-order logic. However, such an expressive representation is often computationally expensive. This report first presents the background for parallel data mining, the BSP model, and inductive logic programming. Based on the study, this report gives an approach to parallel inductive logic in data mining that solves the potential performance problem. Both parallel algorithm and cost analysis are provided. This approach is applied to a number of problems and it shows a super-linear speedup. To justify this analysis, I implemented a parallel version of a core ILP system -- Progol -- in C with the support of the BSP parallel model. Three test cases are provided and a double speedup phenomenon is observed on all these datasets and on two different parallel computers. Contents 1
Robust support vector method for hyperspectral data classification and knowledge discovery
- IEEE Transactions on Geoscience and Remote Sensing
, 2004
"... Abstract — In this paper, we propose the use of Support Vector Machines (SVM) for automatic hyperspectral data classification and knowledge discovery. In the first stage of the study, we use SVMs for crop classification and analyze their performance in terms of efficiency and robustness, as compared ..."
Abstract
-
Cited by 6 (4 self)
- Add to MetaCart
Abstract — In this paper, we propose the use of Support Vector Machines (SVM) for automatic hyperspectral data classification and knowledge discovery. In the first stage of the study, we use SVMs for crop classification and analyze their performance in terms of efficiency and robustness, as compared to extensively used neural and fuzzy methods. Efficiency is assessed by evaluating accuracy and statistical differences in several scenes. Robustness is analyzed in terms of (a) suitability to working conditions when a feature selection stage is not possible, and (b) performance when different levels of Gaussian noise are introduced at their inputs. In the second stage of this work, we analyze the distribution of the support vectors (SV) and perform sensitivity analysis on the best classifier in order to analyze the significance of the input spectral bands. For classification purposes, six hyperspectral images acquired with the 128-band HyMAP spectrometer during the DAISEX-1999 campaign are used. Six crop classes were labelled for each image. A reduced set of labelled samples is used to train the models and the entire images are used to assess their performance. Several conclusions are drawn: (1) SVMs yield better outcomes than neural networks regarding accuracy, simplicity and robustness; (2) training neural and neurofuzzy models is unfeasible when working with high dimensional input spaces and great amounts of training data; (3) SVMs perform similarly for different training subsets with varying input dimension, which indicates that noisy bands are successfully detected; and (4) a valuable ranking of bands through sensitivity analysis is achieved. Index Terms — Hyperspectral imagery, crop classification, knowledge discovery, Support Vector Machines, neural networks.
A new theoretical framework for K-means-type clustering
- FOUNDATIONS AND ADVANCES IN DATA MINING
, 2005
"... One of the fundamental clustering problems is to assign n points into k clusters based on the minimal sum-of-squares(MSSC), which is known to be NP-hard. In this paper, by using matrix arguments, we first model MSSC as a so-called 0-1 semidefinite programming (SDP). The classical K-means algorithm c ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
One of the fundamental clustering problems is to assign n points into k clusters based on the minimal sum-of-squares(MSSC), which is known to be NP-hard. In this paper, by using matrix arguments, we first model MSSC as a so-called 0-1 semidefinite programming (SDP). The classical K-means algorithm can be interpreted as a special heuristics for the underlying 0-1 SDP. Moreover, the 0-1 SDP model can be further approximated by the relaxed and polynomially solvable linear and semidefinite programming. This opens new avenues for solving MSSC. The 0-1 SDP model can be applied not only to MSSC, but also to other scenarios of clustering as well. In particular, we show that the recently proposed normalized k-cut and spectral clustering can also be embedded into the 0-1 SDP model in various kernel spaces.
Approximating k-means-type clustering via semidefinite programming
- SIAM Journal on Optimization
"... One of the fundamental clustering problems is to assign n points into k clusters based on the minimal sum-of-squares(MSSC), which is known to be NP-hard. In this paper, by using matrix arguments, we first model MSSC as a so-called 0-1 semidefinite programming (SDP). We show that our 0-1 SDP model pr ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
One of the fundamental clustering problems is to assign n points into k clusters based on the minimal sum-of-squares(MSSC), which is known to be NP-hard. In this paper, by using matrix arguments, we first model MSSC as a so-called 0-1 semidefinite programming (SDP). We show that our 0-1 SDP model provides an unified framework for several clustering approaches such as normalized k-cut and spectral clustering. Moreover, the 0-1 SDP model allows us to solve the underlying problem approximately via the relaxed linear and semidefinite programming. Secondly, we consider the issue of how to extract a feasible solution of the original MSSC model from the approximate solution of the relaxed SDP problem. By using principal component analysis, we develop a rounding procedure to construct a feasible partitioning from a solution of the relaxed problem. In our rounding procedure, we need to solve a k-means clustering problem in ℜ k−1, which can be solved in O(n k2 (k−1) ) time. In case of bi-clustering, the running time of our rounding procedure can be reduced to O(n log n). We show that our algorithm can provide a 2-approximate solution to the original problem. Promising numerical results based on our new method are reported. Key words. K-means clustering, Principal component analysis, Semidefinite programming, Approximation.

