Results 1 - 10
of
11
A comparative study on content-based music genre classification
- in Proc. SIGIR, 2003
"... Content-based music genre classification is a fundamental component of music information retrieval systems and has been gaining importance and enjoying a growing amount of attention with the emergence of digital music on the Internet. Currently little work has been done on automatic music genre clas ..."
Abstract
-
Cited by 60 (9 self)
- Add to MetaCart
Content-based music genre classification is a fundamental component of music information retrieval systems and has been gaining importance and enjoying a growing amount of attention with the emergence of digital music on the Internet. Currently little work has been done on automatic music genre classification, and in addition, the reported classification accuracies are relatively low. This paper proposes a new feature extraction method for music genre classification, DWCHs 1. DWCHs capture the local and global information of music signals simultaneously by computing histograms on their Daubechies wavelet coefficients. Effectiveness of this new feature and of previously studied features are compared using various machine learning classification algorithms, including Support Vector Machines and Linear Discriminant Analysis. It is demonstrated that the use of DWCHs significantly improves the accuracy of music genre classification.
Extracting Shared Subspace for Multi-label Classification
"... Multi-label problems arise in various domains such as multitopic document categorization and protein function prediction. One natural way to deal with such problems is to construct a binary classifier for each label, resulting in a set of independent binary classification problems. Since the multipl ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
Multi-label problems arise in various domains such as multitopic document categorization and protein function prediction. One natural way to deal with such problems is to construct a binary classifier for each label, resulting in a set of independent binary classification problems. Since the multiple labels share the same input space, and the semantics conveyed by different labels are usually correlated, it is essential to exploit the correlation information contained in different labels. In this paper, we consider a general framework for extracting shared structures in multi-label classification. In this framework, a common subspace is assumed to be shared among multiple labels. We show that the optimal solution to the proposed formulation can be obtained by solving a generalized eigenvalue problem, though the problem is nonconvex. For high-dimensional problems, direct computation of the solution is expensive, and we develop an efficient algorithm for this case. One appealing feature of the proposed framework is that it includes several well-known algorithms as special cases, thus elucidating their intrinsic relationships. We have conducted extensive experiments on eleven multitopic web page categorization tasks, and results demonstrate the effectiveness of the proposed formulation in comparison with several representative algorithms.
Parallelization of the Incremental Proximal Support Vector Machine Classifier using a Heap-based Tree Topology
- In Parallel and Distributed computing for Machine Learning. In conjunction with the 14th European Conference on Machine Learning (ECML’03) and 7th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD’03
, 2003
"... Abstract. Support Vector Machines (SVMs) are an efficient data mining approach for classification, clustering and time series analysis. In recent years, a tremendous growth in the amount of data gathered has changed the focus of SVM classifier algorithms from providing accurate results to enabling i ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
Abstract. Support Vector Machines (SVMs) are an efficient data mining approach for classification, clustering and time series analysis. In recent years, a tremendous growth in the amount of data gathered has changed the focus of SVM classifier algorithms from providing accurate results to enabling incremental (and decremental) learning with new data (or unlearning old data) without the need for computationally costly retraining with the old data. In this paper we propose two efficient parallelized algorithms based on heaps of processing nodes for classification with the incremental proximal SVM introduced by Fung and Mangasarian. 1
Multicategory Incremental Proximal Support Vector Classifiers
- In Proceedings of the 7th International Conference on Knowledge-Based Information & Engineering Systems (KES’2003), number 2773 in Lecture Notes in Artificial Intelligence (LNAI
, 2003
"... Abstract. Support Vector Machines (SVMs) are an efficient data mining approach for classification, clustering and time series analysis. In recent years, a tremendous growth in the amount of data gathered has changed the focus of SVM classifier algorithms from providing accurate results to enabling i ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
Abstract. Support Vector Machines (SVMs) are an efficient data mining approach for classification, clustering and time series analysis. In recent years, a tremendous growth in the amount of data gathered has changed the focus of SVM classifier algorithms from providing accurate results to enabling incremental (and decremental) learning with new data (or unlearning old data) without the need for computationally costly retraining with the old data. In this paper we propose an efficient algorithm for multicategory classification with the incremental proximal SVM introduced by Fung and Mangasarian. 1
A framework for kernel-based multi-category classification
, 2005
"... A geometric framework for understanding multi-category classification is introduced, through which many existing ‘all-together ’ algorithms can be understood. The structure allows the derivation of a parsimonious optimisation function, which is a direct extension of the binary ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
A geometric framework for understanding multi-category classification is introduced, through which many existing ‘all-together ’ algorithms can be understood. The structure allows the derivation of a parsimonious optimisation function, which is a direct extension of the binary
A Kernel-Based Two-Class Classifier for Imbalanced Data Sets
"... Abstract—Many kernel classifier construction algorithms adopt classification accuracy as performance metrics in model evaluation. Moreover, equal weighting is often applied to each data sample in parameter estimation. These modeling practices often become problematic if the data sets are imbalanced. ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Abstract—Many kernel classifier construction algorithms adopt classification accuracy as performance metrics in model evaluation. Moreover, equal weighting is often applied to each data sample in parameter estimation. These modeling practices often become problematic if the data sets are imbalanced. We present a kernel classifier construction algorithm using orthogonal forward selection (OFS) in order to optimize the model generalization for imbalanced two-class data sets. This kernel classifier identification algorithm is based on a new regularized orthogonal weighted least squares (ROWLS) estimator and the model selection criterion of maximal leave-one-out area under curve (LOO-AUC) of the receiver operating characteristics (ROCs). It is shown that, owing to the orthogonalization procedure, the LOO-AUC can be calculated via an analytic formula based on the new regularized orthogonal weighted least squares parameter estimator, without actually splitting the estimation data set. The proposed algorithm can achieve minimal computational expense via a set of forward recursive updating formula in searching model terms with maximal incremental LOO-AUC value. Numerical examples are used to demonstrate the efficacy of the algorithm. Index Terms—Forward selection, imbalanced data sets, kernel classifier, leave-one-out (LOO) cross validation, receiver operating characteristics (ROCs). I.
Multiclass Proximal Support Vector Machines
, 2004
"... Abstract. We propose an extension of proximal support vector machines (PSVM) to the multi-class case. Unlike the one-versus-rest approach that constructs the decision rule based on multiple binary classification tasks, the multiclass PSVM (MPSVM) considers all classes simultaneously and provides a u ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. We propose an extension of proximal support vector machines (PSVM) to the multi-class case. Unlike the one-versus-rest approach that constructs the decision rule based on multiple binary classification tasks, the multiclass PSVM (MPSVM) considers all classes simultaneously and provides a unifying framework when there are either equal or unequal misclassification costs. The MPSVM is built in a regularization framework of reproducing kernel Hilbert space (RKHS) and implements the Bayes rule asymptotically. With regard to computation, the MPSVM simply solves a system of linear equations and demands much less computational effort than the SVM, which can be slow due to optimizing a large-scaled quadratic programming under linear constraints. Some effi-cient algorithm is suggested and one stable computation strategy is also provided for ill-posed cases. The effectiveness of the MPSVM was demonstrated by both simulation studies and applications to cancer classification using microarray data. 1.
Siemens Medical Solutions and
"... Multi-label problems arise in various domains such as multi-topic document categorization, protein function prediction, and automatic image annotation. One natural way to deal with such problems is to construct a binary classifier for each label, resulting in a set of independent binary classificati ..."
Abstract
- Add to MetaCart
Multi-label problems arise in various domains such as multi-topic document categorization, protein function prediction, and automatic image annotation. One natural way to deal with such problems is to construct a binary classifier for each label, resulting in a set of independent binary classification problems. Since multiple labels share the same input space, and the semantics conveyed by different labels are usually correlated, it is essential to exploit the correlation information contained in different labels. In this paper, we consider a general framework for extracting shared structures in multi-label classification. In this framework, a common subspace is assumed to be shared among multiple labels. We show that the optimal solution to the proposed formulation can be obtained by solving a generalized eigenvalue problem, though the problem is nonconvex. For high-dimensional problems, direct computation of the solution is expensive, and we develop an efficient algorithm for this case. One appealing feature of the proposed framework is that it includes several well-known algorithms as special cases, thus elucidating their intrinsic relationships. We further show that the proposed framework can be extended to the kernel-induced feature space. We have conducted extensive experiments on multi-topic web page categorization and automatic gene expression pattern image annotation tasks, and results demonstrate
Adapting Two-Class Support Vector Classification Methods to Many Class Problems
, 2005
"... A geometric construction is presented which is shown to be an effective tool for understanding and implementing multi-category support vector classification. It is demonstrated how this construction can be used to extend many other existing two-class kernel-based classification methodologies i ..."
Abstract
- Add to MetaCart
A geometric construction is presented which is shown to be an effective tool for understanding and implementing multi-category support vector classification. It is demonstrated how this construction can be used to extend many other existing two-class kernel-based classification methodologies in a straightforward way while still preserving attractive properties of individual algorithms. Reducing
Embedding Proximal Support Vectors into Randomized Trees
"... Abstract. By embedding multiple proximal SVM classifiers into a binary tree architecture, it is possible to turn an arbitrary multi-classes problem into a hierarchy of binary classifications. The critical issue then consists in determining in each node of the tree how to aggregate the multiple class ..."
Abstract
- Add to MetaCart
Abstract. By embedding multiple proximal SVM classifiers into a binary tree architecture, it is possible to turn an arbitrary multi-classes problem into a hierarchy of binary classifications. The critical issue then consists in determining in each node of the tree how to aggregate the multiple classes intoapairofsayoverlay classes to discriminate. As a fundamental contribution, our paper proposes to deploy an ensemble of randomized trees, instead of a single optimized decision tree, to bypass the question of overlay classes definition. Empirical results on various datasets demonstrate a significant gain in accuracy both compared to ’one versus one ’ SVM solutions and to conventional ensemble of decision trees classifiers. 1

