Results 1 - 10
of
343
Working set selection using the second order information for training SVM
- JOURNAL OF MACHINE LEARNING RESEARCH
, 2005
"... Working set selection is an important step in decomposition methods for training support vector machines (SVMs). This paper develops a new technique for working set selection in SMO-type decomposition methods. It uses second order information to achieve fast convergence. Theoretical properties such ..."
Abstract
-
Cited by 94 (6 self)
- Add to MetaCart
Working set selection is an important step in decomposition methods for training support vector machines (SVMs). This paper develops a new technique for working set selection in SMO-type decomposition methods. It uses second order information to achieve fast convergence. Theoretical properties such as linear convergence are established. Experiments demonstrate that the proposed method is faster than existing selection methods using first order information.
An interior-point method for large-scale l1-regularized logistic regression
- Journal of Machine Learning Research
, 2007
"... Logistic regression with ℓ1 regularization has been proposed as a promising method for feature selection in classification problems. In this paper we describe an efficient interior-point method for solving large-scale ℓ1-regularized logistic regression problems. Small problems with up to a thousand ..."
Abstract
-
Cited by 77 (3 self)
- Add to MetaCart
Logistic regression with ℓ1 regularization has been proposed as a promising method for feature selection in classification problems. In this paper we describe an efficient interior-point method for solving large-scale ℓ1-regularized logistic regression problems. Small problems with up to a thousand or so features and examples can be solved in seconds on a PC; medium sized problems, with tens of thousands of features and examples, can be solved in tens of seconds (assuming some sparsity in the data). A variation on the basic method, that uses a preconditioned conjugate gradient method to compute the search step, can solve very large problems, with a million features and examples (e.g., the 20 Newsgroups data set), in a few minutes, on a PC. Using warm-start techniques, a good approximation of the entire regularization path can be computed much more efficiently than by solving a family of problems independently.
A Note on Platt's Probabilistic Outputs for Support Vector Machines
, 2003
"... Platt's probabilistic outputs for Support Vector Machines [6] has been popular for applications that require posterior class probabilities. In this note, we propose an improvement which theoretically converges and avoids numerical difficulties. A simpler and ready-to-use pseudo code is included. ..."
Abstract
-
Cited by 63 (4 self)
- Add to MetaCart
Platt's probabilistic outputs for Support Vector Machines [6] has been popular for applications that require posterior class probabilities. In this note, we propose an improvement which theoretically converges and avoids numerical difficulties. A simpler and ready-to-use pseudo code is included.
Trust region Newton method for large-scale logistic regression
- In Proceedings of the 24th International Conference on Machine Learning (ICML
, 2007
"... Large-scale logistic regression arises in many applications such as document classification and natural language processing. In this paper, we apply a trust region Newton method to maximize the log-likelihood of the logistic regression model. The proposed method uses only approximate Newton steps in ..."
Abstract
-
Cited by 35 (5 self)
- Add to MetaCart
Large-scale logistic regression arises in many applications such as document classification and natural language processing. In this paper, we apply a trust region Newton method to maximize the log-likelihood of the logistic regression model. The proposed method uses only approximate Newton steps in the beginning, but achieves fast convergence in the end. Experiments show that it is faster than the commonly used quasi Newton approach for logistic regression. We also compare it with existing linear SVM implementations. 1
Introducing dendritic cells as a novel immune-inspired algorithm for anomaly detection
- In ICARIS-05, LNCS 3627
, 2005
"... Abstract. Dendritic cells are antigen presenting cells that provide a vital link between the innate and adaptive immune system. Research into this family of cells has revealed that they perform the role of coordinating T-cell based immune responses, both reactive and for generating tolerance. We hav ..."
Abstract
-
Cited by 35 (11 self)
- Add to MetaCart
Abstract. Dendritic cells are antigen presenting cells that provide a vital link between the innate and adaptive immune system. Research into this family of cells has revealed that they perform the role of coordinating T-cell based immune responses, both reactive and for generating tolerance. We have derived an algorithm based on the functionality of these cells, and have used the signals and differentiation pathways to build a control mechanism for an artificial immune system. We present our algorithmic details in addition to some preliminary results, where the algorithm was applied for the purpose of anomaly detection. We hope that this algorithm will eventually become the key component within a large, distributed immune system, based on sound immunological concepts.
Multi-task feature selection
- In the workshop of structural Knowledge Transfer for Machine Learning in the 23rd International Conference on Machine Learning (ICML
, 2006
"... We address the problem of joint feature selection across a group of related classification or regression tasks. We propose a novel type of joint regularization of the model parameters in order to couple feature selection across tasks. Intuitively, we extend the ℓ1 regularization for single-task esti ..."
Abstract
-
Cited by 31 (1 self)
- Add to MetaCart
We address the problem of joint feature selection across a group of related classification or regression tasks. We propose a novel type of joint regularization of the model parameters in order to couple feature selection across tasks. Intuitively, we extend the ℓ1 regularization for single-task estimation to the multi-task setting. By penalizing the sum of ℓ2-norms of the blocks of coefficients associated with each feature across different tasks, we encourage multiple predictors to have similar parameter sparsity patterns. To fit parameters under this regularization, we propose a blockwise boosting scheme that follows the regularization path. The algorithm introduces and updates simultaneously the coefficients associated with one feature in all tasks. We show empirically that this approach outperforms independent ℓ1-based feature selection on several datasets. 1
Efficient l1 regularized logistic regression
- In AAAI-06
, 2006
"... L1 regularized logistic regression is now a workhorse of machine learning: it is widely used for many classification problems, particularly ones with many features. L1 regularized logistic regression requires solving a convex optimization problem. However, standard algorithms for solving convex opti ..."
Abstract
-
Cited by 30 (4 self)
- Add to MetaCart
L1 regularized logistic regression is now a workhorse of machine learning: it is widely used for many classification problems, particularly ones with many features. L1 regularized logistic regression requires solving a convex optimization problem. However, standard algorithms for solving convex optimization problems do not scale well enough to handle the large datasets encountered in many practical settings. In this paper, we propose an efficient algorithm for L1 regularized logistic regression. Our algorithm iteratively approximates the objective function by a quadratic approximation at the current point, while maintaining the L1 constraint. In each iteration, it uses the efficient LARS (Least Angle Regression) algorithm to solve the resulting L1 constrained quadratic optimization problem. Our theoretical results show that our algorithm is guaranteed to converge to the global optimum. Our experiments show that our algorithm significantly outperforms standard algorithms for solving convex optimization problems. Moreover, our algorithm outperforms four previously published algorithms that were specifically designed to solve the L1 regularized logistic regression problem.
Secure anonymization for incremental datasets
- in the Third VLDB Workshop on Secure Data Management (SDM), 2006
"... Abstract. Data anonymization techniques based on the k-anonymity model have been the focus of intense research in the last few years. Although the k-anonymity model and the related techniques provide valuable solutions to data privacy, current solutions are limited only to static data release (i.e., ..."
Abstract
-
Cited by 29 (2 self)
- Add to MetaCart
Abstract. Data anonymization techniques based on the k-anonymity model have been the focus of intense research in the last few years. Although the k-anonymity model and the related techniques provide valuable solutions to data privacy, current solutions are limited only to static data release (i.e., the entire dataset is assumed to be available at the time of release). While this may be acceptable in some applications, today we see databases continuously growing everyday and even every hour. In such dynamic environments, the current techniques may suffer from poor data quality and/or vulnerability to inference. In this paper, we analyze various inference channels that may exist in multiple anonymized datasets and discuss how to avoid such inferences. We then present an approach to securely anonymizing a continuously growing dataset in an efficient manner while assuring high data quality. 1
A Clustering Approach for Data and Structural Anonymity in Social Networks
- In Privacy, Security, and Trust in KDD Workshop (PinKDD
, 2008
"... The advent of social network sites in the last few years seems to be a trend that will likely continue in the years to come. Online social interaction has become very popular around the globe and most sociologists agree that this will not fade away. Such a development is possible due to the advancem ..."
Abstract
-
Cited by 28 (3 self)
- Add to MetaCart
The advent of social network sites in the last few years seems to be a trend that will likely continue in the years to come. Online social interaction has become very popular around the globe and most sociologists agree that this will not fade away. Such a development is possible due to the advancements in computer power, technologies, and the spread of the World Wide Web. What many naïve technology users may not always realize is that the information they provide online is stored in massive data repositories and may be used for various purposes. Researchers have pointed out for some time the privacy implications of massive data gathering, and a lot of effort has been made to protect the data from unauthorized disclosure. However, most of the data privacy research has been focused on more traditional data models such as microdata (data stored as one relational table, where each row represents an individual entity). More recently, social network data has begun to be analyzed from a different, specific privacy perspective. Since the individual entities in social networks, besides the attribute values that characterize them, also have relationships with other entities, the possibility of privacy breaches increases. Our main contributions in this paper are the development of a greedy privacy algorithm for anonymizing a social network and the introduction of a structural information loss measure that quantifies the amount of information lost due to edge generalization in the anonymization process.
Cost curves: an improved method for visualizing classifier performance
- Machine Learning
, 2006
"... Abstract. This paper introduces cost curves, a graphical technique for visualizing the performance (error rate or expected cost) of 2-class classifiers over the full range of possible class distributions and misclassification costs. Cost curves are shown to be superior to ROC curves for visualizing ..."
Abstract
-
Cited by 27 (5 self)
- Add to MetaCart
Abstract. This paper introduces cost curves, a graphical technique for visualizing the performance (error rate or expected cost) of 2-class classifiers over the full range of possible class distributions and misclassification costs. Cost curves are shown to be superior to ROC curves for visualizing classifier performance for most purposes. This is because they visually support several crucial types of performance assessment that cannot be done easily with ROC curves, such as showing confidence intervals on a classifier’s performance, and visualizing the statistical significance of the difference in performance of two classifiers. A software tool supporting all the cost curve analysis described in this paper is available from the authors.

