Results 11  20
of
99
G.: Robust classification with interval data
, 2003
"... We consider a binary, linear classification problem in which the data points are assumed to be unknown, but bounded within given hyperrectangles, i.e., the covariates are bounded within intervals explicitly given for each data point separately. We address the problem of designing a robust classifie ..."
Abstract

Cited by 19 (0 self)
 Add to MetaCart
(Show Context)
We consider a binary, linear classification problem in which the data points are assumed to be unknown, but bounded within given hyperrectangles, i.e., the covariates are bounded within intervals explicitly given for each data point separately. We address the problem of designing a robust classifier in this setting by minimizing the worstcase value of a given loss function, over all possible choices of the data in these multidimensional intervals. We examine in detail the application of this methodology to three specific loss functions, arising in support vector machines, in logistic regression and in minimax probability machines. We show that in each case, the resulting problem is amenable to efficient interiorpoint algorithms for convex optimization. The methods tend to produce sparse classifiers, i.e., they induce many zero coefficients in the resulting weight vectors, and we provide some theoretical grounds for this property. After presenting possible extensions of this framework to handle label errors and other uncertainty models, we discuss in some detail our implementation, which exploits the potential sparsity or a more general property referred to as regularity, of the input matrices. 1
The minimum error minimax probability machine
 Journal of Machine Learning Research
, 2004
"... We construct a distributionfree Bayes optimal classifier called the Minimum Error Minimax Probability Machine (MEMPM) in a worstcase setting, i.e., under all possible choices of classconditional densities with a given mean and covariance matrix. By assuming no specific distributions for the data, ..."
Abstract

Cited by 19 (7 self)
 Add to MetaCart
We construct a distributionfree Bayes optimal classifier called the Minimum Error Minimax Probability Machine (MEMPM) in a worstcase setting, i.e., under all possible choices of classconditional densities with a given mean and covariance matrix. By assuming no specific distributions for the data, our model is thus distinguished from traditional Bayes optimal approaches, where an assumption on the data distribution is a must. This model is extended from the Minimax Probability Machine (MPM), a recentlyproposed novel classifier, and is demonstrated to be the general case of MPM. Moreover, it includes another special case named the Biased Minimax Probability Machine, which is appropriate for handling biased classification. One appealing feature of MEMPM is that it contains an explicit performance indicator, i.e., a lower bound on the worstcase accuracy, which is shown to be tighter than that of MPM. We provide conditions under which the worstcase Bayes optimal classifier converges to the Bayes optimal classifier. We demonstrate how to apply a more general statistical framework to estimate model input parameters robustly. We also show how to extend our model to nonlinear classification by exploiting kernelization techniques. A series of experiments on both synthetic data sets and real world benchmark data sets validates our proposition and demonstrates the effectiveness of our model.
Generalized Chebyshev bounds via semidefinite programming
 SIAM Review
"... Abstract. A sharp lower bound on the probability of a set defined by quadratic inequalities, given the first two moments of the distribution, can be efficiently computed using convex optimization. This result generalizes Chebyshev’s inequality for scalar random variables. Two semidefinite programmin ..."
Abstract

Cited by 18 (1 self)
 Add to MetaCart
(Show Context)
Abstract. A sharp lower bound on the probability of a set defined by quadratic inequalities, given the first two moments of the distribution, can be efficiently computed using convex optimization. This result generalizes Chebyshev’s inequality for scalar random variables. Two semidefinite programming formulations are presented, with a constructive proof based on convex optimization duality and elementary linear algebra. Key words. Semidefinite programming, convex optimization, duality theory, Chebyshev inequalities, moment problems. AMS subject classifications. 90C22, 90C25, 6008.
Learning Classifiers from Imbalanced Data Based on Biased Minimax Probability Machine
, 2004
"... We consider the problem of the binary classification on imbalanced data, in which nearly all the instances are labelled as one class, while far fewer instances are labelled as the other class, usually the more important class. Traditional machine learning methods seeking an accurate performance over ..."
Abstract

Cited by 17 (1 self)
 Add to MetaCart
(Show Context)
We consider the problem of the binary classification on imbalanced data, in which nearly all the instances are labelled as one class, while far fewer instances are labelled as the other class, usually the more important class. Traditional machine learning methods seeking an accurate performance over a full range of instances are not suitable to deal with this problem, since they tend to classify all the data into the majority, usually the less important class. Moreover, some current methods have tried to utilize some intermediate factors, e.g., the distribution of the training set, the decision thresholds or the cost matrices, to influence the bias of the classification. However, it remains uncertain whether these methods can improve the performance in a systematic way. In this paper, we propose a novel model named Biased Minimax Probability Machine. Different from previous methods, this model directly controls the worstcase real accuracy of classification of the future data to build up biased classifiers. Hence, it provides a rigorous treatment on imbalanced data. The experimental results on the novel model comparing with those of three competitive methods, i.e., the Naive Bayesian classifier, the kNearest Neighbor method, and the decision tree method C4.5, demonstrate the superiority of our novel model.
Static prediction games for adversarial learning problems
 Journal of Machine Learning Research
"... The standard assumption of identically distributed training and test data is violated when the test data are generated in response to the presence of a predictive model. This becomes apparent, for example, in the context of email spam filtering. Here, email service providers employ spam filters, an ..."
Abstract

Cited by 16 (0 self)
 Add to MetaCart
The standard assumption of identically distributed training and test data is violated when the test data are generated in response to the presence of a predictive model. This becomes apparent, for example, in the context of email spam filtering. Here, email service providers employ spam filters, and spam senders engineer campaign templates to achieve a high rate of successful deliveries despite the filters. We model the interaction between the learner and the data generator as a static game in which the cost functions of the learner and the data generator are not necessarily antagonistic. We identify conditions under which this prediction game has a unique Nash equilibrium and derive algorithms that find the equilibrial prediction model. We derive two instances, the Nash logistic regression and the Nash support vector machine, and empirically explore their properties in a case study on email spam filtering.
Nash Equilibria of Static Prediction Games
"... The standard assumption of identically distributed training and test data is violated when an adversary can exercise some control over the generation of the test data. In a prediction game, a learner produces a predictive model while an adversary may alter the distribution of input data. We study si ..."
Abstract

Cited by 13 (2 self)
 Add to MetaCart
(Show Context)
The standard assumption of identically distributed training and test data is violated when an adversary can exercise some control over the generation of the test data. In a prediction game, a learner produces a predictive model while an adversary may alter the distribution of input data. We study singleshot prediction games in which the cost functions of learner and adversary are not necessarily antagonistic. We identify conditions under which the prediction game has a unique Nash equilibrium, and derive algorithms that will find the equilibrial prediction models. In a case study, we explore properties of Nashequilibrial prediction models for email spam filtering empirically. 1
Stackelberg games for adversarial prediction problems
 In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
, 2011
"... The standard assumption of identically distributed training and test data is violated when test data are generated in response to a predictive model. This becomes apparent, for example, in the context of email spam filtering, where an email service provider employs a spam filter and the spam sender ..."
Abstract

Cited by 13 (1 self)
 Add to MetaCart
(Show Context)
The standard assumption of identically distributed training and test data is violated when test data are generated in response to a predictive model. This becomes apparent, for example, in the context of email spam filtering, where an email service provider employs a spam filter and the spam sender can take this filter into account when generating new emails. We model the interaction between learner and data generator as a Stackelberg competition in which the learner plays the role of the leader and the data generator may react on the leader’s move. We derive an optimization problem to determine the solution of this game and present several instances of the Stackelberg prediction game. We show that the Stackelberg prediction game generalizes existing prediction models. Finally, we explore properties of the discussed models empirically in the context of email spam filtering.
Robust sparse hyperplane classifiers: application to uncertain molecular profiling data
 Journal of Computational Biology
, 2004
"... Key words: robust sparse hyperplanes; secondorder cone program; linear programming; breast cancer; molecular profiling; twoclass highdimensional data ..."
Abstract

Cited by 12 (1 self)
 Add to MetaCart
Key words: robust sparse hyperplanes; secondorder cone program; linear programming; breast cancer; molecular profiling; twoclass highdimensional data
Gaussian margin machines
 In Proceedings on the International Conference on Artificial Intelligence and Statistics (AISTATS
, 2009
"... We introduce Gaussian Margin Machines (GMMs), which maintain a Gaussian distribution over weight vectors for binary classification. The learning algorithm for these machines seeks the least informative distribution that will classify the training data correctly with high probability. One formulation ..."
Abstract

Cited by 11 (4 self)
 Add to MetaCart
(Show Context)
We introduce Gaussian Margin Machines (GMMs), which maintain a Gaussian distribution over weight vectors for binary classification. The learning algorithm for these machines seeks the least informative distribution that will classify the training data correctly with high probability. One formulation can be expressed as a convex constrained optimization problem whose solution can be represented linearly in terms of training instances and their inner and outer products, supporting kernelization. The algorithm admits a natural PACBayesian justification and is shown to minimize a quantity directly related to a PACBayesian generalization bound. A preliminary evaluation on handwriting recognition data shows that our algorithm improves on SVMs for the same task, achieving lower test error and lower test error variance. 1
E.: Integrating the Content and
 Process of Strategic MIS Planning with Competitive Strategy. Decision Sciences 22 (5
, 1991
"... We review here the recent success in quantum annealing, i.e., optimization of the cost or energy functions of complex systems utilizing quantum fluctuations. The concept is introduced in successive steps through the studies of mapping of such computationally hard problems to the classical spin glass ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
We review here the recent success in quantum annealing, i.e., optimization of the cost or energy functions of complex systems utilizing quantum fluctuations. The concept is introduced in successive steps through the studies of mapping of such computationally hard problems to the classical spin glass problems. The quantum spin glass problems arise with the introduction of quantum