Results 1 
6 of
6
Using machine teaching to identify optimal trainingset attacks on machine learners
 in ‘The TwentyNinth AAAI Conference on Artificial Intelligence
, 2015
"... We investigate a problem at the intersection of machine learning and security: trainingset attacks on machine learners. In such attacks an attacker contaminates the training data so that a specific learning algorithm would produce a model profitable to the attacker. Understanding trainingset atta ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
(Show Context)
We investigate a problem at the intersection of machine learning and security: trainingset attacks on machine learners. In such attacks an attacker contaminates the training data so that a specific learning algorithm would produce a model profitable to the attacker. Understanding trainingset attacks is important as more intelligent agents (e.g. spam filters and robots) are equipped with learning capability and can potentially be hacked via data they receive from the environment. This paper identifies the optimal trainingset attack on a broad family of machine learners. First we show that optimal trainingset attack can be formulated as a bilevel optimization problem. Then we show that for machine learners with certain KarushKuhnTucker conditions we can solve the bilevel problem efficiently using gradient methods on an implicit function. As examples, we demonstrate optimal trainingset attacks on Support Vector Machines, logistic regression, and linear regression with extensive experiments. Finally, we discuss potential defenses against such attacks.
The Security of Latent Dirichlet Allocation
"... Latent Dirichlet allocation (LDA) is an increasingly popular tool for data analysis in many domains. If LDA output affects decision making (especially when money is involved), there is an incentive for attackers to compromise it. We ask the question: how can an attacker minimally poison the corpu ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Latent Dirichlet allocation (LDA) is an increasingly popular tool for data analysis in many domains. If LDA output affects decision making (especially when money is involved), there is an incentive for attackers to compromise it. We ask the question: how can an attacker minimally poison the corpus so that LDA produces topics that the attacker wants the LDA user to see? Answering this question is important to characterize such attacks, and to develop defenses in the future. We give a novel bilevel optimization formulation to identify the optimal poisoning attack. We present an efficient solution (up to local optima) using descent method and implicit functions. We demonstrate poisoning attacks on LDA with extensive experiments, and discuss possible defenses. 1
Pattern recognition systems under attack
 Progress in Pattern Rec., Image Analysis, Computer Vision, and Applications, vol. 8258 of LNCS
, 2013
"... We analyze the problem of designing pattern recognition systems in adversarial settings, under an engineering viewpoint, motivated by their increasing exploitation in securitysensitive applications like spam and malware detection, despite their vulnerability to potential attacks has not yet been de ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
We analyze the problem of designing pattern recognition systems in adversarial settings, under an engineering viewpoint, motivated by their increasing exploitation in securitysensitive applications like spam and malware detection, despite their vulnerability to potential attacks has not yet been deeply understood. We first review previous work and report examples of how a complex system may be evaded either by leveraging on trivial vulnerabilities of its untrained components, e.g., parsing errors in the preprocessing steps, or by exploiting more subtle vulnerabilities of learning algorithms. We then discuss the need of exploiting both reactive and proactive security paradigms complementarily to improve the security by design. Our ultimate goal is to provide some useful guidelines for improving the security of pattern recognition in adversarial settings, and to suggest related open issues to foster research in this area. 1
On Robustness and Regularization of Structural Support Vector Machines
"... Previous analysis of binary support vector machines (SVMs) has demonstrated a deep connection between robustness to perturbations over uncertainty sets and regularization of the weights. In this paper, we explore the problem of learning robust models for structured prediction problems. We first form ..."
Abstract
 Add to MetaCart
Previous analysis of binary support vector machines (SVMs) has demonstrated a deep connection between robustness to perturbations over uncertainty sets and regularization of the weights. In this paper, we explore the problem of learning robust models for structured prediction problems. We first formulate the problem of learning robust structural SVMs when there are perturbations in the sample space, and show how we can construct corresponding bounds on the perturbations in the feature space. We then show that robustness to perturbations in the feature space is equivalent to additional regularization. For an ellipsoidal uncertainty set, the additional regularizer is based on the dual norm of the norm that constrains the ellipsoidal uncertainty. For a polyhedral uncertainty set, the robust optimization problem is equivalent to adding a linear regularizer in a transformed weight space related to the linear constraints of the polyhedron. We also show that these constraint sets can be combined and demonstrate a number of interesting special cases. This represents the first theoretical analysis of robust optimization of structural support vector machines. Our experimental results show that our method outperforms the nonrobust structural SVMs on real world data when the test data distribution has drifted from the training data distribution. 1.
Towards Adversarial Reasoning in Statistical Relational Domains
"... Statistical relational artificial intelligence combines firstorder logic and probability in order to handle the complexity and uncertainty present in many realworld domains. However, many realworld domains also include multiple agents that cooperate or compete according to their diverse goals. ..."
Abstract
 Add to MetaCart
Statistical relational artificial intelligence combines firstorder logic and probability in order to handle the complexity and uncertainty present in many realworld domains. However, many realworld domains also include multiple agents that cooperate or compete according to their diverse goals. In order to handle such domains, an autonomous agent must also consider the actions of other agents. In this paper, we show that existing statistical relational modeling and inference techniques can be readily adapted to certain adversarial or noncooperative scenarios. We also discuss how learning methods can be adapted to be robust to the behavior of adversaries. Extending and applying these methods to realworld problems will extend the scope and impact of statistical relational artificial intelligence.
TU Dortmund University
"... We propose relational linear programming, a simple framework for combing linear programs (LPs) and logic programs. A relational linear program (RLP) is a declarative LP template defining the objective and the constraints through the logical concepts of objects, relations, and quantified variables. ..."
Abstract
 Add to MetaCart
We propose relational linear programming, a simple framework for combing linear programs (LPs) and logic programs. A relational linear program (RLP) is a declarative LP template defining the objective and the constraints through the logical concepts of objects, relations, and quantified variables. This allows one to express the LP objective and constraints relationally for a varying number of individuals and relations among them without enumerating them. Together with a logical knowledge base, effectively a logical program consisting of logical facts and rules, it induces a ground LP. This ground LP is solved using lifted linear programming. That is, symmetries within the ground LP are employed to reduce its dimensionality, if possible, and the reduced program is solved using any offtheshelf LP solver. In contrast to mainstream LP template languages like AMPL, which features a mixture of declarative and imperative programming styles, RLP’s relational nature allows a more intuitive representation of optimization problems over relational domains. We illustrate this empirically by experiments on approximate inference in Markov logic networks using LP relaxations, on solving Markov decision processes, and on collective inference using LP support vector machines. 1