Results 1  10
of
14
Curvature and Optimal Algorithms for Learning and Minimizing Submodular Functions
 IN NIPS
, 2013
"... We investigate three related and important problems connected to machine learning: approximating a submodular function everywhere, learning a submodular function (in a PAClike setting [28]), and constrained minimization of submodular functions. We show that the complexity of all three problems depe ..."
Abstract

Cited by 9 (6 self)
 Add to MetaCart
(Show Context)
We investigate three related and important problems connected to machine learning: approximating a submodular function everywhere, learning a submodular function (in a PAClike setting [28]), and constrained minimization of submodular functions. We show that the complexity of all three problems depends on the “curvature” of the submodular function, and provide lower and upper bounds that refine and improve previous results [2, 6, 8, 27]. Our proof techniques are fairly generic. We either use a blackbox transformation of the function (for approximation and learning), or a transformation of algorithms to use an appropriate surrogate function (for minimization). Curiously, curvature has been known to influence approximations for submodular maximization [3, 29], but its effect on minimization, approximation and learning has hitherto been open. We complete this picture, and also support our theoretical claims by empirical results.
Fast MultiStage Submodular Maximization
, 2014
"... Motivated by extremely largescale machine learning problems, we introduce a new multistage algorithmic framework for submodular maximization (called MULTGREED), where at each stage we apply an approximate greedy procedure to maximize surrogate submodular functions. The surrogates serve as proxies ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
(Show Context)
Motivated by extremely largescale machine learning problems, we introduce a new multistage algorithmic framework for submodular maximization (called MULTGREED), where at each stage we apply an approximate greedy procedure to maximize surrogate submodular functions. The surrogates serve as proxies for a target submodular function but require less memory and are easy to evaluate. We theoretically analyze the performance guarantee of the multistage framework and give examples on how to design instances of MULTGREED for a broad range of natural submodular functions. We show that MULTGREED performs very closely to the standard greedy algorithm given appropriate surrogate functions and argue how our framework can easily be integrated with distributive algorithms for further optimization. We complement our theory by empirically evaluating on several realworld problems, including data subset selection on millions of speech samples where MULTGREED yields at least a thousand times speedup and superior results over the stateoftheart selection methods.
Composable Coresets for Diversity and Coverage maximization (Extended Abstract)
, 2014
"... In this paper we consider efficient construction of “composable coresets” for basic diversity and coverage maximization problems. A coreset for a pointset in a metric space is a subset of the pointset with the property that an approximate solution to the whole pointset can be obtained given the ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
In this paper we consider efficient construction of “composable coresets” for basic diversity and coverage maximization problems. A coreset for a pointset in a metric space is a subset of the pointset with the property that an approximate solution to the whole pointset can be obtained given the coreset alone. A composable coreset has the property that for a collection of sets, the approximate solution to the union of the sets in the collection can be obtained given the union of the composable coresets for the point sets in the collection. Using composable coresets one can obtain efficient solutions to a wide variety of massive data processing applications, including nearest neighbor search, streaming algorithms and mapreduce computation. Our main results are algorithms for constructing com
Machine teaching: an inverse problem to machine learning and an approach toward optimal education
 THE TWENTYNINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI “BLUE SKY” SENIOR MEMBER PRESENTATION TRACK)
, 2015
"... I draw the reader’s attention to machine teaching, the problem of finding an optimal training set given a machine learning algorithm and a target model. In addition to generating fascinating mathematical questions for computer scientists to ponder, machine teaching holds the promise of enhancing ed ..."
Abstract

Cited by 5 (4 self)
 Add to MetaCart
(Show Context)
I draw the reader’s attention to machine teaching, the problem of finding an optimal training set given a machine learning algorithm and a target model. In addition to generating fascinating mathematical questions for computer scientists to ponder, machine teaching holds the promise of enhancing education and personnel training. The Socratic dialogue style aims to stimulate critical thinking.
Provable submodular minimization using Wolfe’s algorithm
 In NIPS
, 2014
"... Owing to several applications in large scale learning and vision problems, fast submodular function minimization (SFM) has become a critical problem. Theoretically, unconstrained SFM can be performed in polynomial time [10, 11]. However, these algorithms are typically not practical. In 1976, Wolfe [ ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
Owing to several applications in large scale learning and vision problems, fast submodular function minimization (SFM) has become a critical problem. Theoretically, unconstrained SFM can be performed in polynomial time [10, 11]. However, these algorithms are typically not practical. In 1976, Wolfe [21] proposed an algorithm to find the minimum Euclidean norm point in a polytope, and in 1980, Fujishige [3] showed how Wolfe’s algorithm can be used for SFM. For general submodular functions, this FujishigeWolfe minimum norm algorithm seems to have the best empirical performance. Despite its good practical performance, very little is known about Wolfe’s minimum norm algorithm theoretically. To our knowledge, the only result is an exponential time analysis due to Wolfe [21] himself. In this paper we give a maiden convergence analysis of Wolfe’s algorithm. We prove that in t iterations, Wolfe’s algorithm returns an O(1/t)approximate solution to the minnorm point on any polytope. We also prove a robust version of Fujishige’s theorem which shows that anO(1/n2)approximate solution to the minnorm point on the base polytope implies exact submodular minimization. As a corollary, we get the first pseudopolynomial time guarantee for the FujishigeWolfe minimum norm algorithm for unconstrained submodular function minimization. 1
Learning coverage functions and private release of marginals
 In COLT
, 2014
"... We study the problem of approximating and learning coverage functions. A function c: 2[n] → R+ is a coverage function, if there exists a universe U with nonnegative weights w(u) for each u ∈ U and subsets A1, A2,..., An of U such that c(S) = u∈∪i∈SAi w(u). Alternatively, coverage functions can be ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
We study the problem of approximating and learning coverage functions. A function c: 2[n] → R+ is a coverage function, if there exists a universe U with nonnegative weights w(u) for each u ∈ U and subsets A1, A2,..., An of U such that c(S) = u∈∪i∈SAi w(u). Alternatively, coverage functions can be described as nonnegative linear combinations of monotone disjunctions. They are a natural subclass of submodular functions and arise in a number of applications. We give an algorithm that for any γ, δ> 0, given random and uniform examples of an unknown coverage function c, finds a function h that approximates c within factor 1 + γ on all but δfraction of the points in time poly(n, 1/γ, 1/δ). This is the first fullypolynomial algorithm for learning an interesting class of functions in the demanding PMAC model of Balcan and Harvey [2012]. Our algorithms are based on several new structural properties of coverage functions. Using the results in [Feldman and Kothari, 2014], we also show that coverage functions are learnable agnostically with excess `1error over all product and symmetric distributions in time nlog(1/). In contrast, we show that, without assumptions on the distribution, learning coverage functions is at least as hard as learning polynomialsize disjoint DNF formulas, a class of functions for which the best known algorithm runs in time 2Õ(n 1/3) [Klivans and Servedio, 2004]. As an application of our learning results, we give simple differentiallyprivate algorithms for releasing monotone conjunction counting queries with low average error. In particular, for any k ≤ n, we obtain private release of kway marginals with average error α ̄ in time nO(log(1/ᾱ)). 1
On approximate nonsubmodular minimization via treestructured supermodularity.
 In 18th International Conference on Artificial Intelligence and Statistics (AISTATS2015),
, 2015
"... Abstract We address the problem of minimizing nonsubmodular functions where the supermodularity is restricted to treestructured pairwise terms. We are motivated by several real world applications, which require submodularity along with structured supermodularity, and this forms a rich class of ex ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Abstract We address the problem of minimizing nonsubmodular functions where the supermodularity is restricted to treestructured pairwise terms. We are motivated by several real world applications, which require submodularity along with structured supermodularity, and this forms a rich class of expressive models, where the nonsubmodularity is restricted to a tree. While this problem is NP hard (as we show), we develop several practical algorithms to find approximate and nearoptimal solutions for this problem, some of which provide lower and others of which provide upper bounds thereby allowing us to compute a tightness gap for any problem. We compare our algorithms on synthetic data, and also demonstrate the advantage of the formulation on the real world application of image segmentation, where we incorporate structured supermodularity into higherorder submodular energy minimization.
Fast MultiStage Submodular Maximization: Extended version
"... Motivated by extremely largescale machine learning problems, we introduce a new multistage algorithmic framework for submodular maximization (called MULTGREED), where at each stage we apply an approximate greedy procedure to maximize surrogate submodular functions. The surrogates serve as proxies ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Motivated by extremely largescale machine learning problems, we introduce a new multistage algorithmic framework for submodular maximization (called MULTGREED), where at each stage we apply an approximate greedy procedure to maximize surrogate submodular functions. The surrogates serve as proxies for a target submodular function but require less memory and are easy to evaluate. We theoretically analyze the performance guarantee of the multistage framework and give examples on how to design instances of MULTGREED for a broad range of natural submodular functions. We show that MULTGREED performs very closely to the standard greedy algorithm given appropriate surrogate functions and argue how our framework can easily be integrated with distributive algorithms for further optimization. We complement our theory by empirically evaluating on several realworld problems, including data subset selection on millions of speech samples where MULTGREED yields at least a thousand times speedup and superior results over the stateoftheart selection methods. 1
Deep Submodular Functions: Definitions & Learning
"... Abstract We propose and study a new class of submodular functions called deep submodular functions (DSFs). We define DSFs and situate them within the broader context of classes of submodular functions in relationship both to various matroid ranks and sums of concave composed with modular functions ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract We propose and study a new class of submodular functions called deep submodular functions (DSFs). We define DSFs and situate them within the broader context of classes of submodular functions in relationship both to various matroid ranks and sums of concave composed with modular functions (SCMs). Notably, we find that DSFs constitute a strictly broader class than SCMs, thus motivating their use, but that they do not comprise all submodular functions. Interestingly, some DSFs can be seen as special cases of certain deep neural networks (DNNs), hence the name. Finally, we provide a method to learn DSFs in a maxmargin framework, and offer preliminary results applying this both to synthetic and realworld data instances.
Query Workloadbased RDF Graph Fragmentation and Allocation
"... ABSTRACT As the volume of the RDF data becomes increasingly large, it is essential for us to design a distributed database system to manage it. For distributed RDF data design, it is quite common to partition the RDF data into some parts, called fragments, which are then distributed. Thus, the dist ..."
Abstract
 Add to MetaCart
(Show Context)
ABSTRACT As the volume of the RDF data becomes increasingly large, it is essential for us to design a distributed database system to manage it. For distributed RDF data design, it is quite common to partition the RDF data into some parts, called fragments, which are then distributed. Thus, the distribution design consists of two steps: fragmentation and allocation. In this paper, we propose a method to explore the intrinsic similarities among the structures of queries in a workload for fragmentation and allocation, which aims to reduce the number of crossing matches and the communication cost during SPARQL query processing. Specifically, we mine and select some frequent access patterns to reflect the characteristics of the workload. Based on the selected frequent access patterns, we propose two fragmentation strategies, vertical and horizontal fragmentation strategies, to divide RDF graphs while meeting different kinds of query processing objectives. Vertical fragmentation is for better throughput and horizontal fragmentation is for better performance. After fragmentation, we discuss how to allocate these fragments to various sites. Finally, we discuss how to process a query based on the results of fragmentation and allocation. Extensive experiments confirm the superior performance of our proposed solutions.