Results 1  10
of
42
Maximum margin clustering
 Advances in Neural Information Processing Systems 17
, 2005
"... We propose a new method for clustering based on finding maximum margin hyperplanes through data. By reformulating the problem in terms of the implied equivalence relation matrix, we can pose the problem as a convex integer program. Although this still yields a difficult computational problem, the ha ..."
Abstract

Cited by 77 (4 self)
 Add to MetaCart
We propose a new method for clustering based on finding maximum margin hyperplanes through data. By reformulating the problem in terms of the implied equivalence relation matrix, we can pose the problem as a convex integer program. Although this still yields a difficult computational problem, the hardclustering constraints can be relaxed to a softclustering formulation which can be feasibly solved with a semidefinite program. Since our clustering technique only depends on the data through the kernel matrix, we can easily achieve nonlinear clusterings in the same manner as spectral clustering. Experimental results show that our maximum margin clustering technique often obtains more accurate results than conventional clustering methods. The real benefit of our approach, however, is that it leads naturally to a semisupervised training method for support vector machines. By maximizing the margin simultaneously on labeled and unlabeled training data, we achieve state of the art performance by using a single, integrated learning principle. 1
Quality guarantees on koptimal solutions for distributed constraint optimization
, 2007
"... A distributed constraint optimization problem (DCOP) is a formalism that captures the rewards and costs of local interactions within a team of agents. Because complete algorithms to solve DCOPs are unsuitable for some dynamic or anytime domains, researchers have explored incomplete DCOP algorithms t ..."
Abstract

Cited by 23 (6 self)
 Add to MetaCart
A distributed constraint optimization problem (DCOP) is a formalism that captures the rewards and costs of local interactions within a team of agents. Because complete algorithms to solve DCOPs are unsuitable for some dynamic or anytime domains, researchers have explored incomplete DCOP algorithms that result in locally optimal solutions. One type of categorization of such algorithms, and the solutions they produce, is koptimality; a koptimal solution is one that cannot be improved by any deviation by k or fewer agents. This paper presents the first known guarantees on solution quality for koptimal solutions. The guarantees are independent of the costs and rewards in the DCOP, and once computed can be used for any DCOP of a given constraint graph structure. 1
Discriminative unsupervised learning of structured predictors
 In Proceedings ICML
, 2006
"... We present a new unsupervised algorithm for training structured predictors that is discriminative, convex, and avoids the use of EM. The idea is to formulate an unsupervised version of structured learning methods, such as maximum margin Markov networks, that can be trained via semidefinite programmi ..."
Abstract

Cited by 21 (1 self)
 Add to MetaCart
We present a new unsupervised algorithm for training structured predictors that is discriminative, convex, and avoids the use of EM. The idea is to formulate an unsupervised version of structured learning methods, such as maximum margin Markov networks, that can be trained via semidefinite programming. The result is a discriminative training criterion for structured predictors (like hidden Markov models) that remains unsupervised and does not create local minima. To reduce training cost, we reformulate the training procedure to mitigate the dependence on semidefinite programming, and finally propose a heuristic procedure that avoids semidefinite programming entirely. Experimental results show that the convex discriminative procedure can produce better conditional models than conventional BaumWelch (EM) training. 1.
An Outer Bound for Multiple Access Channels with Correlated Sources
, 2006
"... The capacity region of the multiple access channel with correlated sources remains an open problem. Cover, El Gamal and Salehi gave an achievable region in the form of singleletter entropy and mutual information expressions, without a singleletter converse. Cover, El Gamal and Salehi also suggeste ..."
Abstract

Cited by 13 (4 self)
 Add to MetaCart
The capacity region of the multiple access channel with correlated sources remains an open problem. Cover, El Gamal and Salehi gave an achievable region in the form of singleletter entropy and mutual information expressions, without a singleletter converse. Cover, El Gamal and Salehi also suggested a converse in terms of some nletter mutual informations, which are incomputable. We have proposed an upper bound for the sum rate of this channel in a singleletter expression, by utilizing a new necessary condition for the Markov chain constraint on the valid channel input distributions. In this paper, we extend our results from the sum rate to the entire capacity region. We obtain an outer bound for the capacity region of the multiple access channel with correlated sources in finiteletter expressions.
New results on rationality and strongly polynomial solvability in eisenberggale markets
 In Proceedings of 2nd Workshop on Internet and Network Economics
, 2006
"... Abstract. We study the structure of EG[2], the class of EisenbergGale markets with two agents. We prove that all markets in this class are rational and they admit strongly polynomial algorithms whenever the polytope containing the set of feasible utilities of the two agents can be described via a c ..."
Abstract

Cited by 10 (9 self)
 Add to MetaCart
Abstract. We study the structure of EG[2], the class of EisenbergGale markets with two agents. We prove that all markets in this class are rational and they admit strongly polynomial algorithms whenever the polytope containing the set of feasible utilities of the two agents can be described via a combinatorial LP. This helps resolve positively the status of two markets left as open problems by [JV]: the capacity allocation market in a directed graph with two sourcesink pairs and the network coding market in a directed network with two sources. Our algorithms for solving the corresponding nonlinear convex programs are fundamentally different from those obtained by [JV]; whereas they use the primaldual schema, we use a carefully constructed binary search. 1
Robust Support Vector Machine Training via Convex Outlier Ablation
"... One of the well known risks of large margin training methods, such as boosting and support vector machines (SVMs), is their sensitivity to outliers. These risks are normally mitigated by using a soft margin criterion, such as hinge loss, to reduce outlier sensitivity. In this paper, we present a mor ..."
Abstract

Cited by 10 (2 self)
 Add to MetaCart
One of the well known risks of large margin training methods, such as boosting and support vector machines (SVMs), is their sensitivity to outliers. These risks are normally mitigated by using a soft margin criterion, such as hinge loss, to reduce outlier sensitivity. In this paper, we present a more direct approach that explicitly incorporates outlier suppression in the training process. In particular, we show how outlier detection can be encoded in the large margin training principle of support vector machines. By expressing a convex relaxation of the joint training problem as a semidefinite program, one can use this approach to robustly train a support vector machine while suppressing outliers. We demonstrate that our approach can yield superior results to the standard soft margin approach in the presence of outliers.
Convex relaxations of latent variable training
 In Advances in Neural Information Processing Systems 20
, 2007
"... We investigate a new, convex relaxation of an expectationmaximization (EM) variant that approximates a standard objective while eliminating local minima. First, a cautionary result is presented, showing that any convex relaxation of EM over hidden variables must give trivial results if any dependen ..."
Abstract

Cited by 10 (5 self)
 Add to MetaCart
We investigate a new, convex relaxation of an expectationmaximization (EM) variant that approximates a standard objective while eliminating local minima. First, a cautionary result is presented, showing that any convex relaxation of EM over hidden variables must give trivial results if any dependence on the missing values is retained. Although this appears to be a strong negative outcome, we then demonstrate how the problem can be bypassed by using equivalence relations instead of value assignments over hidden variables. In particular, we develop new algorithms for estimating exponential conditional models that only require equivalence relation information over the variable values. This reformulation leads to an exact expression for EM variants in a wide range of problems. We then develop a semidefinite relaxation that yields global training by eliminating local minima. 1
Distributed Source Coding using Abelian Group Codes: Extracting Performance from Structure
"... In this work, we consider a distributed source coding problem with a joint distortion criterion depending on the sources and the reconstruction. This includes as a special case the problem of computing a function of the sources to within some distortion and also the classic SlepianWolf problem [12 ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
In this work, we consider a distributed source coding problem with a joint distortion criterion depending on the sources and the reconstruction. This includes as a special case the problem of computing a function of the sources to within some distortion and also the classic SlepianWolf problem [12], BergerTung problem [5], WynerZiv problem [4], YeungBerger problem [6] and the AhlswedeKornerWyner problem [3], [13]. While the prevalent trend in information theory has been to prove achievability results using Shannon’s random coding arguments, using structured random codes offer rate gains over unstructured random codes for many problems. Motivated by this, we present a new achievable ratedistortion region (an inner bound to the performance limit) for this problem for discrete memoryless sources based on “good” structured random nested codes built over abelian groups. We demonstrate rate gains for this problem over traditional coding schemes using random unstructured codes. For certain sources and distortion functions, the new rate region is strictly bigger than the BergerTung rate region, which has been the best known achievable rate region for this problem till now. Further, there is no known unstructured random coding scheme that achieves these rate gains. Achievable performance limits for singleuser source coding using abelian group codes are also obtained as parts of the proof of the main coding theorem. As a corollary, we also prove that nested linear codes achieve the Shannon ratedistortion bound in the singleuser setting. Note that while group codes retain some structure, they are more general than linear codes which can only be built over finite fields which are known to exist only for certain sizes.
Convex Structure Learning for Bayesian Networks: Polynomial Feature Selection and Approximate Ordering
, 2006
"... We present a new approach to learning the structure and parameters of a Bayesian network based on regularized estimation in an exponential family representation. Here we show that, given a fixed variable order, the optimal structure and parameters can be learned efficiently, even without restricting ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
We present a new approach to learning the structure and parameters of a Bayesian network based on regularized estimation in an exponential family representation. Here we show that, given a fixed variable order, the optimal structure and parameters can be learned efficiently, even without restricting the size of the parent variable sets. We then consider the problem of optimizing the variable order for a given set of features. This is still a computationally hard problem, but we present a convex relaxation that yields an optimal “soft” ordering in polynomial time. One novel aspect of the approach is that we do not perform a discrete search over DAG structures, nor over variable orders, but instead solve a continuous convex relaxation that can then be rounded to obtain a valid network structure. We conduct an experimental comparison against standard structure search procedures over standard objectives, which cope with local minima, and evaluate the advantages of using convex relaxations that reduce the effects of local minima.
Adaptive Large Margin Training for Multilabel Classification
"... Multilabel classification is a central problem in many areas of data analysis, including text and multimedia categorization, where individual data objects need to be assigned multiple labels. A key challenge in these tasks is to learn a classifier that can properly exploit label correlations without ..."
Abstract

Cited by 4 (4 self)
 Add to MetaCart
Multilabel classification is a central problem in many areas of data analysis, including text and multimedia categorization, where individual data objects need to be assigned multiple labels. A key challenge in these tasks is to learn a classifier that can properly exploit label correlations without requiring exponential enumeration of label subsets during training or testing. We investigate novel loss functions for multilabel training within a large margin framework—identifying a simple alternative that yields improved generalization while still allowing efficient training. We furthermore show how covariances between the label models can be learned simultaneously with the classification model itself, in a jointly convex formulation, without compromising scalability. The resulting combination yields state of the art accuracy in multilabel webpage classification.