Results 1  10
of
56
Maximum margin clustering
 Advances in Neural Information Processing Systems 17
, 2005
"... We propose a new method for clustering based on finding maximum margin hyperplanes through data. By reformulating the problem in terms of the implied equivalence relation matrix, we can pose the problem as a convex integer program. Although this still yields a difficult computational problem, the ha ..."
Abstract

Cited by 80 (4 self)
 Add to MetaCart
We propose a new method for clustering based on finding maximum margin hyperplanes through data. By reformulating the problem in terms of the implied equivalence relation matrix, we can pose the problem as a convex integer program. Although this still yields a difficult computational problem, the hardclustering constraints can be relaxed to a softclustering formulation which can be feasibly solved with a semidefinite program. Since our clustering technique only depends on the data through the kernel matrix, we can easily achieve nonlinear clusterings in the same manner as spectral clustering. Experimental results show that our maximum margin clustering technique often obtains more accurate results than conventional clustering methods. The real benefit of our approach, however, is that it leads naturally to a semisupervised training method for support vector machines. By maximizing the margin simultaneously on labeled and unlabeled training data, we achieve state of the art performance by using a single, integrated learning principle. 1
Quality guarantees on koptimal solutions for distributed constraint optimization problems
 In International Joint Conference on Artificial Intelligence (IJCAI
, 2007
"... ..."
Discriminative unsupervised learning of structured predictors
 In Proceedings ICML
, 2006
"... We present a new unsupervised algorithm for training structured predictors that is discriminative, convex, and avoids the use of EM. The idea is to formulate an unsupervised version of structured learning methods, such as maximum margin Markov networks, that can be trained via semidefinite programmi ..."
Abstract

Cited by 22 (1 self)
 Add to MetaCart
We present a new unsupervised algorithm for training structured predictors that is discriminative, convex, and avoids the use of EM. The idea is to formulate an unsupervised version of structured learning methods, such as maximum margin Markov networks, that can be trained via semidefinite programming. The result is a discriminative training criterion for structured predictors (like hidden Markov models) that remains unsupervised and does not create local minima. To reduce training cost, we reformulate the training procedure to mitigate the dependence on semidefinite programming, and finally propose a heuristic procedure that avoids semidefinite programming entirely. Experimental results show that the convex discriminative procedure can produce better conditional models than conventional BaumWelch (EM) training. 1.
An Outer Bound for Multiple Access Channels with Correlated Sources
, 2006
"... The capacity region of the multiple access channel with correlated sources remains an open problem. Cover, El Gamal and Salehi gave an achievable region in the form of singleletter entropy and mutual information expressions, without a singleletter converse. Cover, El Gamal and Salehi also suggeste ..."
Abstract

Cited by 13 (4 self)
 Add to MetaCart
The capacity region of the multiple access channel with correlated sources remains an open problem. Cover, El Gamal and Salehi gave an achievable region in the form of singleletter entropy and mutual information expressions, without a singleletter converse. Cover, El Gamal and Salehi also suggested a converse in terms of some nletter mutual informations, which are incomputable. We have proposed an upper bound for the sum rate of this channel in a singleletter expression, by utilizing a new necessary condition for the Markov chain constraint on the valid channel input distributions. In this paper, we extend our results from the sum rate to the entire capacity region. We obtain an outer bound for the capacity region of the multiple access channel with correlated sources in finiteletter expressions.
Robust Support Vector Machine Training via Convex Outlier Ablation
"... One of the well known risks of large margin training methods, such as boosting and support vector machines (SVMs), is their sensitivity to outliers. These risks are normally mitigated by using a soft margin criterion, such as hinge loss, to reduce outlier sensitivity. In this paper, we present a mor ..."
Abstract

Cited by 12 (2 self)
 Add to MetaCart
One of the well known risks of large margin training methods, such as boosting and support vector machines (SVMs), is their sensitivity to outliers. These risks are normally mitigated by using a soft margin criterion, such as hinge loss, to reduce outlier sensitivity. In this paper, we present a more direct approach that explicitly incorporates outlier suppression in the training process. In particular, we show how outlier detection can be encoded in the large margin training principle of support vector machines. By expressing a convex relaxation of the joint training problem as a semidefinite program, one can use this approach to robustly train a support vector machine while suppressing outliers. We demonstrate that our approach can yield superior results to the standard soft margin approach in the presence of outliers.
Convex relaxations of latent variable training
 In Advances in Neural Information Processing Systems 20
, 2007
"... We investigate a new, convex relaxation of an expectationmaximization (EM) variant that approximates a standard objective while eliminating local minima. First, a cautionary result is presented, showing that any convex relaxation of EM over hidden variables must give trivial results if any dependen ..."
Abstract

Cited by 11 (5 self)
 Add to MetaCart
We investigate a new, convex relaxation of an expectationmaximization (EM) variant that approximates a standard objective while eliminating local minima. First, a cautionary result is presented, showing that any convex relaxation of EM over hidden variables must give trivial results if any dependence on the missing values is retained. Although this appears to be a strong negative outcome, we then demonstrate how the problem can be bypassed by using equivalence relations instead of value assignments over hidden variables. In particular, we develop new algorithms for estimating exponential conditional models that only require equivalence relation information over the variable values. This reformulation leads to an exact expression for EM variants in a wide range of problems. We then develop a semidefinite relaxation that yields global training by eliminating local minima. 1
New results on rationality and strongly polynomial solvability in eisenberggale markets
 In Proceedings of 2nd Workshop on Internet and Network Economics
, 2006
"... Abstract. We study the structure of EG[2], the class of EisenbergGale markets with two agents. We prove that all markets in this class are rational and they admit strongly polynomial algorithms whenever the polytope containing the set of feasible utilities of the two agents can be described via a c ..."
Abstract

Cited by 11 (10 self)
 Add to MetaCart
Abstract. We study the structure of EG[2], the class of EisenbergGale markets with two agents. We prove that all markets in this class are rational and they admit strongly polynomial algorithms whenever the polytope containing the set of feasible utilities of the two agents can be described via a combinatorial LP. This helps resolve positively the status of two markets left as open problems by [JV]: the capacity allocation market in a directed graph with two sourcesink pairs and the network coding market in a directed network with two sources. Our algorithms for solving the corresponding nonlinear convex programs are fundamentally different from those obtained by [JV]; whereas they use the primaldual schema, we use a carefully constructed binary search. 1
Distributed Source Coding using Abelian Group Codes: Extracting Performance from Structure
"... In this work, we consider a distributed source coding problem with a joint distortion criterion depending on the sources and the reconstruction. This includes as a special case the problem of computing a function of the sources to within some distortion and also the classic SlepianWolf problem [12 ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
In this work, we consider a distributed source coding problem with a joint distortion criterion depending on the sources and the reconstruction. This includes as a special case the problem of computing a function of the sources to within some distortion and also the classic SlepianWolf problem [12], BergerTung problem [5], WynerZiv problem [4], YeungBerger problem [6] and the AhlswedeKornerWyner problem [3], [13]. While the prevalent trend in information theory has been to prove achievability results using Shannon’s random coding arguments, using structured random codes offer rate gains over unstructured random codes for many problems. Motivated by this, we present a new achievable ratedistortion region (an inner bound to the performance limit) for this problem for discrete memoryless sources based on “good” structured random nested codes built over abelian groups. We demonstrate rate gains for this problem over traditional coding schemes using random unstructured codes. For certain sources and distortion functions, the new rate region is strictly bigger than the BergerTung rate region, which has been the best known achievable rate region for this problem till now. Further, there is no known unstructured random coding scheme that achieves these rate gains. Achievable performance limits for singleuser source coding using abelian group codes are also obtained as parts of the proof of the main coding theorem. As a corollary, we also prove that nested linear codes achieve the Shannon ratedistortion bound in the singleuser setting. Note that while group codes retain some structure, they are more general than linear codes which can only be built over finite fields which are known to exist only for certain sizes.
Convex Structure Learning for Bayesian Networks: Polynomial Feature Selection and Approximate Ordering
, 2006
"... We present a new approach to learning the structure and parameters of a Bayesian network based on regularized estimation in an exponential family representation. Here we show that, given a fixed variable order, the optimal structure and parameters can be learned efficiently, even without restricting ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
We present a new approach to learning the structure and parameters of a Bayesian network based on regularized estimation in an exponential family representation. Here we show that, given a fixed variable order, the optimal structure and parameters can be learned efficiently, even without restricting the size of the parent variable sets. We then consider the problem of optimizing the variable order for a given set of features. This is still a computationally hard problem, but we present a convex relaxation that yields an optimal “soft” ordering in polynomial time. One novel aspect of the approach is that we do not perform a discrete search over DAG structures, nor over variable orders, but instead solve a continuous convex relaxation that can then be rounded to obtain a valid network structure. We conduct an experimental comparison against standard structure search procedures over standard objectives, which cope with local minima, and evaluate the advantages of using convex relaxations that reduce the effects of local minima.
Fast Normalized Cut with Linear Constraints
"... Normalized Cut is a widely used technique for solving a variety of problems. Although finding the optimal normalized cut has proven to be NPhard, spectral relaxations can be applied and the problem of minimizing the normalized cut can be approximately solved using eigencomputations. However, it is ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
Normalized Cut is a widely used technique for solving a variety of problems. Although finding the optimal normalized cut has proven to be NPhard, spectral relaxations can be applied and the problem of minimizing the normalized cut can be approximately solved using eigencomputations. However, it is a challenge to incorporate prior information in this approach. In this paper, we express prior knowledge by linear constraints on the solution, with the goal of minimizing the normalized cut criterion with respect to these constraints. We develop a fast and effective algorithm that is guaranteed to converge. Convincing results are achieved on image segmentation tasks, where the prior knowledge is given as the grouping information of features. 1.