Results 1 -
3 of
3
Simultaneous Unsupervised Learning of Disparate Clusterings
"... Most clustering algorithms produce a single clustering for a given data set even when the data can be clustered naturally in multiple ways. In this paper, we address the difficult problem of uncovering disparate clusterings from the data in a totally unsupervised manner. We propose two new approache ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
Most clustering algorithms produce a single clustering for a given data set even when the data can be clustered naturally in multiple ways. In this paper, we address the difficult problem of uncovering disparate clusterings from the data in a totally unsupervised manner. We propose two new approaches for this problem. In the first approach we aim to find good clusterings of the data that are also decorrelated with one another. To this end, we give a new and tractable characterization of decorrelation between clusterings, and present an objective function to capture it. We provide an iterative “decorrelated” k-means type algorithm to minimize this objective function. In the second approach, we model the data as a sum of mixtures and associate each mixture with a clustering. This approach leads us to the problem of learning a convolution of mixture distributions. Though the latter problem can be formulated as one of factorial learning [8, 13, 16], the existing formulations and methods do not perform well on many real high-dimensional data sets. We propose a new regularized factorial learning framework that is more suitable for capturing the notion of disparate clusterings in modern, high-dimensional data sets. The resulting algorithm does well in uncovering multiple clusterings, and is much improved over existing methods. We evaluate our methods on two real-world data sets- a music data set from the text mining domain, and a portrait data set from the computer vision domain. Our methods achieve a substantially higher accuracy than existing factorial learning as well as traditional clustering algorithms.
Modified Descent Methods for Solving the Monotone Variational Inequality Problem
- Operations Research Letters
, 1998
"... Recently, Fukushima proposed a differentiable optimization framework for solving strictly monotone and continuously differentiable variational inequalities. The main result of this paper is to show that Fukushima's results can be extended to monotone (not necessarily strictly monotone) and Lipschitz ..."
Abstract
-
Cited by 11 (4 self)
- Add to MetaCart
Recently, Fukushima proposed a differentiable optimization framework for solving strictly monotone and continuously differentiable variational inequalities. The main result of this paper is to show that Fukushima's results can be extended to monotone (not necessarily strictly monotone) and Lipschitz continuous (not necessarily continuously differentiable) variational inequalities, if one is willing to modify slightly the basic algorithmic scheme. The modification applies also to a general descent scheme introduced by Zhu and Marcotte. Keywords Variational inequalities. Descent methods. Projection. Global convergence. 1 Introduction Let C be a nonempty, closed and convex subset of R n and let F be a mapping from R n into R n . We consider the variational inequality problem (VIP): Find x 2 C such that hF (x ); x \Gamma xi 0 for all x in C, (1) where h\Delta; \Deltai denotes the standard Euclidian inner product in R n . Traditionally, algorithms for solving variati...
Time and cost tradeoff for distributed data processing
- Computers ind. Engng
, 1989
"... Abstract--An important design issue in distributed data processing systems is to determine optimal data distribution. The problem requires a tradeoff between time and cost. For instance, quick response time conflicts with low cost. The paper addresses the data distribution problem in this conflictin ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Abstract--An important design issue in distributed data processing systems is to determine optimal data distribution. The problem requires a tradeoff between time and cost. For instance, quick response time conflicts with low cost. The paper addresses the data distribution problem in this conflicting environment. A formulation of the problem as a non-linear program is developed. An algorithm employing a simple search procedure is presented, which gives an optimal data distribution. An example is solved to illustrate the method.

