Results 1  10
of
64
Combining labeled and unlabeled data with cotraining
, 1998
"... We consider the problem of using a large unlabeled sample to boost performance of a learning algorithm when only a small set of labeled examples is available. In particular, we consider a setting in which the description of each example can be partitioned into two distinct views, motivated by the ta ..."
Abstract

Cited by 1244 (28 self)
 Add to MetaCart
We consider the problem of using a large unlabeled sample to boost performance of a learning algorithm when only a small set of labeled examples is available. In particular, we consider a setting in which the description of each example can be partitioned into two distinct views, motivated by the task of learning to classify web pages. For example, the description of a web page can be partitioned into the words occurring on that page, and the words occurring in hyperlinks that point to that page. We assume that either view of the example would be su cient for learning if we had enough labeled data, but our goal is to use both views together to allow inexpensive unlabeled data to augment amuch smaller set of labeled examples. Speci cally, the presence of two distinct views of each example suggests strategies in which two learning algorithms are trained separately on each view, and then each algorithm's predictions on new unlabeled examples are used to enlarge the training set of the other. Our goal in this paper is to provide a PACstyle analysis for this setting, and, more broadly, a PACstyle framework for the general problem of learning from both labeled and unlabeled data. We also provide empirical results on real webpage data indicating that this use of unlabeled examples can lead to signi cant improvement of hypotheses in practice. As part of our analysis, we provide new re
An Experimental Comparison of MinCut/MaxFlow Algorithms for Energy Minimization in Vision
 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2001
"... After [10, 15, 12, 2, 4] minimum cut/maximum flow algorithms on graphs emerged as an increasingly useful tool for exact or approximate energy minimization in lowlevel vision. The combinatorial optimization literature provides many mincut/maxflow algorithms with different polynomial time compl ..."
Abstract

Cited by 794 (48 self)
 Add to MetaCart
After [10, 15, 12, 2, 4] minimum cut/maximum flow algorithms on graphs emerged as an increasingly useful tool for exact or approximate energy minimization in lowlevel vision. The combinatorial optimization literature provides many mincut/maxflow algorithms with different polynomial time complexity. Their practical efficiency, however, has to date been studied mainly outside the scope of computer vision. The goal of this paper
A new approach to the minimum cut problem
 Journal of the ACM
, 1996
"... Abstract. This paper presents a new approach to finding minimum cuts in undirected graphs. The fundamental principle is simple: the edges in a graph’s minimum cut form an extremely small fraction of the graph’s edges. Using this idea, we give a randomized, strongly polynomial algorithm that finds th ..."
Abstract

Cited by 95 (8 self)
 Add to MetaCart
Abstract. This paper presents a new approach to finding minimum cuts in undirected graphs. The fundamental principle is simple: the edges in a graph’s minimum cut form an extremely small fraction of the graph’s edges. Using this idea, we give a randomized, strongly polynomial algorithm that finds the minimum cut in an arbitrarily weighted undirected graph with high probability. The algorithm runs in O(n 2 log 3 n) time, a significant improvement over the previous Õ(mn) time bounds based on maximum flows. It is simple and intuitive and uses no complex data structures. Our algorithm can be parallelized to run in �� � with n 2 processors; this gives the first proof that the minimum cut problem can be solved in ���. The algorithm does more than find a single minimum cut; it finds all of them. With minor modifications, our algorithm solves two other problems of interest. Our algorithm finds all cuts with value within a multiplicative factor of � of the minimum cut’s in expected Õ(n 2 � ) time, or in �� � with n 2 � processors. The problem of finding a minimum multiway cut of a graph into r pieces is solved in expected Õ(n 2(r�1) ) time, or in �� � with n 2(r�1) processors. The “trace ” of the algorithm’s execution on these two problems forms a new compact data structure for representing all small cuts and all multiway cuts in a graph. This data structure can be efficiently transformed into the
Improved Approximation Algorithms for Uniform Connectivity Problems
 J. Algorithms
"... The problem of finding minimum weight spanning subgraphs with a given connectivity requirement is considered. The problem is NPhard when the connectivity requirement is greater than one. Polynomial time approximation algorithms for various weighted and unweighted connectivity problems are given. Th ..."
Abstract

Cited by 71 (2 self)
 Add to MetaCart
The problem of finding minimum weight spanning subgraphs with a given connectivity requirement is considered. The problem is NPhard when the connectivity requirement is greater than one. Polynomial time approximation algorithms for various weighted and unweighted connectivity problems are given. The following results are presented: 1. For the unweighted kedgeconnectivity problem an approximation algorithm that achieves a performance ratio of 1.85 is described. This is the first polynomialtime algorithm that achieves a constant less than 2, for all k. 2. For the weighted kvertexconnectivity problem, a constant factor approximation algorithm is given assuming that the edgeweights satisfy the triangle inequality. This is the first constant factor approximation algorithm for this problem. 3. For the case of biconnectivity, with no assumptions about the weights of the edges, an algorithm that achieves a factor asymptotically approaching 2 is described. This matches the previous best...
Minimum Cuts in NearLinear Time
, 1999
"... We significantly improve known time bounds for solving the minimum cut problem on undirected graphs. We use a "semiduality" between minimum cuts and maximum spanning tree packings combined with our previously developed random sampling techniques. We give a randomized (Monte Carlo) algorithm that fi ..."
Abstract

Cited by 70 (10 self)
 Add to MetaCart
We significantly improve known time bounds for solving the minimum cut problem on undirected graphs. We use a "semiduality" between minimum cuts and maximum spanning tree packings combined with our previously developed random sampling techniques. We give a randomized (Monte Carlo) algorithm that finds a minimum cut in an medge, nvertex graph with high probability in O(m log³ n) time. We also give a simpler randomized algorithm that finds all minimum cuts with high probability in O(n² log n) time. This variant has an optimal RNC parallelization. Both variants improve on the previous best time bound of O(n² log³ n). Other applications of the treepacking approach are new, nearly tight bounds on the number of near minimum cuts a graph may have and a new data structure for representing them in a spaceefficient manner.
Experimental Study of Minimum Cut Algorithms
 PROCEEDINGS OF THE EIGHTH ANNUAL ACMSIAM SYMPOSIUM ON DISCRETE ALGORITHMS (SODA)
, 1997
"... Recently, several new algorithms have been developed for the minimum cut problem. These algorithms are very different from the earlier ones and from each other and substantially improve worstcase time bounds for the problem. We conduct experimental evaluation the relative performance of these algor ..."
Abstract

Cited by 40 (2 self)
 Add to MetaCart
Recently, several new algorithms have been developed for the minimum cut problem. These algorithms are very different from the earlier ones and from each other and substantially improve worstcase time bounds for the problem. We conduct experimental evaluation the relative performance of these algorithms. In the process, we develop heuristics and data structures that substantially improve practical performance of the algorithms. We also develop problem families for testing minimum cut algorithms. Our work leads to a better understanding of practical performance of the minimum cut algorithms and produces very efficient codes for the problem.
FINDING STRUCTURE WITH RANDOMNESS: PROBABILISTIC ALGORITHMS FOR CONSTRUCTING APPROXIMATE MATRIX DECOMPOSITIONS
"... Abstract. Lowrank matrix approximations, such as the truncated singular value decomposition and the rankrevealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that randomization offers a powerful t ..."
Abstract

Cited by 40 (0 self)
 Add to MetaCart
Abstract. Lowrank matrix approximations, such as the truncated singular value decomposition and the rankrevealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing lowrank matrix approximation. These techniques exploit modern computational architectures more fully than classical methods and open the possibility of dealing with truly massive data sets. This paper presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions. These methods use random sampling to identify a subspace that captures most of the action of a matrix. The input matrix is then compressed—either explicitly or implicitly—to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired lowrank factorization. In many cases, this approach beats its classical competitors in terms of accuracy, speed, and robustness. These claims are supported by extensive numerical experiments and a detailed error analysis. The specific benefits of randomized techniques depend on the computational environment. Consider the model problem of finding the k dominant components of the singular value decomposition
Approximating MinimumSize kConnected Spanning Subgraphs via Matching
 SIAM J. Comput
, 1998
"... Abstract: An efficient heuristic is presented for the problem of finding a minimumsize k connected spanning subgraph of an (undirected or directed) simple graph G =(V#E). There are four versions of the problem, and the approximation guarantees are as follows: minimumsize knode connected spann ..."
Abstract

Cited by 35 (3 self)
 Add to MetaCart
Abstract: An efficient heuristic is presented for the problem of finding a minimumsize k connected spanning subgraph of an (undirected or directed) simple graph G =(V#E). There are four versions of the problem, and the approximation guarantees are as follows: minimumsize knode connected spanning subgraph of an undirected graph 1+[1=k], minimumsize knode connected spanning subgraph of a directed graph 1+[1=k], minimumsize kedge connected spanning subgraph of an undirected graph 1+[2=(k + 1)], and minimumsize kedge connected spanning subgraph of a directed graph 1+[4= p k].
FINDING STRUCTURE WITH RANDOMNESS: STOCHASTIC ALGORITHMS FOR CONSTRUCTING APPROXIMATE MATRIX DECOMPOSITIONS
, 2009
"... Lowrank matrix approximations, such as the truncated singular value decomposition and the rankrevealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys recent research which demonstrates that randomization offers a powerful tool for performing l ..."
Abstract

Cited by 25 (2 self)
 Add to MetaCart
Lowrank matrix approximations, such as the truncated singular value decomposition and the rankrevealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys recent research which demonstrates that randomization offers a powerful tool for performing lowrank matrix approximation. These techniques exploit modern computational architectures more fully than classical methods and open the possibility of dealing with truly massive data sets. In particular, these techniques offer a route toward principal component analysis (PCA) for petascale data. This paper presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions. These methods use random sampling to identify a subspace that captures most of the action of a matrix. The input matrix is then compressed—either explicitly or implicitly—to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired lowrank factorization. In many cases, this approach beats its classical competitors in terms of accuracy, speed, and robustness. These claims are supported by extensive numerical experiments and a detailed error analysis. The specific benefits of randomized techniques depend on the computational environment. Consider
Recent Developments in Maximum Flow Algorithms
 in Proceedings of Scandinavian Workshop on Algorithm Theory (SWAT
, 1998
"... Introduction The maximum flow problem is a classical optimization problem with many applications; see e.g. [1, 18, 39]. Algorithms for this problem have been studied for over four decades. Recently, significant improvements have been made in theoretical performance of maximum flow algorithms. In t ..."
Abstract

Cited by 23 (1 self)
 Add to MetaCart
Introduction The maximum flow problem is a classical optimization problem with many applications; see e.g. [1, 18, 39]. Algorithms for this problem have been studied for over four decades. Recently, significant improvements have been made in theoretical performance of maximum flow algorithms. In this survey we put these results in perspective and provide pointers to the literature. We assume that the reader is familiar with basic flow algorithms, including Dinitz' blocking flow algorithm [13]. 2 Preliminaries The maximum flow problem is to find a flow of the maximum value given a graph G with arc capacities, a source s, and a sink t, Here a flow is a function on arcs that satisfies capacity constraints for all arcs and conservation constraints for all vertices except the source and the sink. For more details, see [1, 18, 39]. We distinguish between directed