Results 1  10
of
37
HyPursuit: A hierarchical network search engine that exploits contentlink hypertext clustering
 PROCEEDINGS OF THE SEVENTH ACM CONFERENCE ON HYPERTEXT
, 1996
"... HyPursuit is a new hierarchical network search engine that clusters hypertext documents to structure a given information space for browsing and search activities. Our contentlink clustering algorithm is based on the semantic information embedded in hyperlink structures and document contents. HyPurs ..."
Abstract

Cited by 99 (2 self)
 Add to MetaCart
HyPursuit is a new hierarchical network search engine that clusters hypertext documents to structure a given information space for browsing and search activities. Our contentlink clustering algorithm is based on the semantic information embedded in hyperlink structures and document contents. HyPursuit admits multiple, coexisting cluster hierarchies based on different principles for grouping documents, such as the Library of Congress catalog scheme and automatically created hypertext clusters. HyPursuit's abstraction functions summarize cluster contents to support scalable query processing. The abstraction functions satisfy system resource limitations with controlled information loss. The result of query processing operations on a cluster summary approximates the result of performing the operations on the entire information space. We constructed a prototype system comprising 100 leaf World Wide Web sites and a hierarchy of 42 servers that route queries to the leaf sites. Experience with our system suggests that abstraction functions based on hypertext clustering can be used to construct meaningful and scalable cluster hierarchies. We are also encouraged by preliminary results on clustering based on both document contents and hyperlink structures.
A new approach to the minimum cut problem
 Journal of the ACM
, 1996
"... Abstract. This paper presents a new approach to finding minimum cuts in undirected graphs. The fundamental principle is simple: the edges in a graph’s minimum cut form an extremely small fraction of the graph’s edges. Using this idea, we give a randomized, strongly polynomial algorithm that finds th ..."
Abstract

Cited by 99 (8 self)
 Add to MetaCart
Abstract. This paper presents a new approach to finding minimum cuts in undirected graphs. The fundamental principle is simple: the edges in a graph’s minimum cut form an extremely small fraction of the graph’s edges. Using this idea, we give a randomized, strongly polynomial algorithm that finds the minimum cut in an arbitrarily weighted undirected graph with high probability. The algorithm runs in O(n 2 log 3 n) time, a significant improvement over the previous Õ(mn) time bounds based on maximum flows. It is simple and intuitive and uses no complex data structures. Our algorithm can be parallelized to run in �� � with n 2 processors; this gives the first proof that the minimum cut problem can be solved in ���. The algorithm does more than find a single minimum cut; it finds all of them. With minor modifications, our algorithm solves two other problems of interest. Our algorithm finds all cuts with value within a multiplicative factor of � of the minimum cut’s in expected Õ(n 2 � ) time, or in �� � with n 2 � processors. The problem of finding a minimum multiway cut of a graph into r pieces is solved in expected Õ(n 2(r�1) ) time, or in �� � with n 2(r�1) processors. The “trace ” of the algorithm’s execution on these two problems forms a new compact data structure for representing all small cuts and all multiway cuts in a graph. This data structure can be efficiently transformed into the
Minimum Cuts in NearLinear Time
, 1999
"... We significantly improve known time bounds for solving the minimum cut problem on undirected graphs. We use a "semiduality" between minimum cuts and maximum spanning tree packings combined with our previously developed random sampling techniques. We give a randomized (Monte Carlo) algorit ..."
Abstract

Cited by 73 (11 self)
 Add to MetaCart
We significantly improve known time bounds for solving the minimum cut problem on undirected graphs. We use a "semiduality" between minimum cuts and maximum spanning tree packings combined with our previously developed random sampling techniques. We give a randomized (Monte Carlo) algorithm that finds a minimum cut in an medge, nvertex graph with high probability in O(m log³ n) time. We also give a simpler randomized algorithm that finds all minimum cuts with high probability in O(n² log n) time. This variant has an optimal RNC parallelization. Both variants improve on the previous best time bound of O(n² log³ n). Other applications of the treepacking approach are new, nearly tight bounds on the number of near minimum cuts a graph may have and a new data structure for representing them in a spaceefficient manner.
An NC Algorithm for Minimum Cuts
 IN PROCEEDINGS OF THE 25TH ANNUAL ACM SYMPOSIUM ON THEORY OF COMPUTING
"... We show that the minimum cut problem for weighted undirected graphs can be solved in NC using three separate and independently interesting results. The first is an (m 2 =n)processor NC algorithm for finding a (2 + ffl)approximation to the minimum cut. The second is a randomized reduction from ..."
Abstract

Cited by 48 (3 self)
 Add to MetaCart
We show that the minimum cut problem for weighted undirected graphs can be solved in NC using three separate and independently interesting results. The first is an (m 2 =n)processor NC algorithm for finding a (2 + ffl)approximation to the minimum cut. The second is a randomized reduction from the minimum cut problem to the problem of obtaining a (2 + ffl)approximation to the minimum cut. This reduction involves a natural combinatorial SetIsolation Problem that can be solved easily in RNC. The third result is a derandomization of this RNC solution that requires a combination of two widely used tools: pairwise independence and random walks on expanders. We believe that the setisolation approach will prove useful in other derandomization problems. The techniques extend to two related problems: we describe NC algorithms finding minimum kway cuts for any constant k and finding all cuts of value within any constant factor of the minimum. Another application of these techni...
Experimental Study of Minimum Cut Algorithms
 PROCEEDINGS OF THE EIGHTH ANNUAL ACMSIAM SYMPOSIUM ON DISCRETE ALGORITHMS (SODA)
, 1997
"... Recently, several new algorithms have been developed for the minimum cut problem. These algorithms are very different from the earlier ones and from each other and substantially improve worstcase time bounds for the problem. We conduct experimental evaluation the relative performance of these algor ..."
Abstract

Cited by 40 (2 self)
 Add to MetaCart
Recently, several new algorithms have been developed for the minimum cut problem. These algorithms are very different from the earlier ones and from each other and substantially improve worstcase time bounds for the problem. We conduct experimental evaluation the relative performance of these algorithms. In the process, we develop heuristics and data structures that substantially improve practical performance of the algorithms. We also develop problem families for testing minimum cut algorithms. Our work leads to a better understanding of practical performance of the minimum cut algorithms and produces very efficient codes for the problem.
A Scalable Selforganizing Map Algorithm for Textual Classification: A Neural Network Approach to Thesaurus Generation
 Communication Cognition and Artificial Intelligence, Spring
, 1998
"... : The rapid proliferation of textual and multimedia online databases, digital libraries, Internet servers, and intranet services has turned researchers' and practitioners' dream of creating an informationrich society into a nightmare of infogluts. Many researchers believe that turning a ..."
Abstract

Cited by 31 (5 self)
 Add to MetaCart
: The rapid proliferation of textual and multimedia online databases, digital libraries, Internet servers, and intranet services has turned researchers' and practitioners' dream of creating an informationrich society into a nightmare of infogluts. Many researchers believe that turning an infoglut into a useful digital library requires automated techniques for organizing and categorizing largescale information. This paper presents research in which we sought to develop a scaleable textual classification and categorization system based on the Kohonen's selforganizing feature map (SOM) algorithm. In our paper, we show how selforganization can be used for automatic thesaurus generation. Our proposed data structure and algorithm took advantage of the sparsity of coordinates in the document input vectors and reduced the SOM computational complexity by several order of magnitude. The proposed Scaleable SOM (SSOM) algorithm makes largescale textual categorization tasks a possibility. A...
Approximating Layout Problems on Random Geometric Graphs
 Journal of Algorithms
, 2001
"... In this paper, we study the approximability of several layout problems on a family of random geometric graphs. Vertices of random geometric graphs are randomly distributed on the unit square and are connected by edges whenever they are closer than some given parameter. The layout problems that we co ..."
Abstract

Cited by 21 (10 self)
 Add to MetaCart
In this paper, we study the approximability of several layout problems on a family of random geometric graphs. Vertices of random geometric graphs are randomly distributed on the unit square and are connected by edges whenever they are closer than some given parameter. The layout problems that we consider are: Bandwidth, Minimum Linear Arrangement, Minimum Cut Width, Minimum Sum Cut, Vertex Separation and Edge Bisection. We first prove that some of these problems remain NPcomplete even for geometric graphs. Afterwards, we compute lower bounds that hold, almost surely, for random geometric graphs. Then, we present two heuristics that, almost surely, turn to be constant approximation algorithms for our layout problems on random geometric graphs. In fact, for the Bandwidth and Vertex Separation problems, these heuristics are asymptotically optimal. Finally, we use the theoretical results in order to empirically compare these and other wellknown heuristics. # This research was partially ...
On the use of information retrieval techniques for the automatic construction of hypertext
 Information Processing and Management
, 1997
"... The rst part of the paper brie y introduces what automatic authoring of a hypertext for information retrieval means. The most di cult part of the automatic construction of a hypertext is the creation of links connecting documents or document fragments that are semantically related. Because of this, ..."
Abstract

Cited by 21 (7 self)
 Add to MetaCart
The rst part of the paper brie y introduces what automatic authoring of a hypertext for information retrieval means. The most di cult part of the automatic construction of a hypertext is the creation of links connecting documents or document fragments that are semantically related. Because of this, to many researchers it seemed natural to use IR techniques for this purpose, since IR has always dealt with the construction of relationships between objects mutually relevant. The second part of the paper presents a survey of some of attempts toward the automatic construction of hypertexts for information retrieval. This part will identify and compare scope, advantages and limitations of di erent approaches. The aim of this survey is to point out the main and most successful current lines of research.
Experiments On The Automatic Construction Of Hypertext From Texts
, 1995
"... this paper we describe an approach we have developed to semiautomatically generate a hypertext from linear texts. This is based on initially creating nodes and composite nodes composed of "minihypertexts". Following this we then compute nodenode similarity values using standard informat ..."
Abstract

Cited by 15 (0 self)
 Add to MetaCart
this paper we describe an approach we have developed to semiautomatically generate a hypertext from linear texts. This is based on initially creating nodes and composite nodes composed of "minihypertexts". Following this we then compute nodenode similarity values using standard information retrieval techniques. These similarity measures are then used to selectively create nodenode links based on the strength of similarity between nodes. What makes our process novel is that the link creation process also uses values from a dynamically computed metric which measures the topological compactness of the overall hypertext being generated. Thus link creation is a selective process based not only on nodenode similarity but also on the overall layout of the hypertext. Experiments on generating a hypertext from a collection of 846 software product descriptions comprising 8.5 Mbytes of text are described. Our experiments with a variety of IR techniques and link creation approaches yield some guidelines on how the process should be automated. Finally, this text to hypertext conversion method is put into the context of an overall hypertext authoring tool currently under development. 1. Introduction