Results 1 -
3 of
3
The Web as a graph
, 2000
"... The pages and hyperlinks of the World-Wide Web maybe viewed as nodes and edges in a directed graph. This graph has about a billion nodes today,several billion links, and appears to grow exponentially with time. There are many reasons---mathematical, sociological, and commercial---for studying the e ..."
Abstract
-
Cited by 147 (2 self)
- Add to MetaCart
The pages and hyperlinks of the World-Wide Web maybe viewed as nodes and edges in a directed graph. This graph has about a billion nodes today,several billion links, and appears to grow exponentially with time. There are many reasons---mathematical, sociological, and commercial---for studying the evolution of this graph. We first review a set of algorithms that operate on the Web graph, addressing problems from Web search, automatic community discovery, and classification. We then recall a number of measurements and properties of the Web graph. Noting that traditional random graph models do not explain these observations, we propose a new family of random graph models.
A Study of the Structure of the Web
"... The WorldWide Web is a huge, growing repository of information on a wide range of topics. It is also becoming important, commercially and sociologically, as a place of human interaction within different communities. In this paper we present an experimental study of the structure of the Web. We an ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
The WorldWide Web is a huge, growing repository of information on a wide range of topics. It is also becoming important, commercially and sociologically, as a place of human interaction within different communities. In this paper we present an experimental study of the structure of the Web. We analyze link topologies of various communities, and patterns of mirroring of content, on 1997 and 1999 snapshots of the Web. Our results give insight into patterns of interaction within communities and how they evolve, as well as patterns of data replication. We also describe the techniques we have developed for performing complex processing on this large data set, and our experiences in doing so. We present new algorithms for finding partial and complete mirrors in URL hierarchies; these are also of independent interest for search and redirection. In order to study and visualize link topologies of different communities, we have developed techniques to compact these large link graphs w...
Link Based Clustering of Web Search Results
- Lecture Notes in Computer Science
, 2001
"... With information proliferation on the Web, how to obtain highquality information from the Web has been one of hot research topics in many fields like Database, IR as well as AI. Web search engine is the most commonly used tool for information retrieval; however, its current status is far from sa ..."
Abstract
- Add to MetaCart
With information proliferation on the Web, how to obtain highquality information from the Web has been one of hot research topics in many fields like Database, IR as well as AI. Web search engine is the most commonly used tool for information retrieval; however, its current status is far from satisfaction. In this paper, we propose a new approach to cluster search results returned from Web search engine using link analysis. Unlike document clustering algorithms in IR that based on common words/phrases shared between documents, our approach is base on common links shared by pages using co-citation and coupling analysis. We also extend standard clustering algorithm K-means to make it more natural to handle noises and apply it to web search results. By filtering some irrelevant pages, our approach clusters high quality pages into groups to facilitate users' accessing and browsing.

