Results 1 -
9 of
9
The Web as a graph: measurements, models, and methods
, 1999
"... . The pages and hyperlinks of the World-Wide Web may be viewed as nodes and edges in a directed graph. This graph is a fascinating object of study: it has several hundred million nodes today, over a billion links, and appears to grow exponentially with time. There are many reasons --- mathematical, ..."
Abstract
-
Cited by 257 (10 self)
- Add to MetaCart
. The pages and hyperlinks of the World-Wide Web may be viewed as nodes and edges in a directed graph. This graph is a fascinating object of study: it has several hundred million nodes today, over a billion links, and appears to grow exponentially with time. There are many reasons --- mathematical, sociological, and commercial --- for studying the evolution of this graph. In this paper we begin by describing two algorithms that operate on the Web graph, addressing problems from Web search and automatic community discovery. We then report a number of measurements and properties of this graph that manifested themselves as we ran these algorithms on the Web. Finally, we observe that traditional random graph models do not explain these observations, and we propose a new family of random graph models. These models point to a rich new sub-field of the study of random graphs, and raise questions about the analysis of graph algorithms on the Web. 1 Overview Few events in the history of comput...
Trawling the Web for Emerging Cyber-Communities
- Computer Networks
, 1999
"... : The web harbors a large number of communities -- groups of content-creators sharing a common interest -- each of which manifests itself as a set of interlinked web pages. Newgroups and commercial web directories together contain of the order of 20000 such communities; our particular interest here ..."
Abstract
-
Cited by 257 (7 self)
- Add to MetaCart
: The web harbors a large number of communities -- groups of content-creators sharing a common interest -- each of which manifests itself as a set of interlinked web pages. Newgroups and commercial web directories together contain of the order of 20000 such communities; our particular interest here is on emerging communities -- those that have little or no representation in such fora. The subject of this paper is the systematic enumeration of over 100,000 such emerging communities from a web crawl: we call our process trawling. We motivate a graph-theoretic approach to locating such communities, and describe the algorithms, and the algorithmic engineering necessary to find structures that subscribe to this notion, the challenges in handling such a huge data set, and the results of our experiment. Keywords: web mining, communities, trawling, link analysis 1. Overview The web has several thousand well-known, explicitly-defined communities -- groups of individuals who share a common int...
The Web as a graph
, 2000
"... The pages and hyperlinks of the World-Wide Web maybe viewed as nodes and edges in a directed graph. This graph has about a billion nodes today,several billion links, and appears to grow exponentially with time. There are many reasons---mathematical, sociological, and commercial---for studying the e ..."
Abstract
-
Cited by 147 (2 self)
- Add to MetaCart
The pages and hyperlinks of the World-Wide Web maybe viewed as nodes and edges in a directed graph. This graph has about a billion nodes today,several billion links, and appears to grow exponentially with time. There are many reasons---mathematical, sociological, and commercial---for studying the evolution of this graph. We first review a set of algorithms that operate on the Web graph, addressing problems from Web search, automatic community discovery, and classification. We then recall a number of measurements and properties of the Web graph. Noting that traditional random graph models do not explain these observations, we propose a new family of random graph models.
ObjectGlobe: Ubiquitous Query Processing on the Internet
, 2001
"... We present the design of ObjectGlobe, a distribust and open processor for Internet datasouc;Gp Today, data is pu
Abstract
-
Cited by 41 (11 self)
- Add to MetaCart
We present the design of ObjectGlobe, a distribust and open processor for Internet datasouc;Gp Today, data is pu<y<Mcm on the Internet via Web servers which have, if at all, very localizedquli processing capabilities. The goal of the ObjectGlobe project is to establish an open marketplace in which data and query proEkTNNk capabilities can be distribuib and udc by any kind of Internet application.Fuplic - more, ObjectGlobe integrates cycle pro viders (i.e., machines) which carryou quyc processing operators. The overall pictuc is to make it possible to execu@ aquGG with -- in principle -- u@;@EcmE quE operators, cycle providers, and data souac;p Su an infrastruMpyc can serve as enabling technology for scalable e-commerce applications, e.g., B2B and B2C market places, to be able to integrate data and data processing operations of a largenuy>G of participants. One of the main challenges in the design ofsuy an open system is to ensu@ privacy andsecuE;> .
Prototype for wrapping and visualizing geo-referenced data in a distributed environment using xml technology
- Proceedings of the 8 th ACM Symposium on Advances in Geographic Information Systems (ACMGIS), Washington DC
, 2000
"... This paper proposes a prototype for integration and visualization of geo-referenced information (GRI) in a distributed environment in general and World Wide Web in particular. This prototype adopts a three-tier architecture and includes three main components: GRI wrapper for distributed GRI web site ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
This paper proposes a prototype for integration and visualization of geo-referenced information (GRI) in a distributed environment in general and World Wide Web in particular. This prototype adopts a three-tier architecture and includes three main components: GRI wrapper for distributed GRI web sites, GRI integration mediator and client side visualization interface. In this prototype, XML is used as a communication protocol between distributed web sites that provide GRI and the mediator, and between the mediator and clients. Java Servlets are written to translate data in distributed websites into XML documents. Data in distributed websites can be stored in a flat file, relational database, object-oriented database or objectrelational database. Java Servlet in the mediator server retrieves data from related distributed websites in an XML format upon a request from the client side, parses the retrieved XML documents, performs merge or other operations on the retrieved XML documents to build a new XML document and sends it to the client side. When the client side gets the requested data from the mediator server, it will parse the returned XML document and draw it inside the browser window by using a Java applet.
Map-based User Interface for Digital City Kyoto
- In Proc. of INET2000 The Internet Global Summit
, 2000
"... We propose a map-based interface called InfoMap for Digital City Kyoto, a city-based information space including around 2600 home pages located in the Kyoto metropolitan area. InfoMap is an extended image map interface system that enables users to browse a lot of information on geographical maps. Wh ..."
Abstract
-
Cited by 6 (6 self)
- Add to MetaCart
We propose a map-based interface called InfoMap for Digital City Kyoto, a city-based information space including around 2600 home pages located in the Kyoto metropolitan area. InfoMap is an extended image map interface system that enables users to browse a lot of information on geographical maps. While providing an effective means of information search in Digital City Kyoto, InfoMap offers functions such as switching between wide area map images and detailed area map images and uses incremental data downloading via the Internet, in addition to normal image map functions. We also present usability test results and issues concerning this interface on the assumption that a user wants to find some information on Digital City Kyoto. Contents
An Augmented Web Space for Digital Cities
- In Proceedings of the 2001 Symposium on Applications and the Internet (SAINT-2001
, 2001
"... We propose an augmented Web space and its query language to support geographical querying and sequential plan creation utilizing a digital city that is a city-based information space on the Internet. The augmented Web space involves a new approach to integrate the World Wide Web (WWW) and a geograp ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
We propose an augmented Web space and its query language to support geographical querying and sequential plan creation utilizing a digital city that is a city-based information space on the Internet. The augmented Web space involves a new approach to integrate the World Wide Web (WWW) and a geographic information system (GIS). The augmented Web space consists of home pages (HP), hyperlinks, and generic links that represent geographical relations between HPs. The generic links are created dynamically using geographical evaluation functions included in a user's search query each time one is issued. A query also includes a path expression showing how to navigate the HPs, hyperlinks, and generic links. Since the path expression is an extended regular expression, we can describe an arbitrary sequence of users' search actions for navigating the augmented Web space. We have applied the proposed augmented Web space to Digital City Kyoto, a city information service system that is accessed through a 3D walk-through implementation and a map-based interface. Each time a user's query is issued through the 3D and 2D interfaces, Digital City Kyoto creates an augmented Web space, and navigates the Web information space based on the path expression in the query. 1.
World Wide Web Search Technologies
"... With over 800 million pages covering most areas of human endeavor, the World Wide Web is fertile ground for information retrieval. Numerous search technologies have been applied to Web searches, and the dominant search method has yet to be identified. This chapter provides an overview of existing ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
With over 800 million pages covering most areas of human endeavor, the World Wide Web is fertile ground for information retrieval. Numerous search technologies have been applied to Web searches, and the dominant search method has yet to be identified. This chapter provides an overview of existing Web search technologies and classifies them into six categories: (i) hyperlink exploration, (ii) information retrieval, (iii) metasearches, (iv) SQL approaches, (v) content-based multimedia searches, and (vi) others. A comparative study of some major commercial and experimental search services is presented, and some future research directions for Web searches are suggested.

