Results 1 -
5 of
5
Harvest: A Scalable, Customizable Discovery and Access System
, 1995
"... Rapid growth in data volume, user base, and data diversity render Internet-accessible information increasingly difficult to use effectively. In this paper we introduce Harvest, a system that provides an integrated set of customizable tools for gathering information from diverse repositories, buil ..."
Abstract
-
Cited by 159 (7 self)
- Add to MetaCart
Rapid growth in data volume, user base, and data diversity render Internet-accessible information increasingly difficult to use effectively. In this paper we introduce Harvest, a system that provides an integrated set of customizable tools for gathering information from diverse repositories, building topic-specific content indexes, flexibly searching the indexes, widely replicating them, and caching objects as they are retrieved across the Internet. The system interoperates with WWW clients and with HTTP,FTP, Gopher, and NetNews information resources. We discuss the design and implementation of Harvest and its subsystems, give examples of its uses, and provide measurements indicating that Harvest can significantly reduce server load, network traffic, and space requirements when building indexes, compared with previous systems. We also discuss several popular indexes wehave built using Harvest, underscoring the customizability and scalability of the system.
Scalable Internet Resource Discovery: Research Problems and Approaches
, 1994
"... Over the past several years, a number of information discovery and access tools have been introduced in the Internet, including Archie, Gopher, Netfind, and WAIS. These tools have become quite popular, and are helping to redefine how people think about wide-area network applications. Yet, they ar ..."
Abstract
-
Cited by 121 (3 self)
- Add to MetaCart
Over the past several years, a number of information discovery and access tools have been introduced in the Internet, including Archie, Gopher, Netfind, and WAIS. These tools have become quite popular, and are helping to redefine how people think about wide-area network applications. Yet, they are not well suited to supporting the future information infrastructure, which will be characterized by enormous data volume, rapid growth in the user base, and burgeoning data diversity. In this paper we indicate trends in these three dimensions and survey problems these trends will create for current approaches. We then suggest several promising directions of future resource discovery research, along with some initial results from projects carried out by members of the Internet Research Task Force Research Group on Resource Discovery and Directory Service.
Copyright Protection for Electronic Publishing over Computer Networks
- AT&T Bell Laboratories
, 1994
"... The increased availability of computers, printers and high-speed networks could make electronic publishing a reality. One of the major technical and economic challenges faced by electronic publishing is that of preventing individuals from easily copying and illegally distributing electronic document ..."
Abstract
-
Cited by 26 (5 self)
- Add to MetaCart
The increased availability of computers, printers and high-speed networks could make electronic publishing a reality. One of the major technical and economic challenges faced by electronic publishing is that of preventing individuals from easily copying and illegally distributing electronic documents. In this paper, we explore the use of cryptographic protocols to discourage the distribution of illicit electronic copies. We propose an architecture and two separate schemes for making electronic document distribution secure. The first strategy requires special-purpose firmware in the printers and displays to decrypt encrypted documents. In the second strategy, encrypted documents are decrypted in software in the recipient's computer. 1 Introduction The increased use of facsimile has made the electronic transfer of paper documents more accepted. Electronic mail, electronic bulletin boards and networks such as the Internet make it possible to distribute electronic information to large gro...
The future of Internet search
- in DOA’01 International Symposium on Distributed Objects and Applications, Short Papers, Roberto Baldoni
, 2001
"... Comparing bandwidth growth with Internet content growth reveals that, given the current trend, search en-gines with a centrally hosted index will show rapidly decreasing quality of service over time. These considerations are backed by the numbers on search engine coverage and the latest research on ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Comparing bandwidth growth with Internet content growth reveals that, given the current trend, search en-gines with a centrally hosted index will show rapidly decreasing quality of service over time. These considerations are backed by the numbers on search engine coverage and the latest research on size and growth comparisons between the \surface web " and the \deep web". On the other hand, today's business application architectures have tobe \Internet-centric", meaning that selected parts of the business processes have to be accessible via the Internet. For example, customers should be able to place their orders via the world wide web or should be able to search theproduct catalog online. Current software architectures provide simple HTML search forms that usually cover only the contents of one particular web application and yield results in proprietary formats, deprived of structure. This approach limits public search engines ' chances to navigate a site's contents and thus makes it di cult to place global queries across multiple sites of this kind. Thus, increasingly signi cant parts of the \deep web " remain hidden from search engines. This article discusses the limits of today's search engine architectures and presents an approach that reverses the paradigm of Internet search: instead of search engines doing all the work (discovery, harvesting, indexing and retrieval) the approach is based on content providers actively contributing to the searchability of the content served by them. The paper will demonstrate how modern distributed object technology helps in designing and implementing this approach. A prototype is used to validate the concepts. Furthermore, it shows how new applications, using e.g. J2EE as their component architecture, can be integrated with this approach by modeling their searchability in UML and automatically generating the required framework adapters.
Prototype of the National High-Performance Software Exchange
- IEEE Computational Science & Engineering, Summer
, 1995
"... This report describes a short-term effort to construct a prototype for the National High-Performance Software Exchange (NHSE). The prototype demonstrates how the evolving National Information Infrastructure (NII) can be used to facilitate sharing of software and information among members of the High ..."
Abstract
- Add to MetaCart
This report describes a short-term effort to construct a prototype for the National High-Performance Software Exchange (NHSE). The prototype demonstrates how the evolving National Information Infrastructure (NII) can be used to facilitate sharing of software and information among members of the High Performance Computing and Communications (HPCC) community. Shortcomings of current information searching and retrieval tools are pointed out, and recommendations are given for areas in need of further development. The hypertext home page for the NHSE is accessible at http://www.netlib.org/nse/home.html. 1 Introduction Over the course of a two-month period, the NHSE developers team, consisting of researchers at the member institutions of the Center for Research on Parallel Computation, have undertaken the task of producing a prototype of the National High-Performance Software Exchange (NHSE). The NHSE is intended as an Internet-accessible resource which will facilitate the exchange of softw...

