Results 1 -
3 of
3
Fixing the "Broken-link" Problem: The W3Objects Approach
, 1996
"... One of most serious problems plaguing the World Wide Web today is that of broken hypertext links, which are a major annoyance to browsing users and also a cause of tarnished reputation and possible loss of opportunity for information providers. The root of the problem lies in the current Web archite ..."
Abstract
-
Cited by 28 (5 self)
- Add to MetaCart
One of most serious problems plaguing the World Wide Web today is that of broken hypertext links, which are a major annoyance to browsing users and also a cause of tarnished reputation and possible loss of opportunity for information providers. The root of the problem lies in the current Web architecture's lack of support for referential integrity. This paper presents a model for the provision of referential integrity for Web resources which supports resource migration and tolerates site and communication failures. The approach is object-oriented, highly flexible, completely distributed, and does not require any global administration. An attractive feature of our design is the provision of a lightweight mechanism which provides referential integrity, and which may be customised on a per resource basis to provide increased fault-tolerance and performance. Our system follows an evolutionary approach, supporting parallel operation with the existing Web, allowing users to gain the addition...
Analysis of lexical signatures for improving information persistence on the World Wide Web
- ACM TRANSACTIONS ON INFORMATION SYSTEMS
, 2004
"... A lexical signature (LS) consisting of several key words from a Web document is often sufficient information for finding the document later, even if its URL has changed. We conduct a large-scale empirical study of nine methods for generating lexical signatures, including Phelps and Wilensky’s origin ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
A lexical signature (LS) consisting of several key words from a Web document is often sufficient information for finding the document later, even if its URL has changed. We conduct a large-scale empirical study of nine methods for generating lexical signatures, including Phelps and Wilensky’s original proposal (PW), seven of our own static variations, and one new dynamic method. We examine their performance on the Web over a 10-month period, and on a TREC data set, evaluating their ability to both (1) uniquely identify the original (possibly modified) document, and (2) locate other relevant documents if the original is lost. Lexical signatures chosen to minimize document frequency (DF) are good at unique identification but poor at finding relevant documents. PW works well on the relatively small TREC data set, but acts almost identically to DF on the Web, which contains billions of documents. Term-frequency-based lexical signatures (TF) are very easy to compute and often perform well, but are highly dependent on the ranking system of the search engine used. The term-frequency inverse-document-frequency- (TFIDF-) based method and hybrid methods (which combine DF with TF or TFIDF) seem to be the most promising candidates among static methods for generating effective lexical signatures. We propose a dynamic LS generator
M.RUGGIER Using WWW to improve software development and maintenance: Application of the LIGHT system to ALEPH programs
"... Programmers who develop, use, maintain, modify software are faced with the problem of scanning and understanding large amounts of documents, ranging from source code to requirements, analysis and design diagrams, user and reference manuals, etc. This task is non trivial and time consuming, because o ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Programmers who develop, use, maintain, modify software are faced with the problem of scanning and understanding large amounts of documents, ranging from source code to requirements, analysis and design diagrams, user and reference manuals, etc. This task is non trivial and time consuming, because of the number and size of documents, and the many implicit cross-references that they contain. In large distributed development teams, where software and related documents are produced at various sites, the problem can be even more severe. LIGHT, LIfe cycle Global HyperText, is an attempt to solve the problem using WWW technology. The basic idea is to make all the software documents, including code, available and cross-connected on the WWW. The first application of this concept to go in production is JULIA/LIGHT, a system to convert and publish on WWW the software documentation of the JULIA reconstruction program of the ALEPH experiment at CERN,

