Results 1 -
4 of
4
Structuring and Visualising the WWW by Generalised Similarity Analysis
, 1997
"... This paper describes a generic approach to structuring and visualising a hypertext-based information space on the WWW. This approach, called Generalised Similarity Analysis (GSA), provides a unifying framework for extracting structural patterns from a range of proximity data concerning three fundame ..."
Abstract
-
Cited by 32 (5 self)
- Add to MetaCart
This paper describes a generic approach to structuring and visualising a hypertext-based information space on the WWW. This approach, called Generalised Similarity Analysis (GSA), provides a unifying framework for extracting structural patterns from a range of proximity data concerning three fundamental relationships in hypertext, namely, hypertext linkage, content similarity and browsing patterns. GSA emphasizes the integral role of users' interests in dynamically structuring the underlying information space. Pathfinder networks are used as a natural vehicle for structuring and visualising the rich structure of an information space by highlighting salient relationships in proximity data. In this paper, we use the GSA framework in the study of hypertext documents automatically retrieved over the Internet, including a number of departmental WWW sites and conference proceedings on the WWW. We show that GSA has several distinct features for structuring and visualising hyp...
Data integrity problems in an open hypermedia link service
, 1995
"... A hypermedia link service is system which stores the information describing hypertext links in a database which is separate from the data content over which the links are intended to operate. One of the first open hypermedia link services was Microcosm, which takes this philosophy to the extreme, st ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
A hypermedia link service is system which stores the information describing hypertext links in a database which is separate from the data content over which the links are intended to operate. One of the first open hypermedia link services was Microcosm, which takes this philosophy to the extreme, storing not only the links in a separate database, but also the information about the endpoints of the links. The most important advantage of such an organisation is that the system remains open so that hypertext functionality may be extended to third party applications. The first part of this thesis describes the background to open hypermedia link services and describes the Microcosm system, which was developed by the Multimedia Research Group at the University of Southampton. The major problem with storing all the information about links separately from the content is that such a scheme introduces many
Design and Implementation of a Document Assembly Workbench
, 1998
"... Computers support the management of large collections of text documents, but efficient reuse of document collections for producing new documents remains inherently difficult. We describe and discuss the design and implementation of a document assembly system based on a document assembly model, where ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Computers support the management of large collections of text documents, but efficient reuse of document collections for producing new documents remains inherently difficult. We describe and discuss the design and implementation of a document assembly system based on a document assembly model, where the user produces new specialized documents by querying and browsing a collection of structured document fragments.
Text Augmentation: Inserting XML tags into natural language text with PPM Models and Viterbi-like search
, 2003
"... This thesis develops work on using Hidden Markov Models to insert tags natural language text. A taxonomy of tags is developed unifying the fields of text segmentation tagging, part-of-speech tagging, proper noun extraction and hierarchical entity extraction. The search spaces for inserting tags are ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
This thesis develops work on using Hidden Markov Models to insert tags natural language text. A taxonomy of tags is developed unifying the fields of text segmentation tagging, part-of-speech tagging, proper noun extraction and hierarchical entity extraction. The search spaces for inserting tags are examined from both a theoretical and experimental point of view across the taxonomy and on four corpora. A analysis of different correctness measures for different types of tag insertion problem is undertaken and a technique to determine whether tag-insertion errors are the result of a modelling failure or a searching failure is discovered.

