Results 1 - 10
of
14
Automatic Hypertext Conversion of Paper Document Collections
- Lecture Notes in Computer Science
, 1995
"... Digital libraries should include all the enhanced search functionality that can be provided by using state-of-the-art electronic tools. With respect to this main goal, the support of intuitive searches by means of employing hypertextual features is important. In order to include these features into ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
Digital libraries should include all the enhanced search functionality that can be provided by using state-of-the-art electronic tools. With respect to this main goal, the support of intuitive searches by means of employing hypertextual features is important. In order to include these features into the browsing functionality also for raster image representations of documents, the underlying implicit and explicit hypertext structure of library objects has to be modelled and detected. This internal conversion of real library objects into hypertext objects has to be done automatically as far as possible in order to make it feasible at all. Yet, this conversion has to be flexible enough to cope with the whole range of library objects. In order to do so it has to use explicit information, such as words, phrases, paragraphs etc., as well as all the implicit information contained in fonts and layout. In this chapter we will therefore describe the automatic hypertext conversion of printed arti...
Hierarchical Taxonomies using Divisive Partitioning
, 1998
"... We propose an unsupervised divisive partitioning algorithm for document data sets which enjoys many favorable properties. In particular, the algorithm shows excellent scalability to large data collections and produces high quality clusters which are competitive with other clustering methods. The alg ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
We propose an unsupervised divisive partitioning algorithm for document data sets which enjoys many favorable properties. In particular, the algorithm shows excellent scalability to large data collections and produces high quality clusters which are competitive with other clustering methods. The algorithm yields information on the significant and distinctive words within each cluster, and these words can be inserted into the naturally occuring hierarchical structure produced by the algorithm. The result is an automatically generated hierarchical topical taxonomy of a document set. In this paper, we show how the algorithm's cost scales up linearly with the size of the data, illustrate experimentally the quality of the clusters produced, and show how the algorithm can produce a hierarchical topical taxonomy.
Using Default Logic for Lexical Knowledge
- In Qualitative and Quantitative Practical Reasoning
, 1997
"... . Lexical knowledge is knowledge about the morphology, grammar, and semantics of words. This knowledge is increasingly important in language engineering, and more generally in information retrieval, information filtering, intelligent agents and knowledge management. Here we present a framework, base ..."
Abstract
-
Cited by 8 (6 self)
- Add to MetaCart
. Lexical knowledge is knowledge about the morphology, grammar, and semantics of words. This knowledge is increasingly important in language engineering, and more generally in information retrieval, information filtering, intelligent agents and knowledge management. Here we present a framework, based on default logic, called Lexica, for capturing lexical knowledge. We show how we can use contextual information about a given word to identify relations such as synonyms, antinyms, specializations, and meronyms for the word. We also show how we can use machine learning techniques to facilitate engineering a Lexica knowledgebase. 1 Introduction Lexical knowledge is knowledge about the semantics, morphology, and usage, of words. Handling words is central to many reasoning activities, and as a result lexical knowledge is increasingly important in language engineering, and more generally in information retrieval, information filtering, intelligent agents and knowledge management. Lexical kno...
The Uncertainty Principle in Software Engineering
, 1996
"... This paper makes two contributions to software engineering research. First, we observe that uncertainty permeates software development but is rarely captured explicitly in software models. We remedy this situation by presenting the Uncertainty Principle in Software Engineering (UPSE), which states t ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
This paper makes two contributions to software engineering research. First, we observe that uncertainty permeates software development but is rarely captured explicitly in software models. We remedy this situation by presenting the Uncertainty Principle in Software Engineering (UPSE), which states that uncertainty is inherent and inevitable in software development processes and products. We substantiate UPSE by providing examples of uncertainty in select software engineering domains. We present three common sources of uncertainty in software development, namely human participation, concurrency, and problem-domain uncertainties. We explore in detail uncertainty in software testing, including test planning, test enactment, error tracing, and quality estimation. Second, we present a technique for modeling uncertainty, called Bayesian belief networks, and justify its applicability to software systems. We apply the Bayesian approach to a simple network of software artifacts based on an elev...
Constructing Bayesian-network Models of Software Testing and Maintenance Uncertainties
- University of California
, 1997
"... The lifetime of many software systems is surprisingly long, often far exceeding initial plans and expectations. During development and maintenance of long-lived software, requirements are analyzed and specified, designs and code modules are developed, testing is planned, and code is tested many time ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
The lifetime of many software systems is surprisingly long, often far exceeding initial plans and expectations. During development and maintenance of long-lived software, requirements are analyzed and specified, designs and code modules are developed, testing is planned, and code is tested many times. Consequently, developers and managers frequently lose or gain confidence in software artifacts, especially when existing uncertainties are relieved or when new uncertainties are encountered. Fluctuations in developers' confidences may in turn affect process actions or decisions, for instance determining the impact of change, the need for regression testing, or when to stop testing. In this paper, we present an approach that allows for developers' confidences or "beliefs" regarding software components to be modeled and updated directly. This approach is part of an overall strategy that calls for explicit modeling of software engineering uncertainties using an established technique fo...
Mediators over Taxonomybased Information Sources
- VLDB Journal
, 2004
"... Abstract. We propose a mediator model for providing integrated and unified access to multiple taxonomy-based sources. Each source comprises a taxonomy and a database that indexes objects under the terms of the taxonomy.A mediator comprises a taxonomy and a set of relations between the mediator’s and ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
Abstract. We propose a mediator model for providing integrated and unified access to multiple taxonomy-based sources. Each source comprises a taxonomy and a database that indexes objects under the terms of the taxonomy.A mediator comprises a taxonomy and a set of relations between the mediator’s and the sources’terms, called articulations. By combining different modes of query evaluation at the sources and the mediator and different types of query translation, a flexible, efficient scheme of mediator operation is obtained that can accommodate various application needs and levels of answer quality. We adopt a simple conceptual modeling approach (taxonomies and intertaxonomy mappings) and we illustrate its advantages in terms of ease of use, uniformity, scalability, and efficiency. These characteristics make this proposal appropriate for a large-scale network of sources and mediators.
Rating the Impact of Logical Representations on Retrieval Performance
- In Proc. LUMIS 2001, DEXA 2001 Int. Workshop on Logical and Uncertainty Models for Information Systems
, 2001
"... Logic provides a rich and uniform framework in which Information Retrieval can be modeled. The ability of logical approaches to give rise to more general Information Retrieval models is promising. However no much experimentation has been carried out about the real impact of logical representations o ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
Logic provides a rich and uniform framework in which Information Retrieval can be modeled. The ability of logical approaches to give rise to more general Information Retrieval models is promising. However no much experimentation has been carried out about the real impact of logical representations on retrieval performance. This work is an attempt to fill this gap. Our work is based on a recent logical model of Information Retrieval in which some classical representations can be modeled and matched. We did experiments on both the basic retrieval task and the task of retrieval with feedback. The results obtained using expressive representations are encouraging for applying the model in real IR systems. 1.
Bayesian-network Confirmation of Software Testing Uncertainties
, 1997
"... In this paper, we claim that software development will do well by explicit modeling of its uncertainties using existing uncertainty modeling techniques. This is accomplished initially by stating the Maxim of Uncertainty in Software Engineering (MUSE), followed by a detailed presentation of uncert ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
In this paper, we claim that software development will do well by explicit modeling of its uncertainties using existing uncertainty modeling techniques. This is accomplished initially by stating the Maxim of Uncertainty in Software Engineering (MUSE), followed by a detailed presentation of uncertainty in software testing. We then propose that a specific technique, known as Bayesian Belief Networks, be used to model software testing uncertainties. We demonstrate the use of Bayesian networks to confirm beliefs in the validity of software artifacts and relations in an elevator control system. We describe a prototype implementation that allows for such "software belief networks" to be defined and updated. We conclude with a discussion of issues, concerns, and future prospects for modeling software uncertainties. Keywords: Uncertainty modeling, Bayesian networks, Software testing, Software maxims. 1 1 Introduction Future prospects for software development appear promising. New ...
An Integrated Approach to the Electronic Library of the Future: Connecting a Document Retrieval System with a Hypertext System
- In Proceedings of the Hypermedia 94
, 1994
"... To improve today's online document retrieval systems, it seems sensible to aim at a direct display of complete documents instead of bibliographic references only. Doing this, it is also desirable to enhance the system environment in order to support various types of search strategies within the docu ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
To improve today's online document retrieval systems, it seems sensible to aim at a direct display of complete documents instead of bibliographic references only. Doing this, it is also desirable to enhance the system environment in order to support various types of search strategies within the documents themselves. Advanced hypertext systems implement a large portion of such search tools. However, the manual effort to transfer a conventional linear document into hypertext form makes the hypertext conversion of large libraries look unrealistic. To overcome this problem, we propose the automatic conversion of given linear documents into a hypertext web. This can be done by combining hypertext generation methods with information retrieval methods in order to decrease the complexity of the generation process. After the hypertext generation, the hypertext web may be entered by means of a document retrieval system and examined either as a whole or restricted to regions defined by the user's...

