Results 1 -
3 of
3
Glean: using syntactic information in document filtering
- Inf. Process. Manage
, 1998
"... In the networked world of the information age, we are exposed to inordinate amounts of information. Search engines and information retrieval systems seek to discern the relevant from the irrelevant information given the context of a user's query. In this paper, we describe a system named Glean, whic ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
In the networked world of the information age, we are exposed to inordinate amounts of information. Search engines and information retrieval systems seek to discern the relevant from the irrelevant information given the context of a user's query. In this paper, we describe a system named Glean, which is based on the idea that coherent textcontains signi cant latent information, such as syntactic structure and patterns of language use, which can be used to enhance the performance of information retrieval systems. We propose a trainable approachthat makes use of syntactic information to increase the precision of information retrieval systems. We present results on these improvements to precision under di erent scenarios: using syntactic information at di erent granularity, and di erent sizes of syntactic contexts.
Catering to the needs of Web users: Integrating Retrieval and Browsing
"... We propose a new approach to querying hypermedia documents on the Web based on information retrieval (IR), browsing, and database techniques so as to provide maximum flexibility to the user. We present a model based on object representation where an identity does not correspond to a source HTML page ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
We propose a new approach to querying hypermedia documents on the Web based on information retrieval (IR), browsing, and database techniques so as to provide maximum flexibility to the user. We present a model based on object representation where an identity does not correspond to a source HTML page but to a fragment of it. A fragment is identified using the explicit structure provided by the HTML tags as well as the implicit structure extracted using IR techniques. Our fragmentation provides access to different heterogeneous components (text, image, audio, video, etc.) of a given document, and to their relationships (implicit or explicit through hyperlinks). Our language expresses browsing and restructuring based on IR techniques in a unified framework. All these are integral components of the AKIRA system, currently under development. Keywords: multimedia, hypermedia, Web, views, data model, query language, information retrieval, agents 1 Introduction The Web invades our lives. Whi...

