Results 1 - 10
of
11
Using Linear Algebra for Intelligent Information Retrieval
- SIAM Review
, 1995
"... . Currently, most approaches to retrieving textual materials from scientific databases depend on a lexical match between words in users' requests and those in or assigned to documents in a database. Because of the tremendous diversity in the words people use to describe the same document, lexical me ..."
Abstract
-
Cited by 450 (14 self)
- Add to MetaCart
. Currently, most approaches to retrieving textual materials from scientific databases depend on a lexical match between words in users' requests and those in or assigned to documents in a database. Because of the tremendous diversity in the words people use to describe the same document, lexical methods are necessarily incomplete and imprecise. Using the singular value decomposition (SVD), one can take advantage of the implicit higher-order structure in the association of terms with documents by determining the SVD of large sparse term by document matrices. Terms and documents represented by 200-300 of the largest singular vectors are then matched against user queries. We call this retrieval method Latent Semantic Indexing (LSI) because the subspace represents important associative relationships between terms and documents that are not evident in individual documents. LSI is a completely automatic yet intelligent indexing method, widely applicable, and a promising way to improve users...
Adaptive filtering of multilingual document streams
- IN FIFTH RIAO CONFERENCE ON COMPUTER ASSISTED INFORMATION SEARCHING ON THE INTERNET
, 1997
"... The increasingly ubiquitous global information structure makes it possible to examine high-volume text streams that contain documents written in a variety of languages. Present monolingual adaptive filtering techniques learn profiles which reectuser preferences and then apply those profiles to reduc ..."
Abstract
-
Cited by 11 (9 self)
- Add to MetaCart
The increasingly ubiquitous global information structure makes it possible to examine high-volume text streams that contain documents written in a variety of languages. Present monolingual adaptive filtering techniques learn profiles which reectuser preferences and then apply those profiles to reduce the volume of new documents that must be examined by the user to manageable levels. This paper presents three techniques for extending adaptive monolingual text filtering techniques to manage multilingual document streams. Experimental results are given which demonstrate that dictionary-based and corpus-based techniques achieve similar performance in this application. This observation motivates our development of a translation technique designed specifically for vector space text representations which can in principle exploit both dictionary-based and corpus-based techniques. Results of initial experiments with this technique are given and the potential advantages of the new technique are discussed. The paper concludes with a discussion of future directions for adaptive multilingual text fitering.
Personalized Web-Document Filtering Using Reinforcement Learning
- APPLIED ARTIFICIAL INTELLIGENCE
, 2001
"... Document filtering is increasingly deployed in Web environments to reduce information overload of users. We formulate online information filtering as a reinforcement learning problem, i.e. TD(0). The goal is to learn user profiles that best represent his information needs and thus maximize the expec ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
Document filtering is increasingly deployed in Web environments to reduce information overload of users. We formulate online information filtering as a reinforcement learning problem, i.e. TD(0). The goal is to learn user profiles that best represent his information needs and thus maximize the expected value of user relevance feedback. A method is then presented that acquires reinforcement signals automatically by estimating user's implicit feedback from direct observations of browsing behaviors. This "learning by observation" approach is contrasted with conventional relevance feedback methods which require explicit user feedbacks. Field tests have been performed which involved 10 users reading a total of 18,750 HTML documents during 45 days. Compared to the existing document filtering techniques, the proposed learning method showed superior performance in information quality and adaptation speed to user preferences in online filtering.
Agents in Cyberspace -- Towards a Framework for Multi-Agent Systems in Information Discovery
- In Proceedings of the 20th BCS Colloquium on Information Retrieval, IRSG98
, 1998
"... This article proposes a formal framework for Multi-Agent Systems in the context of Information Discovery. Information Discovery is a synthesis of Information Retrieval and Information Filtering. The Information Discovery Paradigm is given. In addition, the di erent types of agents needed in Informat ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
This article proposes a formal framework for Multi-Agent Systems in the context of Information Discovery. Information Discovery is a synthesis of Information Retrieval and Information Filtering. The Information Discovery Paradigm is given. In addition, the di erent types of agents needed in Information Discovery applications are described in terms of the operations they support and the knowledge and information they use. A correct ltering topology, consisting of sound lter paths, is identi ed. Three elds are identi ed in which Information Retrieval and Information Filtering bene t from their synthesis: query expansion, query generation or autonomous IR, and pro le adaptation. IRSG98 1 Agents in Cyberspace { Towards a Framework for Multi-Agent Systems in Information Discovery
Text categorization: the assignment of subject descriptors to magazine articles
- Information Processing and Management
, 2000
"... ..."
PEA - a personal email assistant with evolutionary adaption
- International Journal of Information Technology
, 1999
"... In this paper we presentPEA,aPersonal Email Assistant, which lters incoming emails and ranks them according to their relevance. We provide tools for the acquisition of individual user models, which may consist of several pro les to map various interest domains of the user. In order to respond prompt ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
In this paper we presentPEA,aPersonal Email Assistant, which lters incoming emails and ranks them according to their relevance. We provide tools for the acquisition of individual user models, which may consist of several pro les to map various interest domains of the user. In order to respond promptly to the shifts of interests of a user, we apply evolutionary algorithms to support an adaptive environment that constantly adjusts the user model to improve the quality of relevance assessment. As second adaptive component wemake use of a monitoring module that records all activities of the user. By means of a classi er system we model the behavior of the user to predict future actions, which results rst in suggestions to the user and later in automatically performed tasks. Additional features of the system include the segmentation of lengthy emails, e cient treatment of duplicate or new versions of messages, cross-language ltering, and the extraction of relevant information by using templates learned from examples. 1
Glean: using syntactic information in document filtering
- Inf. Process. Manage
, 1998
"... In the networked world of the information age, we are exposed to inordinate amounts of information. Search engines and information retrieval systems seek to discern the relevant from the irrelevant information given the context of a user's query. In this paper, we describe a system named Glean, whic ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
In the networked world of the information age, we are exposed to inordinate amounts of information. Search engines and information retrieval systems seek to discern the relevant from the irrelevant information given the context of a user's query. In this paper, we describe a system named Glean, which is based on the idea that coherent textcontains signi cant latent information, such as syntactic structure and patterns of language use, which can be used to enhance the performance of information retrieval systems. We propose a trainable approachthat makes use of syntactic information to increase the precision of information retrieval systems. We present results on these improvements to precision under di erent scenarios: using syntactic information at di erent granularity, and di erent sizes of syntactic contexts.
Experimental investigation of high performance cognitive and interactive text filtering
- in Conference Proceedings, 1995 IEEE International Conference on Systems, Man and Cybernetics
, 1995
"... Text ltering has become increasingly important as the volume of networked information has exploded in recent years. This paper reviews recent progress in that eld and reports on the development of a testbed for experimental investigation of cognitive andinteractive text selection based on a history ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Text ltering has become increasingly important as the volume of networked information has exploded in recent years. This paper reviews recent progress in that eld and reports on the development of a testbed for experimental investigation of cognitive andinteractive text selection based on a history of user evaluations. An interactive ltering system model is presented and a new cognitive ltering technique which we call the Gaussian User Model is described. Because development of analytic measures of text selection e ectiveness has proven intractable, we have modi ed the Cornell SMART text retrieval system to create a exible text ltering testbed for experimental determination of ltering e ectiveness. The paper concludes with a description of the design of this testbed system. 1
Association Index Architecture for Information Brokers
, 1998
"... Information Discovery (ID) is the synthesis of Information Retrieval (IR) and Information Filtering (IF). In ID, broker agents act as intermediaries between user agents and source agents. Information about user interests and documents in sources can be modeled by 2-level hypermedia representations. ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Information Discovery (ID) is the synthesis of Information Retrieval (IR) and Information Filtering (IF). In ID, broker agents act as intermediaries between user agents and source agents. Information about user interests and documents in sources can be modeled by 2-level hypermedia representations. These representations allow navigational mechanisms which have proven their effectiveness in IR applications. Broker agents should thus combine two 2-level hypermedia representations to obtain an overall information structure necessary for the synthesis of IR and IF. For this, we propose the so called Association Index Architecture (AIA) which consists of two 2-level hypermedia representations which are connected through a third level which is coined the association index. The AIA thus forms a 3-level hypermedia representation. Broker agents can perform actions in the AIA to implement their IR and IF related tasks. The AIA is shown to be a general symbolic architecture for combining knowled...

