Results 1 - 10
of
20
Using collaborative filtering to weave an information tapestry
- Communications of the ACM
, 1992
"... predicated on the belief that information filtering can be more effective when humans are involved in the filtering process. Tapestry was designed to support both content-based filtering and collaborative filtering, which entails people collaborating to help each other perform filtering by recording ..."
Abstract
-
Cited by 577 (3 self)
- Add to MetaCart
predicated on the belief that information filtering can be more effective when humans are involved in the filtering process. Tapestry was designed to support both content-based filtering and collaborative filtering, which entails people collaborating to help each other perform filtering by recording their reactions to documents they read. The reactions are called annotations; they can be accessed by other people’s filters. Tapestry is intended to handle any incoming stream of electronic documents and serves both as a mail filter and repository; its components are the indexer, document store, annotation store, filterer, little box, remailer, appraiser and reader/browser. Tapestry’s client/server architecture, its various components, and the Tapestry query language are described.
Bursty and Hierarchical Structure in Streams
, 2002
"... A fundamental problem in text data mining is to extract meaningful structure from document streams that arrive continuously over time. E-mail and news articles are two natural examples of such streams, each characterized by topics that appear, grow in intensity for a period of time, and then fade aw ..."
Abstract
-
Cited by 196 (2 self)
- Add to MetaCart
A fundamental problem in text data mining is to extract meaningful structure from document streams that arrive continuously over time. E-mail and news articles are two natural examples of such streams, each characterized by topics that appear, grow in intensity for a period of time, and then fade away. The published literature in a particular research field can be seen to exhibit similar phenomena over a much longer time scale. Underlying much of the text mining work in this area is the following intuitive premise --- that the appearance of a topic in a document stream is signaled by a "burst of activity," with certain features rising sharply in frequency as the topic emerges.
An Open Agent Architecture
, 1994
"... The goal of this ongoing project is to develop an open agent architecture and accompanying user interface for networked desktop and handheld machines. The system we are building should support distributed execution of a user's requests, interoperabilityofmultiple application subsystems, additio ..."
Abstract
-
Cited by 171 (28 self)
- Add to MetaCart
The goal of this ongoing project is to develop an open agent architecture and accompanying user interface for networked desktop and handheld machines. The system we are building should support distributed execution of a user's requests, interoperabilityofmultiple application subsystems, addition of new agents, and incorporation of existing applications. It should also be transparent; users should not need to know where their requests are being executed, nor how. Finally, in order to facilitate the user's delegating tasks to agents, the architecture will be served byamultimodal interface, including pen, voice, and direct manipulation. Design considerations taken to support this functionality will be discussed below. INTRODUCTION Agents are all the rage. #Visioneering" videos, suchas Apple Computer's Knowledge Navigator, have helped to popularize the notion that programs endowed with agency, if not intelligence, are just around the corner. Soon, users need not themselves...
A Conceptual Framework for Text Filtering
, 1996
"... This report develops a conceptual framework for text filtering practice and research, and reviews present practice in the field. Text filtering is an information seeking process in which documents are selected from a dynamic text stream to satisfy a relatively stable and specific information need. A ..."
Abstract
-
Cited by 34 (1 self)
- Add to MetaCart
This report develops a conceptual framework for text filtering practice and research, and reviews present practice in the field. Text filtering is an information seeking process in which documents are selected from a dynamic text stream to satisfy a relatively stable and specific information need. A model of the information seeking process is introduced and specialized to define information filtering. The historical development of text filtering is then reviewed and case studies of recent work are used to highlight important design characteristics of modern text filtering systems. Specific techniques drawn from information retrieval, user modeling, machine learning and other related fields are described, and the report concludes with observations on the present state of the art and implications for future research on text filtering.
Index Structures for Information Filtering Under the Vector Space Model
- In Proc. International Conference on Data Engineering
, 1993
"... With the ever increasing volumes of electronic information generation, users of information systems are facing an information overload. It is desirable to support information filtering as a complement to traditional retrieval mechanism. The number of users, and thus profiles (representing users' lon ..."
Abstract
-
Cited by 31 (4 self)
- Add to MetaCart
With the ever increasing volumes of electronic information generation, users of information systems are facing an information overload. It is desirable to support information filtering as a complement to traditional retrieval mechanism. The number of users, and thus profiles (representing users' long-term interests), handled by an information filtering system is potentially huge, and the system has to process a constant stream of incoming information in a timely fashion. The efficiency of the filtering process is thus an important issue. In this paper, we study what data structures and algorithms can be used to efficiently perform large-scale information filtering under the vector space model, a retrieval model established as being effective. We apply the idea of the standard inverted index to index user profiles. We devise an alternative to the standard inverted index, in which we, instead of indexing every term in a profile, select only the significant ones to index. We evaluate thei...
Ishmail: Immediate Identification of Important Information
- AT&T Labs
, 1995
"... This paper describes Ishmail, a program designed for people who get a lot of electronic mail. Most email programs do not address the main problem experienced by people who get a lot of email: information overload. Given a deluge of email, how does one maintain control over incoming message traffic a ..."
Abstract
-
Cited by 13 (1 self)
- Add to MetaCart
This paper describes Ishmail, a program designed for people who get a lot of electronic mail. Most email programs do not address the main problem experienced by people who get a lot of email: information overload. Given a deluge of email, how does one maintain control over incoming message traffic and reduce the time required to find important messages? Some email programs support classification of messages into separate mailboxes, but this is only a partial solution. Ishmail is unique in that it not only sorts messages into mailboxes, but it orders mailboxes by a combination of user-specified priorities and alarms. While most mail programs only alert users about unread messages, Ishmail supports independent alarms on each mailbox with customizable thresholds and filters. Users control their alarms, mailboxes, and messages through customizable summaries that act as both views and interactive controls. Three additional unique features of Ishmail are 1) the ability to read messages safel...
Combining email models for false positive reduction
- In KDD ’05: Proceeding of the 11th ACM SIGKDD international conference on Knowledge discovery in data mining
, 2005
"... Machine learning and data mining can be effectively used to model, classify and discover interesting information for a wide variety of data including email. The Email Mining Toolkit, EMT, has been designed to provide a wide range of analyses for arbitrary email sources. Depending upon the task, one ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
Machine learning and data mining can be effectively used to model, classify and discover interesting information for a wide variety of data including email. The Email Mining Toolkit, EMT, has been designed to provide a wide range of analyses for arbitrary email sources. Depending upon the task, one can usually achieve very high accuracy, but with some amount of false positive tradeoff. Generally false positives are prohibitively expensive in the real world. In the case of spam detection, for example, even if one email is misclassified, this may be unacceptable if it is a very important email. Much work has been done to improve specific algorithms for the task of detecting unwanted messages, but less work has been report on leveraging multiple algorithms and correlating models in this particular domain of email analysis. EMT has been updated with new correlation functions allowing the analyst to integrate a number of EMT’s user behavior models available in the core technology. We present results of combining classifier outputs for improving both accuracy and reducing false positives for the problem of spam detection. We apply these methods to a very large email data set and show results of different combination methods on these corpora. We introduce a new method to compare multiple and combined classifiers, and show how it differs from past work. The method analyzes the relative gain and maximum possible accuracy that can be achieved for certain combinations of classifiers to automatically choose the best combination. Categories & Subject Descriptors: H.3.3 [Information Search and Retrieval]: Retrieval models,
Virtual Folders : Database Support for Electronic Messages Classification
- In: International Symposium on Cooperative Database Systems for Advanced Applications, Heian Shrine, Kyoto
, 1996
"... Every regular electronic mail user is well aware of the problems for managing the increasing number of messages in his/her mailbox. In this paper, the concept of virtual folder is presented as an alternative and flexible approach to the classification of large volumes of electronic messages. Automat ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
Every regular electronic mail user is well aware of the problems for managing the increasing number of messages in his/her mailbox. In this paper, the concept of virtual folder is presented as an alternative and flexible approach to the classification of large volumes of electronic messages. Automatic folders are a particular type of virtual folder, very similar to the concept of vision in database systems. Using a query language developed to support information retrieval in electronic mail environments, an automatic folder is defined as a query that retrieves a set of messages meeting a specified criterion. Automatic folders help on automatic classification and reorganization of messages, taking in charge the maintenance of the consistency between a folder's intent and the set of messages it represents. 1 Introduction Electronic mail (email) has become an essential form of communication, and the size of messaging communities is increasing in an astonishing speed [26, 27]. Email is r...
Message management systems at work: prototypes for business communication
- Journal of Organizational Computing
, 1995
"... We describe two applications based on a system for office communication that is more flexible and expressive than other systems. This system allows the computerization of tasks that previously required manual intervention because of each task’s complexity. The applications, one automating office tas ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
We describe two applications based on a system for office communication that is more flexible and expressive than other systems. This system allows the computerization of tasks that previously required manual intervention because of each task’s complexity. The applications, one automating office tasks and the other simulating a bicycle industry, highlight the system’s ability to accommodate changes to the communication language. They also highlight the utility of both the formal language used by the system and the inferential model of communications used to interpret the messages.
Issues When Designing Filters in Messaging Systems
, 1993
"... : The increasing size of messaging communities increases the risk of information overload, especially when group communication tools like mailing lists or asynchronous conferencing systems (like Usenet News) are used. Future messaging systems will require more capable filters to aid users in the sel ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
: The increasing size of messaging communities increases the risk of information overload, especially when group communication tools like mailing lists or asynchronous conferencing systems (like Usenet News) are used. Future messaging systems will require more capable filters to aid users in the selection of what to read. The increasing use of networks by non-computer professionals requires filters, that are easier to use and manage than most filtering software today. Filters might use evaluations of messages made by certain users as an aid to filtering these messages for other users. Keywords: Electronic Mail, Message Handling Systems, MHS, Computer conferencing, Bulletin Board systems, Filters, Mail filtering, Netnews, Usenet News, Information Retrieval Systems. Author's personal address: Skeppargatan 73, S-115 30 Stockholm, Sweden. Phone: +46-8-16 16 67. Internet mail: jpalme@dsv.su.se. University address: Department of Computer and Systems Sciences, Stockholm University and Kung...

