Results 1 -
7 of
7
Athena: Mining-based interactive management of text databases
- International Conference on Extending Database Technology
, 2000
"... Abstract. We describe Athena: a system for creating, exploiting, and maintaining a hierarchy of textual documents through interactive miningbased operations. Requirements of any such system include speed and minimal end-user e ort. Athena satis es these requirements through linear-time classi cation ..."
Abstract
-
Cited by 27 (2 self)
- Add to MetaCart
Abstract. We describe Athena: a system for creating, exploiting, and maintaining a hierarchy of textual documents through interactive miningbased operations. Requirements of any such system include speed and minimal end-user e ort. Athena satis es these requirements through linear-time classi cation and clustering engines which are applied interactively to speed the development of accurate models. Naive Bayes classi ers are recognized to be among the best for classifying text. We show that our specialization of the Naive Bayes classi er is considerably more accurate (7 to 29 % absolute increase in accuracy) than a standard implementation. Our enhancements include using Lidstone's law of succession instead of Laplace's law, under-weighting long documents, and over-weighting author and subject. We also present a new interactive clustering algorithm, C-Evolve, for topic discovery. C-Evolve rst nds highly accurate cluster digests (partial clusters), gets user feedback to merge and correct these digests, and then uses the classi cation algorithm to complete the partitioning of the data. By allowing this interactivity in the clustering process, C-Evolve achieves considerably higher clustering accuracy (10 to 20 % absolute increase in our experiments) than the popular K-Means and agglomerative clustering methods. 1
Technical Paper Recommendation: A Study in Combining Multiple Information Sources
- Journal of Artificial Intelligence Research
, 2001
"... The growing need to manage and exploit the proliferation of online data sources is opening ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
The growing need to manage and exploit the proliferation of online data sources is opening
Personalized Web-Document Filtering Using Reinforcement Learning
- APPLIED ARTIFICIAL INTELLIGENCE
, 2001
"... Document filtering is increasingly deployed in Web environments to reduce information overload of users. We formulate online information filtering as a reinforcement learning problem, i.e. TD(0). The goal is to learn user profiles that best represent his information needs and thus maximize the expec ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
Document filtering is increasingly deployed in Web environments to reduce information overload of users. We formulate online information filtering as a reinforcement learning problem, i.e. TD(0). The goal is to learn user profiles that best represent his information needs and thus maximize the expected value of user relevance feedback. A method is then presented that acquires reinforcement signals automatically by estimating user's implicit feedback from direct observations of browsing behaviors. This "learning by observation" approach is contrasted with conventional relevance feedback methods which require explicit user feedbacks. Field tests have been performed which involved 10 users reading a total of 18,750 HTML documents during 45 days. Compared to the existing document filtering techniques, the proposed learning method showed superior performance in information quality and adaptation speed to user preferences in online filtering.
Recommending Papers by Mining the Web
- In: Proceedings of the IJCAI99 Workshop on Learning about Users
, 1999
"... The problem of assigning conference paper submissions to suitable reviewers can be viewed asavariant of the general problem of technical paper recommendation. In both cases one would ideally like to direct only those papers that are of the greatest interest to the appropriate set of people. Current ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
The problem of assigning conference paper submissions to suitable reviewers can be viewed asavariant of the general problem of technical paper recommendation. In both cases one would ideally like to direct only those papers that are of the greatest interest to the appropriate set of people. Current attempts to automate the conference reviewing process have typically converted it into a task that requires reviewers to rate keywords and sift through long lists of abstracts to nd those that are appropriate for their interests and background. In this paper, we propose an automated method for recommending small focused sets of papers to reviewers. We show howintelligent paper recommendation can be performed by combining techniques from information retrieval and database technology, and by mining multiple information sources from the Web. We use abstracts of papers submitted to AAAI-98 and data mined from the home pages of its program committee members, and we evaluate our approach based on actual reviewing preferences given by the committee members. 1
Text Categorization Through Probabilistic Learning: Applications to Recommender Systems
, 1998
"... Author: Paul N. Bennett Title: Text Categorization Through Probabilistic Learning: Applications to Recommender Systems Supervising Professor: Raymond J. Mooney, Ph.D. With the growth of the World Wide Web, recommender systems have received an increasing amount of attention. Many recommender systems ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Author: Paul N. Bennett Title: Text Categorization Through Probabilistic Learning: Applications to Recommender Systems Supervising Professor: Raymond J. Mooney, Ph.D. With the growth of the World Wide Web, recommender systems have received an increasing amount of attention. Many recommender systems in use today are based on collaborative filtering. This project has focused on LIBRA, a content-based book recommending system. By utilizing text categorization methods and the information available for each book, the system determines a user profile which is used as the basis of recommendations made to the user. Instead of the bagof -words approach used in many other statistical text categorization approaches, LIBRA parses each text sample into a semi-structured representation. We have used standard Machine Learning techniques to analyze the performance of several algorithms on this learning task. In addition, we analyze the utility of several methods of feature construction and selection (...
A Framework Analysis for Managing Explicit Feedback of Visitors of a Web Site
"... In this paper we present a framework analysis for managing the feedback explicitly given by visitors of a Web site. We introduce the concepts of scope, ltering, and relevance proles for managing users ' feedback, and show their applicability by using Gugubarra as a reference system, a prototype deve ..."
Abstract
- Add to MetaCart
In this paper we present a framework analysis for managing the feedback explicitly given by visitors of a Web site. We introduce the concepts of scope, ltering, and relevance proles for managing users ' feedback, and show their applicability by using Gugubarra as a reference system, a prototype developed by DBIS at the Goethe University of Frankfurt, for creating and managing user pro les of Web visitors.
CALVIN: A Personalized Web-Search Agent based on Monitoring User Actions
"... In this paper we describe Calvin, an intelligent agent that learns user interests by monitoring user activities while he/she searches and browses the Web. The user profile is created and maintained from a contentbased and event-based analysis of the visited pages using Inductive Logic Programming. T ..."
Abstract
- Add to MetaCart
In this paper we describe Calvin, an intelligent agent that learns user interests by monitoring user activities while he/she searches and browses the Web. The user profile is created and maintained from a contentbased and event-based analysis of the visited pages using Inductive Logic Programming. The user submits queries which are expanded considering the information represented in her/his profile. Once the expanded query is submitted to and answered byasearch engine, the agent performs a relevance ranking of the results based on the user interests. After some experiments, Calvin has demonstrated tobecapable of learning and adapting user interests without any explicit feedback from her/him.

