Results 1 - 10
of
34
Evaluation of Item-Based Top-N Recommendation Algorithms
, 2000
"... The explosive growth of the world-wide-web and the emergence of e-commerce has led to the development of recommender systems---a personalized information filtering technology used to identify a set of N items that will be of interest to a certain user. User-based Collaborative filtering is the mos ..."
Abstract
-
Cited by 86 (3 self)
- Add to MetaCart
The explosive growth of the world-wide-web and the emergence of e-commerce has led to the development of recommender systems---a personalized information filtering technology used to identify a set of N items that will be of interest to a certain user. User-based Collaborative filtering is the most successful technology for building recommender systems to date, and is extensively used in many commercial recommender systems. Unfortunately, the computational complexity of these methods grows linearly with the number of customers that in typical commercial applications can grow to be several millions. To address these scalability concerns item-based recommendation techniques have been developed that analyze the user-item matrix to identify relations between the different items, and use these relations to compute the list of recommendations. In this paper we present one such class of item-based recommendation algorithms that first determine the similarities between the various ite...
Predicting Web Actions from HTML Content
- In Proceedings of the The Thirteenth ACM Conference on Hypertext and Hypermedia (HT’02
, 2002
"... This paper examines the accuracy of predicting a user's next action based on analysis of the content of the pages requested recently by the user. Predictions are made using the similarity of a model of the user's interest to the text in and around the hypertext anchors of recently requested Web page ..."
Abstract
-
Cited by 32 (5 self)
- Add to MetaCart
This paper examines the accuracy of predicting a user's next action based on analysis of the content of the pages requested recently by the user. Predictions are made using the similarity of a model of the user's interest to the text in and around the hypertext anchors of recently requested Web pages. This approach can make predictions of actions that have never been taken by the user and potentially make predictions that reflect current user interests. We evaluate this technique using data from a full-content log of Web activity and find that textual similarity-based predictions outperform simpler approaches.
Learning Implicit User Interest Hierarchy for Context in Personalization
- In Proc. of International Conference on Intelligent User Interface (IUI
, 2003
"... To provide a more robust context for personalization, we desire to extract a continuum of general (long-term) to specific (short-term) interests of a user. Our proposed approach is to learn a user interest hierarchy (UIH) from a set of web pages visited by a user. We devise a divisive hierarchical c ..."
Abstract
-
Cited by 32 (4 self)
- Add to MetaCart
To provide a more robust context for personalization, we desire to extract a continuum of general (long-term) to specific (short-term) interests of a user. Our proposed approach is to learn a user interest hierarchy (UIH) from a set of web pages visited by a user. We devise a divisive hierarchical clustering (DHC) algorithm to group words (topics) into a hierarchy where more general interests are represented by a larger set of words. Each web page can then be assigned to nodes in the hierarchy for further processing in learning and predicting interests. This approach is analogous to building a subject taxonomy for a library catalog system and assigning books to the taxonomy. Our approach does not need user involvement and learns the UIH "implicitly." Furthermore, it allows the original objects, web pages, to be assigned to multiple topics (nodes in the hierarchy). In this paper, we focus on learning the UIH from a set of visited pages. We propose a few similarity functions and dynamic threshold-funding methods, and evaluate the resulting hierarchies according to their meaningfulhess and shape.
Inferring User Interest
- IEEE INTERNET COMPUTING
, 2001
"... Recommender systems provide personalized suggestions about items that users will find interesting. Typically, recommender systems require a user interface that can determine the interest of a user and use this information to make suggestions. The common solution, explicit ratings, where users tell t ..."
Abstract
-
Cited by 25 (0 self)
- Add to MetaCart
Recommender systems provide personalized suggestions about items that users will find interesting. Typically, recommender systems require a user interface that can determine the interest of a user and use this information to make suggestions. The common solution, explicit ratings, where users tell the system what they think about a piece of information, is well-understood and fairly precise. However, having to stop to enter explicit ratings can alter normal patterns of browsing and reading. A less intrusive method is to use implicit ratings, where a rating is obtained by a method other than obtaining it directly from the user. This research studies the correlation between various implicit ratings and the explicit rating for a single Web page, and the impact of implicit interest indicators on user privacy. We developed a Web browser that records a user's actions (implicit ratings) and the explicit rating for each page visited. The browser was used by over 70 people that browsed more than 2500 Web pages. We find that the time spent on a page, the amount of scrolling on a page and the combination of time and scrolling has a strong correlation with explicit interest, while individual scrolling methods and mouse-clicks are ineffective in predicting explicit interest.
Personalization from Incomplete Data: What You Don't Know Can Hurt
- In Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2001
, 2001
"... Clickstream data collected at any web site (site-centric data) is inherently incomplete, since it does not capture users' browsing behavior across sites (user-centric data). Hence, models learned from such data may be subject to limitations, the nature of which has not been well studied. Understandi ..."
Abstract
-
Cited by 22 (3 self)
- Add to MetaCart
Clickstream data collected at any web site (site-centric data) is inherently incomplete, since it does not capture users' browsing behavior across sites (user-centric data). Hence, models learned from such data may be subject to limitations, the nature of which has not been well studied. Understanding the limitations is particularly important since most current personalization techniques are based on site-centric data only. In this paper, we empirically examine the implications of learning from incomplete data in the context of two specific problems: (a) predicting if the remainder of any given session will result in a purchase and (b) predicting if a given user will make a purchase at any future session. For each of these problems we present new algorithms for fast and accurate data preprocessing of clickstream data. Based on a comprehensive experiment on user-level clickstream data gathered from 20,000 users' browsing behavior, we demonstrate that models built on user-centric data outperform models built on site-centric data for both prediction tasks.
Persona: A Contextualized and Personalized Web Search
- In Proc. of the 35th Annual Hawaii International Conference on System Sciences
, 2001
"... Recent advances in graph-based search techniques derived from Kleinberg's work [1] have been impressive. This paper further improves the graph-based search algorithm in two dimensions. Firstly, variants of Kleinberg's techniques do not take into account the semantics of the query string nor of the n ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
Recent advances in graph-based search techniques derived from Kleinberg's work [1] have been impressive. This paper further improves the graph-based search algorithm in two dimensions. Firstly, variants of Kleinberg's techniques do not take into account the semantics of the query string nor of the nodes being searched. As a result, polysemy of query words cannot be resolved. This paper presents an interactive query scheme utilizing the simple web ontology provided by the Open Directory Project to resolve meanings of a user query. Secondly, we extend a recently proposed personalized version of the Kleinberg algorithm [3]. Simulation results are presented to illustrate the sensitivity of our technique. We outline the implementation of our algorithm in the Persona personalized web search system.
Cross-sell: A Fast Promotion-Tunable Customer-item Recommendation Method Based on Conditionally Independent Probabilities
- In Proceedings of ACM SIGKDD International Conference
, 2000
"... We develop a method for recommending products to customers with applications to both on-line and surface mail promotional offers. Our method differs from previous work in collaborative filtering [8] and imputation [18], in that we assume probabilities are conditionally independent. This assumption, ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
We develop a method for recommending products to customers with applications to both on-line and surface mail promotional offers. Our method differs from previous work in collaborative filtering [8] and imputation [18], in that we assume probabilities are conditionally independent. This assumption, which is also made in Nave Bayes [5], enables us to pre-compute probabilities and store them in main memory, enabling very fast performance on millions of customers. The algorithm supports a variety of tunable parameters so that the method can address different promotional objectives. We tested the algorithm at an on-line hardware retailer, with 17,400 customers divided randomly into control and experimental groups. In the experimental group, clickthrough increased by +40% (p<0.01), revenue by +38% (p<0.07), and units sold by +61% (p<0.01). By changing the algorithm's parameter settings we found that these results could be improved even further. This work demonstrates the considerable potent...
PVA: A Self-Adaptive Personal View Agent
, 2002
"... In this paper, we present PVA, an adaptive personal view information agent system for tracking, learning and managing user interests in Internet documents. PVA consists of three parts: a proxy, personal view constructor, and personal view maintainer. The proxy logs the user's activities and extracts ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
In this paper, we present PVA, an adaptive personal view information agent system for tracking, learning and managing user interests in Internet documents. PVA consists of three parts: a proxy, personal view constructor, and personal view maintainer. The proxy logs the user's activities and extracts the user's interests without user intervention. The personal view constructor mines user interests and maps them to a class hierarchy (i.e., personal view). The personal view maintainer synchronizes user interests and the personal view periodically. When user interests change, in PVA, not only the contents, but also the structure of the user profile are modified to adapt to the changes. In addition, PVA considers the aging problem of user interests. The experimental results show that modulating the structure of the user profile increases the accuracy of a personalization system.
Interactive Path Analysis of Web Site Traffic
- Proc. of the Seventh ACM SIGKDD Int. Conf. in Knowledge Discovery and Data Mining
, 2001
"... The goal of Path Analysis is to understand visitors' navigation of a Web site. The fundamental analysis component is a path. A path is a finite sequence of elements, typically representing URLs or groups of URLs. A full path is an abstraction of a visit or a session, which can contain attributes des ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
The goal of Path Analysis is to understand visitors' navigation of a Web site. The fundamental analysis component is a path. A path is a finite sequence of elements, typically representing URLs or groups of URLs. A full path is an abstraction of a visit or a session, which can contain attributes described below. Subpaths represent interesting subsequences of the full paths.
The Design and Evaluation of Web Prefetching and Caching Techniques
, 2002
"... User-perceived retrieval latencies in the World Wide Web can be improved by pre-loading a local cache with resources likely to be accessed. A user requesting content that can be served by the cache is able to avoid the delays inherent in the Web, such as congested networks and slow servers. The diff ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
User-perceived retrieval latencies in the World Wide Web can be improved by pre-loading a local cache with resources likely to be accessed. A user requesting content that can be served by the cache is able to avoid the delays inherent in the Web, such as congested networks and slow servers. The difficulty, then, is to determine what content to prefetch into the cache. This work explores machine learning algorithms for user sequence prediction, both in general and specifically for sequences of Web requests. We also consider information retrieval techniques to allow the use of the content of Web pages to help predict future requests. Although history-based mechanisms can provide strong performance in predicting future requests, performance can be improved by including predictions from additional sources. While past researchers have used a variety of techniques for evaluating caching algorithms and systems, most of those methods were not applicable to the evaluation of prefetching algorithms or systems. Therefore, two new mechanisms for evaluation are introduced. The first is a detailed trace-based simulator, built from scratch,

