Results 1 - 10
of
168
Web mining for web personalization
- ACM Transactions on Internet Technology
, 2003
"... Web personalization is the process of customizing a Web site to the needs of specific users, taking advantage of the knowledge acquired from the analysis of the user’s navigational behavior (usage data) in correlation with other information collected in the Web context, namely, structure, content an ..."
Abstract
-
Cited by 100 (4 self)
- Add to MetaCart
Web personalization is the process of customizing a Web site to the needs of specific users, taking advantage of the knowledge acquired from the analysis of the user’s navigational behavior (usage data) in correlation with other information collected in the Web context, namely, structure, content and user profile data. Due to the explosive growth of the Web, the domain of Web personalization has gained great momentum both in the research and commercial areas. In this article we present a survey of the use of Web mining for Web personalization. More specifically, we introduce the modules that comprise a Web personalization system, emphasizing the Web usage mining module. A review of the most common methods that are used as well as technical issues that occur is given, along with a brief overview of the most popular tools and applications available from software vendors. Moreover, the most important research initiatives in the Web usage mining and personalization areas are presented.
Discovery and Evaluation of Aggregate Usage Profiles for Web Personalization
- Data Mining and Knowledge Discovery
, 2002
"... Web usage mining, possibly used in conjunction with standard approaches to personalization such as collaborative filtering, can help address some of the shortcomings of these techniques, including reliance on subjective user ratings, lack of scalability, and poor performance in the face of high-dime ..."
Abstract
-
Cited by 78 (14 self)
- Add to MetaCart
Web usage mining, possibly used in conjunction with standard approaches to personalization such as collaborative filtering, can help address some of the shortcomings of these techniques, including reliance on subjective user ratings, lack of scalability, and poor performance in the face of high-dimensional and sparse data. However, the discovery of patterns from usage data by itself is not sufficient for performing the personalization tasks. The critical step is the effective derivation of good quality and useful (i.e., actionable) "aggregate usage profiles" from these patterns. In this paper we present and experimentally evaluate two techniques, based on clustering of user transactions and clustering of pageviews, in order to discover overlapping aggregate profiles that can be effectively used by recommender systems for real-time Web personalization. We evaluate these techniques both in terms of the quality of the individual profiles generated, as well as in the context of providing recommendations as an integrated part of a personalization engine. In particular, our results indicate that using the generated aggregate profiles, we can achieve effective personalization at early stages of users' visits to a site, based only on anonymous clickstream data and without the benefit of explicit input by these users or deeper knowledge about them.
Effective Personalization Based on Association Rule Discovery from Web Usage Data
- In Proceedings of the 3rd ACM Workshop on Web Information and Data Management (WIDM01
, 2001
"... To engage visitors to a Web site at a very early stage (i.e., before registration or authentication), personalization tools must rely primarily on clickstreamdata captured in Web server logs. The lack of explicit user ratings as well as the sparse nature and the large volume of data in such a settin ..."
Abstract
-
Cited by 58 (9 self)
- Add to MetaCart
To engage visitors to a Web site at a very early stage (i.e., before registration or authentication), personalization tools must rely primarily on clickstreamdata captured in Web server logs. The lack of explicit user ratings as well as the sparse nature and the large volume of data in such a setting poses serious challenges to standard collaborative filtering techniques in terms of scalability and performance. Web usage mining techniques such as clustering that rely on offline pattern discovery from user transactions can be used to improve the scalability of collaborative filtering, however, this is often at the cost of redfied recommendation accuracy. In this paper we propose effective and scalable techniques for Web personalization based on association rule d scovery from usage data. Through detailed experimental evaluation on real usage data, we show that the proposed methodology can achieve better recommend tion effectiveness, while maintaining a computational advantage over direct approaches to collaborative filtering such as the k-nearest-neighbor strategy.
A Taxonomy of Recommender Agents on the Internet
- ARTIFICIAL INTELLIGENCE REVIEW
, 2003
"... Recently, Artificial Intelligence techniques have proved useful in helping users to handle the large amount of information on the Internet. The idea of personalized search engines, intelligent software agents, and recommender systems has been widely accepted among users who require assistance in sea ..."
Abstract
-
Cited by 44 (1 self)
- Add to MetaCart
Recently, Artificial Intelligence techniques have proved useful in helping users to handle the large amount of information on the Internet. The idea of personalized search engines, intelligent software agents, and recommender systems has been widely accepted among users who require assistance in searching, sorting, classifying, filtering and sharing this vast quantity of information. In this paper, we present a state-of-the-art taxonomy of intelligent recommender agents on the Internet. We have analyzed 37 different systems and their references and have sorted them into a list of 8 basic dimensions. These dimensions are then used to establish a taxonomy under which the systems analyzed are classified. Finally, we conclude this paper with a cross-dimensional analysis with the aim of providing a starting point for researchers to construct their own recommender system.
Towards semantic web mining
- IN INTERNATIONAL SEMANTIC WEB CONFERENCE (ISWC
, 2002
"... Semantic Web Mining aims at combining the two fast-developing research areas Semantic Web and Web Mining. The idea is to improve, on the one hand, the results of Web Mining by exploiting the new semantic structures in the Web; and to make use of Web Mining, on the other hand, for building up the Sem ..."
Abstract
-
Cited by 44 (9 self)
- Add to MetaCart
Semantic Web Mining aims at combining the two fast-developing research areas Semantic Web and Web Mining. The idea is to improve, on the one hand, the results of Web Mining by exploiting the new semantic structures in the Web; and to make use of Web Mining, on the other hand, for building up the Semantic Web. This paper gives an overview of where the two areas meet today, and sketches ways of how a closer integration could be profitable.
Insight and Perspective for Content Delivery Networks
- in Communications of the ACM
, 2006
"... Striking a balance between the costs for Web content providers and the quality of service for Web customers. More efficient content delivery over the Web has become an important element of improving Web performance. Content Delivery Networks (CDNs) have been proposed to maximize bandwidth, improve a ..."
Abstract
-
Cited by 40 (7 self)
- Add to MetaCart
Striking a balance between the costs for Web content providers and the quality of service for Web customers. More efficient content delivery over the Web has become an important element of improving Web performance. Content Delivery Networks (CDNs) have been proposed to maximize bandwidth, improve accessibility, and maintain correctness through content replication [11]. With CDNs, content is distributed to cache servers located close to users, resulting in fast, reliable applications and Web services for the users. More specifically, CDNs maintain multiple Points of Presence (PoP) with clusters of (the so-called surrogate) servers that store copies of identical content, such that users ’ requests are satisfied by the most appropriate site (see the figure here). Typically, a CDN topology involves: • A set of surrogate servers (distributed around the world) that cache the origin servers ’ content; • Routers and network elements that deliver
Using Ontologies to Discover Domain-Level Web Usage Profiles
, 2002
"... Usage patterns discovered through Web usage mining are effective in capturing item-to-item and user-to-user relationships and similarities at the level of user sessions Without the benefit of deeper domain knowledge, such patterns provide little insight into the underlying reasons for which such ite ..."
Abstract
-
Cited by 30 (7 self)
- Add to MetaCart
Usage patterns discovered through Web usage mining are effective in capturing item-to-item and user-to-user relationships and similarities at the level of user sessions Without the benefit of deeper domain knowledge, such patterns provide little insight into the underlying reasons for which such items or users are grouped together This can lead to a number of important shortcomings in personalization systems based on Web usage mining or collaborative filtering. For example, if a new item is recently added to the Web site, it is not likely that the pages associated with the item would be a part of any of the discovered patterns, and thus these pages cannot be recommended. Keyword-based content-filtering approaches have been used to enhance the effectiveness of collaborative filtering systems by focusing on content similarity among items or pages. These approaches, however, are incapable of capturing more complex relationships at a deeper semantic level based on different types of attributes associated with structured objects. This paper represents work-in-progress towards creating a general framework for using domain ontologies to automatically characterize usage profiles containing a set of structured Web objects. Our motivation is to use this framework in the context of Web personalization, going beyond page- or item-level constructs, and using the full semantic power of the underlying ontology.
Web Usage Mining Based on Probabilistic Latent Semantic Analysis
, 2004
"... The primary goal of Web usage mining is the discovery of patterns in the navigational behavior of Web users. Standard approaches, such as clustering of user sessions and discovering association rules or frequent navigational paths, do not generally provide the ability to automatically characterize o ..."
Abstract
-
Cited by 29 (5 self)
- Add to MetaCart
The primary goal of Web usage mining is the discovery of patterns in the navigational behavior of Web users. Standard approaches, such as clustering of user sessions and discovering association rules or frequent navigational paths, do not generally provide the ability to automatically characterize or quantify the unobservable factors that lead to common navigational patterns. It is, therefore, necessary to develop techniques that can automatically identify the users' underlying navigational objectives and to discover hidden semantic relationships among users as well as between users and Web objects. Probabilistic Latent Semantic Analysis (PLSA) is particularly useful in this context, since it can uncover latent semantic associations among users and pages based on the co-occurrence patterns of these pages in user sessions. In this paper, we develop a unified framework for the discovery and analysis of Web navigational patterns based on PLSA. We show the flexibility of this framework in characterizing various relationships among users and Web objects. Since these relationships are measured in terms of probabilities, we are able to use probabilistic inference to perform a variety of analysis tasks such as user segmentation, page classification, as well as predictive tasks such as collaborative recommendations. We demonstrate the e#ectiveness our approach through experiments performed on several real-world data sets.
Using Sequential and Non-Sequential Patterns in Predictive Web Usage Mining Tasks
- In Proceedings of the IEEE International Conference on Data Mining
, 2002
"... We describe an efficient framework for Web personalization based on sequential and non-sequential pattern discovery from usage data. Our experimental results performed on real usage data indicate that more restrictive patterns, such as contiguous sequential patterns (e.g., frequent navigational path ..."
Abstract
-
Cited by 27 (2 self)
- Add to MetaCart
We describe an efficient framework for Web personalization based on sequential and non-sequential pattern discovery from usage data. Our experimental results performed on real usage data indicate that more restrictive patterns, such as contiguous sequential patterns (e.g., frequent navigational paths) are more suitable for predictive tasks, such as Web prefetching, which involve predicting which item is accessed next by a user), while less constrained patterns, such as frequent itemsets or general sequential patterns are more effective alternatives in the context of Web personalization and recommender systems.
The impact of site structure and user environment on session reconstruction in web usage analysis
, 2002
"... The analysis of user behavior on the Web presupposes a reliable reconstruction of the users ’ navigational activities. Cookies and server-generated session identifiers have been designed to allow a faithful session reconstruction. However, in the absence of reliable methods, analysts must rely on he ..."
Abstract
-
Cited by 27 (4 self)
- Add to MetaCart
The analysis of user behavior on the Web presupposes a reliable reconstruction of the users ’ navigational activities. Cookies and server-generated session identifiers have been designed to allow a faithful session reconstruction. However, in the absence of reliable methods, analysts must rely on heuristics methods (a) to identify unique visitors to a site, and (b) to distinguish among the activities of such users during independent sessions. The characteristics of the site, such as the site structure, as well as the methods used for data collection (e.g., the existence of cookies and reliable synchronization across multiple servers) may necessitate the use of different types of heuristics. In this study, we extend our work on the reliability of sessionizing mechanisms, by investigating the impact of site structure on the quality of constructed sessions. Specifically, we juxtapose sessionizing on a frame-based and a frame-free version of a site. We investigate the behavior of cookies, server-generated session identification, and heuristics that exploit session duration, page stay time and page linkage. Different measures of session reconstruction quality, as well as experiments on the impact on the prediction of frequent entry and exit pages, show that different reconstruction heuristics can be recommended depending on the characteristics of the site. We also present first results on the impact of session reconstruction heuristics on predictive applications such as Web personalization.

