Results 1 - 10
of
64
Effective Personalization Based on Association Rule Discovery from Web Usage Data
- In Proceedings of the 3rd ACM Workshop on Web Information and Data Management (WIDM01
, 2001
"... To engage visitors to a Web site at a very early stage (i.e., before registration or authentication), personalization tools must rely primarily on clickstreamdata captured in Web server logs. The lack of explicit user ratings as well as the sparse nature and the large volume of data in such a settin ..."
Abstract
-
Cited by 58 (9 self)
- Add to MetaCart
To engage visitors to a Web site at a very early stage (i.e., before registration or authentication), personalization tools must rely primarily on clickstreamdata captured in Web server logs. The lack of explicit user ratings as well as the sparse nature and the large volume of data in such a setting poses serious challenges to standard collaborative filtering techniques in terms of scalability and performance. Web usage mining techniques such as clustering that rely on offline pattern discovery from user transactions can be used to improve the scalability of collaborative filtering, however, this is often at the cost of redfied recommendation accuracy. In this paper we propose effective and scalable techniques for Web personalization based on association rule d scovery from usage data. Through detailed experimental evaluation on real usage data, we show that the proposed methodology can achieve better recommend tion effectiveness, while maintaining a computational advantage over direct approaches to collaborative filtering such as the k-nearest-neighbor strategy.
A Data Mining Algorithm for Generalized Web Prefetching
- IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
, 2003
"... Predictive Web prefetching refers to the mechanism of deducing the forthcoming page accesses of a client based on its past accesses. In this paper, we present a new context for the interpretation of Web prefetching algorithms as Markov predictors. We identify the factors that affect the performanc ..."
Abstract
-
Cited by 51 (16 self)
- Add to MetaCart
Predictive Web prefetching refers to the mechanism of deducing the forthcoming page accesses of a client based on its past accesses. In this paper, we present a new context for the interpretation of Web prefetching algorithms as Markov predictors. We identify the factors that affect the performance of Web prefetching algorithms. We propose a new algorithm called WM o , which is based on data mining and is proven to be a generalization of existing ones. It was designed to address their specific limitations and its characteristics include all the above factors. It compares favorably with previously proposed algorithms. Further, the algorithm efficiently addresses the increased number of candidates. We present a detailed performance evaluation of WM o with synthetic and real data. The experimental results show that WM o can provide significant improvements over previously proposed Web prefetching algorithms.
Model-based clustering and visualization of navigation patterns on a web site
- Data Mining and Knowledge Discovery
, 2003
"... We present a new methodology for exploring and analyzing navigation patterns on a web site. The patterns that can be analyzed consist of sequences of URL categories traversed by users. In our approach, we rst partition site users into clusters such that users with similar navigation paths through th ..."
Abstract
-
Cited by 36 (0 self)
- Add to MetaCart
We present a new methodology for exploring and analyzing navigation patterns on a web site. The patterns that can be analyzed consist of sequences of URL categories traversed by users. In our approach, we rst partition site users into clusters such that users with similar navigation paths through the site are placed into the same cluster. Then, for each cluster, we display these paths for users within that cluster. The clustering approach weemployis model-based (as opposed to distance-based) and partitions users according to the order in which they request web pages. In particular, we cluster users by learning a mixture of rst-order Markov models using the Expectation-Maximization algorithm. The runtime of our algorithm scales linearly with the number of clusters and with the size of the data � and our implementation easily handles hundreds of thousands of user sessions in memory. In the paper, we describe the details of our method and a visualization tool based on it called WebCANVAS. We illustrate the use of our approach on user-tra c data from msnbc.com. Keywords: Model-based clustering, sequence clustering, data visualization, Internet, web 1
Using Ontologies to Discover Domain-Level Web Usage Profiles
, 2002
"... Usage patterns discovered through Web usage mining are effective in capturing item-to-item and user-to-user relationships and similarities at the level of user sessions Without the benefit of deeper domain knowledge, such patterns provide little insight into the underlying reasons for which such ite ..."
Abstract
-
Cited by 30 (7 self)
- Add to MetaCart
Usage patterns discovered through Web usage mining are effective in capturing item-to-item and user-to-user relationships and similarities at the level of user sessions Without the benefit of deeper domain knowledge, such patterns provide little insight into the underlying reasons for which such items or users are grouped together This can lead to a number of important shortcomings in personalization systems based on Web usage mining or collaborative filtering. For example, if a new item is recently added to the Web site, it is not likely that the pages associated with the item would be a part of any of the discovered patterns, and thus these pages cannot be recommended. Keyword-based content-filtering approaches have been used to enhance the effectiveness of collaborative filtering systems by focusing on content similarity among items or pages. These approaches, however, are incapable of capturing more complex relationships at a deeper semantic level based on different types of attributes associated with structured objects. This paper represents work-in-progress towards creating a general framework for using domain ontologies to automatically characterize usage profiles containing a set of structured Web objects. Our motivation is to use this framework in the context of Web personalization, going beyond page- or item-level constructs, and using the full semantic power of the underlying ontology.
A Hybrid Web Personalization Model Based on Site Connectivity
- In Proc. of WebKDD
, 2003
"... Web usage mining has been used effectively as an underlying mechanism for Web personalization and recommender systems. A variety of recommendation frameworks have been proposed, including some based on non-sequential models, such as association rules and clusters, and some based on sequential models ..."
Abstract
-
Cited by 24 (1 self)
- Add to MetaCart
Web usage mining has been used effectively as an underlying mechanism for Web personalization and recommender systems. A variety of recommendation frameworks have been proposed, including some based on non-sequential models, such as association rules and clusters, and some based on sequential models, such as sequential or navigational patterns. Our recent studies have suggested that the structural characteristics of Web sites, such as the site topology and the degree of connectivity, have a significant impact on the relative performance of recommendation models based on association rules, contiguous and non-contiguous sequential patterns. In this paper, we present a framework for a hybrid Web personalization system that can intelligently switch among different recommendation models, based on the degree of connectivity and the current location of the user within the site. We have conducted a detailed evaluation based on real Web usage data from three sites with different structural characteristics. Our results show that the hybrid system selects less constrained models such as frequent itemsets when the user is navigating portions of the site with a higher degree of connectivity, while sequential recommendation models are chosen for deeper navigational depths and lower degrees of connectivity. The comparative evaluation also indicates that the overall performance of hybrid system in terms of precision and coverage is better than the recommendation systems based on any of the individual models.
Evaluation of Techniques for Classifying Biological Sequences
, 2001
"... In recent years we have witnessed an exponential increase in the amount of biological information, either DNA or protein sequences, that has become available in public databases. This has been followed by an increased interest in developing computational techniques to automatically classify these ..."
Abstract
-
Cited by 19 (0 self)
- Add to MetaCart
In recent years we have witnessed an exponential increase in the amount of biological information, either DNA or protein sequences, that has become available in public databases. This has been followed by an increased interest in developing computational techniques to automatically classify these large volumes of sequence data into various categories corresponding to either their role in the chromosomes, their structure, and/or their function. In this paper we evaluate some of the widely-used sequence classification algorithms and develop a framework for modeling sequences in a fashion so that traditional machine learning algorithms, such as support vector machines, can be applied easily.
Analysis of Topic Dynamics in Web Search
"... We report on a study of topic dynamics for pages visited by a sample of people using MSN Search. We examine the predictive accuracies of probabilistic models of topic transitions for individuals and groups of users. We explore temporal dynamics by comparing the accuracy of the models for predicting ..."
Abstract
-
Cited by 16 (4 self)
- Add to MetaCart
We report on a study of topic dynamics for pages visited by a sample of people using MSN Search. We examine the predictive accuracies of probabilistic models of topic transitions for individuals and groups of users. We explore temporal dynamics by comparing the accuracy of the models for predicting topic transitions at increasingly distant times in the future. Finally, we discuss directions for applying models of search topic dynamics.
Effective Prediction of Web-user Accesses: A Data Mining Approach
, 2001
"... The problem of predicting web-user accesses has recently attracted significant attention. Several algorithms have been proposed, which find important applications, like user profiling, recommender systems, web prefetching, design of adaptive web sites, etc. In all these applications the core issue i ..."
Abstract
-
Cited by 16 (1 self)
- Add to MetaCart
The problem of predicting web-user accesses has recently attracted significant attention. Several algorithms have been proposed, which find important applications, like user profiling, recommender systems, web prefetching, design of adaptive web sites, etc. In all these applications the core issue is the developement of an e#ective prediction algorithm. In this paper, we focus on web-prefetching, because of its importance in reducing user perceived latency present in every Web-based application. The proposed method can be easily extended to the other aforementioned applications.
Context-Sensitive Modeling of Web-Surfing Behaviour using Concept Trees
- in Proceedings of the 5 th WEBKDD Workshop
, 2003
"... Early approaches to mathematically abstracting websurfing behavior were largely based on first-order Markov models. Most humans however do not surf in a "memoryless " fashion, rather they are guided by their timedependent situational context and associated information needs. This belief is corrobora ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
Early approaches to mathematically abstracting websurfing behavior were largely based on first-order Markov models. Most humans however do not surf in a "memoryless " fashion, rather they are guided by their timedependent situational context and associated information needs. This belief is corroborated by the non-exponential revisit times observed in many site-centric weblogs. In this paper, we propose a general framework for modeling users whose surfing behavior is dynamically governed by their current topic of interest. This allows a modeled surfer to behave differently on the same page, depending on his situational context. The proposed methodology involves mapping each visited page to a topic or concept, (conceptually) imposing a tree hierarchy on these topics, and then estimating the parameters of a semi-Markov process defined on this tree based on the observed transitions among the underlying visited pages. The semi-Markovian assumption imparts additional flexibility by allowing for non-exponential state re-visit times, and the concept hierarchy provides a nice way of capturing context and user intent. Our approach is computationally much less demanding as compared to the alternative approach of using higher order Markov models for capturing history-sensitive surfing behavior. Several practical applications are described. The application of better predicting which outlink a surfer may take, is illustrated using web-log data from a rich community portal, www.sulekha.com as an example, though the focus of the paper is on forming a plausible generative model rather than solving any specific task.
Data Mining for Web Personalization
- The Adaptive Web: Methods and Strategies of Web Personalization. Lecture
, 2006
"... Abstract. In this chapter we present an overview of Web personalization process viewed as an application of data mining requiring support for all the phases of a typical data mining cycle. These phases include data collection and preprocessing, pattern discovery and evaluation, and finally applying ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
Abstract. In this chapter we present an overview of Web personalization process viewed as an application of data mining requiring support for all the phases of a typical data mining cycle. These phases include data collection and preprocessing, pattern discovery and evaluation, and finally applying the discovered knowledge in real-time to mediate between the user and the Web. This view of the personalization process provides added flexibility in leveraging multiple data sources and in effectively using the discovered models in an automatic personalization system. The chapter provides a detailed discussion of a host of activities and techniques used at different stages of this cycle, including the preprocessing and integration of data from multiple sources, as well as pattern discovery techniques that are typically applied to this data. We consider a number of classes of data mining algorithms used particularly for Web personalization, including techniques based on clustering, association rule discovery, sequential pattern mining, Markov models, and probabilistic mixture and hidden (latent) variable models. Finally, we discuss hybrid data mining frameworks that leverage data from a variety

