Results 1 - 10
of
16
CubeSVD: A Novel Approach to Personalized Web Search
- In Proc. of the 14 th International World Wide Web Conference (WWW
, 2005
"... As the competition of Web search market increases, there is a high demand for personalized Web search to conduct retrieval incorporating Web users' information needs. This paper focuses on utilizing clickthrough data to improve Web search. Since millions of searches are conducted everyday, a search ..."
Abstract
-
Cited by 47 (3 self)
- Add to MetaCart
As the competition of Web search market increases, there is a high demand for personalized Web search to conduct retrieval incorporating Web users' information needs. This paper focuses on utilizing clickthrough data to improve Web search. Since millions of searches are conducted everyday, a search engine accumulates a large volume of clickthrough data, which records who submits queries and which pages he/she clicks on. The clickthrough data is highly sparse and contains di#erent types of objects (user, query and Web page), and the relationships among these objects are also very complicated. By performing analysis on these data, we attempt to discover Web users' interests and the patterns that users locate information. In this paper, a novel approach CubeSVD is proposed to improve Web search. The clickthrough data is represented by a 3-order tensor, on which we perform 3-mode analysis using the higher-order singular value decomposition technique to automatically capture the latent factors that govern the relations among these multi-type objects: users, queries and Web pages. A tensor reconstructed based on the CubeSVD analysis reflects both the observed interactions among these objects and the implicit associations among them. Therefore, Web search activities can be carried out based on CubeSVD analysis. Experimental evaluations using a real-world data set collected from an MSN search engine show that CubeSVD achieves encouraging search results in comparison with some standard methods.
Mining Temporally Evolving Graphs
, 2004
"... Web mining has been explored to a vast degree and different techniques have been proposed for a variety of applications that include Web Search, Web Classification, Web Personalization etc. Most research on Web mining has been from a ‘data-centric ’ point of view. The focus has been primarily on dev ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
Web mining has been explored to a vast degree and different techniques have been proposed for a variety of applications that include Web Search, Web Classification, Web Personalization etc. Most research on Web mining has been from a ‘data-centric ’ point of view. The focus has been primarily on developing measures and applications based on data collected from content, structure and usage of Web till a particular time instance. In this project we examine another dimension of Web Mining, namely temporal dimension. Web data has been evolving over time, reflecting the ongoing trends. These changes in data in the temporal dimension reveal new kind of information. This information has not captured the attention of the Web mining research community to a large extent. In this paper, we highlight the significance of studying the evolving nature of the Web graphs. We have classified the approach to such problems at three levels of analysis: single node, sub-graphs and whole graphs. We provide a framework to approach problems of this kind and have identified interesting problems at each level. Our experiments verify the significance of such analysis and also point to future directions in this area. The approach we take is generic and can be applied to other domains, where data can be modeled as graph, such as network intrusion detection or social networks.
Data Mining for Web Personalization
- The Adaptive Web: Methods and Strategies of Web Personalization. Lecture
, 2006
"... Abstract. In this chapter we present an overview of Web personalization process viewed as an application of data mining requiring support for all the phases of a typical data mining cycle. These phases include data collection and preprocessing, pattern discovery and evaluation, and finally applying ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
Abstract. In this chapter we present an overview of Web personalization process viewed as an application of data mining requiring support for all the phases of a typical data mining cycle. These phases include data collection and preprocessing, pattern discovery and evaluation, and finally applying the discovered knowledge in real-time to mediate between the user and the Web. This view of the personalization process provides added flexibility in leveraging multiple data sources and in effectively using the discovered models in an automatic personalization system. The chapter provides a detailed discussion of a host of activities and techniques used at different stages of this cycle, including the preprocessing and integration of data from multiple sources, as well as pattern discovery techniques that are typically applied to this data. We consider a number of classes of data mining algorithms used particularly for Web personalization, including techniques based on clustering, association rule discovery, sequential pattern mining, Markov models, and probabilistic mixture and hidden (latent) variable models. Finally, we discuss hybrid data mining frameworks that leverage data from a variety
ABSTRACT Robustness of Collaborative Recommendation Based On Association Rule Mining ∗
"... Standard memory-based collaborative filtering algorithms, such as k-nearest neighbor, are quite vulnerable to profile injection attacks. Previous work has shown that some modelbased techniques are more robust than k-nn. Model abstraction can inhibit certain aspects of an attack, providing an algorit ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
Standard memory-based collaborative filtering algorithms, such as k-nearest neighbor, are quite vulnerable to profile injection attacks. Previous work has shown that some modelbased techniques are more robust than k-nn. Model abstraction can inhibit certain aspects of an attack, providing an algorithmic approach to minimizing attack effectiveness. In this paper, we examine the robustness of a recommendation algorithm based on the data mining technique of association rule mining. Our results show that the Apriori algorithm offers large improvement in stability and robustness compared to k-nearest neighbor and other model-based techniques we have studied. Furthermore, our results show that Apriori can achieve comparable recommendation accuracy to k-nn.
Using Probabilistic Latent Semantic Analysis for Personalized Web Search
- Springer-Verlag Berlin Heidelberg, LNCS
"... Abstract. Web users use search engine to find useful information on the Internet. However current web search engines return answer to a query independent of specific user information need. Since web users with similar web behaviors tend to acquire similar information when they submit a same query, t ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Abstract. Web users use search engine to find useful information on the Internet. However current web search engines return answer to a query independent of specific user information need. Since web users with similar web behaviors tend to acquire similar information when they submit a same query, these unseen factors can be used to improve search result. In this paper we present an approach that mines these unseen factors from web logs to personalized web search. Our approach is based on probabilistic latent semantic analysis, a model based technique that is used to analyze co-occurrence data. Experimental results on real data collected by MSN search engine show the improvements over traditional web search. 1
B.: Task-oriented web user modeling for recommendation
- In: Proceedings of the 10th International Conference on User Modeling (UM’05
, 2005
"... Abstract. We propose an approach for modeling the navigational behavior of Web users based on task-level patterns. The discovered “tasks” are characterized probabilistically as latent variables, and represent the underlying interests or intended navigational goal of users. The ability to measure the ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Abstract. We propose an approach for modeling the navigational behavior of Web users based on task-level patterns. The discovered “tasks” are characterized probabilistically as latent variables, and represent the underlying interests or intended navigational goal of users. The ability to measure the probabilities by which pages in user sessions are associated with various tasks, allow us to track task transitions and modality shifts within (or across) user sessions, and to generate task-level navigational patterns. We also propose a maximum entropy recommendation system which combines the page-level statistics about users ’ navigational activities together with our task-level usage patterns. Our experiments show that the task-level patterns provide better interpretability of Web users’ navigation, and improve the accuracy of recommendations. 1
Semantic Web Mining and the Representation, Analysis, and Evolution of Web SPace
- IN PROCEEDINGS OF RAWS 2005 WORKSHOP
, 2005
"... Semantic Web Mining aims at combining the two fast-developing research areas Semantic Web and Web Mining. This survey analyzes the convergence of trends from both areas: Growing numbers of researchers work on improving the results of Web Mining by exploiting semantic structures in the Web, and t ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Semantic Web Mining aims at combining the two fast-developing research areas Semantic Web and Web Mining. This survey analyzes the convergence of trends from both areas: Growing numbers of researchers work on improving the results of Web Mining by exploiting semantic structures in the Web, and they use Web Mining techniques for building the Semantic Web. Last but not least, these techniques can be used for mining the Semantic Web itself. The second aim of this paper is to use these concepts to circumscribe what Web space is, what it represents and how it can be represented and analyzed. This is used to sketch the role that Semantic Web Mining and the software agents and human agents involved in it can play in the evolution of Web space.
Frequent Pattern Mining in Web Log Data
"... Abstract: Frequent pattern mining is a heavily researched area in the field of data mining with wide range of applications. One of them is to use frequent pattern discovery methods in Web log data. Discovering hidden information from Web log data is called Web usage mining. The aim of discovering fr ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Abstract: Frequent pattern mining is a heavily researched area in the field of data mining with wide range of applications. One of them is to use frequent pattern discovery methods in Web log data. Discovering hidden information from Web log data is called Web usage mining. The aim of discovering frequent patterns in Web log data is to obtain information about the navigational behavior of the users. This can be used for advertising purposes, for creating dynamic user profiles etc. In this paper three pattern mining approaches are investigated from the Web usage mining point of view. The different patterns in Web log mining are page sets, page sequences and page graphs.
A mixture model for expert finding
- Proc. of PAKDD’2008
"... Abstract. This paper addresses the issue of identifying persons with expertise knowledge on a given topic. Traditional methods usually estimate the relevance between the query and the support documents of candidate experts using, for example, a language model. However, the language model lacks the a ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Abstract. This paper addresses the issue of identifying persons with expertise knowledge on a given topic. Traditional methods usually estimate the relevance between the query and the support documents of candidate experts using, for example, a language model. However, the language model lacks the ability of identifying semantic knowledge, thus results in some right experts cannot be found due to not occurrence of the query terms in the support documents. In this paper, we propose a mixture model based on Probabilistic Latent Semantic Analysis (PLSA) to estimate a hidden semantic theme layer between the terms and the support documents. The hidden themes are used to capture the semantic relevance between the query and the experts. We evaluate our mixture model in a real-world system, ArnetMiner 1. Experimental results indicate that the proposed model outperforms the language models. 1
Different Aspects of Web Log Mining
- 6 th International Symposium of Hungarian Researchers on Computational Intelligence
, 2005
"... Abstract: The expansion of the World Wide Web has resulted in a large amount of data that is now freely available for user access. The data have to be managed and organized in such a way that the user can access them efficiently. For this reason the application of data mining techniques on the Web i ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Abstract: The expansion of the World Wide Web has resulted in a large amount of data that is now freely available for user access. The data have to be managed and organized in such a way that the user can access them efficiently. For this reason the application of data mining techniques on the Web is now the focus of an increasing number of researchers. One key issue is the investigation of user navigational behavior from different aspects. For this reason different types of data mining techniques can be applied on the log file collected on the servers. In this paper three of the most important approaches are introduced for web log mining. All the three methods are based on the frequent pattern mining approach. The three types of patterns that can be used for obtain useful information about the navigational behavior of the users are page set, page sequence and page graph mining.

