Results 1 - 10
of
11
Data Mining for Web Personalization
- The Adaptive Web: Methods and Strategies of Web Personalization. Lecture
, 2006
"... Abstract. In this chapter we present an overview of Web personalization process viewed as an application of data mining requiring support for all the phases of a typical data mining cycle. These phases include data collection and preprocessing, pattern discovery and evaluation, and finally applying ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
Abstract. In this chapter we present an overview of Web personalization process viewed as an application of data mining requiring support for all the phases of a typical data mining cycle. These phases include data collection and preprocessing, pattern discovery and evaluation, and finally applying the discovered knowledge in real-time to mediate between the user and the Web. This view of the personalization process provides added flexibility in leveraging multiple data sources and in effectively using the discovered models in an automatic personalization system. The chapter provides a detailed discussion of a host of activities and techniques used at different stages of this cycle, including the preprocessing and integration of data from multiple sources, as well as pattern discovery techniques that are typically applied to this data. We consider a number of classes of data mining algorithms used particularly for Web personalization, including techniques based on clustering, association rule discovery, sequential pattern mining, Markov models, and probabilistic mixture and hidden (latent) variable models. Finally, we discuss hybrid data mining frameworks that leverage data from a variety
Blogrank: ranking weblogs based on connectivity and similarity features
- In AAA-IDEA ’06: Proceedings of the 2nd international workshop on Advanced architectures and
, 2006
"... A large part of the hidden web resides in weblog servers. New content is produced in a daily basis and the work of traditional search engines turns to be insufficient due to the nature of weblogs. This work summarizes the structure of the blogosphere and highlights the special features of weblogs. I ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
A large part of the hidden web resides in weblog servers. New content is produced in a daily basis and the work of traditional search engines turns to be insufficient due to the nature of weblogs. This work summarizes the structure of the blogosphere and highlights the special features of weblogs. In this paper we present a method for ranking weblogs based on the link graph and on several similarity characteristics between weblogs. First we create an enhanced graph of connected weblogs and add new types of edges and weights utilising many weblog features. Then, we assign a ranking to each weblog using our algorithm, BlogRank, which is a modified version of PageRank. For the validation of our method we run experiments on a weblog dataset, which we process and adapt to our search engine.
Web path recommendations based on page ranking and markov models
- In WIDM ’05: Proceedings of the 7th annual ACM international workshop on Web information and data management, 2–9
, 2005
"... Markov models have been widely used for modelling users' navigational behaviour in the Web graph, using the transitional probabilities between web pages, as recorded in the web logs. The recorded users ' navigation is used to extract popular web paths and predict current users ’ next steps. Such pur ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Markov models have been widely used for modelling users' navigational behaviour in the Web graph, using the transitional probabilities between web pages, as recorded in the web logs. The recorded users ' navigation is used to extract popular web paths and predict current users ’ next steps. Such purely usage-based probabilistic models, however, present certain shortcomings. Since the prediction of users ' navigational behaviour is based solely on the usage data, structural properties of the Web graph are ignored. Thus important- in terms of pagerank authority score- paths may be underrated. In this paper we present a hybrid probabilistic predictive model extending the properties of Markov models by incorporating link analysis methods. More specifically, we propose the use of a PageRank-style algorithm for assigning prior probabilities to the web pages based on their importance in the web site's graph. We prove, through experimentation, that this approach results in more objective and representative predictions than the ones produced from the pure usage-based approaches.
An algebra for specifying valid compound terms in faceted taxonomies
- Data & Knowledge Engineering
"... A faceted taxonomy is a set of taxonomies, each describing a given knowledge domain from a different aspect. The indexing of the domain objects is done using compound terms, i.e. conjunctive combinations of terms from the taxonomies. A faceted taxonomy has several advantages over a single taxonomy, ..."
Abstract
-
Cited by 5 (4 self)
- Add to MetaCart
A faceted taxonomy is a set of taxonomies, each describing a given knowledge domain from a different aspect. The indexing of the domain objects is done using compound terms, i.e. conjunctive combinations of terms from the taxonomies. A faceted taxonomy has several advantages over a single taxonomy, including conceptual clarity, compactness, and scalability. A drawback, however, is the cost of identifying compound terms that are invalid, i.e. terms that do not apply to any object of the domain. This need arises both in indexing and retrieval, and involves considerable human effort for specifying the valid compound terms one by one. In this paper, we propose and present in detail an algebra which can be used to specify the set of valid compound terms in an efficient and flexible manner. It works on the basis of the original simple terms of the facets and a small set of positive and/or negative statements. In each algebraic operation, we adopt a closed-world assumption with respect to the declared positive or negative statements. In this paper we elaborate on the properties of the algebraic operators and we describe application and methodological issues.
The semantics of frequent subgraphs: Mining and navigation pattern analysis
- In Proc. of SIGKDD Explorations WebKDD 2005: KDD Workshop on Web Mining and Web Usage Analysis, in conjunction with the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2005
, 2005
"... The search for frequent subgraphs is a useful extension of common approaches in Web mining. For example, it allows the study of revisitation patterns in Web usage and the discovery of richer navigation structures such as "landmarks" or "hubs" that serve to organize a user's conceptual map of a site ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
The search for frequent subgraphs is a useful extension of common approaches in Web mining. For example, it allows the study of revisitation patterns in Web usage and the discovery of richer navigation structures such as "landmarks" or "hubs" that serve to organize a user's conceptual map of a site or a part of the Web. Any use of graph structures in Web usage mining, however, should also take into account that it is essential to integrate background knowledge into the analysis, and that behaviour must be studied at di#erent levels of abstraction. To capture these needs, we propose to use taxonomies in mining and to extend the standard notions of interestingness frequency/support by the notion of contextinduced interestingness. The AP-IP mining problem then consists of finding all frequent abstract patterns and the individual patterns that constitute them and are therefore interesting in this context (even though they may be infrequent) . The paper presents the AP-IP algorithm that uses a taxonomy to search for the abstract and individual patterns, We also show that the search for label-abstracted but isomorphic subgraphs does not always give an accurate image of navigation strategies, and we develop a procedure for mining at the concept level to solve this problem. A case study of a real-life Web site shows the advantages of the proposed solutions.
Using and Learning Semantics in Frequent Subgraph Mining
, 2005
"... The search for frequent subgraphs is becoming increasingly important in many application areas including Web mining and bioinformatics. ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
The search for frequent subgraphs is becoming increasingly important in many application areas including Web mining and bioinformatics.
SEWeP: A Web Mining System supporting Semantic
- in Proc. of the ECML/PKDD 2004 Conference
, 2004
"... We present SEWeP, a Web Personalization prototype system that integrates usage data with content semantics, expressed in taxonomy terms, in order to produce a broader yet semantically focused set of recommendations. ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
We present SEWeP, a Web Personalization prototype system that integrates usage data with content semantics, expressed in taxonomy terms, in order to produce a broader yet semantically focused set of recommendations.
An Approach for Identification of User’s Intentions during the Navigation in Semantic Websites
- In Proceedings of the 4th European Semantic Web Conference
, 2007
"... Abstract. The growing need for content customization in websites has fostered the development of systems which try to identify the user’s navigation patterns. These may be, normally, identified by means of log file analysis. However, this solution does not identify the semantic intention behind user ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract. The growing need for content customization in websites has fostered the development of systems which try to identify the user’s navigation patterns. These may be, normally, identified by means of log file analysis. However, this solution does not identify the semantic intention behind user’s navigation. This paper provides an approach to incorporating semantic knowledge to the process of identifying the user’s intentions in the navigation of a website with semantic support. The capture of the user’s intentions is achieved by the semantic enrichment of the log files and the use of and approach that takes into account the linguistic and cognitive aspects in the development of the user model.
© Rinton Press AN INTEGRATED TECHNIQUE FOR WEB SITE USAGE SEMANTIC ANALYSIS: THE ORGAN SYSTEM
, 2007
"... In this work, a new log analysis system is proposed and implemented, called ORGAN (Ontology-oRiented usaGe ANalysis system). ORGAN aims to enhance and ease log analysis by using semantic knowledge.It is able to offer typical statistical analysis of Web usage logs taking into consideration at the sam ..."
Abstract
- Add to MetaCart
In this work, a new log analysis system is proposed and implemented, called ORGAN (Ontology-oRiented usaGe ANalysis system). ORGAN aims to enhance and ease log analysis by using semantic knowledge.It is able to offer typical statistical analysis of Web usage logs taking into consideration at the same time site’s underlying semantics. We evaluated ORGAN using Web site data for different cases to verify and exhibit its promising behavior. The experimental outcomes were encouraging and valuable conclusions for the Web site usage under analysis were reached. Consequently, we believe and show paradigms that ORGAN could become a useful tool for Web log analysts and assist the Web site managers in the decision-making for reorganization tasks. Finally, we discuss open problems to motivate further research efforts towards the incorporation of semantic Web technologies into Web site log mining analysis.
Trends in Web Mining for Personalization
"... Web mining has matured as a field of basic and applied research in computer science in general and e-commerce in particular. In the last decade, WWW has emerged as an all encompassing technology that has revolutionized the way people live. Web browsing has become an integral part of their lifestyle. ..."
Abstract
- Add to MetaCart
Web mining has matured as a field of basic and applied research in computer science in general and e-commerce in particular. In the last decade, WWW has emerged as an all encompassing technology that has revolutionized the way people live. Web browsing has become an integral part of their lifestyle. This need has led to rise of range of technologies that overcome the information overload problem on the web, which is termed as personalization of the web. In this paper, we present a survey that gives a precise and comprehensive understanding of work done in the last five years in the field of personalization. The paper reviews how each of the approaches cater to user needs and give examples of projects associated with some of these techniques. The paper also mentions few issues for further research in this domain, based on the survey. In the end, paper concludes citing a promising future of this area of research.

