Results 1 - 10
of
10
Mining Temporally Evolving Graphs
, 2004
"... Web mining has been explored to a vast degree and different techniques have been proposed for a variety of applications that include Web Search, Web Classification, Web Personalization etc. Most research on Web mining has been from a ‘data-centric ’ point of view. The focus has been primarily on dev ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
Web mining has been explored to a vast degree and different techniques have been proposed for a variety of applications that include Web Search, Web Classification, Web Personalization etc. Most research on Web mining has been from a ‘data-centric ’ point of view. The focus has been primarily on developing measures and applications based on data collected from content, structure and usage of Web till a particular time instance. In this project we examine another dimension of Web Mining, namely temporal dimension. Web data has been evolving over time, reflecting the ongoing trends. These changes in data in the temporal dimension reveal new kind of information. This information has not captured the attention of the Web mining research community to a large extent. In this paper, we highlight the significance of studying the evolving nature of the Web graphs. We have classified the approach to such problems at three levels of analysis: single node, sub-graphs and whole graphs. We provide a framework to approach problems of this kind and have identified interesting problems at each level. Our experiments verify the significance of such analysis and also point to future directions in this area. The approach we take is generic and can be applied to other domains, where data can be modeled as graph, such as network intrusion detection or social networks.
Web usage mining: extracting unexpected periods from web logs
- DATA MINING AND KNOWLEDGE DISCOVERY
, 2008
"... Existing Web Usage Mining techniques are currently based on an arbitrary division of the data (e.g. "one log per month") or guided by presumed results (e.g "what is the customers behaviour for the period of Christmas purchases?"). Those approaches have two main drawbacks. First, they depend on this ..."
Abstract
-
Cited by 12 (3 self)
- Add to MetaCart
Existing Web Usage Mining techniques are currently based on an arbitrary division of the data (e.g. "one log per month") or guided by presumed results (e.g "what is the customers behaviour for the period of Christmas purchases?"). Those approaches have two main drawbacks. First, they depend on this arbitrary organization of the data. Second, they cannot automatically extract seasons peaks among the stored data. In this paper, we propose to perform a specific data mining process (and particularly to extract frequent behaviours) in order to automatically discover the densest periods. Our method extracts, among the whole set of possible combinations, the frequent sequential patterns related to the extracted periods. A period will be considered as dense if it contains at least one frequent sequential pattern for the set of users connected to the Web site in that period. Our experiments show that the extracted periods are relevant and our approach is able to extract both frequent sequential patterns and the associated dense periods.
Using Markov Models for Web Site Link Prediction
, 2002
"... Markov models have been extensively used to model Web users' navigation behaviors on Web sites. The link structure of a Web site can be seen as a citation network. By applying bibliographic co-citation and coupling analysis to a Markov model constructed from a Web log file on a Web site, we propose ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
Markov models have been extensively used to model Web users' navigation behaviors on Web sites. The link structure of a Web site can be seen as a citation network. By applying bibliographic co-citation and coupling analysis to a Markov model constructed from a Web log file on a Web site, we propose a clustering algorithm called CitationCluster to cluster conceptually related pages. The clustering results are used to construct a conceptual hierarchy of the Web site. Markov model based link prediction is integrated with the hierarchy to assist users' navigation on the Web site.
Web path recommendations based on page ranking and markov models
- In WIDM ’05: Proceedings of the 7th annual ACM international workshop on Web information and data management, 2–9
, 2005
"... Markov models have been widely used for modelling users' navigational behaviour in the Web graph, using the transitional probabilities between web pages, as recorded in the web logs. The recorded users ' navigation is used to extract popular web paths and predict current users ’ next steps. Such pur ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Markov models have been widely used for modelling users' navigational behaviour in the Web graph, using the transitional probabilities between web pages, as recorded in the web logs. The recorded users ' navigation is used to extract popular web paths and predict current users ’ next steps. Such purely usage-based probabilistic models, however, present certain shortcomings. Since the prediction of users ' navigational behaviour is based solely on the usage data, structural properties of the Web graph are ignored. Thus important- in terms of pagerank authority score- paths may be underrated. In this paper we present a hybrid probabilistic predictive model extending the properties of Markov models by incorporating link analysis methods. More specifically, we propose the use of a PageRank-style algorithm for assigning prior probabilities to the web pages based on their importance in the web site's graph. We prove, through experimentation, that this approach results in more objective and representative predictions than the ones produced from the pure usage-based approaches.
Role of Weak Ties in Link Prediction of Complex Networks
"... Plenty of algorithms for link prediction have been proposed and were applied to various real networks. Among these works, the weights of links are rarely taken into account. In this paper, we use local similarity indices to estimate the likelihood of the existence of links in weighted networks, incl ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Plenty of algorithms for link prediction have been proposed and were applied to various real networks. Among these works, the weights of links are rarely taken into account. In this paper, we use local similarity indices to estimate the likelihood of the existence of links in weighted networks, including Common Neighbor, Adamic-Adar Index, Resource Allocation Index, and their weighted versions. In both the unweighted and weighted cases, the resource allocation index performs the best. To our surprise, the weighted indices perform worse, which reminds us of the well-known Weak Tie Theory. Further experimental study shows that the weak ties play a significant role in the link prediction problem, and to emphasize the contribution of weak ties can remarkably enhance the predicting accuracy.
Hyperlink Analysis: Techniques and Applications
, 2002
"... ABSTRACT.................................................................................................................................................. 0 ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
ABSTRACT.................................................................................................................................................. 0
Web Usage Mining: users ' navigational patterns extraction from web logs using Ant-based Clustering Method
"... Abstract — Web Usage Mining is the process of applying data mining techniques to the discovery of usage patterns from data extracted from Web Log files. It mines the secondary data (web logs) derived from the users ' interaction with the web pages during certain period of Web sessions. Web usage min ..."
Abstract
- Add to MetaCart
Abstract — Web Usage Mining is the process of applying data mining techniques to the discovery of usage patterns from data extracted from Web Log files. It mines the secondary data (web logs) derived from the users ' interaction with the web pages during certain period of Web sessions. Web usage mining consists of three phases, namely preprocessing, pattern discovery, and pattern analysis. In this paper, web logs of our university web server logs
Referrer Graph: a low-cost web prediction algorithm
"... This paper presents the Referrer Graph (RG) web prediction algorithm as a low-cost solution to predict next web user accesses. RG is aimed at being used in a real web system with prefetching capabilities without degrading its performance. The algorithm learns from user accesses and builds a Markov m ..."
Abstract
- Add to MetaCart
This paper presents the Referrer Graph (RG) web prediction algorithm as a low-cost solution to predict next web user accesses. RG is aimed at being used in a real web system with prefetching capabilities without degrading its performance. The algorithm learns from user accesses and builds a Markov model. These kinds kind of algorithms use the sequence of the user accesses to make predictions. Unlike previous Markov model based proposals, the RG algorithm differentiates dependencies in objects of the same page from objects of different pages by using the object URI and referrer in each request. This permits us to build a simple data structure that is easier to handle and, consequently, with a lower computational cost in comparison with other algorithms. The RG algorithm has been evaluated and compared with the best prediction algorithms proposed in the open literature, and the results show that it achieves similar precision values and page latency savings but requiring much less computational and memory resources.
Web User Categorization and Behavior Study Based on Refreshing Ratnesh Kumar Jain 1
, 2009
"... ------------------------------------------------ABSTRACT---------------------------------------------------As the information available on World Wide Web is growing the usage of the web sites is also growing. Since each access to the web pages are recorded in the web logs it is becoming a huge data ..."
Abstract
- Add to MetaCart
------------------------------------------------ABSTRACT---------------------------------------------------As the information available on World Wide Web is growing the usage of the web sites is also growing. Since each access to the web pages are recorded in the web logs it is becoming a huge data repository which when mined properly can provide useful information for decision making. The designer of the web site, analyst and management executives are interested in extracting this hidden information from web logs for decision making. In this research paper we proposed a method to categorize the users into faithful, Partially Impatient and Completely Impatient user, page wise so that study of user behavior can be easier. To categorize the user we proposed one new information in the web log that represent each instance of refreshing. We used the markov chain model in which we treated the clicking of Refresh button as another state i.e. Refresh State. We derive some theorem to study each type of user behavior and show that how do users behavior differ.
Using Association Rules and Markov Model for Predict Next Access on Web Usage Mining
"... Abstract- Predicting the next request of a user as visits Web pages has gained importance as Web-based activity increases. A large amount of research has been done on trying to predict correctly the pages a user will request. This task requires the development of models that can predicts a user’s ne ..."
Abstract
- Add to MetaCart
Abstract- Predicting the next request of a user as visits Web pages has gained importance as Web-based activity increases. A large amount of research has been done on trying to predict correctly the pages a user will request. This task requires the development of models that can predicts a user’s next request to a web server. In this paper, we propose a method for constructing first-order and second-order Markov models of Web site access prediction based on past visitor behavior and compare it association rules technique. In these approaches, sequences of user requests are collected by the session identification technique, which distinguishes the requests for the same web page in different browses. We report experimental studies using real server log for comparison between methods and show that degree of precision. Index Terms—Markov Model, Association rules, Prediction 1

