The web has become a ubiquitous tool, used in day-to-day work, to find information and conduct business, and it is revolutionising the role and availability of information. One of the problems encountered in web interaction, which is still unsolved, is the navigation problem, whereby users can "get lost in hyperspace", meaning that when following a sequence of links, i.e. a trail of information, users tend to become disoriented in terms of the goal of their original query and the relevance to the query of the information they are currently browsing. Herein we build statistical foundations for tackling the navigation problem based on a formal model of the web in terms of a probabilistic automaton, which can also be viewed as a finite ergodic Markov chain. In our model of the web the probabilities attached to state transitions have two interpretations, namely, they can denote the proportion of times a user followed a link, and alternatively they can denote the expected utility of following a link. Using this approach we have developed two techniques for constructing a web view based on the two interpretations of the probabilities of links, where a web view is a collection of relevant trails. The first method we describe is concerned with finding frequent user behaviour patterns. A collection of trails is taken as input and an ergodic Markov chain is produced as output with the probabilities of transitions corresponding to the frequency the user traversed the associated links. The second method we describe is a reinforcement learning algorithm that attaches higher probabilities to links whose expected trail relevance is higher. The user's home page and a query are taken as input and an ergodic Markov chain is produced as output with the probabilities of...
|
2870
|
Introduction to automata theory, languages and computation
– Hopcroft, Ullman
- 1979
|
|
1839
|
The Anatomy of a Large-Scale Hypertextual Web Search Engine
– Brin, Page
- 1998
|
|
1669
|
Authoritative sources in a hyperlinked environment
– Kleinberg
- 1999
|
|
1439
|
Modern Information Retrieval
– Baeza-Yates, Ribeiro
- 1999
|
|
1064
|
The PageRank Citation Ranking: Bringing Order to the Web
– Page, Brin, et al.
- 1999
|
|
892
|
Temporal and modal logic
– Emerson
- 1990
|
|
601
|
Generalized Fisheye Views
– Furnas
|
|
479
|
Finite Markov Chains
– Kemeny, Snell
- 1983
|
|
454
|
Reinforcement Learning
– Sutton, Barto
- 1998
|
|
367
|
Methods and techniques of adaptive hypermedia. User Modeling and User-Adapted Interaction
– Brusilovsky
- 1996
|
|
349
|
As we may think
– Bush
- 1945
|
|
342
|
Fast discovery of association rules
– Agrawal, Mannil, et al.
- 1996
|
|
264
|
WebWatcher: A Tour Guide for the World Wide Web
– Joachims, Freitag, et al.
- 1997
|
|
220
|
Trawling the web for emerging cyber communities
– Kumar, Raghavan, et al.
- 1999
|
|
206
|
Hypertext and Hypermedia
– Nielsen
- 1990
|
|
205
|
Silk from a sow’s ear: Extracting usable structures from the Web
– Pirolli, Pitkow, et al.
- 1996
|
|
200
|
Efficient crawling through URL ordering
– Cho, Garcia-Molina, et al.
- 1998
|
|
161
|
Accessibility of information on the web
– Lawrence, Giles
|
|
158
|
Elements of Information Theory. Wiley Series in Telecommunications
– Cover, Thomas
- 1991
|
|
124
|
Finding related pages in the World Wide Web
– Dean, Henzinger
- 1999
|
|
122
|
Mining the Web's Link Structure
– Chakrabarti, Dom, et al.
- 1999
|
|
113
|
Knowledge discovery from user web-page navigational
– Shahabi
|
|
97
|
Effective view navigation
– Furnas
- 1997
|
|
89
|
Strong regularities in World Wide Web surfing
– Huberman, Pirolli, et al.
- 1998
|
|
88
|
The myriad virtues of subword trees
– Apostolico
- 1985
|
|
88
|
Using reinforcement learning to spider the webefficiently
– Rennie, McCallum
- 1999
|
|
88
|
M.D.Smith. Using path profiles to predict http requests
– Schechter
- 1998
|
|
87
|
Extracting large-scale knowledge bases from the web
– Kumar, Raghavan, et al.
- 1999
|
|
86
|
Data Mining of User Navigation Patterns
– Borges, Levene
- 1999
|
|
82
|
Bibliometrics of the World Wide Web: an exploratory analysis of the intellectual structure of cyberspace
– Larson
|
|
62
|
Scripted documents: A hypermedia path mechanism
– Zellweger
- 1989
|
|
61
|
R.N.Horspool, `Data Compression Using Dynamic Markov Modeling
– Cormack
- 1987
|
|
55
|
Expanding the Notion of Links
– DeRose
- 1989
|
|
53
|
M.: Measuring Index Quality using Random Walks on the Web
– Henzinger, Heydon, et al.
- 1999
|
|
48
|
Navigating in hyperspace: Designing a structure-based toolbox
– Rivlin, Botafogo
- 1994
|
|
46
|
Patterns of Hypertext
– Bernstein
- 1998
|
|
46
|
Citation influence for journal aggregates of scientific publications: Theory, with applications to the literature of physics
– PINSKI, NARIN
- 1976
|
|
41
|
Distributions of surfers' paths through the World Wide Web: Empirical characterizations
– Pirolli, Pitkow
- 1999
|
|
36
|
Spatial metaphors and disorientation in hypertext browsing
– Kim, Hirtle
- 1995
|
|
35
|
Guided tours and on-line presentations: How authors make existing hypertext intelligible for readers
– Marshall, Irish
- 1989
|
|
31
|
Characterizing user navigation through complex data structure
– Canter, Rivers, et al.
- 1985
|
|
22
|
Information retrieval on the world wide web
– Gudivada, Raghavan, et al.
- 1997
|
|
22
|
Replacing the printed word: A complete literary system
– Nelson
- 1980
|
|
21
|
On the estimation of the order of a Markov chain and universal data compression
– MERHAV, GUTMAN, et al.
- 1989
|
|
19
|
Statistical Methods in Markov Chains
– Billingsley
- 1961
|
|
19
|
Mining Association Rules in Hypertext Databases
– Borges, Levene
- 1998
|
|
17
|
A probabilistic approach to navigation in hypertext
– Levene, Loizou
- 1999
|
|
16
|
The bookmark and the compass: Orientation tools for hypertext users
– Bernstein
- 1988
|
|
10
|
Navigation in Hypertext is easy only sometimes
– Levene, Loizou
- 1999
|
|
9
|
Constructing web-views from automated navigation sessions
– Zin, Levene
- 1999
|