Results 1 - 10
of
11
Understanding Temporal Query Dynamics
"... Web search is strongly influenced by time. The queries people issue change over time, with some queries occasionally spiking in popularity (e.g., earthquake) and others remaining relatively constant (e.g., youtube). Likewise, the documents indexed by a search engine change, with some documents alway ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
Web search is strongly influenced by time. The queries people issue change over time, with some queries occasionally spiking in popularity (e.g., earthquake) and others remaining relatively constant (e.g., youtube). Likewise, the documents indexed by a search engine change, with some documents always being about a particular query (e.g., the Wikipedia page on earthquakes is about the query earthquake) and others being about the query only at a particular point in time (e.g., the New York Times is only about earthquakes following a major seismic activity). The relationship between documents and queries can also change as people’s intent changes (e.g., people sought different content for the query earthquake before the Haitian earthquake than they did after). In this paper, we explore how queries, their associated documents, and the query intent change over the course of 10 weeks by analyzing query log data, a daily Web crawl, and periodic human relevance judgments. We identify several interesting features by which changes to query popularity can be classified, and show that presence of these features, when accompanied by changes in result content, can be a good indicator of change in query intent.
Archiving the Web using Page Changes Pattern: A Case Study
- In ACM/IEEE Joint Conference on Digital Libraries (JCDL ’11
, 2011
"... A pattern is a model or a template used to summarize and describe the behavior (or the trend) of a data having generally some recurrent events. Patterns have received a considerable attention in recent years and were widely studied in the data mining field. Various pattern mining approaches have bee ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
A pattern is a model or a template used to summarize and describe the behavior (or the trend) of a data having generally some recurrent events. Patterns have received a considerable attention in recent years and were widely studied in the data mining field. Various pattern mining approaches have been proposed and used for different applications such as network monitoring, moving object tracking, financial or medical data analysis, scientific data processing, etc. In these different contexts, discovered patterns were useful to detect anomalies, to predict data behavior (or trend), or more generally, to simplify data processing or to improve system performance. However, to the best of our knowledge, patterns have never been used in the context of web archiving. Web archiving is the process of continuously collecting and preserving portions of the World Wide Web for future generations. In this paper, we show how patterns of page changes can be useful tools to efficiently archive web sites. We first define our pattern model that describes the changes of pages. Then, we present the strategy used to (i) extract the temporal evolution of page changes, to (ii) discover patterns and to (iii) exploit them to improve web archives. We choose the archive of French public TV channels France Télévisions as a case study 1 in order to validate our approach. Our experimental evaluation based on real web pages shows the utility of patterns to improve archive quality and to optimize indexing or storing.
Changing How People View Changes on the Web
"... The Web is a dynamic information environment. Web content changes regularly and people revisit Web pages frequently. But the tools used to access the Web, including browsers and search engines, do little to explicitly support these dynamics. In this paper we present DiffIE, a browser plug-in that ma ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
The Web is a dynamic information environment. Web content changes regularly and people revisit Web pages frequently. But the tools used to access the Web, including browsers and search engines, do little to explicitly support these dynamics. In this paper we present DiffIE, a browser plug-in that makes content change explicit in a simple and lightweight manner. DiffIE caches the pages a person visits and highlights how those pages have changed when the person returns to them. We describe how we built a stable, reliable, and usable system, including how we created compact, privacy-preserving page representations to support fast difference detection. Via a longitudinal user study, we explore how DiffIE changed the way people dealt with changing content. We find that much of its benefit came not from exposing expected change, but rather from drawing attention to unexpected change and helping people build a richer understanding of the Web content they frequent. ACM Classification: H5.2: Information interfaces and
“It’s Simply Integral to What I do”: Enquiries into how the Web is Weaved into Everyday Life
, 2012
"... This paper presents findings from a field study of 24 individuals who kept diaries of their web use, across device and location, for a period of four days. Our focus was on how the web was used for non-work purposes, with a view to understanding how this is intertwined with everyday life. While our ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
This paper presents findings from a field study of 24 individuals who kept diaries of their web use, across device and location, for a period of four days. Our focus was on how the web was used for non-work purposes, with a view to understanding how this is intertwined with everyday life. While our initial aim was to update existing frameworks of ‘web activities’, such as those described by Sellen et al. [25] and Kellar et al. [14], our data lead us to suggest that the notion of ‘web activity ’ is only partially useful for an analytic understanding of what it is that people do when they go online. Instead, our analysis leads us to present five modes of web use, which can be used to frame and enrich interpretations of ‘activity’. These are respite, orienting, opportunistic use, purposeful use and lean-back internet. We then consider two properties of the web that enable it to be tailored to these different modes, persistence and temporality, and close by suggesting ways of drawing upon these qualities in order to inform design.
Investigations of Continual Computation
"... Autonomous agents that sense, reason, and act in real-world environments for extended periods often need to solve streams of incoming problems. Traditionally, effort is applied only to problems that have already arrived and have been noted. We examine continual computation methods that allow agents ..."
Abstract
- Add to MetaCart
Autonomous agents that sense, reason, and act in real-world environments for extended periods often need to solve streams of incoming problems. Traditionally, effort is applied only to problems that have already arrived and have been noted. We examine continual computation methods that allow agents to ideally allocate time to solving current as well as potential future problems under uncertainty. We first review prior work on continual computation. Then, we present new directions and results, including the consideration of shared subtasks and multiple tasks. We present results on the computational complexity of the continual-computation problem and provide approximations for arbitrary models of computational performance. Finally, we review special formulations for addressing uncertainty about the best algorithm to apply, learning about performance, and considering costs associated with delayed use of results. 1
General Terms Algorithms, Experimentation
"... Many web documents are dynamic, with content changing in varying amounts at varying frequencies. However, current document search algorithms have a static view of the document content, with only a single version of the document in the index at any point in time. In this paper, we present the first p ..."
Abstract
- Add to MetaCart
Many web documents are dynamic, with content changing in varying amounts at varying frequencies. However, current document search algorithms have a static view of the document content, with only a single version of the document in the index at any point in time. In this paper, we present the first published analysis of using the temporal dynamics of document content to improve relevance ranking. We show that there is a strong relationship between the amount and frequency of content change and relevance. We develop a novel probabilistic document ranking algorithm that allows differential weighting of terms based on their temporal characteristics. By leveraging such content dynamics we show significant performance improvements for navigational queries.
The Dynamics of Personal Territories on the Web
"... In this paper, we present a long-term study of user-centric Web traffic data collected in 2000-2002 and 2005-2006 from two large representative panels of French Internet users. Our work focuses on the dynamics of personal territories on the Web and their evolution between 2000 and 2006. At the sessi ..."
Abstract
- Add to MetaCart
In this paper, we present a long-term study of user-centric Web traffic data collected in 2000-2002 and 2005-2006 from two large representative panels of French Internet users. Our work focuses on the dynamics of personal territories on the Web and their evolution between 2000 and 2006. At the session level, we distinguish four profiles of browsing dynamics in 2005-2006, and point out the growing dichotomy between straight routine sessions and exploratory browsing. At a global level, we observe that although each individual’s corpus of visited sites is permanently growing, his browsing practices are structured around routine well-known sites which operate as links providers to new sites. We argue that this tension between the known and the unknown is constitutive of Web practices and is a fundamental property of personal Web territories.
ABSTRACT
, 1201
"... Online social networking technologies enable individuals to simultaneously share information with any number of peers. Quantifying the causal effect of these mediums on the dissemination of information requires not only identification of who influences whom, but also of whether individuals would sti ..."
Abstract
- Add to MetaCart
Online social networking technologies enable individuals to simultaneously share information with any number of peers. Quantifying the causal effect of these mediums on the dissemination of information requires not only identification of who influences whom, but also of whether individuals would still propagate information in the absence of social signals about that information. We examine the role of social networks in online information diffusion with a large-scale field experiment that randomizes exposure to signals about friends ’ information sharing among 253 million subjects in situ. Those who are exposed are significantly more likely to spread information, and do so sooner than those who are not exposed. We further examine the relative role of strong and weak ties in information propagation. We show that, although stronger ties are individually more influential, it is the more abundant weak ties who are responsible for the propagation of novel information. This suggests that weak ties may play a more dominant role in the dissemination of information online than currently believed.
The Role of Social Networks in Information Diffusion ABSTRACT
"... Online social networking technologies enable individuals to simultaneously share information with any number of peers. Quantifying the causal effect of these mediums on the dissemination of information requires not only identification of who influences whom, but also of whether individuals would sti ..."
Abstract
- Add to MetaCart
Online social networking technologies enable individuals to simultaneously share information with any number of peers. Quantifying the causal effect of these mediums on the dissemination of information requires not only identification of who influences whom, but also of whether individuals would still propagate information in the absence of social signals about that information. We examine the role of social networks in online information diffusion with a large-scale field experiment that randomizes exposure to signals about friends ’ information sharing among 253 million subjects in situ. Those who are exposed are significantly more likely to spread information, and do so sooner than those who are not exposed. We further examine the relative role of strong and weak ties in information propagation. We show that, although stronger ties are individually more influential, it is the more abundant weak ties who are responsible for the propagation of novel information. This suggests that weak ties may play a more dominant role in the dissemination of information online than currently believed.
How Random are Online Social Interactions?
, 1207
"... The massive amounts of data that social media generates has facilitated the study of online human behavior on a scale unimaginable a few years ago. At the same time, the much discussed apparent randomness with which people interact online makes it appear as if these studies cannot reveal predictive ..."
Abstract
- Add to MetaCart
The massive amounts of data that social media generates has facilitated the study of online human behavior on a scale unimaginable a few years ago. At the same time, the much discussed apparent randomness with which people interact online makes it appear as if these studies cannot reveal predictive social behaviors that could be used for developing better platforms and services. We use two large social databases to measure the mutual information entropy that both individual and group actions generate as they evolve over time. We show that user’s interaction sequences have strong deterministic components, in contrast with existing assumptions and models. In addition, we show that individual interactions are more predictable when users act on their own rather than when attending group activities.

