• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Turning down the noise in the blogosphere (2009)

Cached

  • Download as a PDF

Download Links

  • [www.cs.cmu.edu]
  • [select.cs.cmu.edu]
  • [www.select.cs.cmu.edu]
  • [www.cs.cmu.edu]
  • [select.cs.cmu.edu]
  • [www.select.cs.cmu.edu]
  • [www.cs.cmu.edu]
  • [reports-archive.adm.cs.cmu.edu]

  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Khalid El-arini , Gaurav Veda , Dafna Shahaf , Carlos Guestrin May
Venue:In KDD
Citations:13 - 5 self
  • Summary
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@INPROCEEDINGS{El-arini09turningdown,
    author = {Khalid El-arini and Gaurav Veda and Dafna Shahaf and Carlos Guestrin May},
    title = {Turning down the noise in the blogosphere},
    booktitle = {In KDD},
    year = {2009}
}

Bookmark

citeulike Connotea Bibsonomy Del.icio.us Digg Reddit

OpenURL

 

Abstract

In recent years, the blogosphere has experienced a substantial increase in the number of posts published daily, forcing users to cope with information overload. The task of guiding users through this flood of information has thus become critical. To address this issue, we present a principled approach for picking a set of posts that best covers the important stories in the blogosphere. We define a simple and elegant notion of coverage and formalize it as a submodular optimization problem, for which we can efficiently compute a near-optimal solution. In addition, since people have varied interests, the ideal coverage algorithm should incorporate user preferences in order to tailor the selected posts to individual tastes. We define the problem of learning a personalized coverage function by providing an appropriate user-interaction model and formalizing an online learning framework for this task. We then provide a no-regret algorithm which can quickly learn a user’s preferences from limited feedback. We evaluate our coverage and personalization algorithms extensively over real blog data. Results from a user study show that our simple coverage algorithm does as well as most popular blog aggregation sites, including Google Blog Search, Yahoo! Buzz, and Digg. Furthermore, we demonstrate empirically that our algorithm can successfully adapt to user preferences. We believe that our technique, especially with personalization, can dramatically reduce information overload.

Citations

1370 Latent Dirichlet allocation - Blei, Ng, et al.
376 The use of mmr, diversity-based reranking for reordering documents and producing summaries - Carbonell, Goldstein - 1998
372 Finding scientific topics - Griffiths, Steyvers - 2004
253 Amazon. com recommendations: Item-to-item collaborative filtering - Linden, Smith, et al. - 2003
252 An analysis of the approximations for maximizing submodular set functions - Nemhauser, Wosley, et al. - 1978
192 Incorporating nonlocal information into information extraction systems by gibbs sampling - Finkel, Grenager, et al. - 2005
106 Adaptive game playing using multiplicative weights - Freund, Schapire - 1999
93 Beyond independent relevance: methods and evaluation metrics for subtopic retrieval - Zhai, Cohen, et al. - 2003
86 The budgeted maximum coverage problem - Khuller, Moss, et al. - 1999
82 Cost-effective outbreak detection in networks - Leskovec, Krause, et al. - 2007
54 Google news personalization: scalable online collaborative filtering - Das, Datar, et al. - 2007
38 Prediction, Learning and Games - Cesa-Bianchi, Lugosi - 2006
34 Improving web search results using affinity graph - Zhang, Li, et al. - 2005
16 Modeling Discriminative Global Inference - Rizzolo, Roth - 2007
15 Online inference of topics with latent Dirichlet allocation - Canini, Shi, et al. - 2009
11 Online models for content optimization - Agarwal, Chen, et al.
3 The budgeted maximum coverage problem. Information Processing Letters - Khuller, Moss, et al. - 1999
3 Less is more - Chen, Karger - 2006
2 How many blogs does the world need - Kinsley - 2008
1 The information ecology of social media and online communities - Joshi, Kolari, et al. - 2008
1 Modeling discriminative global inference. ICSC - Rizzolo, Roth - 2007
1 The hair’s still perfect - Smith - 2007
1 Schapire provide more details on the theoretical guarantees of this technique (Freund & Schapire - Freund - 1999
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University