MetaCart Sign in to MyCiteSeerX

Include Citations | Advanced Search | Help

Disambiguated Search | Include Citations | Advanced Search | Help

Newsjunkie: Providing Personalized Newsfeeds via (2004)

by Analysis Of Information ,  Evgeniy Gabrilovich
In WWW2004
Add To MetaCart

Abstract:

We present a principled methodology for filtering news stories by formal measures of information novelty, and show how the techniques can be used to custom-tailor newsfeeds based on information that a user has already reviewed. We review methods for analyzing novelty and then describe Newsjunkie, a system that personalizes news for users by identifying the novelty of stories in the context of stories they have already reviewed. Newsjunkie employs novelty-analysis algorithms that represent articles as words and named entities. The algorithms analyze inter- and intra- document dynamics by considering how information evolves over time from article to article, as well as within individual articles. We review the results of a user study undertaken to gauge the value of the approach over legacy time-based review of newsfeeds, and also to compare the performance of alternate distance metrics that are used to estimate the dissimilarity between candidate new articles and sets of previously reviewed articles.

Citations

4923 Elements of Information Theory – Cover, Thomas - 1991
3356 C4.5: Programs for Machine Learning – Quinlan - 1993
1636 Indexing by latent semantic analysis – Deerwester, Dumais, et al. - 1990
1303 WordNet: An Electronic Lexical Database – Fellbaum - 1998
915 Term-weighting approaches in automatic text retrieval – Salton, Buckley - 1988
218 Semantic Similarity in a Taxonomy: An Information-Based Meas-ure and its Applications to Problems of Ambiguity in Natural Language – Resnik
202 The use of mmr, diversity-based reranking for reordering documents and producing summaries – Carbonell, Goldstein - 1998
122 Measures of distributional similarity – Lee - 1999
105 Bursty and hierarchical structure in streams – KLEINBERG
98 Individual comparisons by ranking methods – Wilcoxon - 1945
94 Models of attention in computing and communication: From principles to applications. CACM – Horvitz, Paek, et al. - 2003
83 Predicting query performance – Cronen-Townsend, Zhou, et al. - 2002
43 The AT&T Internet Difference Engine: Tracking and Viewing Changes on the Web – Douglis, Ball, et al. - 1998
39 Temporal summaries of news topics – Allan, Gupta, et al.
33 Overview of the TREC 2002 novelty track – Harman - 2002
32 Placing search in context: the concept revisited – Finkelstein, Gabrilovich, et al.
31 A natural law of succession – Ristad - 1995
28 TimeMines: Constructing Timelines with Statistical Models of Word Usage – Swan, Jensen - 2000
24 Topicconditioned novelty detection – Yang, Zhang, et al. - 2002
23 Explorations in context space: words, sentences, discourse – Burgess, Livesay, et al. - 1998
22 Comparing corpora – Kilgarriff - 2001
20 Quantifying query ambiguity – Cronen-Townsend, Croft - 2002
18 Information filtering, novelty detection, and named-page finding – Collins-Thompson, Ogilvie, et al.
18 Experiments in multidocument summarization – Schiffman, Nenkova, et al. - 2002
14 Combining multiple learning strategies for effective cross validation – Yang, Ault, et al. - 2000