MetaCart Sign in to MyCiteSeerX

Include Citations | Advanced Search | Help

Disambiguated Search | Include Citations | Advanced Search | Help

Searching for authors named "Dennis Fetterly" – sorted by Relevance.

Try your query at: Scholar | Yahoo! | Ask | Bing | CSB
Help! 9 documents found, showing 1 through 9.
ATOM RSS
  • Detecting phrase-level duplication on the world wide web  
  • by Dennis Fetterly — 2005 — In Proceedings of the 28th Annual International ACM SIGIR Conference on Research & Development in Information Retrieval
  • …Two years ago, we conducted a study on the evolution of web pages over time. In the course of that study, we discovered a large number of machine-generated “spam ” web pages emanating from a handful of web servers in Germany. These spam web pages were dynamically assembled by stitching together gram…
  • Cited by 23 (1 self)Add To MetaCart
  • Measuring the Search Effectiveness of a Breadth-First Crawl  
  • by Dennis Fetterly, Vishwa Vinay
  • …Abstract. Previous scalability experiments found that early precision improves as collection size increases. However, that was under the assumption that a collection’s documents are all sampled with uniform probability from the same population. We contrast this to a large breadth-first web crawl, an…
  • Add To MetaCart
  • Spam, Damn Spam, and Statistics: Using statistical analysis to locate spam web pages  
  • by Dennis Fetterly, Mark Manasse, Marc Najork — 2004 — In Proceedings of WebDB
  • …The increasing importance of search engines to commercial web sites has given rise to a phenomenon we call "web spam", that is, web pages that exist only to mislead search engines into (mis)leading users to certain web sites. Web spam is a nuisance to users as well as search engines: users have a ha…
  • Cited by 6 (0 self)Add To MetaCart
  • On the Evolution of Clusters of Near-Duplicate Web Pages  
  • by Dennis Fetterly , Mark Manasse, Marc Najork — 2003 — IN 1ST LATIN AMERICAN WEB CONGRESS
  • …This paper expands on a 1997 study of the amount and distribution of near-duplicate pages on the World Wide Web. We downloaded a set of 150 million web pages on a weekly basis over the span of 11 weeks. We then determined which of these pages are near-duplicates of one another, and tracked how clust…
  • Cited by 27 (2 self)Add To MetaCart
  • Kumar Chellapilla Microsoft Live Labs  
  • by Carlos Castillo, Redmond Wa, Dennis Fetterly
  • …Adversarial IR in general, and search engine spam, in particular, are engaging research topics with a real-world impact for Web users, advertisers and publishers. The AIRWeb workshop will bring researchers and practitioners in these areas together, to present and discuss state-of-the-art techniques …
  • Add To MetaCart
  • The Impact of Crawl Policy on Web Search Effectiveness  
  • by Dennis Fetterly, Nick Craswell, Vishwa Vinay
  • …Crawl selection policy has a direct influence on Web search effectiveness, because a useful page that is not selected for crawling will also be absent from search results. Yet there has been little or no work on measuring this effect. We introduce an evaluation framework, based on relevance judgment…
  • Add To MetaCart
  • A large-scale study of the evolution of web pages  
  • by Dennis Fetterly, Mark Manasse, Marc Najork, Janet L. Wiener — 2003 — In Proceedings of the 12th International World Wide Web Conference
  • …How fast does the web change? Does most of the content remain unchanged once it has been authored, or are the documents continuously updated? Do pages change a little or a lot? Is the extent of change correlated to any other property of the page? All of these questions are of interest to those who m…
  • Cited by 102 (5 self)Add To MetaCart
Help! Showing 1 through 9.
ATOM RSS
Try your query at: Scholar | Yahoo! | Ask | Bing | CSB