• Documents
  • Authors
  • Tables

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

DMCA

Joke Retrieval: Recognizing the Same Joke Told Differently

Cached

  • Download as a PDF

Download Links

  • [maroo.cs.umass.edu]
  • [ciir-publications.cs.umass.edu]
  • [maroo.cs.umass.edu]
  • [ciir-publications.cs.umass.edu]

  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Lisa Friedl , James Allan
  • Summary
  • Citations
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@MISC{Friedl_jokeretrieval:,
    author = {Lisa Friedl and James Allan},
    title = {Joke Retrieval: Recognizing the Same Joke Told Differently},
    year = {}
}

Share

Facebook Twitter Reddit Bibsonomy

OpenURL

 

Abstract

In a corpus of jokes, a human might judge two documents to be the "same joke " even if characters, locations, and other details are varied. A given joke could be retold with an entirely different vocabulary, while still maintaining its identity. Since most retrieval systems consider documents to be related only when their word content is similar, we propose joke retrieval as a domain where standard language models may fail. In particular, we consider the task of identifying the "same joke " to be a necessary component of any joke retrieval system, and we examine it in both ranking and classification settings. We exploit the structure of jokes to develop two domain-specific alternatives to the "bag of words " document model. In one, only the punch lines, or final sentences, are compared; in the second, certain categories of words (e.g., professions and countries) are marked up and treated as interchangeable. Each technique works well for certain jokes. By combining the methods using machine learning, we create a hybrid that achieves higher performance than any individual approach.

Keyphrases

joke retrieval    joke told differently    standard language model    certain category    word content    retrieval system    punch line    classification setting    word document model    individual approach    different vocabulary    machine learning    domain-specific alternative    joke retrieval system    final sentence    necessary component    certain joke   

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University