• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Matching words and pictures (2003)

Cached

  • Download as a PDF

Download Links

  • [www.cis.upenn.edu]
  • [www.cs.princeton.edu]
  • [www.cs.princeton.edu:80]
  • [www.cs.utexas.edu]
  • [www.ceng.metu.edu.tr]
  • [www.cs.berkeley.edu]
  • [jmlr.csail.mit.edu]
  • [www.jmlr.org]
  • [luthuli.cs.uiuc.edu]
  • [www.cs.bilkent.edu.tr]
  • [www.ai.mit.edu]
  • [www.ai.mit.edu]
  • [kobus.ca]
  • [vision.cs.arizona.edu]
  • [www.cs.berkeley.edu]
  • [www.ceng.metu.edu.tr]

  • Other Repositories/Bibliography

  • DBLP
  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Kobus Barnard , Pinar Duygulu , David Forsyth , Nando De Freitas , David M. Blei , Michael I. Jordan
Venue:JOURNAL OF MACHINE LEARNING RESEARCH
Citations:390 - 33 self
  • Summary
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@ARTICLE{Barnard03matchingwords,
    author = {Kobus Barnard and Pinar Duygulu and David Forsyth and Nando De Freitas and David M. Blei and Michael I. Jordan},
    title = {Matching words and pictures},
    journal = {JOURNAL OF MACHINE LEARNING RESEARCH},
    year = {2003},
    volume = {3},
    pages = {1107--1135}
}

Years of Citing Articles

Bookmark

citeulike Connotea Bibsonomy Del.icio.us Digg Reddit

OpenURL

 

Abstract

We present a new approach for modeling multi-modal data sets, focusing on the specific case of segmented images with associated text. Learning the joint distribution of image regions and words has many applications. We consider in detail predicting words associated with whole images (auto-annotation) and corresponding to particular image regions (region naming). Auto-annotation might help organize and access large collections of images. Region naming is a model of object recognition as a process of translating image regions to words, much as one might translate from one language to another. Learning the relationships between image regions and semantic correlates (words) is an interesting example of multi-modal data mining, particularly because it is typically hard to apply data mining techniques to collections of images. We develop a number of models for the joint distribution of image regions and words, including several which explicitly learn the correspondence between regions and words. We study multi-modal and correspondence extensions to Hofmann’s hierarchical clustering/aspect model, a translation model adapted from statistical machine translation (Brown et al.), and a multi-modal extension to mixture of latent Dirichlet allocation

Citations

6231 Maximum likelihood from incomplete data via the EM algorithm - Dempster, Laird - 1977
1824 Normalized cuts and image segmentation - Shi, Malik - 2000
1369 Latent Dirichlet allocation - Blei, Andrew, et al.
533 Computer Vision: A Modern Approach - Forsyth, Ponce - 2002
327 Object recognition as machine translation :learning a lexicon for a fixed image vocabulary - Duygulu, Barnard, et al. - 2002
265 Blobworld: Image Segmentation Using Expectation-Maximization and Its Application to Image Querying - Carson, Belongie, et al. - 2002
240 Modeling annotated data - Blei, Jordan - 2003
202 Speech and Language Processing – An Introduction to - Jurafsky, Martin - 2009
185 T.: Pedestrian detection using wavelet templates - OREN, PAPAGEORGIOU, et al. - 1997
179 Learning the Semantic of Words and Pictures - Barnard, Forsyth - 2001
137 Multiple-instance learning for natural scene classification - Maron, Ratan - 1998
122 Finding Naked People - Fleck, Forsyth, et al. - 1996
113 The mathematics of Machine Translation: Parameter estimation - Brown, Pietra, et al. - 1993
113 Image-to-word transformation based on dividing and vector quantizing images with words - Mori, Takahashi, et al. - 1999
107 A Shortest Augmenting Path Algorithm for Dense and Sparse - Jonker, Volgenant - 1987
86 Empirical Methods for Exploiting Parallel Texts - Melamed - 1998
79 Statistical models for co-occurrence data - Hofmann, Puzicha - 1998
78 Combining textual and visual cues for contentbased image retrieval on the world wide web - Cascia, Sethi, et al. - 1998
67 Clustering art - Barnard, Duygulu, et al. - 2001
66 Analysis of user need in image archives - Armitage, Enser - 1997
49 Query Analysis in a Visual Information Retrieval Context - Enser - 1993
47 End-user searching challenges indexing practices in the digital newspaper photo archive - Markkula, Sormunen - 2000
44 Learning from ambiguity - Maron - 1998
43 BName-it: Association of face and name in video - Satoh, Kanade - 1997
36 Visual semantics: Extracting visual information from text accompanying pictures - Rohini, Burhans - 1994
23 User types and queries: Impact on image access systems - Keister - 1994
21 Learning and Representing Topic, a Hierarchical Mixture Model for Word Occurrences - Hofmann - 1998
18 On stochastic versions of the EM algorithm - Celeux, Chauveau, et al. - 1995
17 Progress in documentation: pictorial information retrieval - Enser - 1995
16 Extracting Visual Information from Text: Using Captions to Label Human Faces - Srihari - 1991
8 View a picture: Theoretical image analysis and empirical user studies on indexing and retrieval - Ornager - 1996
7 Browse and search patterns in a digital image database - Frost, Taylor, et al.
5 Computer vision tools for finding images and video sequences - Forsyth - 1999
4 A statistical approach to 3d object recognition applied to faces and cars - Schneiderman, Kanade - 2000
3 Use of collateral text in image interpretation - Srihari, Chopra, et al. - 1994
1 Multi-modal browsing of images in web documents - BARNARD, FORSYTH, et al. - 1999
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University