Results 1 - 10
of
39
Photo Tourism: Exploring Photo Collections in 3D
- ACM TRANSACTIONS ON GRAPHICS
, 2006
"... We present a system for interactively browsing and exploring large unstructured collections of photographs of a scene using a novel 3D interface. Our system consists of an image-based modeling front end that automatically computes the viewpoint of each photograph as well as a sparse 3D model of the ..."
Abstract
-
Cited by 232 (20 self)
- Add to MetaCart
We present a system for interactively browsing and exploring large unstructured collections of photographs of a scene using a novel 3D interface. Our system consists of an image-based modeling front end that automatically computes the viewpoint of each photograph as well as a sparse 3D model of the scene and image to model correspondences. Our photo explorer uses image-based rendering techniques to smoothly transition between photographs, while also enabling full 3D navigation and exploration of the set of images and world geometry, along with auxiliary information such as overhead maps. Our system also makes it easy to construct photo tours of scenic or historic locations, and to annotate image details, which are automatically transferred to other relevant images. We demonstrate our system on several large personal photo collections as well as images gathered from Internet photo sharing sites.
Passive capture and ensuing issues for a personal lifetime store
, 2004
"... Passive capture lets people record their experiences without having to operate recording equipment, and without even having to give recording conscious thought. The advantages are increased capture, and improved participation in the event itself. However, passive capture also presents many new chall ..."
Abstract
-
Cited by 60 (6 self)
- Add to MetaCart
Passive capture lets people record their experiences without having to operate recording equipment, and without even having to give recording conscious thought. The advantages are increased capture, and improved participation in the event itself. However, passive capture also presents many new challenges. One key challenge is how to deal with the increased volume of media for retrieval, browsing, and organizing. This paper describes the SenseCam device, which combines a camera with a number of sensors in a pendant worn around the neck. Data from SenseCam is uploaded into a MyLifeBits repository, where a number of features, but especially correlation and relationships, are used to manage the data.
Modeling the World from Internet Photo Collections
- INT J COMPUT VIS
, 2007
"... There are billions of photographs on the Internet, comprising the largest and most diverse photo collection ever assembled. How can computer vision researchers exploit this imagery? This paper explores this question from the standpoint of 3D scene modeling and visualization. We present structure-fro ..."
Abstract
-
Cited by 45 (1 self)
- Add to MetaCart
There are billions of photographs on the Internet, comprising the largest and most diverse photo collection ever assembled. How can computer vision researchers exploit this imagery? This paper explores this question from the standpoint of 3D scene modeling and visualization. We present structure-from-motion and image-based rendering algorithms that operate on hundreds of images downloaded as a result of keyword-based image search queries like “Notre Dame ” or “Trevi Fountain.” This approach, which we call Photo Tourism, has enabled reconstructions of numerous well-known world sites. This paper presents these algorithms and results as a first step towards 3D modeling of the world’s well-photographed sites, cities, and landscapes from Internet imagery, and discusses key open problems and challenges for the research community.
World explorer: Visualizing aggregate data from unstructured text in geo-referenced collections
- In Proceedings of the Seventh ACM/IEEE-CS Joint Conference on Digital Libraries
, 2007
"... The availability of map interfaces and location-aware devices makes a growing amount of unstructured, geo-referenced information available on the Web. This type of information can be valuable not only for browsing, finding and making sense of individual items, but also in aggregate form to help unde ..."
Abstract
-
Cited by 39 (4 self)
- Add to MetaCart
The availability of map interfaces and location-aware devices makes a growing amount of unstructured, geo-referenced information available on the Web. This type of information can be valuable not only for browsing, finding and making sense of individual items, but also in aggregate form to help understand data trends and features. In particular, over twenty million geo-referenced photos are now available on Flickr, a photo-sharing website – the first major collection of its kind. These photos are often associated with userentered unstructured text labels (i.e., tags). We show how we analyze the tags associated with the geo-referenced Flickr images to generate aggregate knowledge in the form of “representative tags ” for arbitrary areas in the world. We use these tags to create a visualization tool, World Explorer, that can help expose the content of the data, using a map interface to display the derived tags and the original photo items. We perform a qualitative evaluation of World Explorer that outlines the visualization’s benefits in browsing this type of content. We provide insights regarding the aggregate versus individual-item requirements in browsing digital geo-referenced material.
How flickr helps us make sense of the world: context and content in community-contributed media collections
- In Proceedings of the 15th International Conference on Multimedia (MM2007
, 2007
"... The advent of media-sharing sites like Flickr and YouTube has drastically increased the volume of community-contributed multimedia resources available on the web. These collections have a previously unimagined depth and breadth, and have generated new opportunities – and new challenges – to multimed ..."
Abstract
-
Cited by 35 (4 self)
- Add to MetaCart
The advent of media-sharing sites like Flickr and YouTube has drastically increased the volume of community-contributed multimedia resources available on the web. These collections have a previously unimagined depth and breadth, and have generated new opportunities – and new challenges – to multimedia research. How do we analyze, understand and extract patterns from these new collections? How can we use these unstructured, unrestricted community contributions of media (and annotation) to generate “knowledge”? As a test case, we study Flickr – a popular photo sharing website. Flickr supports photo, time and location metadata, as well as a light-weight annotation model. We extract information from this dataset using two different approaches. First, we employ a location-driven approach to generate aggregate knowledge in the form of “representative tags ” for arbitrary areas in the world. Second, we use a tag-driven approach to automatically extract place and event semantics for Flickr tags, based on each tag’s metadata patterns. With the patterns we extract from tags and metadata, vision algorithms can be employed with greater precision. In particular, we demonstrate a location-tag-vision-based approach to retrieving images of geography-related landmarks and features from the Flickr dataset. The results suggest that community-contributed media and annotation can enhance and improve our access to multimedia resources – and our understanding of the world.
Generating Diverse and Representative Image Search Results for Landmarks ABSTRACT
"... Can we leverage the community-contributed collections of rich media on the web to automatically generate representative and diverse views of the world’s landmarks? We use a combination of context- and content-based tools to generate representative sets of images for location-driven features and land ..."
Abstract
-
Cited by 31 (3 self)
- Add to MetaCart
Can we leverage the community-contributed collections of rich media on the web to automatically generate representative and diverse views of the world’s landmarks? We use a combination of context- and content-based tools to generate representative sets of images for location-driven features and landmarks, a common search task. To do that, we using location and other metadata, as well as tags associated with images, and the images ’ visual features. We present an approach to extracting tags that represent landmarks. We show how to use unsupervised methods to extract representative views and images for each landmark. This approach can potentially scale to provide better search and representation for landmarks, worldwide. We evaluate the system in the context of image search using a real-life dataset of 110,000 images from the San Francisco area.
Leveraging context to resolve identity in photo albums
- In JCDL ’05: Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
, 2005
"... Our system suggests likely identity labels for photographs in a personal photo collection. Instead of using face recognition techniques, the system leverages automatically available context, like the time and location where the photos were taken. Based on time and location, the system automatically ..."
Abstract
-
Cited by 30 (1 self)
- Add to MetaCart
Our system suggests likely identity labels for photographs in a personal photo collection. Instead of using face recognition techniques, the system leverages automatically available context, like the time and location where the photos were taken. Based on time and location, the system automatically computes event and location groupings of photos. As the user annotates some of the identities of people in their collection, patterns of re-occurrence and co-occurrence of different people in different locations and events emerge. The system uses these patterns to generate label suggestions for identities that were not yet annotated. These suggestions can greatly accelerate the process of manual annotation and improve the quality of retrieval from the collection. We obtained ground-truth identity annotation for four different photo albums, and used them to test our system. The system proved effective, making very accurate label suggestions, even when the number of suggestions for each photo was limited to five names, and even when only a small subset of the photos was annotated.
Context data in geo-referenced digital photo collections
- In Proceedings of the 12th annual ACM International Conference on Multimedia
, 2004
"... Given time and location information about digital photographs we can automatically generate an abundance of related contextual metadata, using off-the-shelf and Web-based data sources. Among these are the local daylight status and weather conditions at the time and place a photo was taken. This meta ..."
Abstract
-
Cited by 27 (3 self)
- Add to MetaCart
Given time and location information about digital photographs we can automatically generate an abundance of related contextual metadata, using off-the-shelf and Web-based data sources. Among these are the local daylight status and weather conditions at the time and place a photo was taken. This metadata has the potential of serving as memory cues and filters when browsing photo collections, especially as these collections grow into the tens of thousands and span dozens of years. We describe the contextual metadata that we automatically assemble for a photograph, given time and location, as well as a browser interface that utilizes that metadata. We then present the results of a user study and a survey that together expose which categories of contextual metadata are most useful for recalling and finding photographs. We identify among still unavailable metadata categories those that are most promising to develop next.
Integrating the Web and the World: Contextual trails on the move
- Proceedings of the 15 th ACM Hypertext Conference
, 2004
"... This paper presents applications of HyCon, a framework for context aware hypermedia system. The HyCon architecture encompasses annotations, links, and guided tours associating locations and RFID- or Bluetooth-tagged objects with maps, Web pages, and collections of resources. The user-created annotat ..."
Abstract
-
Cited by 18 (9 self)
- Add to MetaCart
This paper presents applications of HyCon, a framework for context aware hypermedia system. The HyCon architecture encompasses annotations, links, and guided tours associating locations and RFID- or Bluetooth-tagged objects with maps, Web pages, and collections of resources. The user-created annotations, links and guided tours, are represented as XLink structures, and HyCon introduces the use of XLink for the representation of recorded geographical paths with annotations and links. The HyCon framework extends upon earlier location based hypermedia systems by supporting authoring in the field and by providing access to browsing and searching information through a novel geo-based search (GBS) interface for the Web. Interface-wise, the HyCon prototype utilizes SVG on an interface level, for graphics as well as for user interface widgets on tablet PCs and mobile phones.
MobShare: Controlled and Immediate Sharing of Mobile Images
- In Proc. of Multimedia 2004
, 2004
"... In this paper we describe the design and implementation of a mobile phone picture sharing system MobShare that enables immediate, controlled, and organized sharing of mobile pictures, and the browsing, combining, and discussion of the shared pictures. The design combines research on photography, per ..."
Abstract
-
Cited by 16 (4 self)
- Add to MetaCart
In this paper we describe the design and implementation of a mobile phone picture sharing system MobShare that enables immediate, controlled, and organized sharing of mobile pictures, and the browsing, combining, and discussion of the shared pictures. The design combines research on photography, personal image management, mobile phone camera use, mobile picture publishing, and an interview study we conducted on mobile phone camera users. The system is based on a client-server architecture and uses current mobile phone and web technology. The implementation describes novel solutions in immediate sharing of mobile images to an organized web album, and in providing full control over with whom the images are shared. Also, we describe new ways of promoting discussion in sharing images and enabling the combination and comparison of personal and shared pictures. The system proves that the designed solutions can be implemented with current technology and provides novel approaches to general issues in sharing digital images. Categories and Subject Descriptors H.5.1 [Information interfaces and presentation (e.g., HCI)]:

