Results 1 - 10
of
21
World explorer: Visualizing aggregate data from unstructured text in geo-referenced collections
- In Proceedings of the Seventh ACM/IEEE-CS Joint Conference on Digital Libraries
, 2007
"... The availability of map interfaces and location-aware devices makes a growing amount of unstructured, geo-referenced information available on the Web. This type of information can be valuable not only for browsing, finding and making sense of individual items, but also in aggregate form to help unde ..."
Abstract
-
Cited by 39 (4 self)
- Add to MetaCart
The availability of map interfaces and location-aware devices makes a growing amount of unstructured, geo-referenced information available on the Web. This type of information can be valuable not only for browsing, finding and making sense of individual items, but also in aggregate form to help understand data trends and features. In particular, over twenty million geo-referenced photos are now available on Flickr, a photo-sharing website – the first major collection of its kind. These photos are often associated with userentered unstructured text labels (i.e., tags). We show how we analyze the tags associated with the geo-referenced Flickr images to generate aggregate knowledge in the form of “representative tags ” for arbitrary areas in the world. We use these tags to create a visualization tool, World Explorer, that can help expose the content of the data, using a map interface to display the derived tags and the original photo items. We perform a qualitative evaluation of World Explorer that outlines the visualization’s benefits in browsing this type of content. We provide insights regarding the aggregate versus individual-item requirements in browsing digital geo-referenced material.
How flickr helps us make sense of the world: context and content in community-contributed media collections
- In Proceedings of the 15th International Conference on Multimedia (MM2007
, 2007
"... The advent of media-sharing sites like Flickr and YouTube has drastically increased the volume of community-contributed multimedia resources available on the web. These collections have a previously unimagined depth and breadth, and have generated new opportunities – and new challenges – to multimed ..."
Abstract
-
Cited by 35 (4 self)
- Add to MetaCart
The advent of media-sharing sites like Flickr and YouTube has drastically increased the volume of community-contributed multimedia resources available on the web. These collections have a previously unimagined depth and breadth, and have generated new opportunities – and new challenges – to multimedia research. How do we analyze, understand and extract patterns from these new collections? How can we use these unstructured, unrestricted community contributions of media (and annotation) to generate “knowledge”? As a test case, we study Flickr – a popular photo sharing website. Flickr supports photo, time and location metadata, as well as a light-weight annotation model. We extract information from this dataset using two different approaches. First, we employ a location-driven approach to generate aggregate knowledge in the form of “representative tags ” for arbitrary areas in the world. Second, we use a tag-driven approach to automatically extract place and event semantics for Flickr tags, based on each tag’s metadata patterns. With the patterns we extract from tags and metadata, vision algorithms can be employed with greater precision. In particular, we demonstrate a location-tag-vision-based approach to retrieving images of geography-related landmarks and features from the Flickr dataset. The results suggest that community-contributed media and annotation can enhance and improve our access to multimedia resources – and our understanding of the world.
World-scale Mining of Objects and Events from Community Photo Collections
- CIVR'08
, 2008
"... In this paper, we describe an approach for mining images of objects (such as touristic sights) from community photo collections in an unsupervised fashion. Our approach relies on retrieving geotagged photos from those web-sites using a grid of geospatial tiles. The downloaded photos are clustered in ..."
Abstract
-
Cited by 26 (0 self)
- Add to MetaCart
In this paper, we describe an approach for mining images of objects (such as touristic sights) from community photo collections in an unsupervised fashion. Our approach relies on retrieving geotagged photos from those web-sites using a grid of geospatial tiles. The downloaded photos are clustered into potentially interesting entities through a processing pipeline of several modalities, including visual, textual and spatial proximity. The resulting clusters are analyzed and are automatically classified into objects and events. Using mining techniques, we then find text labels for these clusters, which are used to again assign each cluster to a corresponding Wikipedia article in a fully unsupervised manner. A final verification step uses the contents (including images) from the selected Wikipedia article to verify the cluster-article assignment. We demonstrate this approach on several urban areas, densely covering an area of over 700 square kilometers and mining over 200,000 photos, making it probably the largest experiment of its kind to date.
Combining image descriptors to effectively retrieve events from visual lifelogs
- In MIR ’08: Proceeding of the 1st ACM international conference on Multimedia information retrieval
, 2008
"... The SenseCam is a wearable camera that passively captures approximately 3,000 images per day, which equates to almost one million images per year. It is used to create a personal visual recording of the wearer’s life and generates information which can be helpful as a human memory aid. For such a la ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
The SenseCam is a wearable camera that passively captures approximately 3,000 images per day, which equates to almost one million images per year. It is used to create a personal visual recording of the wearer’s life and generates information which can be helpful as a human memory aid. For such a large amount of visual information to be of any use, it is accepted that it should be structured into“events”, of which there are about 8,000 in a wearer’s average year. In automatically segmenting SenseCam images into events, it will then be useful for users to locate other events similar to a given event e.g. “what other times was I walking in the park?”, “show me other events when I was in a restaurant”. On two datasets of 240k and 1.8M images containing topics with a variety of information needs, we evaluate the fusion of MPEG-7, SIFT, and SURF content-based retrieval techniques to address the event search issue. We have found that our proposed fusion approach of MPEG-7 and SURF offers an improvement on using either of those sources or SIFT individually, and we have also shown how a lifelog event is modeled has a large effect on the retrieval performance.
Exploiting Flickr Tags and Groups for Finding Landmark Photos ⋆
"... Abstract. Many people take pictures of different city landmarks and post them to photo-sharing systems like Flickr. They also add tags and place photos in Flickr groups, created around particular themes. Using tags, other people can search for representative landmark images of places of interest. Se ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Abstract. Many people take pictures of different city landmarks and post them to photo-sharing systems like Flickr. They also add tags and place photos in Flickr groups, created around particular themes. Using tags, other people can search for representative landmark images of places of interest. Searching for landmarks using tags results into many non-landmark photos and provides poor landmark summary for a city. In this paper we propose a new method to identify landmark photos using tags and social Flickr groups. In contrast to similar modern systems, our approach is also applicable when GPS-coordinates for photos are not available. Presented user study shows that the proposed method outperforms state-of-the-art systems for landmark finding. 1
Interactive Tag Maps and Tag Clouds for the Multiscale Exploration of Large Spatio-temporal Datasets
"... ‘Tag clouds ’ and ‘tag maps ’ are introduced to represent geographically referenced text. In combination, these aspatial and spatial views are used to explore a large structured spatio-temporal data set by providing overviews and filtering by text and geography. Prototypes are implemented using free ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
‘Tag clouds ’ and ‘tag maps ’ are introduced to represent geographically referenced text. In combination, these aspatial and spatial views are used to explore a large structured spatio-temporal data set by providing overviews and filtering by text and geography. Prototypes are implemented using freely available technologies including Google Earth and Yahoo!’s Tag Map applet. The interactive tag map and tag cloud techniques and the rapid prototyping method used are informally evaluated through successes and limitations encountered. Preliminary evaluation suggests that the techniques may be useful for generating insights when visualizing large data sets containing geo-referenced text strings. The rapid prototyping approach enabled the technique to be
Social Multimedia: Highlighting Opportunities for Search and Mining of Multimedia Data in Social Media Applications
- PUBLISHED IN MULTIMEDIA TOOLS AND APPLICATIONS
, 2010
"... In recent years, various Web-based sharing and community services such as Flickr and YouTube have made a vast and rapidly growing amount of multimedia content available online. Uploaded by individual participants, content in these immense pools of content is accompanied by varied types of metadata, ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
In recent years, various Web-based sharing and community services such as Flickr and YouTube have made a vast and rapidly growing amount of multimedia content available online. Uploaded by individual participants, content in these immense pools of content is accompanied by varied types of metadata, such as social network data or descriptive textual information. These collections present, at once, new challenges and exciting opportunities for multimedia research. This article presents an approach for “social multimedia” applications. The approach is based on the experience of building a number of successful applications that are based on mining multimedia content analysis in social multimedia context.
Summarization of Online Image Collections via Implicit Feedback ABSTRACT
"... The availability of map interfaces and location-aware devices makes a growing amount of unstructured, geo-referenced information available on the Web. In particular, over twelve million geo-referenced photos are now available on Flickr, a popular photo-sharing website. We show a method to analyze th ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
The availability of map interfaces and location-aware devices makes a growing amount of unstructured, geo-referenced information available on the Web. In particular, over twelve million geo-referenced photos are now available on Flickr, a popular photo-sharing website. We show a method to analyze the Flickr data and generate aggregate knowledge in the form of “representative tags ” for arbitrary areas in the world. We display these tags on a map interface in an interactive web application along with images associated with each tag. We then use the implicit feedback of the aggregate user interactions with the tags and images to learn which images best describe the area shown on the map.
ABSTRACT Towards Extracting Flickr Tag Semantics
"... We address the problem of extracting semantics of tags – short, unstructured text-labels assigned to resources on the Web – based on each tag’s metadata patterns. In particular, we describe an approach for extracting place and event semantics for tags that are assigned to photos on Flickr, a popular ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We address the problem of extracting semantics of tags – short, unstructured text-labels assigned to resources on the Web – based on each tag’s metadata patterns. In particular, we describe an approach for extracting place and event semantics for tags that are assigned to photos on Flickr, a popular photo sharing website supporting time and location (latitude/longitude) metadata. The approach can be generalized to other domains where text terms can be extracted and associated with metadata patterns, such as geoannotated web pages.
General Terms Algorithms, Experimentation, Measurements
"... Tag clouds provide an aggregate of tag-usage statistics. They are typically sent as in-line HTML to browsers. However, display mechanisms suited for ordinary text are not ideal for tags, because font sizes may vary widely on a line. As well, the typical layout does not account for relationships that ..."
Abstract
- Add to MetaCart
Tag clouds provide an aggregate of tag-usage statistics. They are typically sent as in-line HTML to browsers. However, display mechanisms suited for ordinary text are not ideal for tags, because font sizes may vary widely on a line. As well, the typical layout does not account for relationships that may be known between tags. This paper presents models and algorithms to improve the display of tag clouds that consist of in-line HTML, as well as algorithms that use nested tables to achieve a more general 2-dimensional layout in which tag relationships are considered. The first algorithms leverage prior work in typesetting and rectangle packing, whereas the second group of algorithms leverage prior work in Electronic Design Automation. Experiments show our algorithms can be efficiently implemented and perform well.

