Results 1 - 10
of
26
Multimodal fusion for multimedia analysis: a survey
, 2010
"... This survey aims at providing multimedia researchers with a state-of-the-art overview of fusion strategies, which are used for combining multiple modalities in order to accomplish various multimedia analysis tasks. The existing literature on multimodal fusion research is presented through several c ..."
Abstract
-
Cited by 58 (1 self)
- Add to MetaCart
This survey aims at providing multimedia researchers with a state-of-the-art overview of fusion strategies, which are used for combining multiple modalities in order to accomplish various multimedia analysis tasks. The existing literature on multimodal fusion research is presented through several classifications based on the fusion methodology and the level of fusion (feature, decision, and hybrid). The fusion methods are described from the perspective of the basic concept, advantages, weaknesses, and their usage in various analysis tasks as reported in the literature. Moreover, several distinctive issues that influence a multimodal fusion process such as, the use of correlation and independence, confidence level, contextual information, synchronization between different modalities, and the optimal modality selection are also highlighted. Finally, we present the open issues for further research in the area of multimodal fusion.
Image Clustering Based on a Shared Nearest Neighbors Approach for Tagged
- Collections, Int. Conf. on Content-based Image and Video Retrieval
, 2008
"... Browsing and finding pictures in large-scale and heterogeneous collections is an important issue, most particularly for online photo sharing applications. Since such services know a huge growing of their database, the tag-based indexing strategy and the results displayed in a traditional “in a singl ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
(Show Context)
Browsing and finding pictures in large-scale and heterogeneous collections is an important issue, most particularly for online photo sharing applications. Since such services know a huge growing of their database, the tag-based indexing strategy and the results displayed in a traditional “in a single file ” representation are not efficient to browse and query image collections. Naturally, data clustering appeared as a good solution by presenting a summarized view of an image set instead of an exhaustive but useless list of its element. We present a new method for image clustering based on a shared nearest neighbors approach that could be processed on both content-based features and textual descriptions (tags). We describe, discuss and evaluate the SNN method for image clustering and present some experimental results using the Flickr collections showing that our approach provides useful representations of an image set.
A Study of Query by Semantic Example
- APPEARS IN 3RD INTERNATIONAL WORKSHOP ON SEMANTIC LEARNING AND APPLICATIONS IN MULTIMEDIA, ANCHORAGE, 2008.
, 2008
"... In recent years, query-by-semantic-example (QBSE) has become a popular approach to do content based image retrieval [20, 23, 18]. QBSE extends the well established query-by-example retrieval paradigm to the semantic domain. While various authors have pointed out the benefits of QBSE, there are still ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
(Show Context)
In recent years, query-by-semantic-example (QBSE) has become a popular approach to do content based image retrieval [20, 23, 18]. QBSE extends the well established query-by-example retrieval paradigm to the semantic domain. While various authors have pointed out the benefits of QBSE, there are still various open questions with respect to this paradigm. These include a lack of precise understanding of how the overall performance depends on various different parameters of the system. In this work, we present a systematic experimental study of the QBSE framework. This can be broadly divided into three categories. First, we examine the space of low-level visual features for its effects on the retrieval performance. Second, we study the space of learned semantic concepts, herein denoted as the “semantic space”, and show that not all semantic concepts are equally informative for retrieval. Finally, we present a study of the intrinsic structure of the semantic space, by analyzing the contextual relationships between semantic concepts and show that this intrinsic structure is crucial for the performance improvements.
oro.open.ac.uk Image Retrieval using Markov Random Fields and Global Image Features
"... and other research outputs ..."
(Show Context)
A semantic vector space for query by image example
- in ACM SIGIR Conf. on research and development in information retrieval, Multimedia Information Retrieval Workshop
, 2007
"... Content-based image retrieval enables the user to search a database for visually similar images. In these scenarios, the user submits an example that is compared to the images in the database by their low-level characteristics such as colour, texture and shape. While visual similarity is essential f ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
(Show Context)
Content-based image retrieval enables the user to search a database for visually similar images. In these scenarios, the user submits an example that is compared to the images in the database by their low-level characteristics such as colour, texture and shape. While visual similarity is essential for a vast number of applications, there are cases where a user needs to search for semantically similar images. For example, the user might want to find all images depicting bears on a river. This might be quite difficult using only low-level features, but using concept detectors for “bear ” and “river ” will produce results that are semantically closer to what the user requested. Following this idea, this paper studies a novel paradigm: query by semantic multimedia example. In this setting the user’s query is processed at a semantic level: a vector of concept probabilities is inferred for each image and a similarity metric computes the distance between the concept vector of the query and of the concept vectors of the images in database. The system is evaluated with a COREL Stock Photo collection.
Exploring Multimedia in a Keyword Space
"... We address the problem of searching multimedia by semantic similarity in a keyword space. In contrast to previous research we represent multimedia content by a vector of keywords instead of a vector of low-level features. This vector of keywords can be obtained through user manual annotations or com ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
(Show Context)
We address the problem of searching multimedia by semantic similarity in a keyword space. In contrast to previous research we represent multimedia content by a vector of keywords instead of a vector of low-level features. This vector of keywords can be obtained through user manual annotations or computed by an automatic annotation algorithm. In this setting, we studied the influence of two aspects of the search by semantic similarity process: (1) accuracy of user keywords versus automatic keywords and (2) functions to compute semantic similarity between keyword vectors of two multimedia documents. We
Web News Categorization using a Cross-Media Document Graph
"... In this paper we propose a multimedia categorization framework that is able to exploit information across different parts of a multimedia document (e.g., a Web page, a PDF, a Microsoft Office document). For example, a Web news page is composed by text describing some event (e.g., a car accident) and ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
(Show Context)
In this paper we propose a multimedia categorization framework that is able to exploit information across different parts of a multimedia document (e.g., a Web page, a PDF, a Microsoft Office document). For example, a Web news page is composed by text describing some event (e.g., a car accident) and a picture containing additional information regarding the real extent of the event (e.g., how damaged the car is) or providing evidence corroborating the text part. The framework handles multimedia information by considering not only the document’s text and images data but also the layout structure which determines how a given text block is related to a particular image. The novelties and contributions of the proposed framework are: (1) support of heterogeneous types of multimedia documents; (2) a documentgraph representation method; and (3) the computation of crossmedia
Compound document analysis by fusing evidence across media
- In Proceedings of the 7th International Workshop on Content-Based Multimedia Indexing (CBMI 2009
, 2009
"... In this paper a cross media analysis scheme for the semantic interpretation of compound documents is presented. It is essentially a late-fusion mechanism that operates on top of single-media extractors output and it’s main novelty relies on using the evidence extracted from heterogeneous media sourc ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
(Show Context)
In this paper a cross media analysis scheme for the semantic interpretation of compound documents is presented. It is essentially a late-fusion mechanism that operates on top of single-media extractors output and it’s main novelty relies on using the evidence extracted from heterogeneous media sources to perform probabilistic inference on a bayesian network that incorporates knowledge about the domain. Experiments performed on a set of 54 compound documents showed that the proposed scheme is able to exploit the existing cross media relations and achieve performance improvements. 1
Modeling, classifying and annotating weakly annotated images using Bayesian network
"... Abstract In this paper, we propose a probabilistic graphical model to represent weakly annotated images. We consider an image as weakly annotated if the number of keywords defined for it is less than the maximum number defined in the ground truth. This model is used to classify images and automatic ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract In this paper, we propose a probabilistic graphical model to represent weakly annotated images. We consider an image as weakly annotated if the number of keywords defined for it is less than the maximum number defined in the ground truth. This model is used to classify images and automatically extend existing annotations to new images by taking into account semantic relations between keywords. The proposed method has been evaluated in visual-textual classification and automatic annotation of images. The visualtextual classification is performed by using both visual and textual information. The experimental results, obtained from a database of more than 30000 images, show an improvement by 50.5% in terms of recognition rate against only visual information classification. Taking into account semantic relations between keywords improves the recognition rate by 10.5%. Moreover, the proposed model can be used to extend existing annotations to weakly annotated images, by computing distributions of missing keywords. Semantic relations improve the mean rate of good annotations by 6.9%. Finally, the proposed method is competitive with a state-of-art model.