@MISC{Do04formalsemantic, author = {Duc Do and Audrey Tam}, title = {Formal Semantic Models for Images and Image Understanding}, year = {2004} }
Share
OpenURL
Abstract
A number of formal models for images [13,27,28] and models for text and image matching [1] have been proposed, but they have not sufficiently dealt with features with high-level semantics. While formal models are supposed to be precise, their structures should allow for the level of subjectivity involved in interpreting the high-level semantics inherent in images. In our earlier work, we have shown that by restricting image retrieval to a specific domain, we can use logical reasoning based on common sense knowledge bases and the knowledge extracted from text corpora from the same domain to infer higher level semantics from lower level semantics. The interpretation of these lower level semantics, usually involving objects in the image, is subject to a lower level of subjectivity, hence making it possible to build an image model that is reasonably objective. Based on these observations, we propose that an effective and feasible approach to build high-level semantics into image retrieval is to build semantic models for both the image (the object of meaning) and image understanding (the perception of meaning). The image model will aim to capture image features which are commonly accepted within a certain domain. The image understanding model will include mechanisms for subjective interpretation and will be associated with correspondence functions which measure similarity between instances of these two models. This level of similarity, or the semantic distance, can be called the semiotic gap. Using this framework, the image retrieval problem can be deemed equivalent to the problem of defining a correspondence function that delivers the theoretically, or empirically, narrowest semiotic gap. We propose to construct the formal image model based on the concepts of semiotic structures, and an image understanding model based upon insights into how knowledge inference could assist with image retrieval. In this paper, we present the formal image model and argue why this model is suitable for the retrieval of visual data. An image understanding model, which is under ongoing research, is also briefly discussed with results of some preliminary experiments.