Results 1 - 10
of
118
Photobook: Content-Based Manipulation of Image Databases
, 1995
"... We describe the Photobook system, which is a set of interactive tools for browsing and searching images and image sequences. These query tools differ from those used in standard image databases in that they make direct use of the image content rather than relying on text annotations. Direct search o ..."
Abstract
-
Cited by 415 (0 self)
- Add to MetaCart
We describe the Photobook system, which is a set of interactive tools for browsing and searching images and image sequences. These query tools differ from those used in standard image databases in that they make direct use of the image content rather than relying on text annotations. Direct search on image content is made possible by use of semantics-preserving image compression, which reduces images to a small set of perceptually-significant coefficients. We describe three types of Photobook descriptions in detail: one that allows search based on appearance, one that uses 2-D shape, and a third that allows search based on textural properties. These image content descriptions can be combined with each other and with textbased descriptions to provide a sophisticated browsing and search capability. In this paper we demonstrate Photobook on databases containing images of people, video keyframes, hand tools, fish, texture swatches, and 3-D medical data.
Vision Texture for Annotation
, 1995
"... This paper demonstrates a new application of computer vision to digital libraries -- the use of texture for annotation, the description of content. Vision-based annotation assists the user in attaching descriptions to large sets of images and video. If a user labels a piece of an image as "water," a ..."
Abstract
-
Cited by 95 (7 self)
- Add to MetaCart
This paper demonstrates a new application of computer vision to digital libraries -- the use of texture for annotation, the description of content. Vision-based annotation assists the user in attaching descriptions to large sets of images and video. If a user labels a piece of an image as "water," a texture model can be used to propagate this label to other "visually similar" regions. However, a serious problem is that no single model has been found to be good enough to reliably match human perception of similarity in pictures. Rather than using one model, the system described here knows several texture models, and is equipped with the ability to choose the one which "best explains" the regions selected by the user for annotating. If none of these models suffices, then it creates new explanations by combining models. Examples are given of annotations propagated by the system on natural scenes. The system provides an average gain of four to one in label prediction over a set of 98 image...
A Fully Automated Content-Based Video Search Engine Supporting Spatiotemporal Queries
- IEEE Transactions on Circuits and Systems for Video Technology
, 1998
"... The rapidity with which digital information, particularly video, is being generated has necessitated the development of tools for efficient search of these media. Content-based visual queries have been primarily focused on still image retrieval. In this paper, we propose a novel, interactive system ..."
Abstract
-
Cited by 85 (4 self)
- Add to MetaCart
The rapidity with which digital information, particularly video, is being generated has necessitated the development of tools for efficient search of these media. Content-based visual queries have been primarily focused on still image retrieval. In this paper, we propose a novel, interactive system on the Web, based on the visual paradigm, with spatiotemporal attributes playing a key role in video retrieval. We have developed innovative algorithms for automated video object segmentation and tracking, and use real-time video editing techniques while responding to user queries. The resulting system, called VideoQ (demo available at http://www.ctr.columbia.edu/VideoQ/), is the first on-line video search engine supporting automatic objectbased indexing and spatiotemporal queries. The system performs well, with the user being able to retrieve complex video clips such as those of skiers and baseball players with ease. Index Terms---Content based, information retreival, object oriented, spat...
Audio Feature Extraction And Analysis For Scene Segmentation And Classification
- Journal of VLSI Signal Processing System
, 1998
"... Understanding of the scene content of a video sequence is very important for content-based indexing and retrieval of multimedia databases. Research in this area in the past several years has focused on the use of speech recognition and image analysis techniques. As a complimentary effort to the prio ..."
Abstract
-
Cited by 79 (6 self)
- Add to MetaCart
Understanding of the scene content of a video sequence is very important for content-based indexing and retrieval of multimedia databases. Research in this area in the past several years has focused on the use of speech recognition and image analysis techniques. As a complimentary effort to the prior work, we have focused on using the associated audio information (mainly the nonspeech portion) for video scene analysis. As an example, we consider the problem of discriminating five types of TV programs, namely commercials, basketball games, football games, news reports, and weather forecasts. A set of low-level audio features are proposed for characterizing semantic contents of short audio clips. The linear separability of different classes under the proposed feature space is examined using a clustering analysis. The effective features are identified by evaluating the intracluster and intercluster scattering matrices of the feature space. Using these features, a neural net classifier was...
VideoQ: An Automated Content Based Video Search System Using Visual Cues
- In Proceedings of ACM Multimedia
, 1997
"... The rapidity with which digital information, particularly video, is being generated, has necessitated the development of tools for efficient search of these media. Content based visual queries have been primarily focussed on still image retrieval. In this paper, we propose a novel, real-time, intera ..."
Abstract
-
Cited by 75 (1 self)
- Add to MetaCart
The rapidity with which digital information, particularly video, is being generated, has necessitated the development of tools for efficient search of these media. Content based visual queries have been primarily focussed on still image retrieval. In this paper, we propose a novel, real-time, interactive system on the Web, based on the visual paradigm, with spatio-temporal attributes playing a key role in video retrieval. We have developed algorithms for automated video object segmentation and tracking and use real-time video editing techniques while responding to user queries. The resulting system performs well, with the user being able to retrieve complex video clips such as those of skiers, baseball players, with ease. 1. Introduction The ease of capture and encoding of digital images has caused a massive amount of visual information to be produced and disseminated rapidly. Hence efficient tools and systems for searching and retrieving visual information are needed. While there are...
A Real-time Matching System for Large Fingerprint Databases
, 1996
"... With the current rapid growth in multimedia technology, there is an imminent need for efficient techniques to search and query large image databases. Because of their unique and peculiar needs, image databases cannot be treated in a similar fashion to other types of digital libraries. The contextual ..."
Abstract
-
Cited by 68 (12 self)
- Add to MetaCart
With the current rapid growth in multimedia technology, there is an imminent need for efficient techniques to search and query large image databases. Because of their unique and peculiar needs, image databases cannot be treated in a similar fashion to other types of digital libraries. The contextual dependencies present in images and the complex nature of two-dimensional image data make the representation issues more difficult for image databases. An invariant representation of an image is still an open research issue. For these reasons, it is difficult to find a universal content-based retrieval technique. Current approaches based on shape, texture, and color for indexing image databases have met with limited success. Further, these techniques have not been adequately tested in the presence of noise and distortions. A given application domain offers stronger constraints for improving the retrieval performance. Fingerprint databases are characterized by their large size as well as nois...
Information retrieval on the Web
- ACM Computing Surveys
, 2000
"... In this paper we review studies of the growth of the Internet and technologies that are useful for information search and retrieval on the Web. We present data on the Internet from several different sources, e.g., current as well as projected number of users, hosts, and Web sites. Although numerical ..."
Abstract
-
Cited by 58 (0 self)
- Add to MetaCart
In this paper we review studies of the growth of the Internet and technologies that are useful for information search and retrieval on the Web. We present data on the Internet from several different sources, e.g., current as well as projected number of users, hosts, and Web sites. Although numerical figures vary, overall trends cited
Name-It: Naming and Detecting Faces in News Videos
, 1999
"... ions. (In the near future, the worldwide trend will be for broadcasts to feature closed captions.) Thus we use closed-caption texts as transcripts for news videos. In addition, we employ video-caption detection and recognition. We used "CNN Headline News" as our primary source of news for our experi ..."
Abstract
-
Cited by 54 (1 self)
- Add to MetaCart
ions. (In the near future, the worldwide trend will be for broadcasts to feature closed captions.) Thus we use closed-caption texts as transcripts for news videos. In addition, we employ video-caption detection and recognition. We used "CNN Headline News" as our primary source of news for our experiments. Given image sequences, transcripts, and video captions as information sources, Name-It associates extracted faces with extracted name candidates using the correlation of their timing information and face similarity information. Video captions are also taken into account as supplementary information. To associate faces and names, Name-It integrates several advanced image processing and natural-language processing techniques ---face sequence extraction and similarity evaluation from videos, name extraction from transcripts, and video-caption recognition. Although these technologies aren't always highly accurate, integrating these results will help the system achieve more accurate output
A Statistical Approach to Scene Change Detection
, 1995
"... One of the challenging problems in video databases is the organization of video information. Segmenting a video into a number of clips and characterizing each clip has been suggested as one mechanism for organizing video information. This approach requires a suitable method to automatically locate c ..."
Abstract
-
Cited by 53 (6 self)
- Add to MetaCart
One of the challenging problems in video databases is the organization of video information. Segmenting a video into a number of clips and characterizing each clip has been suggested as one mechanism for organizing video information. This approach requires a suitable method to automatically locate cut points in a video. One way of finding such cut points is to determine the boundaries between consecutive camera shots. In this paper, we address this as a statistical hypothesis testing problem and present three tests to determine cut locations. All the three tests are such that they can be applied directly to the compressed video. This avoids an unnecessary decompression-compression cycle, since it is common to store and transmit digital video in compressed form. As our experimental results indicate, the statistical approach permits accurate detection of scene changes induced through straight as well as optical cuts. 1 INTRODUCTION A shot or take in video parlance refers to a contiguous...
Audio content analysis for online audiovisual data segmentation and classification
- 62 IEEE SIGNAL PROCESSING MAGAZINE MARCH 2004
, 2001
"... Abstract—While current approaches for audiovisual data segmentation and classification are mostly focused on visual cues, audio signals may actually play a more important role in content parsing for many applications. An approach to automatic segmentation and classification of audiovisual data based ..."
Abstract
-
Cited by 46 (2 self)
- Add to MetaCart
Abstract—While current approaches for audiovisual data segmentation and classification are mostly focused on visual cues, audio signals may actually play a more important role in content parsing for many applications. An approach to automatic segmentation and classification of audiovisual data based on audio content analysis is proposed. The audio signal from movies or TV programs is segmented and classified into basic types such as speech, music, song, environmental sound, speech with music background, environmental sound with music background, silence, etc. Simple audio features including the energy function, the average zero-crossing rate, the fundamental frequency, and the spectral peak tracks are extracted to ensure the feasibility of real-time processing. A heuristic rule-based procedure is proposed to segment and classify audio signals and built upon morphological and statistical analysis of the time-varying functions of these audio features. Experimental results show that the proposed scheme achieves an accuracy rate of more than 90 % in audio classification. Index Terms—Audio analysis, audio indexing, audio segmentation, audiovisual content parsing, information filtering and retrieval, multimedia database management. I.

