Results 1 -
7 of
7
Learning Video Preferences Using Visual Features and Closed Captions
, 2007
"... Viewers of video now have more choices than ever. As the number of choices increases, the task of searching through these choices to locate video of interest is becoming more difficult. Current methods for learning a viewerâs preferences in order to automate the search process rely either on video ..."
Abstract
- Add to MetaCart
Viewers of video now have more choices than ever. As the number of choices increases, the task of searching through these choices to locate video of interest is becoming more difficult. Current methods for learning a viewerâs preferences in order to automate the search process rely either on video having content descriptions or on having been rated by other viewers identified as being similar. However, much video exists that does not meet these requirements. To address this need, we use hidden Markov models to learn the preferences of a viewer by combining visual features and closed captions. We validate our approach by testing the learned models on a data set composed of features drawn from movies and user ratings obtained from publicly available data sets.
Feature Article Learning Video Preferences Using Visual Features and Closed Captions
"... identifying a viewer’s video preferences uses hidden Markov models by combining visual features and closed captions. ..."
Abstract
- Add to MetaCart
identifying a viewer’s video preferences uses hidden Markov models by combining visual features and closed captions.
supervised by
"... Despite numerous outstanding results, highly complex and specialized multimedia algorithms have not been able to fulfill the promise of fully automated multimedia interpretation. An essential problem is that they are insufficiently aware of the context they operate in. Algorithms that do take a form ..."
Abstract
- Add to MetaCart
Despite numerous outstanding results, highly complex and specialized multimedia algorithms have not been able to fulfill the promise of fully automated multimedia interpretation. An essential problem is that they are insufficiently aware of the context they operate in. Algorithms that do take a form of context in consideration, often function in a domain-specific environment. The generic framework proposed in this paper stimulates algorithm collaboration on an interpretation task by continuously actualizing the context of the multimedia item under interpretation. Semantic Web knowledge, combined with reasoning methods, forms the corner stone of the integration of these various interacting agents. We believe that this framework will enable an advanced interpretation of multimedia data that goes beyond the capabilities of individual algorithms. A basic platform implementation already indicates the potential of the concept, clearing the path for even more complex interpretation scenarios.
USING OBSCENE SOUND ANALYSIS BASED ON A REPEATED CURVE-LIKE SPECTRUM FEATURE
"... This paper addresses the automatic classification of X-rated videos by analyzing its obscene sounds. In this paper, obscene sounds refer to audio signals generated from sexual moans and screams during sexual scenes. By analyzing various sound samples, we determined the distinguishable characteristic ..."
Abstract
- Add to MetaCart
This paper addresses the automatic classification of X-rated videos by analyzing its obscene sounds. In this paper, obscene sounds refer to audio signals generated from sexual moans and screams during sexual scenes. By analyzing various sound samples, we determined the distinguishable characteristics of obscene sounds and propose a repeated curve-like spectrum feature that represents the characteristics of such sounds. We constructed 6,269 audio clips to evaluate the proposed feature, and separately constructed 1,200 X-rated and general videos for classification. The proposed feature has an F1-score, precision, and recall rate of 96.6%, 98.2%, and 95.2%, respectively, for the original dataset, and 92.6%, 97.6%, and 88.0 % for a noisy dataset of 5dB SNR. And, in classifying videos, the feature has more than a 90 % F1score, 97 % precision, and an 84 % recall rate. From the measured performance, X-rated videos can be classified with only the audio features and the repeated curve-like spectrum feature is suitable to detect obscene sounds. KEYWORDS X-rated video classification, obscene sound classification, repeated curve-like spectrum feature, audio feature, support vector machine 1.
Communication
"... This paper describes our participation in Genre Tagging Task of MediaEval 2011, which aims to predict the videos’ category label. We use bag-of-words approaches with different features derived from visual content and associated textual information. We perform different experiments in which different ..."
Abstract
- Add to MetaCart
This paper describes our participation in Genre Tagging Task of MediaEval 2011, which aims to predict the videos’ category label. We use bag-of-words approaches with different features derived from visual content and associated textual information. We perform different experiments in which different constellations in respect of single modalities, classification methods, visual features and their combinations are investigated. Each video of the test set is assigned to a single genre label, therefore, the classification accuray (CA) is a good metric for evaluation. As expected the most pieces of information for distinguishing genre contain the metadata (MAP = 0.2988 / CA = 65%). In combination with visual words the performance can be increased (MAP
Int J Multimed Info Retr DOI 10.1007/s13735-012-0024-2 TRENDS AND SURVEYS High-level event recognition in unconstrained videos
, 2012
"... Abstract The goal of high-level event recognition is to automatically detect complex high-level events in a given video sequence. This is a difficult task especially when videos are captured under unconstrained conditions by nonprofessionals. Such videos depicting complex events have limited quality ..."
Abstract
- Add to MetaCart
Abstract The goal of high-level event recognition is to automatically detect complex high-level events in a given video sequence. This is a difficult task especially when videos are captured under unconstrained conditions by nonprofessionals. Such videos depicting complex events have limited quality control, and therefore, may include severe camera motion, poor lighting, heavy background clutter, and occlusion. However, due to the fast growing popularity of such videos, especially on the Web, solutions to this problem are in high demands and have attracted great interest from researchers. In this paper, we review current technologies for complex event recognition in unconstrained videos. While the existing solutions vary, we identify common key modules and provide detailed descriptions along with some insights for each of them, including extraction and representation of low-level features across different modalities, classification strategies, fusion techniques, etc. Publicly available benchmark datasets, performance metrics, and related research forums are also described. Finally, we discuss promising directions for future research. Y.-G. Jiang (B)
VIDEO CLASSIFICATION BASED ON SOCIAL ATTITUDES
"... Organizing large video databases is a pressing need and a challenging problem. Social attitudes in the form of users ’ beliefs and evaluations can benefit classification. For instance, news videos do not gather as much user attention as music videos while sports videos trigger interest mainly during ..."
Abstract
- Add to MetaCart
Organizing large video databases is a pressing need and a challenging problem. Social attitudes in the form of users ’ beliefs and evaluations can benefit classification. For instance, news videos do not gather as much user attention as music videos while sports videos trigger interest mainly during the time of event. In this paper, we provide an extensive analysis of the role of usage statistics in aiding classification. Towards this, we propose a novel framework motivated by evolutionary biology to characterize growth, persistence and decline of contents in online environments. We then incorporate this information in a nearest neigbor classifier to establish categories. The effectiveness of the approach is demonstrated by comparing against results obtained using principal component analysis followed by nearest neighbor based classification.

