Results 1 -
2 of
2
Learning to tag from open vocabulary labels
- In ECML PKDD ’10
, 2010
"... Abstract. Most approaches to classifying media content assume a fixed, closed vocabulary of labels. In contrast, we advocate machine learning approaches which take advantage of the millions of free-form tags obtainable via online crowd-sourcing platforms and social tagging websites. The use of such ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
Abstract. Most approaches to classifying media content assume a fixed, closed vocabulary of labels. In contrast, we advocate machine learning approaches which take advantage of the millions of free-form tags obtainable via online crowd-sourcing platforms and social tagging websites. The use of such open vocabularies presents learning challenges due to typographical errors, synonymy, and a potentially unbounded set of tag labels. In this work, we present a new approach that organizes these noisy tags into well-behaved semantic classes using topic modeling, and learn to predict tags accurately using a mixture of topic classes. This method can utilize an arbitrary open vocabulary of tags, reduces training time by 94% compared to learning from these tags directly, and achieves comparable performance for classification and superior performance for retrieval. We also demonstrate that on open vocabulary tasks, human evaluations are essential for measuring the true performance of tag classifiers, which traditional evaluation methods will consistently underestimate. We focus on the domain of tagging music clips, and demonstrate our results using data collected with a human computation game called TagATune.
A Simple Word Trigger Method for Social Tag Suggestion
"... It is popular for users in Web 2.0 era to freely annotate online resources with tags. To ease the annotation process, it has been great interest in automatic tag suggestion. We propose a method to suggest tags according to the text description of a resource. By considering both the description and t ..."
Abstract
- Add to MetaCart
It is popular for users in Web 2.0 era to freely annotate online resources with tags. To ease the annotation process, it has been great interest in automatic tag suggestion. We propose a method to suggest tags according to the text description of a resource. By considering both the description and tags of a given resource as summaries to the resource written in two languages, we adopt word alignment models in statistical machine translation to bridge their vocabulary gap. Based on the translation probabilities between the words in descriptions and the tags estimated on a large set of description-tags pairs, we build a word trigger method (WTM) to suggest tags according to the words in a resource description. Experiments on real world datasets show that WTM is effective and robust compared with other methods. Moreover, WTM is relatively simple and efficient, which is practical for Web applications. 1

