Results 1 -
2 of
2
Describable Visual Attributes for Face Images
"... We introduce the use of describable visual attributes for face images. Describable visual attributes are labels that can be given to an image to describe its appearance. This thesis focuses mostly on images of faces and the attributes used to describe them, although the concepts also apply to other ..."
Abstract
- Add to MetaCart
We introduce the use of describable visual attributes for face images. Describable visual attributes are labels that can be given to an image to describe its appearance. This thesis focuses mostly on images of faces and the attributes used to describe them, although the concepts also apply to other domains. Examples of face attributes include gender, age, jaw shape, nose size, etc. The advantages of an attribute-based representation for vision tasks are manifold: they can be composed to create descriptions at various levels of specificity; they are generalizable, as they can be learned once and then applied to recognize new objects or categories without any further training; and they are efficient, possibly requiring exponentially fewer attributes (and training data) than explicitly naming each category. We show how one can create and label large datasets of real-world images to train classifiers which measure the presence, absence, or degree to which an attribute is expressed in images. These classifiers can then automatically label new images. We demonstrate the current effectiveness and explore the future potential of using attributes for image search, automatic face replacement in images, and face verification, via both human and computational experiments. To aid other researchers in studying these
Midge: Generating Image Descriptions From Computer Vision Detections
"... This paper introduces a novel generation system that composes humanlike descriptions of images from computer vision detections. By leveraging syntactically informed word co-occurrence statistics, the generator filters and constrains the noisy detections output from a vision system to generate syntac ..."
Abstract
- Add to MetaCart
This paper introduces a novel generation system that composes humanlike descriptions of images from computer vision detections. By leveraging syntactically informed word co-occurrence statistics, the generator filters and constrains the noisy detections output from a vision system to generate syntactic trees that detail what the computer vision system sees. Results show that the generation system outperforms state-of-the-art systems, automatically generating some of the most natural image descriptions to date. 1

