Results 1 - 10
of
38
SIMPLIcity: Semantics-Sensitive Integrated Matching for Picture LIbraries
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2001
"... The need for efficient content-based image retrieval has increased tremendously in many application areas such as biomedicine, military, commerce, education, and Web image classification and searching. We present here SIMPLIcity (Semanticssensitive Integrated Matching for Picture LIbraries), an imag ..."
Abstract
-
Cited by 307 (28 self)
- Add to MetaCart
The need for efficient content-based image retrieval has increased tremendously in many application areas such as biomedicine, military, commerce, education, and Web image classification and searching. We present here SIMPLIcity (Semanticssensitive Integrated Matching for Picture LIbraries), an image retrieval system, which uses semantics classification methods, a wavelet-based approach for feature extraction, and integrated region matching based upon image segmentation. As in other regionbased retrieval systems, an image is represented by a set of regions, roughly corresponding to objects, which are characterized by color, texture, shape, and location. The system classifies images into semantic categories, such as textured-nontextured, graphphotograph. Potentially, the categorization enhances retrieval by permitting semantically-adaptive searching methods and narrowing down the searching range in a database. A measure for the overall similarity between images is developed using a region-matching scheme that integrates properties of all the regions in the images. Compared with retrieval based on individual regions, the overall similarity approach 1) reduces the adverse effect of inaccurate segmentation, 2) helps to clarify the semantics of a particular region, and 3) enables a simple querying interface for region-based image retrieval systems. The application of SIMPLIcity to several databases, including a database of about 200,000 general-purpose images, has demonstrated that our system performs significantly better and faster than existing ones. The system is fairly robust to image alterations.
Survey of clustering data mining techniques
, 2002
"... Accrue Software, Inc. Clustering is a division of data into groups of similar objects. Representing the data by fewer clusters necessarily loses certain fine details, but achieves simplification. It models data by its clusters. Data modeling puts clustering in a historical perspective rooted in math ..."
Abstract
-
Cited by 177 (0 self)
- Add to MetaCart
Accrue Software, Inc. Clustering is a division of data into groups of similar objects. Representing the data by fewer clusters necessarily loses certain fine details, but achieves simplification. It models data by its clusters. Data modeling puts clustering in a historical perspective rooted in mathematics, statistics, and numerical analysis. From a machine learning perspective clusters correspond to hidden patterns, the search for clusters is unsupervised learning, and the resulting system represents a data concept. From a practical perspective clustering plays an outstanding role in data mining applications such as scientific data exploration, information retrieval and text mining, spatial database applications, Web analysis, CRM, marketing, medical diagnostics, computational biology, and many others. Clustering is the subject of active research in several fields such as statistics, pattern recognition, and machine learning. This survey focuses on clustering in data mining. Data mining adds to clustering the complications of very large datasets with very many attributes of different types. This imposes unique
Image Categorization by Learning and Reasoning with Regions
- Journal of Machine Learning Research
, 2004
"... Designing computer programs to automatically categorize images using low-level features is a challenging research topic in computer vision. In this paper, we present a new learning technique, which extends Multiple-Instance Learning (MIL), and its application to the problem of region-based image cat ..."
Abstract
-
Cited by 98 (7 self)
- Add to MetaCart
Designing computer programs to automatically categorize images using low-level features is a challenging research topic in computer vision. In this paper, we present a new learning technique, which extends Multiple-Instance Learning (MIL), and its application to the problem of region-based image categorization. Images are viewed as bags, each of which contains a number of instances corresponding to regions obtained from image segmentation. The standard MIL problem assumes that a bag is labeled positive if at least one of its instances is positive; otherwise, the bag is negative.
IRM: Integrated Region Matching for Image Retrieval
, 2000
"... Content-based image retrieval using region segmentation has been an active research area. We present IRM (Integrated Region Matching), a novel similarity measure for regionbased image similarity comparison. The targeted image retrieval systems represent an image by a set of regions, roughly correspo ..."
Abstract
-
Cited by 75 (12 self)
- Add to MetaCart
Content-based image retrieval using region segmentation has been an active research area. We present IRM (Integrated Region Matching), a novel similarity measure for regionbased image similarity comparison. The targeted image retrieval systems represent an image by a set of regions, roughly corresponding to objects, which are characterized by features reflecting color, texture, shape, and location properties. The IRM measure for evaluating overall similarity between images incorporates properties of all the regions in the images by a region-matching scheme. Compared with retrieval based on individual regions, the overall similarity approach reduces the influence of inaccurate segmentation, helps to clarify the semantics of a particular region, and enables a simple querying interface for region-based image retrieval systems. The IRM has been implemented as a part of our experimental SIMPLIcity image retrieval system. The application to a database of about 200,000 general-purpose images ...
Content-Based Image Retrieval Using Multiple-Instance Learning
, 2002
"... We explore the application of machine learning techniques to the problem of content-based image retrieval (CBIR). Unlike most existing CBIR systems in which only global information is used or in which a user must explicitly indicate what part of the image is of interest, we apply the multiple-i ..."
Abstract
-
Cited by 64 (5 self)
- Add to MetaCart
We explore the application of machine learning techniques to the problem of content-based image retrieval (CBIR). Unlike most existing CBIR systems in which only global information is used or in which a user must explicitly indicate what part of the image is of interest, we apply the multiple-instance (MI) learning model to use a small number of training images to learn what images from the database are of interest to the user.
A Region-Based Fuzzy Feature Matching Approach to Content-Based Image Retrieval
, 2002
"... This paper proposes a fuzzy logic approach, UFM (unified feature matching), for region-based image retrieval. In our retrieval system, an image is represented by a set of segmented regions each of which is characterized by a fuzzy feature (fuzzy set) reflecting color, texture, and shape properties. ..."
Abstract
-
Cited by 62 (11 self)
- Add to MetaCart
This paper proposes a fuzzy logic approach, UFM (unified feature matching), for region-based image retrieval. In our retrieval system, an image is represented by a set of segmented regions each of which is characterized by a fuzzy feature (fuzzy set) reflecting color, texture, and shape properties. As a result, an image is associated with a family of fuzzy features corresponding to regions. Fuzzy features naturally characterize the gradual transition between regions (blurry boundaries) within an image, and incorporate the segmentation-related uncertainties into the retrieval algorithm. The resemblance of two images is then defined as the overall similarity between two families of fuzzy features, and quantified by a similarity measure, UFM measure, which integrates properties of all the regions in the images. Compared with similarity measures based on individual regions and on all regions with crisp-valued feature representations, the UFM measure greatly reduces the inuence of inaccurate segmentation, and provides a very intuitive quantification. The UFM has been implemented as a part of our experimental SIMPLIcity image retrieval system. The performance of the system is illustrated using examples from an image database of about 60,000 general-purpose images.
Towards a benchmark for Semantic Web reasoners - an analysis of the DAML ontology library
, 2003
"... Introduction Bet hmarksare one important asp ex of pe rformance e aluation. This pap e conce trate on the de e86)1 t of a re1888) tative be1 hmark forSe71 tic We87 To thise te t we pe rform a statistical analysis of available Seab tic We b ontologie] in our case the DAML ontology library, andde7F ..."
Abstract
-
Cited by 31 (4 self)
- Add to MetaCart
Introduction Bet hmarksare one important asp ex of pe rformance e aluation. This pap e conce trate on the de e86)1 t of a re1888) tative be1 hmark forSe71 tic We87 To thise te t we pe rform a statistical analysis of available Seab tic We b ontologie] in our case the DAML ontology library, andde7F e paramex]7 that can be use forthe ge1)F66x] of syntheK) ontologiex Theg syntheF1 ontologie can be use as workloads in be1 hmarks. Naturally, pe)661x]1j e aluation can also be pe1FjI#x using a re7 workload, viz. a workload that isobseK e on a re11)IF be ing use for normal ope -x7K)#6 Howe ve r, such workloads can usually not be applie ree78I7I in a controlle manne) Thee)x] synthe)j workloads are typicallyuse in pe rformance e aluations. SyntheF7 workloads should be areII77x tation or mo de ofthe re7 workload. Heloa it isne8K#K6x tome67#8 and characte]16 the workload one isting reingx7K to produce me7#Kx]1 synthejI workloads. This should allow us to syste7x]1)787 e aluate di#ee t rej#j#x] an
The fuzzy correlation between code and performance predictability
- In Proceedings of the 37th International Symposium on Microarchitecture (MICRO
, 2004
"... Recent studies have shown that most SPEC CPU2K benchmarks exhibit strong phase behavior, and the Cycles per Instruction (CPI) performance metric can be accurately predicted based on program’s control-flow behavior, by simply observing the sequencing of the program counters, or extended instruction p ..."
Abstract
-
Cited by 18 (1 self)
- Add to MetaCart
Recent studies have shown that most SPEC CPU2K benchmarks exhibit strong phase behavior, and the Cycles per Instruction (CPI) performance metric can be accurately predicted based on program’s control-flow behavior, by simply observing the sequencing of the program counters, or extended instruction pointers (EIPs). One motivation of this paper is to see if server workloads also exhibit such phase behavior. In particular, can EIPs effectively predict CPI in server workloads? We propose using regression trees to measure the theoretical upper bound on the accuracy of predicting the CPI using EIPs, where accuracy is measure by the explained variance of CPI with EIPs. Our results show that for most server workloads and, surprisingly, even for CPU2K benchmarks, the accuracy of predicting CPI from EIPs varies widely. We classify the benchmarks into four quadrants based on their CPI variance and predictability of CPI using EIPs. Our results indicate that no single sampling technique can be broadly applied to a large class of applications. We propose a new methodology that selects the best-suited sampling technique to accurately capture the program behavior. 1.
Latent Variable Models for Neural Data Analysis
, 1999
"... The brain is perhaps the most complex system to have ever been subjected to rigorous scientific investigation. The scale is staggering: over 1011 neurons, each making an average of 10 3 synapses, with computation occurring on scales ranging from a single dendritic spine, to an entire cortical area. ..."
Abstract
-
Cited by 17 (3 self)
- Add to MetaCart
The brain is perhaps the most complex system to have ever been subjected to rigorous scientific investigation. The scale is staggering: over 1011 neurons, each making an average of 10 3 synapses, with computation occurring on scales ranging from a single dendritic spine, to an entire cortical area. Slowly, we are beginning to acquire experimental tools that can gather the massive amounts of data needed to characterize this system. However, to understand and interpret these data will also require substantial strides in inferential and statistical techniques. This dissertation attempts to meet this need, extending and applying the modern tools of latent variable modeling to problems in neural data analysis. It is divided
On Generalized Multiple-Instance Learning
- International Journal of Computational Intelligence and Applications
, 2003
"... We describe a generalization of the multiple-instance learning model in which a bag's label is not based on a single instance's proximity to a single target point. Rather, a bag is positive if and only if it contains a collection of instances, each near one of a set of target points. We list potenti ..."
Abstract
-
Cited by 17 (4 self)
- Add to MetaCart
We describe a generalization of the multiple-instance learning model in which a bag's label is not based on a single instance's proximity to a single target point. Rather, a bag is positive if and only if it contains a collection of instances, each near one of a set of target points. We list potential applications of this model (robot vision, content-based image retrieval, protein sequence identification, and drug discovery) and describe target concepts for these applications that cannot be represented in the conventional multiple-instance learning model. We then adapt a learning-theoretic algorithm for learning in this model and present empirical results.

