Results 1 -
8 of
8
Preference Mining: A Novel Approach on Mining User Preferences for Personalized Applications
, 2003
"... Advanced personalized e-applications require comprehensive knowledge about their user’s likes and dislikes in order to provide individual product recommendations, personal customer advice and custom-tailored product offers. In our approach we model such preferences as strict partial orders with “A ..."
Abstract
-
Cited by 22 (2 self)
- Add to MetaCart
Advanced personalized e-applications require comprehensive knowledge about their user’s likes and dislikes in order to provide individual product recommendations, personal customer advice and custom-tailored product offers. In our approach we model such preferences as strict partial orders with “A is better than B ” semantics, which has been proven to be very suitable in various e-applications. In this paper we present novel Preference Mining techniques for detecting strict partial order preferences in user log data. The main advantage of our approach is the semantic expressiveness of the Preference Mining results. Experimental evaluations prove the effectiveness and efficiency of our algorithms. Since the Preference Mining implementation uses sophisticated SQL statements to execute all data-intensive operations on database layer, our algorithms scale well even for large log data sets. With our approach personalized e-applications can gain valuable knowledge about their customers’ preferences, which is essential for a qualified customer service.
Fast Accurate Fuzzy Clustering through Data
- IEEE Transactions on Fuzzy Systems
, 2003
"... Clustering is a useful approach in image segmentation, data mining and other pattern recognition problems for which unlabeled data exist. Fuzzy clustering using fuzzy c-means or variants of it can provide a data partition that is both better and more meaningful than hard clustering approaches. Th ..."
Abstract
-
Cited by 13 (7 self)
- Add to MetaCart
Clustering is a useful approach in image segmentation, data mining and other pattern recognition problems for which unlabeled data exist. Fuzzy clustering using fuzzy c-means or variants of it can provide a data partition that is both better and more meaningful than hard clustering approaches. The clustering process can be quite slow when there are many objects or patterns to be clustered. This paper discusses an algorithm, brFCM, which is able to reduce the number of distinct patterns which must be clustered without adversely affecting partition quality. The reduction is done by aggregating similar examples and then using a weighted exemplar in the clustering process. The reduction in the amount of clustering data allows a partition of the data to be produced faster. The algorithm is applied to the problem of segmenting 32 magnetic resonance images into different tissue types and the problem of segmenting 172 infrared images into trees, grass and target. Average speed-ups of as much as 59 to 290 times a traditional implementation of fuzzy c-means were obtained using brFCM, while producing partitions that are equivalent to those produced by fuzzy c-means.
A Fast and Robust General Purpose Clustering Algorithm
- In Pacific Rim International Conference on Artificial Intelligence
, 2000
"... General purpose and highly applicable clustering methods are usually required during the early stages of knowledge discovery exercises. k-Means has been adopted as the prototype of iterative model-based clustering because of its speed, simplicity and capability to work within the format of very larg ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
General purpose and highly applicable clustering methods are usually required during the early stages of knowledge discovery exercises. k-Means has been adopted as the prototype of iterative model-based clustering because of its speed, simplicity and capability to work within the format of very large databases. However, k-Means has several disadvantages derived from its statistical simplicity. We propose an algorithm that remains very efficient, generally applicable, multi-dimensional but is more robust to noise and outliers. We achieve this by using the discrete median rather than the mean as the estimator of the center of a cluster. Comparison with k-Means, Expectation Maximization and Gibbs sampling demonstrates the advantages of our algorithm.
Hybrid Optimization for Clustering in Data Mining
, 2000
"... The need to analyze data is growing exponentially and the eld of Data Mining has emerged for the development of techniques to obtain information and knowledge from vast amounts of micro-data. To this end, clustering is a fundamental task. Formulations using representatives have received much more at ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
The need to analyze data is growing exponentially and the eld of Data Mining has emerged for the development of techniques to obtain information and knowledge from vast amounts of micro-data. To this end, clustering is a fundamental task. Formulations using representatives have received much more attention because these provide prototypical items for a class. They also offer general purpose and highly applicable clustering methods. We review the optimization criteria that define the inductive principle in representative-based clustering and then discuss the algorithms for their approximate solution. We propose a hybrid iterative and genetic algorithms that obtains more cost effective optimization of the medoid clustering criteria.
Fast Approximation of Sums of Distances
"... We show how to preprocess a set S of n points in R d (d constant) using O(kn log d 1 n) time and space so that the sum of distances of points in S to a query point q can be approximated to within a factor of O() in O(k log d 1 n) time, where is an arbitrarily small constant and k is a cons ..."
Abstract
- Add to MetaCart
We show how to preprocess a set S of n points in R d (d constant) using O(kn log d 1 n) time and space so that the sum of distances of points in S to a query point q can be approximated to within a factor of O() in O(k log d 1 n) time, where is an arbitrarily small constant and k is a constant dependent only on and d. We also give applications of this technique to approximation algorithms for clustering and facility location problems. 1 Introduction Let S = fp 1 ; : : : ; p n g be a set of points in R d , with d constant. For a query point q we dene the weight of q as w(q) = n X i=1 d(q; p i ) ; (1) where d(x; y) denotes the Euclidean distance between x and y. This function appears frequently as the objective function in facility location and clustering problems [2, 3, 5, 6, 10, 18]. Unfortunately, even with preprocessing, it appears that little can be done in order to speed up the evaluation of w(q) for an arbitrary query point q, and the only known result is...
Fast Approximations for Sums of Distances,
- Computational Geometry: Theory and Apllications
, 2002
"... We describe two data structures that preprocess a set S of n points in R (d constant) so that the sum of Euclidean distances of points in S to a query point q can be quickly approximated to within a factor of . This preprocessing technique has several applications in clustering and facility locat ..."
Abstract
- Add to MetaCart
We describe two data structures that preprocess a set S of n points in R (d constant) so that the sum of Euclidean distances of points in S to a query point q can be quickly approximated to within a factor of . This preprocessing technique has several applications in clustering and facility location. Using it, we derive an O(n log n) time deterministic and O(n) time randomized -approximation algorithm for the so called Fermat-Weber problem in any xed dimension.
Clustering With Obstacle Entities
, 1999
"... With a large amount of data stored in spatial databases, one may like to find groups of data which share similar features, such as spatial approximity. Thus cluster analysis has become an active area of research in data mining. However, most of the clustering algorithms developed so far ignored the ..."
Abstract
- Add to MetaCart
With a large amount of data stored in spatial databases, one may like to find groups of data which share similar features, such as spatial approximity. Thus cluster analysis has become an active area of research in data mining. However, most of the clustering algorithms developed so far ignored the fact that physical obstacles exist in the real world, and these obstacles may affect substantially clustering methods as well as clustering results. This thesis studies the problem of Clustering with Obstacle Entities (COE), and develops a k-medoid algorithm, COE-Clarans (Clustering with Obstacle Entities based on CLARANS) which performs effective clustering by taking obstacle entities into consideration. Some computational geometry techniques were selected for optimizing the speed and scalability of COE-Clarans. Our experimental and performance studies have demonstrated the effectiveness and efficiency of the algorithm. In addition, the complexity of COE problem has been analyzed in our st...
Preference Mining: A Novel Approach on Mining User Preferences for Personalized Applications
"... Abstract. Advanced personalized e-applications require comprehensive knowledge about their user’s likes and dislikes in order to provide individual product recommendations, personal customer advice and custom-tailored product offers. In our approach we model such preferences as strict partial orders ..."
Abstract
- Add to MetaCart
Abstract. Advanced personalized e-applications require comprehensive knowledge about their user’s likes and dislikes in order to provide individual product recommendations, personal customer advice and custom-tailored product offers. In our approach we model such preferences as strict partial orders with “A is better than B ” semantics, which has been proven to be very suitable in various e-applications. In this paper we present novel Preference Mining techniques for detecting strict partial order preferences in user log data. The main advantage of our approach is the semantic expressiveness of the Preference Mining results. Experimental evaluations prove the effectiveness and efficiency of our algorithms. Since the Preference Mining implementation uses sophisticated SQL statements to execute all data-intensive operations on database layer, our algorithms scale well even for large log data sets. With our approach personalized e-applications can gain valuable knowledge about their customers ’ preferences, which is essential for a qualified customer service. 1

