Results 1  10
of
41
How many clusters? Which clustering method? Answers via modelbased cluster analysis
 THE COMPUTER JOURNAL
, 1998
"... ..."
Variable neighborhood search: Principles and applications
, 2001
"... Systematic change of neighborhood within a possibly randomized local search algorithm yields a simple and effective metaheuristic for combinatorial and global optimization, called variable neighborhood search (VNS). We present a basic scheme for this purpose, which can easily be implemented using an ..."
Abstract

Cited by 94 (9 self)
 Add to MetaCart
Systematic change of neighborhood within a possibly randomized local search algorithm yields a simple and effective metaheuristic for combinatorial and global optimization, called variable neighborhood search (VNS). We present a basic scheme for this purpose, which can easily be implemented using any local search algorithm as a subroutine. Its effectiveness is illustrated by solving several classical combinatorial or global optimization problems. Moreover, several extensions are proposed for solving large problem instances: using VNS within the successive approximation method yields a twolevel VNS, called variable neighborhood decomposition search (VNDS); modifying the basic scheme to explore easily valleys far from the incumbent solution yields an efficient skewed VNS (SVNS) heuristic. Finally, we show how to stabilize column generation algorithms with help of VNS and discuss various ways to use VNS in graph theory, i.e., to suggest, disprove or give hints on how to prove conjectures, an area where metaheuristics do not appear
An Analysis of Recent Work on Clustering Algorithms
, 1999
"... This paper describes four recent papers on clustering, each of which approaches the clustering problem from a different perspective and with different goals. It analyzes the strengths and weaknesses of each approach and describes how a user could could decide which algorithm to use for a given clust ..."
Abstract

Cited by 73 (0 self)
 Add to MetaCart
This paper describes four recent papers on clustering, each of which approaches the clustering problem from a different perspective and with different goals. It analyzes the strengths and weaknesses of each approach and describes how a user could could decide which algorithm to use for a given clustering application. Finally, it concludes with ideas that could make the selection and use of clustering algorithms for data analysis less difficult.
Fast Training Algorithms For MultiLayer Neural Nets
, 1993
"... Training a multilayer neural net by backpropagation is slow and requires arbitrary choices regarding the number of hidden units and layers. This paper describes an algorithm which is much faster than backpropagation and for which it is not necessary to specify the number of hidden units in advance ..."
Abstract

Cited by 29 (0 self)
 Add to MetaCart
Training a multilayer neural net by backpropagation is slow and requires arbitrary choices regarding the number of hidden units and layers. This paper describes an algorithm which is much faster than backpropagation and for which it is not necessary to specify the number of hidden units in advance. The relationship with other fast pattern recognition algorithms, such as algorithms based on kd trees, is mentioned. The algorithm has been implemented and tested on articial problems such as the parity problem and on real problems arising in speech recognition. Experimental results, including training times and recognition accuracy, are given. Generally, the algorithm achieves accuracy as good as or better than nets trained using backpropagation, and the training process is much faster than backpropagation. Accuracy is comparable to that for the \nearest neighbour" algorithm, which is slower and requires more storage space. Comments Only the Abstract is given here. The full paper ap...
Toward a measurementbased geographic location service
 in Proc. of PAM’2004, Antibes JuanlesPins
, 2004
"... Abstract. Locationaware applications require a geographic location service of Internet hosts. We focus on a measurementbased service for the geographic location of Internet hosts. Host locations are inferred by comparing delay patterns of geographically distributed landmarks, which are hosts with ..."
Abstract

Cited by 20 (5 self)
 Add to MetaCart
Abstract. Locationaware applications require a geographic location service of Internet hosts. We focus on a measurementbased service for the geographic location of Internet hosts. Host locations are inferred by comparing delay patterns of geographically distributed landmarks, which are hosts with a known geographic location, with the delay pattern of the target host to be located. Results show a significant correlation between geographic distance and network delay that can be exploited for a coarsegrained geographic location of Internet hosts. 1
An interior point algorithm for minimum sum of squares clustering
 SIAM J. Sci. Comput
, 1997
"... Abstract. An exact algorithm is proposed for minimum sumofsquares nonhierarchical clustering, i.e., for partitioning a given set of points from a Euclidean mspace into a given number of clusters in order to minimize the sum of squared distances from all points to the centroid of the cluster to wh ..."
Abstract

Cited by 20 (8 self)
 Add to MetaCart
Abstract. An exact algorithm is proposed for minimum sumofsquares nonhierarchical clustering, i.e., for partitioning a given set of points from a Euclidean mspace into a given number of clusters in order to minimize the sum of squared distances from all points to the centroid of the cluster to which they belong. This problem is expressed as a constrained hyperbolic program in 01 variables. The resolution method combines an interior point algorithm, i.e., a weighted analytic center column generation method, with branchandbound. The auxiliary problem of determining the entering column (i.e., the oracle) is an unconstrained hyperbolic program in 01 variables with a quadratic numerator and linear denominator. It is solved through a sequence of unconstrained quadratic programs in 01 variables. To accelerate resolution, variable neighborhood search heuristics are used both to get a good initial solution and to solve quickly the auxiliary problem as long as global optimality is not reached. Estimated bounds for the dual variables are deduced from the heuristic solution and used in the resolution process as a trust region. Proved minimum sumofsquares partitions are determined for the first time for several fairly large data sets from the literature, including Fisher’s 150 iris. Key words. classification and discrimination, cluster analysis, interiorpoint methods, combinatorial optimization
Heuristic Methods for Large Centroid Clustering Problems
, 1996
"... This article presents new heuristic methods for solving a class of hard centroid clustering problems including the fmedian, the sumofsquares clustering and the multisource Weber problems. Centroid clustering is to partition a set of entities into a given number of subsets and to find the locatio ..."
Abstract

Cited by 16 (5 self)
 Add to MetaCart
This article presents new heuristic methods for solving a class of hard centroid clustering problems including the fmedian, the sumofsquares clustering and the multisource Weber problems. Centroid clustering is to partition a set of entities into a given number of subsets and to find the location of a centre for each subset in such a way that a dissimilarity measure between the entities and the centres is minimized. The first method proposed is a candidate list search that produces good solutions in a short amount of time if the number of centres in the problem is not too large. The second method is a general local optimization approach that finds very good solutions. The third method is designed for problems with a large number of centres; it decomposes the problem into subproblems that are solved independently. Numer ical results show that these methods are efficient  dozens of best solutions known to problem instances of the literature have been improved and fast, handling problem instances with more than 85'000 entities and 15'000 centres much larger than those solved in the literature. The expected complexity of these new procedures is discussed and shown to be comparable to that of an existing method which is known to be very fast.
Data Resource Selection in Distributed Visual Information Systems
 IEEE Transactions on Knowledge and Data Engineering
, 1998
"... With the advances in multimedia databases and the popularization of the Internet, it is now possible to access large image and video repositories distributed throughout the world. One of the challenging problems in such an access is how the information in the respective databases can be summarized t ..."
Abstract

Cited by 15 (4 self)
 Add to MetaCart
With the advances in multimedia databases and the popularization of the Internet, it is now possible to access large image and video repositories distributed throughout the world. One of the challenging problems in such an access is how the information in the respective databases can be summarized to enable an intelligent selection of relevant database sites based on visual queries. This paper presents an approach to solve this problem based on image contentbased indexing of a metadatabase at a query distribution server. The metadatabase records a summary of the visual content of the images in each database through image templates and statistical features characterizing the similarity distributions of the images. The selection of the databases is done by searching the metadatabase using a ranking algorithm that uses query similarity to a template and the features of the databases associated with the template. Two selection approaches, termed meanbased and histogrambased approaches, ...
SemQuery: Semantic Clustering and Querying on Heterogeneous Features for Visual data
 IEEE Trans. Knowledge and Data Engineering
, 1998
"... The effectiveness of the contentbased image retrieval can be enhanced using the heterogeneous features embedded in the images. However, since the features in texture, color, and shape are generated using different computation methods and thus may require different similarity measurements, the integ ..."
Abstract

Cited by 14 (0 self)
 Add to MetaCart
The effectiveness of the contentbased image retrieval can be enhanced using the heterogeneous features embedded in the images. However, since the features in texture, color, and shape are generated using different computation methods and thus may require different similarity measurements, the integration of the retrieval on heterogeneous features is a nontrivial task. In this paper, we present a semanticsbased clustering and indexing approach, termed SemQuery, to support visual queries on heterogeneous features of images. Using this approach, the database images are classified based on their heterogeneous features. Each semantic image cluster contains a set of subclusters that are represented by the heterogeneous features that the images contain. A database image is included into a feature subcluster only if the image contains all the features under the same cluster. We also design a multilayer model to merge the results of basic queries on individual features. A visual query proc...
Concept Hierarchy in Data Mining: Specification, Generation and Implementation
, 1997
"... Data mining is the nontrivial extraction of implicit, previously unknown, and potentially useful information from data. As one of the most important background knowledge, concept hierarchy plays a fundamentally important role in data mining. It is the purpose of this thesis to study some aspects of ..."
Abstract

Cited by 11 (0 self)
 Add to MetaCart
Data mining is the nontrivial extraction of implicit, previously unknown, and potentially useful information from data. As one of the most important background knowledge, concept hierarchy plays a fundamentally important role in data mining. It is the purpose of this thesis to study some aspects of concept hierarchy such as the automatic generation and encoding technique in the context of data mining. After the discussion on the basic terminology and categorization, automatic generation of concept hierarchies is studied for both nominal and numerical hierarchies. One algorithm is designed for determining the partial order on a given set of nominal attributes. The resulting partial order is a useful guide for users to finalize the concept hierarchy for their particular data mining tasks. Based on hierarchical and partitioning clustering methods, two algorithms are proposed for the automatic generation of numerical hierarchies. The quality and performance comparisons indicates that the ...