Results 1  10
of
30
Lowcomplexity fuzzy relational clustering algorithms for web mining
 IEEE TRANSACTIONS ON FUZZY SYSTEMS
, 2001
"... This paper presents new algorithms—fuzzy cmedoids (FCMdd) and robust fuzzy cmedoids (RFCMdd)—for fuzzy clustering of relational data. The objective functions are based on selecting c representative objects (medoids) from the data set in such a way that the total fuzzy dissimilarity within each clus ..."
Abstract

Cited by 34 (2 self)
 Add to MetaCart
This paper presents new algorithms—fuzzy cmedoids (FCMdd) and robust fuzzy cmedoids (RFCMdd)—for fuzzy clustering of relational data. The objective functions are based on selecting c representative objects (medoids) from the data set in such a way that the total fuzzy dissimilarity within each cluster is minimized. A comparison of FCMdd with the wellknown relational fuzzy cmeans algorithm (RFCM) shows that FCMdd is more efficient. We present several applications of these algorithms to Web mining, including Web document clustering, snippet clustering, and Web access log analysis.
A Fuzzy Relative of the kMedoids Algorithm with Application to Web Document and Snippet Clustering
 Snippet Clustering, in Proc. IEEE Intl. Conf. Fuzzy Systems  FUZZIEEE99, Korea
, 1999
"... This paper presents new algorithms (Fuzzy cMedoids FCMdd and Fuzzy c Trimmed Medoids or FCTMdd) for fuzzy clustering of relational data. The objective functions are based on selecting c representative objects (medoids) from the data set in such a way that the total dissimilarity within each cluster ..."
Abstract

Cited by 17 (2 self)
 Add to MetaCart
This paper presents new algorithms (Fuzzy cMedoids FCMdd and Fuzzy c Trimmed Medoids or FCTMdd) for fuzzy clustering of relational data. The objective functions are based on selecting c representative objects (medoids) from the data set in such a way that the total dissimilarity within each cluster is minimized. A comparison of FCMdd with the Relational Fuzzy cMeans algorithm (RFCM) shows that FCMdd is much faster. We present examples of applications of these algorithms to Web document and snippet clustering. 1.Introduction Object data refers to the the situation where the objects to be clustered are represented by vectors x i 2 ! p . Relational data refers to the situation where we have only numerical values representing the degrees to which pairs of objects in the data set are related. Algorithms that generate partitions of relational data are usually referred to as relational (or sometimes pairwise) clustering algorithms. Relational clustering is more general in the sense tha...
Automatic Web User Profiling and Personalization Using Robust Fuzzy Relational Clustering
, 2002
"... The proliferation of information on the world wide Web has made the personalization of this information space a necessity. Personalization of content returned from a Web site is a desired feature that can enhance server performance improve system design, and lead to wise marketing decisions in elect ..."
Abstract

Cited by 14 (4 self)
 Add to MetaCart
The proliferation of information on the world wide Web has made the personalization of this information space a necessity. Personalization of content returned from a Web site is a desired feature that can enhance server performance improve system design, and lead to wise marketing decisions in electronic commerce. Mining typical user profiles from the vast amount of historical data stored in access logs is an important component of Web personalization. In the absence of a priori knowledge, unsupervised or clustering methods seem to be ideally suited to categorize the usage behavior of Web surfers. In this chapter, we present a framework for mining typical user profiles from server acces logs based on robust fuzzy relational clustering. As a byproduct of the clustering process that generates robust profiles, associations between different URL addresses on a given site can easily be inferred. In general, the URLs that are present in the same profile tend to be visited together in the same session or form a large itemset. Finally, we present a personalization system that uses previously mined profiles to automatically generate a Web page containing URLs the user might be interested in. Our personalization approach is based on profiles computed from the prior traversal patterns of the users on the website and do not involve providing any declarative private information or the user to log in.
An Introduction to Symbolic Data Analysis and the Sodas Software
 Journal of Symbolic Data Analysis
, 2003
"... ..."
Multidimensional Scaling of IntervalValued Dissimilarity Data
 Pattern Recognition Letters
, 2000
"... Multidimensional scaling is a wellknown technique for representing measurements of dissimilarity among objects as points in a pdimensional space. In this paper, this method is extended to the case where dissimilarities are only known to lie within certain intervals. Each object is then no longer r ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
Multidimensional scaling is a wellknown technique for representing measurements of dissimilarity among objects as points in a pdimensional space. In this paper, this method is extended to the case where dissimilarities are only known to lie within certain intervals. Each object is then no longer represented as point, but as a region of R p , in such a way that the minimum and maximum distances between two regions approximate the lower and upper bounds of the dissimilarity interval between the two objects. Experiments with real data demonstrate the ability of this method to represent both the structure and the precision of dissimilarity measurements. Keywords: Multidimensional scaling, Intervalvalued data, Exploratory data analysis, Data visualization. 1
Multidimensional scaling of intervalvalued dissimilarity data
 Pattern Recognition Letters
, 2000
"... Multidimensional scaling is a wellknown technique for representing measurements of dissimilarity among objects as points in a pdimensional space. In this paper, this method is extended to the case where dissimilarities are only known to lie within certain intervals. Each object is then no longer r ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Multidimensional scaling is a wellknown technique for representing measurements of dissimilarity among objects as points in a pdimensional space. In this paper, this method is extended to the case where dissimilarities are only known to lie within certain intervals. Each object is then no longer represented as point, but as a region of R p,insuchaway that the minimum and maximum distances between two regions approximate the lower and upper bounds of the dissimilarity interval between the two objects. Experiments with real data demonstrate the ability of this method to represent both the structure and the precision of dissimilarity measurements. Keywords: Multidimensional scaling, Intervalvalued data, Exploratory data analysis, Data visualization.
Minimization Subproblems and Heuristics for an Applied Clustering Problem
, 2001
"... A practical problem that requires the classification of a set of points of R^n using a criterion not sensitive to bounded outliers is studied in this paper. A fixedpoint (kmeans) algorithm is defined that uses an arbitrary distance function. Finite convergence is proved. A robust distance defined ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
A practical problem that requires the classification of a set of points of R^n using a criterion not sensitive to bounded outliers is studied in this paper. A fixedpoint (kmeans) algorithm is defined that uses an arbitrary distance function. Finite convergence is proved. A robust distance defined by Boente, Fraiman and Yohai is selected for applications. Smooth approximations of this distance are defined and suitable heuristics are introduced to enhance the probability of finding global optimizers. A reallife example is presented and commented.
Attribute Analysis in Biomedical Text Classification
"... Text Classification tasks are becoming increasingly popular in the field of Information Access. Being approached as Machine Learning problems, the definition of suitable attributes for each task is approached in an adhoc way. We believe that a more principled framework is required, and we present i ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Text Classification tasks are becoming increasingly popular in the field of Information Access. Being approached as Machine Learning problems, the definition of suitable attributes for each task is approached in an adhoc way. We believe that a more principled framework is required, and we present initial insights on attribute engineering for Text Classification, along with a software library that allows experiment definition and fast prototyping of classification systems. The library is currently being used and evaluated in Information Access projects in the biomedical domain.
Knowledge Discovery From Symbolic Data And The Sodas Software
 Conf. on Principles and Practice of Knowledge Discovery in Databases, PPKDD2000
, 2000
"... The data descriptions of the units are called "symbolic" when they are more complex than the standard ones due to the fact that they contain internal variation and are structured. Symbolic data happen from many sources, for instance in order to summarise huge Relational Data Bases by t ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
The data descriptions of the units are called "symbolic" when they are more complex than the standard ones due to the fact that they contain internal variation and are structured. Symbolic data happen from many sources, for instance in order to summarise huge Relational Data Bases by their underlying concepts. "Extracting knowledge" means getting explanatory results, that why, "symbolic objects" are introduced and studied in this paper. They model concepts and constitute an explanatory output for data analysis. Moreover they can be used in order to define queries of a Relational Data Base and propagate concepts between Data Bases. We define "Symbolic Data Analysis" (SDA) as the extension of standard Data Analysis to symbolic data tables as input in order to find symbolic objects as output. In this paper we give an overview on recent development on SDA. We present some tools and methods of SDA and introduce the SODAS software prototype (issued from the work of 17 teams of nine countries involved in an European project of EUROSTAT). 1
Adaptive Concept Learning through Clustering and Aggregation of Relational Data
"... We introduce a new approach for Clustering and Aggregating Relational Data (CARD). We assume that data is available in a relational form, where we only have information about the degrees to which pairs of objects in the data set are related. Moreover, we assume that the relational information is rep ..."
Abstract
 Add to MetaCart
We introduce a new approach for Clustering and Aggregating Relational Data (CARD). We assume that data is available in a relational form, where we only have information about the degrees to which pairs of objects in the data set are related. Moreover, we assume that the relational information is represented by multiple dissimilarity matrices. These matrices could have been generated using different sensors, features, or mappings. CARD is designed to aggregate pairwise distances from multiple relational matrices, partition the data into clusters, and learn a relevance weight for each matrix in each cluster simultaneously. We introduce two versions of CARD. The first one is completely unsupervised(UCARD). The second version is semisupervised(SSCARD) and uses partial supervision information that consists of a small set of mustlink and cannotlink constraints. The performance of the proposed algorithms is illustrated by using it to categorize a collection of 500 color images. We represent the pairwise image dissimilarities by six different relational matrices that encode color, texture, and structure information. The results are compared with those obtained by 3 other relational clustering methods.