Results 1 
9 of
9
Clustering Gene Expression Patterns
, 1999
"... Recent advances in biotechnology allow researchers to measure expression levels for thousands of genes simultaneously, across different conditions and over time. Analysis of data produced by such experiments offers potential insight into gene function and regulatory mechanisms. A key step in the ana ..."
Abstract

Cited by 332 (11 self)
 Add to MetaCart
Recent advances in biotechnology allow researchers to measure expression levels for thousands of genes simultaneously, across different conditions and over time. Analysis of data produced by such experiments offers potential insight into gene function and regulatory mechanisms. A key step in the analysis of gene expression data is the detection of groups of genes that manifest similar expression patterns. The corresponding algorithmic problem is to cluster multicondition gene expression patterns. In this paper we describe a novel clustering algorithm that was developed for analysis of gene expression data. We define an appropriate stochastic error model on the input, and prove that under the conditions of the model, the algorithm recovers the cluster structure with high probability. The running time of the algorithm on an ngene dataset is O(n 2 (log(n)) c ). We also present a practical heuristic based on the same algorithmic ideas. The heuristic was implemented and its p...
Identifying Distinctive Subsequences in Multivariate Time Series by Clustering
 PROC. ACM SIGKDD
, 1999
"... Most time series comparison algorithms attempt to discover what the members of a set of time series have in common. We investigate a different problem, determining what distinguishes time series in that set from other time series obtained from the same source. In both cases the goal is to identif ..."
Abstract

Cited by 36 (2 self)
 Add to MetaCart
Most time series comparison algorithms attempt to discover what the members of a set of time series have in common. We investigate a different problem, determining what distinguishes time series in that set from other time series obtained from the same source. In both cases the goal is to identify shared patterns, though in the latter case those patterns must be distinctiveaswell. An efficient incremental algorithm for identifying distinctive subsequences in multivariate, realvalued time series is described and evaluated with data from two very different sources: the response of a set of bandpass filters to human speech and the sensors of a mobile robot.
Performance Criteria for Graph Clustering and Markov Cluster Experiments
 NATIONAL RESEARCH INSTITUTE FOR MATHEMATICS AND COMPUTER SCIENCE IN THE
, 2000
"... In [6] a cluster algorithm for graphs was introduced called the Markov cluster algorithm or MCL algorithm. The algorithm is based on simulation of (stochastic) flow in graphs by means of alternation of two operators, expansion and inflation. The results in [8] establish an intrinsic relationship bet ..."
Abstract

Cited by 32 (1 self)
 Add to MetaCart
In [6] a cluster algorithm for graphs was introduced called the Markov cluster algorithm or MCL algorithm. The algorithm is based on simulation of (stochastic) flow in graphs by means of alternation of two operators, expansion and inflation. The results in [8] establish an intrinsic relationship between the corresponding algebraic process (MCL process) and cluster structure in the iterands and the limits of the process. Several kinds of experiments conducted with the MCL algorithm are described here. Test cases with varying homogeneity characteristics are used to establish some of the particular strengths and weaknesses of the algorithm. In general the algorithm performs well, except for graphs which are very homogeneous (such as weakly connected grids) and for which the natural cluster diameter (i.e. the diameter of a subgraph induced by a natural cluster) is large. This can be understood in terms of the flow characteristics of the MCL algorithm and the heuristic on which the...
Reinterpreting the Category Utility Function
, 2001
"... . The category utility function is a partition quality scoring function applied in some clustering programs of machine learning. We reinterpret this function in terms of the data variance explained by a clustering, or, equivalently, in terms of the squareerror classical clustering criterion that ad ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
. The category utility function is a partition quality scoring function applied in some clustering programs of machine learning. We reinterpret this function in terms of the data variance explained by a clustering, or, equivalently, in terms of the squareerror classical clustering criterion that administers the KMeans and Ward methods. This analysis suggests extensions of the scoring function to situations with differently standardized and mixed scale data. Keywords: Clustering, data standardization, contingency coefficient, correlation ratio, weighting features, mixedscale data 2 BORIS MIRKIN 1.
LeastSquares Structuring, Clustering, and Data Processing Issues
"... Approximation structuring clustering is an extension of what is usually called "squareerrorclustering" onto various cluster structures and data formats. It appears to be not only a mathematical device to support, specify and extend many clustering techniques, but also a framework for mathematical ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Approximation structuring clustering is an extension of what is usually called "squareerrorclustering" onto various cluster structures and data formats. It appears to be not only a mathematical device to support, specify and extend many clustering techniques, but also a framework for mathematical analysis of interrelations among the techniques and their relations to other concepts and problems in data analysis, statistics, machine learning, data compression and decompression, and design and use of multiresolution hierarchies. Based on the results found, a number of methods for solving data processing problems are described.
Combinatoral Optimization in Clustering
"... Contents 1 Introduction 2 2 Types of Data 5 3 Cluster Structures 14 4 Clustering Criteria 15 5 Single Cluster Clustering 16 5.1 Clustering Approaches.......................... 16 5.1.1 De#nitionbased Clusters .................... 16 5.1.2 Direct Algorithms ........................ 18 5.1.3 Optimal ..."
Abstract
 Add to MetaCart
Contents 1 Introduction 2 2 Types of Data 5 3 Cluster Structures 14 4 Clustering Criteria 15 5 Single Cluster Clustering 16 5.1 Clustering Approaches.......................... 16 5.1.1 De#nitionbased Clusters .................... 16 5.1.2 Direct Algorithms ........................ 18 5.1.3 Optimal Clusters . ........................ 20 5.2 Single and Monotone Linkage Clusters ................. 21 5.2.1 MST and Single Linkage Clustering .............. 21 5.2.2 Monotone Linkage Clusters . . ................. 23 1 5.2.3 Modeling Skeletons in Digital Image Processing . . . . . . . . 25 5.2.4 Linkagebased Convex Criteria ................. 27 5.3 Moving Center and Approximation Clusters . . . . . ......... 29 5.3.1 Criteria for Moving Center Methods . . . . . ......... 29 5.3.2 Principal Cluster . . ....................... 29 5.3.3 Additive Cluster ......................... 32 5.3.4 Seriation with Returns . . . . . . ................ 34 6 Partitioning
Approximation Clustering: a Mine of Semidefinite Programming Problems
"... . Clustering is a discipline devoted to #nding homogeneous groups of data entities. In contrast to conventional clustering whichinvolves data processing in terms of either entities or variables, approximation clustering is aimed at processing of the data matrices as they are. Currently, approxima ..."
Abstract
 Add to MetaCart
. Clustering is a discipline devoted to #nding homogeneous groups of data entities. In contrast to conventional clustering whichinvolves data processing in terms of either entities or variables, approximation clustering is aimed at processing of the data matrices as they are. Currently, approximation clustering is a set of clustering models and methods based on approximate decomposition of the data table into scalar product matrices representing weighted subsets, partitions or hierarchies as the sought clustering structures. Some of the problems involved are of semide#nite programming, the others seem quite similar. 1 Introduction Clustering models may di#er depending on the nature of data. We distinguish here among three types of data: columnconditional, similarity and aggregable ones. The #rst two are those usually considered in clustering: a columnconditional data set is represented by an entitytovariable matrix so that the entries within any column #variable# can be c...
Three Approaches to Aggregation of Interaction Tables
"... An interaction table is a summable square matrix emerging in analysis of intercitation, international trade, brandswitching, mobility, or inputoutput industrial data. Three approaches to aggregation of interaction data are theoretically compared: (i) loglinear modeling, (ii) aggregation of Mar ..."
Abstract
 Add to MetaCart
An interaction table is a summable square matrix emerging in analysis of intercitation, international trade, brandswitching, mobility, or inputoutput industrial data. Three approaches to aggregation of interaction data are theoretically compared: (i) loglinear modeling, (ii) aggregation of Markov chains, and (iii) principle of equivalence in the correspondence analysis. This way an empirical clustering algorithm, developed in the framework (iii), is justified and amended by substantively modeling the interaction processes.
RICE UNIVERSITY Architecture and Algorithms for Scalable Mobile
"... Supporting Quality of Service is an important objective for future mobile systems, and requires resource reservation and admission control to achieve. In this thesis, we introduce a scalable scheme to admission control termed Virtual Bottleneck Cell. Our approach is designed to scale to many users a ..."
Abstract
 Add to MetaCart
Supporting Quality of Service is an important objective for future mobile systems, and requires resource reservation and admission control to achieve. In this thesis, we introduce a scalable scheme to admission control termed Virtual Bottleneck Cell. Our approach is designed to scale to many users and hando s, while simultaneously controlling \hot spots". The key technique is to hierarchically control the virtual system, ensuring QoS objectives are satis ed without requiring accurate predictions of the users ' future locations. We develop a simple analytical model to study the system and illustrate several key components of the approach. We formulate the problem of how to group the cells to form the virtual system as an optimization problem and propose a heuristic adaptive clustering algorithm as its solution. Finally, we perform simulations in a twodimensional network to compare the performance obtained with VBC and adaptive clustering with alternate schemes, including the optimal o ine