Results 11  20
of
1,625
QMR: a QuasiMinimal Residual Method for NonHermitian Linear Systems
, 1991
"... ... In this paper, we present a novel BCGlike approach, the quasiminimal residual (QMR) method, which overcomes the problems of BCG. An implementation of QMR based on a lookahead version of the nonsymmetric Lanczos algorithm is proposed. It is shown how BCG iterates can be recovered stably from t ..."
Abstract

Cited by 334 (26 self)
 Add to MetaCart
... In this paper, we present a novel BCGlike approach, the quasiminimal residual (QMR) method, which overcomes the problems of BCG. An implementation of QMR based on a lookahead version of the nonsymmetric Lanczos algorithm is proposed. It is shown how BCG iterates can be recovered stably from the QMR process. Some further properties of the QMR approach are given and an error bound is presented. Finally, numerical experiments are reported.
Blobworld: A System for RegionBased Image Indexing and Retrieval
 In Third International Conference on Visual Information Systems
, 1999
"... . Blobworld is a system for image retrieval based on finding coherent image regions which roughly correspond to objects. Each image is automatically segmented into regions ("blobs") with associated color and texture descriptors. Querying is based on the attributes of one or two regions of interest, ..."
Abstract

Cited by 306 (4 self)
 Add to MetaCart
. Blobworld is a system for image retrieval based on finding coherent image regions which roughly correspond to objects. Each image is automatically segmented into regions ("blobs") with associated color and texture descriptors. Querying is based on the attributes of one or two regions of interest, rather than a description of the entire image. In order to make largescale retrieval feasible, we index the blob descriptions using a tree. Because indexing in the highdimensional feature space is computationally prohibitive, we use a lowerrank approximation to the highdimensional distance. Experiments show encouraging results for both querying and indexing. 1 Introduction From a user's point of view, the performance of an information retrieval system can be measured by the quality and speed with which it answers the user's information need. Several factors contribute to overall performance:  the time required to run each individual query,  the quality (precision/recall) of each i...
Concept Decompositions for Large Sparse Text Data using Clustering
 Machine Learning
, 2000
"... . Unlabeled document collections are becoming increasingly common and available; mining such data sets represents a major contemporary challenge. Using words as features, text documents are often represented as highdimensional and sparse vectorsa few thousand dimensions and a sparsity of 95 to 99 ..."
Abstract

Cited by 303 (28 self)
 Add to MetaCart
. Unlabeled document collections are becoming increasingly common and available; mining such data sets represents a major contemporary challenge. Using words as features, text documents are often represented as highdimensional and sparse vectorsa few thousand dimensions and a sparsity of 95 to 99% is typical. In this paper, we study a certain spherical kmeans algorithm for clustering such document vectors. The algorithm outputs k disjoint clusters each with a concept vector that is the centroid of the cluster normalized to have unit Euclidean norm. As our first contribution, we empirically demonstrate that, owing to the highdimensionality and sparsity of the text data, the clusters produced by the algorithm have a certain "fractallike" and "selfsimilar" behavior. As our second contribution, we introduce concept decompositions to approximate the matrix of document vectors; these decompositions are obtained by taking the leastsquares approximation onto the linear subspace spanned...
Consistency of spectral clustering
, 2004
"... Consistency is a key property of statistical algorithms, when the data is drawn from some underlying probability distribution. Surprisingly, despite decades of work, little is known about consistency of most clustering algorithms. In this paper we investigate consistency of a popular family of spe ..."
Abstract

Cited by 286 (15 self)
 Add to MetaCart
Consistency is a key property of statistical algorithms, when the data is drawn from some underlying probability distribution. Surprisingly, despite decades of work, little is known about consistency of most clustering algorithms. In this paper we investigate consistency of a popular family of spectral clustering algorithms, which cluster the data with the help of eigenvectors of graph Laplacian matrices. We show that one of the two of major classes of spectral clustering (normalized clustering) converges under some very general conditions, while the other (unnormalized), is only consistent under strong additional assumptions, which, as we demonstrate, are not always satisfied in real data. We conclude that our analysis provides strong evidence for the superiority of normalized spectral clustering in practical applications. We believe that methods used in our analysis will provide a basis for future exploration of Laplacianbased methods in a statistical setting.
Simulation of Simplicity: A Technique to Cope with Degenerate Cases in Geometric Algorithms
 ACM TRANS. GRAPH
, 1990
"... This paper describes a generalpurpose programming technique, called the Simulation of Simplicity, which can be used to cope with degenerate input data for geometric algorithms. It relieves the programmer from the task to provide a consistent treatment for every single special case that can occur. T ..."
Abstract

Cited by 277 (21 self)
 Add to MetaCart
This paper describes a generalpurpose programming technique, called the Simulation of Simplicity, which can be used to cope with degenerate input data for geometric algorithms. It relieves the programmer from the task to provide a consistent treatment for every single special case that can occur. The programs that use the technique tend to be considerably smaller and more robust than those that do not use it. We believe that this technique will become a standard tool in writing geometric software.
Missing value estimation methods for DNA microarrays
, 2001
"... Motivation: Gene expression microarray experiments can generate data sets with multiple missing expression values. Unfortunately, many algorithms for gene expression analysis require a complete matrix of gene array values as input. For example, methods such as hierarchical clustering and Kmeans clu ..."
Abstract

Cited by 275 (20 self)
 Add to MetaCart
Motivation: Gene expression microarray experiments can generate data sets with multiple missing expression values. Unfortunately, many algorithms for gene expression analysis require a complete matrix of gene array values as input. For example, methods such as hierarchical clustering and Kmeans clustering are not robust to missing data, and may lose effectiveness even with a few missing values. Methods for imputing missing data are needed, therefore, to minimize the effect of incomplete data sets on analyses, and to increase the range of data sets to which these algorithms can be applied. In this report, we investigate automated methods for estimating missing data.
Visual Simulation of Smoke
, 2001
"... In this paper, we propose a new approach to numerical smoke simulation for computer graphics applications. The method proposed here exploits physics unique to smoke in order to design a numerical method that is both fast and efficient on the relatively coarse grids traditionally used in computer gra ..."
Abstract

Cited by 265 (20 self)
 Add to MetaCart
In this paper, we propose a new approach to numerical smoke simulation for computer graphics applications. The method proposed here exploits physics unique to smoke in order to design a numerical method that is both fast and efficient on the relatively coarse grids traditionally used in computer graphics applications (as compared to the much finer grids used in the computational fluid dynamics literature). We use the inviscid Euler equations in our model, since they are usually more appropriate for gas modeling and less computationally intensive than the viscous NavierStokes equations used by others. In addition, we introduce a physically consistent vorticity confinement term to model the small scale rolling features characteristic of smoke that are absent on most coarse grid simulations. Our model also correctly handles the interaction of smoke with moving objects. Keywords: Smoke, computational fluid dynamics, NavierStokes equations, Euler equations, SemiLagrangian methods, stable fluids, vorticity confinement, participating media 1
Pajek  Program for Large Network Analysis
 Connections
, 1998
"... Large networks, having thousands of vertices and lines, can be found in many different areas, e. g: genealogies, flow graphs of programs, molecule, computer networks, transportation networks, social networks, intra/inter organisational networks ... Many standard network algorithms are very time and ..."
Abstract

Cited by 254 (11 self)
 Add to MetaCart
Large networks, having thousands of vertices and lines, can be found in many different areas, e. g: genealogies, flow graphs of programs, molecule, computer networks, transportation networks, social networks, intra/inter organisational networks ... Many standard network algorithms are very time and space consuming and therefore unsuitable for analysis of such networks. In the article we present some approaches to analysis and visualisation of large networks implemented in program Pajek. Some typical examples are also given. 1 Introduction Pajek (Slovene word for Spider) is a program, for Windows (32 bit), for analysis of large networks. It is freely available, for noncommercial use, at its homepage: http://vlado.fmf.unilj.si/pub/networks/pajek/ Large networks can be found in many different areas. Usually they are produced automatically, using computers, from different data sources that are already available in computer readable form. For example: large genealogies (genea...
Latent semantic indexing: A probabilistic analysis
, 1998
"... Latent semantic indexing (LSI) is an information retrieval technique based on the spectral analysis of the termdocument matrix, whose empirical success had heretofore been without rigorous prediction and explanation. We prove that, under certain conditions, LSI does succeed in capturing the underl ..."
Abstract

Cited by 248 (8 self)
 Add to MetaCart
Latent semantic indexing (LSI) is an information retrieval technique based on the spectral analysis of the termdocument matrix, whose empirical success had heretofore been without rigorous prediction and explanation. We prove that, under certain conditions, LSI does succeed in capturing the underlying semantics of the corpus and achieves improved retrieval performance. We also propose the technique of random projection as a way of speeding up LSI. We complement our theorems with encouraging experimental results. We also argue that our results may be viewed in a more general framework, as a theoretical basis for the use of spectral methods in a wider class of applications such as collaborative filtering.