Results 1  10
of
116
Dynamic social network analysis using latent space models
 SIGKDD Explorations, Special Issue on Link Mining
"... This paper explores two aspects of social network modeling. First, we generalize a successful static model of relationships into a dynamic model that accounts for friendships drifting over time. Second, we show how to make it tractable to learn such models from data, even as the number of entities n ..."
Abstract

Cited by 68 (3 self)
 Add to MetaCart
This paper explores two aspects of social network modeling. First, we generalize a successful static model of relationships into a dynamic model that accounts for friendships drifting over time. Second, we show how to make it tractable to learn such models from data, even as the number of entities n gets large. The generalized model associates each entity with a point in pdimensional Euclidean latent space. The points can move as time progresses but large moves in latent space are improbable. Observed links between entities are more likely if the entities are close in latent space. We show how to make such a model tractable (subquadratic in the number of entities) by the use of appropriate kernel functions for similarity in latent space; the use of low dimensional KDtrees; a new efficient dynamic adaptation of multidimensional scaling for a first pass of approximate projection of entities into latent space; and an efficient conjugate gradient update rule for nonlinear local optimization in which amortized time per entity during an update is O(log n). We use both synthetic and realworld data on up to 11,000 entities which indicate nearlinear scaling in computation time and improved performance over four alternative approaches. We also illustrate the system operating on twelve years of NIPS coauthorship data. 1.
Graph Drawing by HighDimensional Embedding
 In GD02, LNCS
, 2002
"... We present a novel approach to the aesthetic drawing of undirected graphs. The method has two phases: first embed the graph in a very high dimension and then project it into the 2D plane using PCA. Experiments we have carried out show the ability of the method to draw graphs of 10 nodes in few seco ..."
Abstract

Cited by 59 (10 self)
 Add to MetaCart
We present a novel approach to the aesthetic drawing of undirected graphs. The method has two phases: first embed the graph in a very high dimension and then project it into the 2D plane using PCA. Experiments we have carried out show the ability of the method to draw graphs of 10 nodes in few seconds. The new method appears to have several advantages over classical methods, including a significantly better running time, a useful inherent capability to exhibit the graph in various dimensions, and an effective means for interactive exploration of large graphs.
Distributed Clustering Using Collective Principal Component Analysis
 Knowledge and Information Systems
, 1999
"... This paper considers distributed clustering of high dimensional heterogeneous data using a distributed Principal Component Analysis (PCA) technique called the Collective PCA. It presents the Collective PCA technique that can be used independent of the clustering application. It shows a way to inte ..."
Abstract

Cited by 49 (9 self)
 Add to MetaCart
This paper considers distributed clustering of high dimensional heterogeneous data using a distributed Principal Component Analysis (PCA) technique called the Collective PCA. It presents the Collective PCA technique that can be used independent of the clustering application. It shows a way to integrate the Collective PCA with a given otheshelf clustering algorithm in order to develop a distributed clustering technique. It also presents experimental results using dierent test data sets including an application for web mining.
Updating a RankRevealing ULV Decomposition
, 1991
"... A ULV decomposition of a matrix A of order n is a decomposition of the form A = ULV^H, where U and V are orthogonal matrices and L is a lower triangular matrix. When A is approximately of rank k, the decomposition is rank revealing if the last n \Gamma k rows of L are small. This paper presents al ..."
Abstract

Cited by 48 (4 self)
 Add to MetaCart
A ULV decomposition of a matrix A of order n is a decomposition of the form A = ULV^H, where U and V are orthogonal matrices and L is a lower triangular matrix. When A is approximately of rank k, the decomposition is rank revealing if the last n \Gamma k rows of L are small. This paper presents algorithms for updating a rankrevealing ULV decomposition. The algorithms run in O(n²) time, and can be implemented on a linear array of processors to run in O(n) time.
A Multistage Representation of the Wiener Filter Based on Orthogonal Projections
 IEEE Transactions on Information Theory
, 1998
"... The Wiener filter is analyzed for stationary complex Gaussian signals from an informationtheoretic point of view. A dualport analysis of the Wiener filter leads to a decomposition based on orthogonal projections and results in a new multistage method for implementing the Wiener filter using a nest ..."
Abstract

Cited by 42 (3 self)
 Add to MetaCart
The Wiener filter is analyzed for stationary complex Gaussian signals from an informationtheoretic point of view. A dualport analysis of the Wiener filter leads to a decomposition based on orthogonal projections and results in a new multistage method for implementing the Wiener filter using a nested chain of scalar Wiener filters. This new representation of the Wiener filter provides the capability to perform an informationtheoretic analysis of previous, basisdependent, reducedrank Wiener filters. This analysis demonstrates that the recently introduced crossspectral metric is optimal in the sense that it maximizes mutual information between the observed and desired processes. A new reducedrank Wiener filter is developed based on this new structure which evolves a basis using successive projections of the desired signal onto orthogonal, lower dimensional subspaces. The performance is evaluated using a comparative computer analysis model and it is demonstrated that the lowcomplexity multistage reducedrank Wiener filter is capable of outperforming the more complex eigendecompositionbased methods.
A Parallel Implementation of the Nonsymmetric QR Algorithm for Distributed Memory Architectures
 SIAM J. SCI. COMPUT
, 2002
"... One approach to solving the nonsymmetric eigenvalue problem in parallel is to parallelize the QR algorithm. Not long ago, this was widely considered to be a hopeless task. Recent efforts have led to significant advances, although the methods proposed up to now have suffered from scalability problems ..."
Abstract

Cited by 36 (3 self)
 Add to MetaCart
One approach to solving the nonsymmetric eigenvalue problem in parallel is to parallelize the QR algorithm. Not long ago, this was widely considered to be a hopeless task. Recent efforts have led to significant advances, although the methods proposed up to now have suffered from scalability problems. This paper discusses an approach to parallelizingthe QR algorithm that greatly improves scalability. A theoretical analysis indicates that the algorithm is ultimately not scalable, but the nonscalability does not become evident until the matrix dimension is enormous. Experiments on the Intel Paragon system, the IBM SP2 supercomputer, the SGI Origin 2000, and the Intel ASCI Option Red supercomputer are reported.
Randomized Load Balancing for Treestructured Computation
 In Scalable High Performance Computing Conference
, 1994
"... In this paper, we study the performance of a randomized algorithm for balancing load across a multiprocessor executing a dynamic irregular task tree. Specifically, we show that the time taken to explore a task tree is likely to be within a small constant factor of an inherent lower bound for the tre ..."
Abstract

Cited by 30 (7 self)
 Add to MetaCart
In this paper, we study the performance of a randomized algorithm for balancing load across a multiprocessor executing a dynamic irregular task tree. Specifically, we show that the time taken to explore a task tree is likely to be within a small constant factor of an inherent lower bound for the tree instance. Our model permits arbitrary task times and overlap between computation and load balance, and thus extends earlier work which assumed fixed cost tasks and used a bulk synchronous style in which the system alternated between distinct computing and load balancing steps. Our analysis is supported by experiments with application codes, demonstrating that the efficiency is high enough to make this method practical. 1 Introduction In this paper we study a popular randomized strategy for load balancing dynamic treestructured task graphs on large scale message passing multiprocessors. First, we show analytically that with high probability, the randomized strategy results in parallel run...
Drawing Huge Graphs by Algebraic Multigrid Optimization. Multiscale Modeling and Simulation
, 2003
"... Abstract. We present an extremely fast graph drawing algorithm for very large graphs, which we term ACE (for Algebraic multigrid Computation of Eigenvectors). ACE exhibits a vast improvement over the fastest algorithms we are currently aware of; using a serial PC, it draws graphs of millions of node ..."
Abstract

Cited by 30 (3 self)
 Add to MetaCart
Abstract. We present an extremely fast graph drawing algorithm for very large graphs, which we term ACE (for Algebraic multigrid Computation of Eigenvectors). ACE exhibits a vast improvement over the fastest algorithms we are currently aware of; using a serial PC, it draws graphs of millions of nodes in less than a minute. ACE finds an optimal drawing by minimizing a quadratic energy function. The minimization problem is expressed as a generalized eigenvalue problem, which is solved rapidly using a novel algebraic multigrid technique. The same generalized eigenvalue problem seems to come up also in other fields, hence ACE appears to be applicable outside graph drawing too.
Perturbations of orthogonal polynomials with periodic recursion coefficients
, 2007
"... We extend the results of Denisov–Rakhmanov, Szegő–Shohat– Nevai, and Killip–Simon from asymptotically constant orthogonal polynomials on the real line (OPRL) and unit circle (OPUC) to asymptotically periodic OPRL and OPUC. The key tool is a characterization of the isospectral torus that is well ada ..."
Abstract

Cited by 26 (15 self)
 Add to MetaCart
We extend the results of Denisov–Rakhmanov, Szegő–Shohat– Nevai, and Killip–Simon from asymptotically constant orthogonal polynomials on the real line (OPRL) and unit circle (OPUC) to asymptotically periodic OPRL and OPUC. The key tool is a characterization of the isospectral torus that is well adapted to the study of perturbations.
Gaussian mixture clustering and imputation of microarray data
 Bioinformatics
, 2004
"... Motivation: In microarray experiments, missing entries arise from blemishes on the chips. In largescale studies, virtually every chip contains some missing entries and more than 90% of the genes are affected. Many analysis methods require a full set of data. Either those genes with missing entries ..."
Abstract

Cited by 19 (0 self)
 Add to MetaCart
Motivation: In microarray experiments, missing entries arise from blemishes on the chips. In largescale studies, virtually every chip contains some missing entries and more than 90% of the genes are affected. Many analysis methods require a full set of data. Either those genes with missing entries are excluded, or the missing entries are filled with estimates prior to the analyses.This study compares methods of missing value estimation. Results: Two evaluation metrics of imputation accuracy are employed. First, the root mean squared error measures the difference between the true values and the imputed values. Second, the number of misclustered genes measures the difference between clustering with true values and that with imputed values; it examines the bias introduced by imputation to clustering. The Gaussian mixture clustering with model averaging imputation is superior to all other imputation methods, according to both evaluation metrics, on both timeseries (correlated) and nontime series (uncorrelated) data sets. Availability: Matlab code is available on request from the authors. Contact: