Results 1  10
of
49
A fast multigrid algorithm for mesh deformation
 ACM Trans. Graph
, 2006
"... Figure 1: The idle CAMEL becomes a boxer with the help of MOCAP data and our mesh deformation system. In this paper, we present a multigrid technique for efficiently deforming large surface and volume meshes. We show that a previous leastsquares formulation for distortion minimization reduces to a ..."
Abstract

Cited by 55 (2 self)
 Add to MetaCart
Figure 1: The idle CAMEL becomes a boxer with the help of MOCAP data and our mesh deformation system. In this paper, we present a multigrid technique for efficiently deforming large surface and volume meshes. We show that a previous leastsquares formulation for distortion minimization reduces to a Laplacian system on a general graph structure for which we derive an analytic expression. We then describe an efficient multigrid algorithm for solving the relevant equations. Here we develop novel prolongation and restriction operators used in the multigrid cycles. Combined with a simple but effective graph coarsening strategy, our algorithm can outperform other multigrid solvers and the factorization stage of direct solvers in both time and memory costs for large meshes. It is demonstrated that our solver can trade off accuracy for speed to achieve greater interactivity, which is attractive for manipulating large meshes. Our multigrid solver is particularly well suited for a mesh editing environment which does not permit extensive precomputation. Experimental evidence of these advantages is provided on a number of meshes with a wide range of size. With our mesh deformation solver, we also successfully demonstrate that visually appealing mesh animations can be generated from both motion capture data and a single base mesh even when they are inconsistent.
Dynamic supernodes in sparse Cholesky update/downdate and triangular solves
 ACM Trans. Math. Software
, 2006
"... The supernodal method for sparse Cholesky factorization represents the factor L as a set of supernodes, each consisting of a contiguous set of columns of L with identical nonzero pattern. A conventional supernode is stored as a dense submatrix. While this is suitable for sparse Cholesky factorizatio ..."
Abstract

Cited by 30 (10 self)
 Add to MetaCart
(Show Context)
The supernodal method for sparse Cholesky factorization represents the factor L as a set of supernodes, each consisting of a contiguous set of columns of L with identical nonzero pattern. A conventional supernode is stored as a dense submatrix. While this is suitable for sparse Cholesky factorization where the nonzero pattern of L does not change, it is not suitable for methods that modify a sparse Cholesky factorization after a lowrank change to A (an update/downdate, A = A±WW T). Supernodes merge and split apart during an update/downdate. Dynamic supernodes are introduced, which allow a sparse Cholesky update/downdate to obtain performance competitive with conventional supernodal methods. A dynamic supernodal solver is shown to exceed the performance of the conventional (BLASbased) supernodal method for solving triangular systems. These methods are incorporated into CHOLMOD, a sparse Cholesky factorization and update/downdate package, which forms the basis of x=A\b in MATLAB when A is sparse and symmetric positive definite. 1
Weighted matchings for preconditioning symmetric indefinite linear systems
 SIAM J. Sci. Comput
, 2006
"... Abstract. Maximum weight matchings have become an important tool for solving highly indefinite unsymmetric linear systems, especially in direct solvers. In this study we investigate the benefit of reorderings and scalings based on symmetrized maximum weight matchings as a preprocessing step for inco ..."
Abstract

Cited by 24 (6 self)
 Add to MetaCart
(Show Context)
Abstract. Maximum weight matchings have become an important tool for solving highly indefinite unsymmetric linear systems, especially in direct solvers. In this study we investigate the benefit of reorderings and scalings based on symmetrized maximum weight matchings as a preprocessing step for incomplete LDL T factorizations. The reorderings are constructed such that the matched entries form 1 × 1or2 × 2 diagonal blocks in order to increase the diagonal dominance of the system. During the incomplete factorization only tridiagonal pivoting is used. We report results for this approach and comparisons with other solution methods for a diverse set of symmetric indefinite matrices, ranging from nonlinear elasticity to interior point optimization.
An OutofCore Sparse Cholesky Solver
, 2009
"... Direct methods for solving large sparse linear systems of equations are popular because of their generality and robustness. Their main weakness is that the memory they require usually increases rapidly with problem size. We discuss the design and development of the first release of a new symmetric d ..."
Abstract

Cited by 23 (8 self)
 Add to MetaCart
Direct methods for solving large sparse linear systems of equations are popular because of their generality and robustness. Their main weakness is that the memory they require usually increases rapidly with problem size. We discuss the design and development of the first release of a new symmetric direct solver that aims to circumvent this limitation by allowing the system matrix, intermediate data, and the matrix factors to be stored externally. The code, which is written in Fortran and called HSL MA77, implements a multifrontal algorithm. The first release is for positivedefinite systems and performs a Cholesky factorization. Special attention is paid to the use of efficient dense linear algebra kernel codes that handle the fullmatrix operations on the frontal matrix and to the input/output operations. The input/output operations are performed using a separate package that provides a virtualmemory system and allows the data to be spread over many files; for very large problems these may be held on more than one device. Numerical results are presented for a collection of 30 large realworld problems, all of which were solved successfully.
Algorithm 8xx: CHOLMOD, supernodal sparse Cholesky factorization and update/downdate
, 2006
"... ..."
(Show Context)
Interactive Vector Field Feature Identification
"... Fig. 1. Above is a sequence of interactions performed during an exploration session in our system (also demonstrated in our accompanying video). Our interactive framework allows the user to specify representative control points (insets) of desired feature types. These control points guide a mapping ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
(Show Context)
Fig. 1. Above is a sequence of interactions performed during an exploration session in our system (also demonstrated in our accompanying video). Our interactive framework allows the user to specify representative control points (insets) of desired feature types. These control points guide a mapping of the vector field points to the interactive texture canvas, where distances between the projected points encode similarities between their localized neighborhoods. Featurebased visualizations are generated through a painting interface, performed on this canvas. Abstract— We introduce a flexible technique for interactive exploration of vector field data through classification derived from userspecified feature templates. Our method is founded on the observation that, while similar features within the vector field may be spatially disparate, they share similar neighborhood characteristics. Users generate featurebased visualizations by interactively highlighting wellaccepted and domain specific representative feature points. Feature exploration begins with the computation of attributes that describe the neighborhood of each sample within the input vector field. Compilation of these attributes forms a representation of the vector field samples in the attribute space. We project the attribute points onto the canonical 2D plane to enable interactive exploration of the vector field using a painting interface. The projection encodes the similarities between vector field points within the distances computed between their associated attribute points. The proposed method is performed at interactive rates for enhanced user experience and is completely flexible as showcased by the simultaneous identification of diverse feature types. Index Terms—Vector field, data clustering, feature classification, highdimensional data, user interaction 1
Algorithmic Performance Studies on Graphics Processing Units
, 2007
"... We report on our experience with integrating and using graphics processing units (GPUs) as fast parallel floatingpoint coprocessors to accelerate two fundamental computational scientific kernels on the GPU: sparse direct factorization and nonlinear interiorpoint optimization. Since a full reimp ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
(Show Context)
We report on our experience with integrating and using graphics processing units (GPUs) as fast parallel floatingpoint coprocessors to accelerate two fundamental computational scientific kernels on the GPU: sparse direct factorization and nonlinear interiorpoint optimization. Since a full reimplementation of these complex kernels is typically not feasible, we identify the matrixmatrix multiplication as a first natural entrypoint for a minimally invasive integration of GPUs. We investigate the performance on the NVIDIA GeForce 8800 multicore chip initially architectured for intensive gaming applications. We exploit the architectural features of the GeForce 8800 GPU to design an efficient GPUparallel sparse matrix solver. A prototype approach to leverage the bandwidth and computing power of GPUs for these matrix kernel operation is demonstrated resulting in an overall performance of over 110 GFlops/s on the desktop for large matrices. We use our GPU algorithm for PDEconstrained optimization problems and demonstrate that the commodity GPU is a useful coprocessor for scientific applications.
BALANCED INCOMPLETE FACTORIZATION
"... In this paper we present a new incomplete factorization of a square matrix into triangular factors in which we get standard LU/LDL T factors (direct factors) and their inverses (inverse factors) at the same time. Algorithmically, we derive this method from the approach based on the ShermanMorrison ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
(Show Context)
In this paper we present a new incomplete factorization of a square matrix into triangular factors in which we get standard LU/LDL T factors (direct factors) and their inverses (inverse factors) at the same time. Algorithmically, we derive this method from the approach based on the ShermanMorrison formula [16]. In contrast to the RIF algorithm [9], the direct and inverse factors here directly influence each other throughout the computation. Consequently, the algorithm to compute the approximate factors may mutually balance dropping in the factors and control their conditioning in this way. Although we describe the theory behind the factorization for general nonsymmetric matrices, in implementation and experiments we restrict for clarity and conciseness only to the case when the system matrix is symmetric and positive definite. In this case, we call the new approximate LDL T factorization Balanced Incomplete Factorization (BIF). Our experimental results confirm that this factorization is very robust and may be useful in solving difficult illconditioned problems by preconditioned iterative methods. Moreover, the internal coupling of computation of direct and inverse factors results in much shorter setup times (times to compute approximate decomposition) than RIF, a method of a similar and very high level of robustness.
GeneralPurpose Sparse Matrix Building Blocks using the NVIDIA CUDA Technology Platform
"... Abstract — We report on our experience with integrating and using graphics processing units (GPUs) as fast parallel floatingpoint coprocessors to accelerate two fundamental computational scientific kernels on the GPU: sparse direct factorization and nonlinear interiorpoint optimization. Since a fu ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
(Show Context)
Abstract — We report on our experience with integrating and using graphics processing units (GPUs) as fast parallel floatingpoint coprocessors to accelerate two fundamental computational scientific kernels on the GPU: sparse direct factorization and nonlinear interiorpoint optimization. Since a full reimplementation of these complex kernels is typically not feasible, we identify e.g. the matrixmatrix multiplication as a first natural entrypoint for a minimally invasive integration of GPUs. We investigate the performance on the NVIDIA GeForce 8800 multicore chip. We exploit the architectural features of the GeForce 8800 GPU to design an efficient GPUparallel sparse matrix solver. A prototype approach to leverage the bandwidth and computing power of GPUs for these matrix kernel operation is demonstrated resulting in an overall performance of over 110 GFlops/s on the desktop for large matrices. We use our GPU algorithm for PDEconstrained optimization problems and demonstrate that the commodity GPU is a useful coprocessor for scientific applications. Index Terms — GPGPU, graphical processing units, sparse matrix decomposition, sparse direct solvers, largescale nonlinear optimization I.
Updated Sparse Cholesky Factors for Corotational Elastodynamics
 TO APPEAR IN ACM TRANSACTIONS ON GRAPHICS
, 2012
"... We present warpcanceling corotation, a nonlinear finite element formulation for elastodynamic simulation that achieves fast performance by making only partial or delayed changes to the simulation’s linearized system matrices. Coupled with an algorithm for incremental updates to a sparse Cholesky fa ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
We present warpcanceling corotation, a nonlinear finite element formulation for elastodynamic simulation that achieves fast performance by making only partial or delayed changes to the simulation’s linearized system matrices. Coupled with an algorithm for incremental updates to a sparse Cholesky factorization, the method realizes the stability and scalability of a sparse direct method without the need for expensive refactorization at each time step. This finite element formulation combines the widely used corotational method with stiffness warping so that changes in the perelement rotations are initially approximated by inexpensive pernode rotations. When the errors of this approximation grow too large, the perelement rotations are selectively corrected by updating parts of the matrix chosen according to locally measured errors. These changes to the system matrix are propagated to its Cholesky factor by incremental updates that are much faster than refactoring the matrix from scratch. A nested dissection ordering of the system matrix gives rise to a hierarchical factorization in which changes to the system matrix cause limited, wellstructured changes to the Cholesky factor. We show examples of simulations that demonstrate that the proposed formulation produces results that are visually comparable to those produced by a standard corotational formulation. Because our method requires computing only partial updates of the Cholesky factor, it is substantially faster than full refactorization and outperforms widely used iterative methods such as preconditioned conjugate gradients. Our method supports a controlled tradeoff between accuracy and speed, and unlike most iterative methods its performance does not slow for stiffer materials but rather it actually improves.