Results 1 -
6 of
6
RIOT: I/O-Efficient Numerical Computing without SQL ∗
"... R is a numerical computing environment that is widely popular for statistical data analysis. Like many such environments, R performs poorly for large datasets whose sizes exceed that of physical memory. We present our vision of RIOT (R with I/O Transparency), a system that makes R programs I/O-effic ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
R is a numerical computing environment that is widely popular for statistical data analysis. Like many such environments, R performs poorly for large datasets whose sizes exceed that of physical memory. We present our vision of RIOT (R with I/O Transparency), a system that makes R programs I/O-efficient in a way transparent to the users. We describe our experience with RIOT-DB, an initial prototype that uses a relational database system as a backend. Despite the overhead and inadequacy of generic database systems in handling array data and numerical computation, RIOT-DB significantly outperforms R in many large-data scenarios, thanks to a suite of high-level, inter-operation optimizations that integrate seamlessly into R. While many techniques in RIOT are inspired by databases (and, for RIOT-DB, realized by a database system), RIOT users are insulated from anything database related. Compared with previous approaches that require users to learn new languages and rewrite their programs to interface with a database, RIOT will, we believe, be easier to adopt by the majority of the R users.
Applying Automated Memory Analysis to Improve Iterative Algorithms
- SIAM J. Sci. Comput
, 2007
"... Historically, iterative solvers have been designed so as to minimize the number of floating-point operations. We propose instead that iterative solvers should be designed to minimize the amount of data that must be loaded from the memory hierarchy to the CPU. In this paper, we describe automated mem ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Historically, iterative solvers have been designed so as to minimize the number of floating-point operations. We propose instead that iterative solvers should be designed to minimize the amount of data that must be loaded from the memory hierarchy to the CPU. In this paper, we describe automated memory analysis, a technique to improve the memory efficiency of a sparse linear iterative solver. Our automated memory analysis uses a language processor to predict the data movement required for an iterative algorithm based upon a Matlab implementation. We demonstrate how automated memory analysis is used to reduce the execution time of a component of a global parallel ocean model. In particular, code modifications identified or evaluated through automated memory analysis enables a 46 % reduction in execution time for the conjugate gradient solver on a small serial problem. Further, we achieve a 9 % reduction in total execution time for the full model on 64 processors. The predictive capabilities of our automated memory analysis can be used to simplify the development of memory efficient numerical algorithms or software. 1
Automated memory analysis: improving the design and implementation of iterative algorithms
, 2005
"... has been examined by the signatories, and we find that both the content and the form meet acceptable presentation standards of scholarly work in the above mentioned discipline. ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
has been examined by the signatories, and we find that both the content and the form meet acceptable presentation standards of scholarly work in the above mentioned discipline.
PTask: Operating system abstractions to manage gpus as compute devices
- Carnegie Mellon University
, 2011
"... We propose a new set of OS abstractions to support GPUs and other accelerator devices as first class computing resources. These new abstractions, collectively called the PTask API, support a dataflow programming model. Because a PTask graph consists of OS-managed objects, the kernel has sufficient v ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
We propose a new set of OS abstractions to support GPUs and other accelerator devices as first class computing resources. These new abstractions, collectively called the PTask API, support a dataflow programming model. Because a PTask graph consists of OS-managed objects, the kernel has sufficient visibility and control to provide system-wide guarantees like fairness and performance isolation, and can streamline data movement in ways that are impossible under current GPU programming models. Our experience developing the PTask API, along with a gestural interface on Windows 7 and a FUSEbased encrypted file system on Linux show that the PTask API can provide important system-wide guarantees where there were previously none, and can enable significant performance improvements, for example gaining a 5 × improvement in maximum throughput for the gestural interface. Categories and Subject Descriptors D.4.8 [Operating systems]: [Performance]; D.4.7 [Operating systems]: [Organization and Design];
Slice-hoisting for Array-size Inference in MATLAB
- In 16th International Workshop on Languages and Compilers for Parallel Computing, Lecture Notes in Computer Science
, 2003
"... Inferring variable types precisely is very important to be able to compile MATLAB libraries e#ectively in the context of the telescoping languages framework being developed at Rice. Past studies have demonstrated the value of type information in optimizing MATLAB [4]. ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Inferring variable types precisely is very important to be able to compile MATLAB libraries e#ectively in the context of the telescoping languages framework being developed at Rice. Past studies have demonstrated the value of type information in optimizing MATLAB [4].

