Results 1 - 10
of
65
File server scaling with network-attached secure disks
- In Proceedings of the 1997 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems
, 1997
"... By providing direct data transfer between storage and client, net-work-attached storage devices have the potential to improve scal-ability for existing distributed file systems (by removing the server as a bottleneck) and bandwidth for new parallel and distributed file systems (through network strip ..."
Abstract
-
Cited by 129 (10 self)
- Add to MetaCart
By providing direct data transfer between storage and client, net-work-attached storage devices have the potential to improve scal-ability for existing distributed file systems (by removing the server as a bottleneck) and bandwidth for new parallel and distributed file systems (through network striping and more efficient data paths). Together, these advantages influence a large enough fraction of the storage market to make commodity network-attached storage fea-sible. Realizing the technology’s full potential requires careful consideration across a wide range of file system, networking and security issues. This paper contrasts two network-attached storage architectures-(l) Networked SCSI disks (NetSCSI) are network-attached storage devices with minimal changes from the familiar SCSI interface, while (2) Network-Attached Secure Disks (NASD) are drives that support independent client access to drive object services. To estimate the potential performance benefits of these architectures, we develop an analytic model and perform trace-driven replay experiments based on AFS and NFS traces. Our results suggest that NetSCSI can reduce tile server load during a burst of NFS or AFS activity by about 30%. With the NASD archi-tecture, server load (during burst activity) can be reduced by a fac-tor of up to five for AFS and up to ten for NFS. 1
Effects of communication latency, overhead, and bandwidth in a cluster architecture
- In Proceedings of the 24th Annual International Symposium on Computer Architecture
, 1997
"... This work provides a systematic study of the impact of communication performance on parallel applications in a high performance network of workstations. We develop an experimental system in which the communication latency, overhead, and bandwidth can be independently varied to observe the effects on ..."
Abstract
-
Cited by 98 (5 self)
- Add to MetaCart
This work provides a systematic study of the impact of communication performance on parallel applications in a high performance network of workstations. We develop an experimental system in which the communication latency, overhead, and bandwidth can be independently varied to observe the effects on a wide range of applications. Our results indicate that current efforts to improve cluster communication performance to that of tightly integrated parallel machines results in significantly improved application performance. We show that applications demonstrate strong sensitivity to overhead, slowing down by a factor of 60 on 32 processors when overhead is increased from 3 to 103 s. Applications in this study are also sensitive to per-message bandwidth, but are surprisingly tolerant of increased latency and lower per-byte bandwidth. Finally, most applications demonstrate a highly linear dependence to both overhead and per-message bandwidth, indicating that further improvements in communication performance will continue to improve application performance. 1
Chameleon: A Software Infrastructure For Adaptive Fault Tolerance In Distributed Systems
, 1998
"... The project has benefited through some demonstrations and presentations given to people from the computer industry, and the discussions they generated. Of them, mention must be made of Pankaj Mehra of Tandem Labs, Roger Lee, Robert Ferraro and Jagdish Patel of the Jet Propulsion Laboratory, Larry Ja ..."
Abstract
-
Cited by 80 (11 self)
- Add to MetaCart
The project has benefited through some demonstrations and presentations given to people from the computer industry, and the discussions they generated. Of them, mention must be made of Pankaj Mehra of Tandem Labs, Roger Lee, Robert Ferraro and Jagdish Patel of the Jet Propulsion Laboratory, Larry Jack and Chet Markiewicz of Honeywell Inc., and Haim Levendel of Motorola. iii TABLE OF CONTENTS CHAPTER PAGE 1 INTRODUCTION : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1 2 RELATED WORK : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 3 3 CHAMELEON OVERVIEW : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 14 3.1 Behavioral Overview : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 15 3.1.1 Initialization of the Chameleon Environment : : : : : : : : : : : : : : : : 15 3.1.2 Interpreting User-Specified Depen
Active Disks - Remote Execution for Network-Attached Storage
, 1997
"... The principal trend in the design of computer systems is the expectation of much greater computational power in future generations of microprocessors. This trend applies to embedded systems as well as host processors. As a result, devices such as storage controllers have excess capacity and growing ..."
Abstract
-
Cited by 46 (1 self)
- Add to MetaCart
The principal trend in the design of computer systems is the expectation of much greater computational power in future generations of microprocessors. This trend applies to embedded systems as well as host processors. As a result, devices such as storage controllers have excess capacity and growing computational capabilities. Storage system designers are exploiting this trend with higher-level interfaces to storage and increased intelligence inside storage devices. One development in this direction is Network-Attached Secure Disks (NASD) which attaches storage devices directly to the network and raises the storage interface above the simple (fixed-size block) memory abstraction of SCSI. This allows devices more freedom to provide efficient operations; promises more scalable subsystems by offloading file system and storage management functionality from dedicated servers; and reduces latency by executing common case requests directly at storage devices. In this paper, we push this increa...
Commercial Fault Tolerance: A Tale of Two Systems
- IEEE Transactions on Dependable and Secure Computing
, 2004
"... Abstract—This paper compares and contrasts the design philosophies and implementations of two computer system families: the IBM S/360 and its evolution to the current zSeries line, and the Tandem (now HP) NonStop1 Server. Both systems have a long history; the initial IBM S/360 machines were shipped ..."
Abstract
-
Cited by 37 (0 self)
- Add to MetaCart
Abstract—This paper compares and contrasts the design philosophies and implementations of two computer system families: the IBM S/360 and its evolution to the current zSeries line, and the Tandem (now HP) NonStop1 Server. Both systems have a long history; the initial IBM S/360 machines were shipped in 1964, and the Tandem NonStop System was first shipped in 1976. They were aimed at similar markets, what would today be called enterprise-class applications. The requirement for the original S/360 line was for very high availability; the requirement for the NonStop platform was for single fault tolerance against unplanned outages. Since their initial shipments, availability expectations for both platforms have continued to rise and the system designers and developers have been challenged to keep up. There were and still are many similarities in the design philosophies of the two lines, including the use of redundant components and extensive error checking. The primary difference is that the S/360-zSeries focus has been on localized retry and restore to keep processors functioning as long as possible, while the NonStop developers have based systems on a loosely coupled multiprocessor design that supports a “fail-fast ” philosophy implemented through a combination of hardware and software, with workload being actively taken over by another resource when one fails. Index Terms—Computer systems implementation, fault tolerance, high availability. 1
Security for a High Performance Commodity Storage Subsystem
, 1999
"... and the United States Postal Service. The views and conclusions in this document are my own and should not be interpreted as representing the official policies, either expressed or implied, of any supporting organization or the U.S. Government. ..."
Abstract
-
Cited by 36 (1 self)
- Add to MetaCart
and the United States Postal Service. The views and conclusions in this document are my own and should not be interpreted as representing the official policies, either expressed or implied, of any supporting organization or the U.S. Government.
Highly Concurrent Shared Storage
- In Proceedings of the 20th International Conference on Distributed Computing Systems
, 2000
"... . Shared storage arrays enable thousands of storage devices to be shared and directly accessed by end hosts over switched system-area networks, promising databases and filesystems highly scalable, reliable storage. In such systems, hosts perform access tasks (read and write) and management tasks ..."
Abstract
-
Cited by 34 (6 self)
- Add to MetaCart
. Shared storage arrays enable thousands of storage devices to be shared and directly accessed by end hosts over switched system-area networks, promising databases and filesystems highly scalable, reliable storage. In such systems, hosts perform access tasks (read and write) and management tasks (migration and reconstruction of data on failed devices.) Each task translates into multiple phases of low-level device I/Os, so that concurrent host tasks can span multiple shared devices and access overlapping ranges potentially leading to inconsistencies for redundancy codes and for data read by end hosts. Highly scalable concurrency control and recovery protocols are required to coordinate on-line storage management and access tasks. While expressing storage-level tasks as ACID transactions ensures proper concurrency control and recovery, such an approach imposes high performance overhead, results in replication of work and does not exploit the available knowledge about storage le...
The Direct Access File System
- In Proceedings of Second USENIX Conference on File and Storage Technologies (FAST ’03
, 2003
"... Rights to individual papers remain with the author or the author's employer. Permission is granted for noncommercial reproduction of the work for educational or research purposes. This copyright notice must be included in the reproduced paper. USENIX acknowledges all trademarks herein. ..."
Abstract
-
Cited by 31 (1 self)
- Add to MetaCart
Rights to individual papers remain with the author or the author's employer. Permission is granted for noncommercial reproduction of the work for educational or research purposes. This copyright notice must be included in the reproduced paper. USENIX acknowledges all trademarks herein.
Fine-Grain Distributed Shared Memory on Clusters of Workstations
, 1997
"... Shared memory, one of the most popular models for programming parallel platforms, is becoming ubiquitous both in low-end workstations and high-end servers. With the advent of low-latency networking hardware, clusters of workstations strive to offer the same processing power as high-end servers for a ..."
Abstract
-
Cited by 30 (8 self)
- Add to MetaCart
Shared memory, one of the most popular models for programming parallel platforms, is becoming ubiquitous both in low-end workstations and high-end servers. With the advent of low-latency networking hardware, clusters of workstations strive to offer the same processing power as high-end servers for a fraction of the cost. In such environments, shared memory has been limited to page-based systems that control access to shared memory using the memory's page protection to implement shared memory coherence protocols. Unfortunately, false sharing and fragmentation problems force such systems to resort to weak consistency shared memory models that complicate the shared memory programming model.
High performance virtual machines (HPVM): Clusters with supercomputing APIs and performance
- in: Proceedings of the 8th SIAM Conference on Parallel Processing for Scientific Computing
, 1997
"... The HPVM project provides software which enables high-performance computing on clusters of PCs and workstations using standard supercomputing APIs such as MPI, SHMEM Put/Get, and Global Arrays. HPVMs—High-Performance Virtual Machines—are surprisingly competitive with MPP systems, such as the IBM SP2 ..."
Abstract
-
Cited by 29 (4 self)
- Add to MetaCart
The HPVM project provides software which enables high-performance computing on clusters of PCs and workstations using standard supercomputing APIs such as MPI, SHMEM Put/Get, and Global Arrays. HPVMs—High-Performance Virtual Machines—are surprisingly competitive with MPP systems, such as the IBM SP2 and Cray T3D. The Illinois HPVM achieves impressive low-level communication performance across the cluster: one-way latencies of around 11 µsec and bandwidths> 50 MBytes/sec—even for small packets (< 256 bytes). Performance at higher levels, such as MPI, is expected to be approximately 17 µsec latency and also> 50 MByte/sec bandwidth.

