Results 1 - 10
of
10
Eliminating receive livelock in an interrupt-driven kernel
- ACM Transactions on Computer Systems
, 1997
"... Most operating systems use interface interrupts to schedule network tasks. Interrupt-driven systems can provide low overhead and good latency at low of-fered load, but degrade significantly at higher arrival rates unless care is taken to prevent several pathologies. These are various forms of receiv ..."
Abstract
-
Cited by 241 (4 self)
- Add to MetaCart
Most operating systems use interface interrupts to schedule network tasks. Interrupt-driven systems can provide low overhead and good latency at low of-fered load, but degrade significantly at higher arrival rates unless care is taken to prevent several pathologies. These are various forms of receive livelock, in which the system spends all its time processing interrupts, to the exclusion of other neces-sary tasks. Under extreme conditions, no packets are delivered to the user application or the output of the system. To avoid livelock and related problems, an operat-ing system must schedule network interrupt handling as carefully as it schedules process execution. We modified an interrupt-driven networking implemen-tation to do so; this eliminates receive livelock without degrading other aspects of system performance. We present measurements demonstrating the success of our approach. 1.
Potential benefits of delta encoding and data compression for HTTP (Corrected version)
, 1997
"... ..."
Network Behavior of a Busy Web Server and its Clients
, 1995
"... research relevant to the design and application of high performance scientific computers. We test our ideas by designing, building, and using real systems. The systems we build are research prototypes; they are not intended to become products. There are two other research laboratories located in Pal ..."
Abstract
-
Cited by 92 (2 self)
- Add to MetaCart
research relevant to the design and application of high performance scientific computers. We test our ideas by designing, building, and using real systems. The systems we build are research prototypes; they are not intended to become products. There are two other research laboratories located in Palo Alto, the Network Systems
Scalable kernel performance for Internet servers under realistic loads
, 1998
"... UNIX Internet servers with an event-driven architecture often perform poorly under real workloads, even if they perform well under laboratory benchmarking conditions. We investigated the poor performance of event-driven servers. We found that the delays typical in wide-area networks cause busy serve ..."
Abstract
-
Cited by 86 (9 self)
- Add to MetaCart
UNIX Internet servers with an event-driven architecture often perform poorly under real workloads, even if they perform well under laboratory benchmarking conditions. We investigated the poor performance of event-driven servers. We found that the delays typical in wide-area networks cause busy servers to manage a large number of simultaneous connections. We also observed that the select system call implementation in most UNIX kernels scales poorly with the number of connections being managed by a process. The UNIX algorithm for allocating file descriptors also scales poorly. These algorithmic problems lead directly to the poor performance of event-driven servers. We implemented scalable versions of the select system call and the descriptor allocation algorithm. This led to an improvement of up to 58% in Web proxy and Web server throughput, and dramatically improved the scalability of the system.
Efficient Procedure Mapping using Cache Line Coloring
- IN PROCEEDINGS OF THE SIGPLAN'97 CONFERENCE ON PROGRAMMING LANGUAGE DESIGN AND IMPLEMENTATION
, 1997
"... As the gap between memory and processor performance continues to widen, it becomes increasingly important to exploit cache memory effectively. Both hardware and software approaches can be explored to optimize cache performance. Hardware designers focus on cache organization issues, including replace ..."
Abstract
-
Cited by 67 (12 self)
- Add to MetaCart
As the gap between memory and processor performance continues to widen, it becomes increasingly important to exploit cache memory effectively. Both hardware and software approaches can be explored to optimize cache performance. Hardware designers focus on cache organization issues, including replacement policy, associativity, line size and the resulting cache access time. Software writers use various optimization techniques, including software prefetching, data scheduling and code reordering. Our focus is on improving memory usage through code reordering compiler techniques. In this
Memory-System Design Considerations For Dynamically-Scheduled Microprocessors
, 1997
"... Memory-System Design Considerations for Dynamically-Scheduled Microprocessors Keith Istvan Farkas Doctor of Philosophy Graduate Department of Electrical and Computer Engineering University of Toronto 1997 Dynamically-scheduled processors challenge hardware and software architects to develop designs ..."
Abstract
-
Cited by 66 (4 self)
- Add to MetaCart
Memory-System Design Considerations for Dynamically-Scheduled Microprocessors Keith Istvan Farkas Doctor of Philosophy Graduate Department of Electrical and Computer Engineering University of Toronto 1997 Dynamically-scheduled processors challenge hardware and software architects to develop designs that balance hardware complexity and compiler technology against performance targets. This dissertation presents a first thorough look at some of the issues introduced by this hardware complexity. The focus of the investigation of these issues is the register file and the other components of the data memory system. These components are: the lockup-free data cache, the stream buffers, and the interface to the lower levels of the memory system. The investigation is based on software models. These models incorporate the features of a dynamically-scheduled processor that affect the design of the data-memory components. The models represent a balance between accuracy and generality, and ar...
Operating system support for busy internet servers
- In Proceedings of the Fifth Workshop on Hot Topics in Operating Systems (HotOS-V), Orcas Island
, 1995
"... mogul @ wrl.dec.com The Internet has experienced exponential growth in the use of the World-Wide Web, and rapid growth in the use of other Internet services such as VSENET news and electronic mail. These applications qualitatively differ from other network applications in the stresses they impose on ..."
Abstract
-
Cited by 50 (2 self)
- Add to MetaCart
mogul @ wrl.dec.com The Internet has experienced exponential growth in the use of the World-Wide Web, and rapid growth in the use of other Internet services such as VSENET news and electronic mail. These applications qualitatively differ from other network applications in the stresses they impose on busy server systems. Unlike traditional distributed systems, Internet servers must cope with huge user communities, short interactions, and long network latencies. Such servers require different kinds of operating system features to manage their resources effectively. 1
The Predictability of Branches in Libraries
- IN 28TH INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE
, 1995
"... Profile-based optimizations are being used with increasing frequency. Profile ..."
Abstract
-
Cited by 33 (6 self)
- Add to MetaCart
Profile-based optimizations are being used with increasing frequency. Profile
Attribute Caches
- Kathy
, 1995
"... Workloads generate a variety of disk I/O requests to access file information, execute programs, and perform computation. I/O caches capture many of these requests, reducing execution time, providing high I/O rates, and decreasing the disk bandwidth needed by each workload. Workload component charact ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
Workloads generate a variety of disk I/O requests to access file information, execute programs, and perform computation. I/O caches capture many of these requests, reducing execution time, providing high I/O rates, and decreasing the disk bandwidth needed by each workload. Workload component characterization shows file type and size information can be used to group requests with similar reuse rates and access patterns. Attribute caches have various partitions to capture the statistically distinct component behavior of the workload, each tailored to cache files with certain properties or attributes. Information about an I/O request becomes an attribute that determines how best to cache a request. Using attributes, cache resources are allocated to capture specific types of I/O data locality. The paper develops an attribute cache scheme to improve total I/O cache performance. The scheme relies on workload characteristics to determine the appropriate cache configuration for a given cache s...
Efficient Dynamic Procedure Placement
, 1998
"... Commercial applications such as database servers often have very large instruction footprints and consequently are frequently stalled due to instruction cache misses. A large fraction of the i-cache misses are typically due to conflicts in the relatively small direct-mapped on-chip instruction ca ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
Commercial applications such as database servers often have very large instruction footprints and consequently are frequently stalled due to instruction cache misses. A large fraction of the i-cache misses are typically due to conflicts in the relatively small direct-mapped on-chip instruction caches. A variety of tools have been developed to try to order the procedures of an application to minimize these conflicts. Such tools often make use of profile information to place procedures so that procedures that frequently call each other do not conflict in the i-cache. However, users often avoid using any kind of tool that requires them to do extra profiling and linking steps to optimize their application. In addition, any tool that does a static layout of procedures (whether using profiling information or not) cannot adapt to varying application workloads that cause very different application behavior. We have developed a method called DPP (dynamic procedure placement) for pl...

