Results 1 - 10
of
15
Eliminating receive livelock in an interrupt-driven kernel
- ACM Transactions on Computer Systems
, 1997
"... Most operating systems use interface interrupts to schedule network tasks. Interrupt-driven systems can provide low overhead and good latency at low of-fered load, but degrade significantly at higher arrival rates unless care is taken to prevent several pathologies. These are various forms of receiv ..."
Abstract
-
Cited by 241 (4 self)
- Add to MetaCart
Most operating systems use interface interrupts to schedule network tasks. Interrupt-driven systems can provide low overhead and good latency at low of-fered load, but degrade significantly at higher arrival rates unless care is taken to prevent several pathologies. These are various forms of receive livelock, in which the system spends all its time processing interrupts, to the exclusion of other neces-sary tasks. Under extreme conditions, no packets are delivered to the user application or the output of the system. To avoid livelock and related problems, an operat-ing system must schedule network interrupt handling as carefully as it schedules process execution. We modified an interrupt-driven networking implemen-tation to do so; this eliminates receive livelock without degrading other aspects of system performance. We present measurements demonstrating the success of our approach. 1.
Potential benefits of delta encoding and data compression for HTTP (Corrected version)
, 1997
"... ..."
Scalable kernel performance for Internet servers under realistic loads
, 1998
"... UNIX Internet servers with an event-driven architecture often perform poorly under real workloads, even if they perform well under laboratory benchmarking conditions. We investigated the poor performance of event-driven servers. We found that the delays typical in wide-area networks cause busy serve ..."
Abstract
-
Cited by 86 (9 self)
- Add to MetaCart
UNIX Internet servers with an event-driven architecture often perform poorly under real workloads, even if they perform well under laboratory benchmarking conditions. We investigated the poor performance of event-driven servers. We found that the delays typical in wide-area networks cause busy servers to manage a large number of simultaneous connections. We also observed that the select system call implementation in most UNIX kernels scales poorly with the number of connections being managed by a process. The UNIX algorithm for allocating file descriptors also scales poorly. These algorithmic problems lead directly to the poor performance of event-driven servers. We implemented scalable versions of the select system call and the descriptor allocation algorithm. This led to an improvement of up to 58% in Web proxy and Web server throughput, and dramatically improved the scalability of the system.
Memory-System Design Considerations For Dynamically-Scheduled Microprocessors
, 1997
"... Memory-System Design Considerations for Dynamically-Scheduled Microprocessors Keith Istvan Farkas Doctor of Philosophy Graduate Department of Electrical and Computer Engineering University of Toronto 1997 Dynamically-scheduled processors challenge hardware and software architects to develop designs ..."
Abstract
-
Cited by 66 (4 self)
- Add to MetaCart
Memory-System Design Considerations for Dynamically-Scheduled Microprocessors Keith Istvan Farkas Doctor of Philosophy Graduate Department of Electrical and Computer Engineering University of Toronto 1997 Dynamically-scheduled processors challenge hardware and software architects to develop designs that balance hardware complexity and compiler technology against performance targets. This dissertation presents a first thorough look at some of the issues introduced by this hardware complexity. The focus of the investigation of these issues is the register file and the other components of the data memory system. These components are: the lockup-free data cache, the stream buffers, and the interface to the lower levels of the memory system. The investigation is based on software models. These models incorporate the features of a dynamically-scheduled processor that affect the design of the data-memory components. The models represent a balance between accuracy and generality, and ar...
Memory Consistency Models for Shared-Memory Multiprocessors
- WRL RESEARCH REPORT
, 1995
"... The memory consistency model for a shared-memory multiprocessor specifies the behavior of memory with respect to read and write operations from multiple processors. As such, the memory model influences many aspects of system design, including the design of programming languages, compilers, and the u ..."
Abstract
-
Cited by 61 (1 self)
- Add to MetaCart
The memory consistency model for a shared-memory multiprocessor specifies the behavior of memory with respect to read and write operations from multiple processors. As such, the memory model influences many aspects of system design, including the design of programming languages, compilers, and the underlying hardware. Relaxed models that impose fewer memory ordering constraints offer the potential for higher performance by allowing hardware and software to overlap and reorder memory operations. However, fewer ordering guarantees can compromise programmability and portability. Many of the previously proposed models either fail to provide reasonable programming semantics or are biased toward programming ease at the cost of sacrificing performance. Furthermore, the lack of consensus on an acceptable model hinders software portability across different systems. This dissertation focuses on providing a balanced solution that directly addresses the trade-off between programming ease and performance. To address programmability, we propose an alternative method for specifying memory behavior that presents a higher level abstraction to the programmer. We show that with only a few types of information supplied by the
Operating system support for busy internet servers
- In Proceedings of the Fifth Workshop on Hot Topics in Operating Systems (HotOS-V), Orcas Island
, 1995
"... mogul @ wrl.dec.com The Internet has experienced exponential growth in the use of the World-Wide Web, and rapid growth in the use of other Internet services such as VSENET news and electronic mail. These applications qualitatively differ from other network applications in the stresses they impose on ..."
Abstract
-
Cited by 50 (2 self)
- Add to MetaCart
mogul @ wrl.dec.com The Internet has experienced exponential growth in the use of the World-Wide Web, and rapid growth in the use of other Internet services such as VSENET news and electronic mail. These applications qualitatively differ from other network applications in the stresses they impose on busy server systems. Unlike traditional distributed systems, Internet servers must cope with huge user communities, short interactions, and long network latencies. Such servers require different kinds of operating system features to manage their resources effectively. 1
Register File Design Considerations in Dynamically Scheduled Processors
- In Proceedings of the Second IEEE Symposium on High-Performance Computer Architecture
, 1995
"... We have investigated the register file requirements of dynamically scheduled processors using register renaming and dispatch queues running the SPEC92 benchmarks. We looked at processors capable of issuing either four or eight instructions per cycle and found that in most cases implementing precise ..."
Abstract
-
Cited by 40 (1 self)
- Add to MetaCart
We have investigated the register file requirements of dynamically scheduled processors using register renaming and dispatch queues running the SPEC92 benchmarks. We looked at processors capable of issuing either four or eight instructions per cycle and found that in most cases implementing precise exceptions requires a relatively small number of additional registers compared to imprecise exceptions. Systems with aggressive non-blocking load support were able to achieve performance similar to processors with perfect memory systems at the cost of some additional registers. Given our machine assumptions, we found that the performance of a four-issue machine with a 32-entry dispatch queue tends to saturate around 80 registers. For an eight-issue machine with a 64-entry dispatch queue performance does not saturate until about 128 registers. Assuming the machine cycle time is proportional to the register file cycle time, the 8-issue machine yields only 20% higher performance than the 4-issue machine due in part...
The Predictability of Branches in Libraries
- IN 28TH INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE
, 1995
"... Profile-based optimizations are being used with increasing frequency. Profile ..."
Abstract
-
Cited by 33 (6 self)
- Add to MetaCart
Profile-based optimizations are being used with increasing frequency. Profile
The Trianus System and its Application to Custom Computing
- 6th Intl. Workshop on Field-Programmable Logic and Applications. LNCS 1142
, 1996
"... We describe the Trianus software system which consists of a suite of tightly integrated tools for the efficient design and implementation of algorithms using a custom computing machine. The software is built upon a generic framework for FPGA circuit design and comprises a compiler for the Lola h ..."
Abstract
-
Cited by 19 (3 self)
- Add to MetaCart
We describe the Trianus software system which consists of a suite of tightly integrated tools for the efficient design and implementation of algorithms using a custom computing machine. The software is built upon a generic framework for FPGA circuit design and comprises a compiler for the Lola hardware description language, a layout editor, a circuit checker, a technology mapper, a placer, a router, and a bit-stream generator and loader for the Xilinx XC6200 architecture. We argue that a tight coupling of design tools provides a base for fast iterative and interactive circuit design, a feature which current systems provide only in a very limited form. 1
Drip: A Schematic Drawing Interpreter
- WRL Research Report 95/1
, 1995
"... This paper presents a design capture system in which schematics are translated into a procedural netlist specification language. The circuit designer draws schematics with a standard structured graphics editor that knows nothing about netlists or schematics. The translator program analyzes the struc ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
This paper presents a design capture system in which schematics are translated into a procedural netlist specification language. The circuit designer draws schematics with a standard structured graphics editor that knows nothing about netlists or schematics. The translator program analyzes the structured graphics output file and translates it into a procedural netlist specification. d i g i t a l Western Research Laboratory 250 University Avenue Palo Alto, California 94301 USA ii Table of Contents 1. Introduction 1 2. Basics 2 2.1. Simple Example 2 2.2. Structured Graphics 3 3. Generating Procedures 4 3.1. Frames and Evaluation 4 3.2. 2D Ordering 5 4. Drawing Interpretation 7 4.1. Icons 8 5. Analysis of Non-Evaluation Objects 9 5.1. Binding Text to Objects 9 5.2. Wires 10 5.3. Wire Subscripting 11 6. Error Reporting 11 7. Experiences 12 Acknowledgements 12 References 12 iii iv List of Figures Figure 1: Code Generated for "CELL: orN" 2 Figure 2: 2D ordering of objects 5 Figur...

