Results 1  10
of
48
Verifying Quantitative Reliability for Programs That Execute on Unreliable Hardware
"... Emerging highperformance architectures are anticipated to contain unreliable components that may exhibit soft errors, which silently corrupt the results of computations. Full detection and masking of soft errors is challenging, expensive, and, for some applications, unnecessary. For example, approx ..."
Abstract

Cited by 25 (3 self)
 Add to MetaCart
Emerging highperformance architectures are anticipated to contain unreliable components that may exhibit soft errors, which silently corrupt the results of computations. Full detection and masking of soft errors is challenging, expensive, and, for some applications, unnecessary. For example, approximate computing applications (such as multimedia processing, machine learning, and big data analytics) can often naturally tolerate soft errors. We present Rely, a programming language that enables developers to reason about the quantitative reliability of an application – namely, the probability that it produces the correct result when executed on unreliable hardware. Rely allows developers to specify the reliability requirements for each value that a function produces. We present a static quantitative reliability analysis that verifies quantitative requirements on the reliability of an application, enabling a developer to perform sound and verified reliability engineering. The analysis takes a Rely program with a reliability specification and a hardware specification that characterizes the reliability of the underlying hardware components and verifies that the program satisfies its reliability specification when executed on the underlying unreliable hardware platform. We demonstrate the application of quantitative reliability analysis on six computations implemented
SAGE: Selftuning approximation for graphics engines
 In Proc. of the 46th Annual International Symposium on Microarchitecture
, 2013
"... Approximate computing, where computation accuracy is traded off for better performance or higher data throughput, is one solution that can help data processing keep pace with the current and growing overabundance of information. For particular domains such as multimedia and learning algorithms, app ..."
Abstract

Cited by 16 (4 self)
 Add to MetaCart
(Show Context)
Approximate computing, where computation accuracy is traded off for better performance or higher data throughput, is one solution that can help data processing keep pace with the current and growing overabundance of information. For particular domains such as multimedia and learning algorithms, approximation is commonly used today. We consider automation to be essential to provide transparent approximation and we show that larger benefits can be achieved by constructing the approximation techniques to fit the underlying hardware. Our target platform is the GPU because of its high performance capabilities and difficult programming challenges that can be alleviated with proper automation. Our approach, SAGE, combines a static compiler that automatically generates a set of CUDA kernels with varying levels of approximation with a runtime system that iteratively selects among the available kernels to achieve speedup while adhering to a target output quality set by the user. The SAGE compiler employs three optimization techniques to generate approximate kernels that exploit the GPU microarchitecture: selective discarding of atomic operations, data packing, and thread fusion. Across a set of machine learning and image processing kernels, SAGE’s approximation yields an average of 2.5x speedup with less than 10 % quality loss compared to the accurate execution on a NVIDIA GTX 560 GPU.
Paraprox: patternbased approximation for data parallel applications
 In 19th International Conference on Architectural Support for Programming Languages and Operating Systems
, 2014
"... Approximate computing is an approach where reduced accuracy of results is traded off for increased speed, throughput, or both. Loss of accuracy is not permissible in all computing domains, but there are a growing number of dataintensive domains where the output of programs need not be perfectly ..."
Abstract

Cited by 12 (3 self)
 Add to MetaCart
(Show Context)
Approximate computing is an approach where reduced accuracy of results is traded off for increased speed, throughput, or both. Loss of accuracy is not permissible in all computing domains, but there are a growing number of dataintensive domains where the output of programs need not be perfectly correct to provide useful results or even noticeable differences to the end user. These soft domains include multimedia processing, machine learning, and data mining/analysis. An important challenge with approximate computing is transparency to insulate both software and hardware developers from the time, cost, and difficulty of using approximation. This paper proposes a softwareonly system, Paraprox, for realizing transparent approximation of dataparallel programs that operates on commodity hardware systems. Paraprox starts with a dataparallel kernel implemented using OpenCL or CUDA and creates a parameterized approximate kernel that is tuned at runtime to maximize performance subject to a target output quality (TOQ) that is supplied by the user. Approximate kernels are created by recognizing common computation idioms found in dataparallel programs (e.g., Map, Scatter/Gather, Reduction, Scan, Stencil, and Partition) and substituting approximate implementations in their place. Across a set of 13 soft dataparallel applications with at most 10 % quality degradation, Paraprox yields an average performance gain of 2.7x on a NVIDIA GTX 560 GPU and 2.5x on an Intel Core i7 quadcore processor compared to accurate execution on each platform.
Uncertain<T>: A firstorder type for uncertain data
 In ASPLOS
, 2014
"... Emerging applications increasingly use estimates such as sensor data (GPS), probabilistic models, machine learning, big data, and human data. Unfortunately, representing this uncertain data with discrete types (floats, integers, and booleans) encourages developers to pretend it is not probabilisti ..."
Abstract

Cited by 11 (4 self)
 Add to MetaCart
(Show Context)
Emerging applications increasingly use estimates such as sensor data (GPS), probabilistic models, machine learning, big data, and human data. Unfortunately, representing this uncertain data with discrete types (floats, integers, and booleans) encourages developers to pretend it is not probabilistic, which causes three types of uncertainty bugs. (1) Using estimates as facts ignores random error in estimates. (2) Computation compounds that error. (3) Boolean questions on probabilistic data induce false positives and negatives. This paper introduces Uncertain〈T〉, a new programming language abstraction for uncertain data. We implement a Bayesian network semantics for computation and conditionals that improves program correctness. The runtime uses sampling and hypothesis tests to evaluate computation and conditionals lazily and efficiently. We illustrate with sensor and machine learning applications that Uncertain〈T 〉 improves expressiveness and accuracy. Whereas previous probabilistic programming languages focus on experts, Uncertain〈T 〉 serves a wide range of developers. Experts still identify error distributions. However, both experts and application writers compute with distributions, improve estimates with domain knowledge, and ask questions with conditionals. The Uncertain〈T 〉 type system and operators encourage developers to expose and reason about uncertainty explicitly, controlling false positives and false negatives. These benefits make Uncertain〈T 〉 a compelling programming model for modern applications facing the challenge of uncertainty.
Memristorbased approximated computation
 In ISLPED 2013
"... Abstract—The cessation of Moore’s Law has limited further improvements in power efficiency. In recent years, the physical realization of the memristor has demonstrated a promising solution to ultraintegrated hardware realization of neural networks, which can be leveraged for better performance and ..."
Abstract

Cited by 6 (5 self)
 Add to MetaCart
(Show Context)
Abstract—The cessation of Moore’s Law has limited further improvements in power efficiency. In recent years, the physical realization of the memristor has demonstrated a promising solution to ultraintegrated hardware realization of neural networks, which can be leveraged for better performance and power efficiency gains. In this work, we introduce a power efficient framework for approximated computations by taking advantage of the memristorbased multilayer neural networks. A programmable memristor approximated computation unit (Memristor ACU) is introduced first to accelerate approximated computation and a memristorbased approximated computation framework with scalability is proposed on top of the Memristor ACU. We also introduce a parameter configuration algorithm of the Memristor ACU and a feedback state tuning circuit to program the Memristor ACU effectively. Our simulation results show that the maximum error of the Memristor ACU for 6 common complex functions is only 1.87 % while the state tuning circuit can achieve 12bit precision. The implementation of HMAX model atop our proposed memristorbased approximated computation framework demonstrates 22 × power efficiency improvements than its pure digital implementation counterpart. Index Terms—memristor, approximated computation, power efficiency, neuromorphic
Expressing and Verifying Probabilistic Assertions
"... Traditional assertions express correctness properties that must hold on every program execution. However, many applications have probabilistic outcomes and consequently their correctness properties are also probabilistic (e.g., they identify faces in images, consume sensor data, or run on unreliabl ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
(Show Context)
Traditional assertions express correctness properties that must hold on every program execution. However, many applications have probabilistic outcomes and consequently their correctness properties are also probabilistic (e.g., they identify faces in images, consume sensor data, or run on unreliable hardware). Traditional assertions do not capture these correctness properties. This paper proposes that programmers express probabilistic correctness properties with probabilistic assertions and describes a new probabilistic evaluation approach to efficiently verify these assertions. Probabilistic assertions are Boolean expressions that express the probability that a property will be true in a given execution rather than asserting that the property must always be true. Given either specific inputs or distributions on the input space, probabilistic evaluation verifies probabilistic assertions by first performing distribution extraction to represent the program as a Bayesian network. Probabilistic evaluation then uses statistical properties to simplify this representation to efficiently compute assertion probabilities directly or with sampling. Our approach is a mix of both static and dynamic analysis: distribution extraction statically builds and optimizes the Bayesian network representation and sampling dynamically interprets this representation. We implement our approach in a tool called MAYHAP for C and C++ programs. We evaluate expressiveness, correctness, and performance of MAYHAP on programs that use sensors, perform approximate computation, and obfuscate data for privacy. Our case studies demonstrate that probabilistic assertions describe useful correctness properties and that MAYHAP efficiently verifies them. Categories and Subject Descriptors G.3 [Probability and Statis
A General Constraintcentric Scheduling Framework for Spatial Architectures
"... Specialized execution using spatial architectures provides energy efficient computation, but requires effective algorithms for spatially scheduling the computation. Generally, this has been solved with architecturespecific heuristics, an approach which suffers from poor compiler/architect productiv ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
(Show Context)
Specialized execution using spatial architectures provides energy efficient computation, but requires effective algorithms for spatially scheduling the computation. Generally, this has been solved with architecturespecific heuristics, an approach which suffers from poor compiler/architect productivity, lack of insight on optimality, and inhibits migration of techniques between architectures. Our goal is to develop a scheduling framework usable for all spatial architectures. To this end, we expresses spatial scheduling as a constraint satisfaction problem using Integer Linear Programming (ILP). We observe that architecture primitives and scheduler responsibilities can be related through five abstractions: placement of computation, routing of data, managing event timing, managing resource utilization, and forming the optimization objectives. We encode these responsibilities as 20 general ILP constraints, which are used to create schedulers for the disparate TRIPS, DySER, and PLUG architectures. Our results show that a general declarative approach using ILP is implementable, practical, and typically matches or outperforms specialized schedulers.
Postcompiler Software Optimization for Reducing Energy
"... Modern compilers typically optimize for executable size and speed, rarely exploring nonfunctional properties such as power efficiency. These properties are often hardwarespecific, timeintensive to optimize, and may not be amenable to standard dataflow optimizations. We present a general postcompi ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
(Show Context)
Modern compilers typically optimize for executable size and speed, rarely exploring nonfunctional properties such as power efficiency. These properties are often hardwarespecific, timeintensive to optimize, and may not be amenable to standard dataflow optimizations. We present a general postcompilation approach called Genetic Optimization Algorithm (GOA), which targets measurable nonfunctional aspects of software execution in programs that compile to x86 assembly. GOA combines insights from profileguided optimization, superoptimization, evolutionary computation and mutational robustness. GOA searches for program variants that retain required functional behavior while improving nonfunctional behavior, using characteristic workloads and predictive modeling to guide the search. The resulting optimizations are validated using physical performance measurements and a larger heldout test suite. Our experimental results on PARSEC benchmark programs show average energy reductions of 20%, both for a large AMD system and a small Intel system, while maintaining program functionality on target workloads.
ExpectationOriented Framework for Automating Approximate Programming
"... This paper describes ExpAX, a framework for automating approximate programming based on programmerspecified error expectations. Three components constitute ExpAX: (1) a programming model based on a new kind of program specification, which we refer to as expectations. Our programming model enable ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
This paper describes ExpAX, a framework for automating approximate programming based on programmerspecified error expectations. Three components constitute ExpAX: (1) a programming model based on a new kind of program specification, which we refer to as expectations. Our programming model enables programmers to implicitly relax the accuracy constraints without explicitly marking operations approximate; (2) a novel approximation safety analysis that automatically identifies a safetoapproximate subset of the program operations; and (3) an optimization that automatically marks a subset of the safetoapproximate operations as approximate while considering the error expectation. Further, we formulate the process of automatically marking operations as approximate as an optimization problem and provide a genetic algorithm to solve it. We evaluate ExpAX using a diverse set of applications and show that it can provide significant energy savings while improving the qualityofresult degradation. ExpAX automatically excludes the safetoapproximate operations that if approximated lead to significant quality degradation. 1.
SNNAP: approximate computing on programmable SoCs via neural acceleration
 In International Symposium on HighPerformance Computer Architecture (HPCA), 2015 (cited on
"... Abstract—Many applications that can take advantage of accelerators are amenable to approximate execution. Past work has shown that neural acceleration is a viable way to accelerate approximate code. In light of the growing availability of onchip fieldprogrammable gate arrays (FPGAs), this paper ex ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
(Show Context)
Abstract—Many applications that can take advantage of accelerators are amenable to approximate execution. Past work has shown that neural acceleration is a viable way to accelerate approximate code. In light of the growing availability of onchip fieldprogrammable gate arrays (FPGAs), this paper explores neural acceleration on offtheshelf programmable SoCs. We describe the design and implementation of SNNAP, a flexible FPGAbased neural accelerator for approximate programs. SNNAP is designed to work with a compiler workflow that configures the neural network’s topology and weights instead of the programmable logic of the FPGA itself. This approach enables effective use of neural acceleration in commercially available devices and accelerates different applications without costly FPGA reconfigurations. No hardware expertise is required to accelerate software with SNNAP, so the effort required can be substantially lower than custom hardware design for an FPGA fabric and possibly even lower than current “Ctogates ” highlevel synthesis (HLS) tools. Our measurements on a Xilinx Zynq FPGA show that SNNAP yields a geometric mean of 3.8 × speedup (as high as 38.1×) and 2.8 × energy savings (as high as 28×) with less than 10 % quality loss across all applications but one. We also compare SNNAP with designs generated by commercial HLS tools and show that SNNAP has similar performance overall, with better resourcenormalized throughput on 4 out of 7 benchmarks. I.