Results 1 - 10
of
14
Rio: A System Solution for Sharing I/O between Mobile Systems
"... Mobile systems are equipped with a diverse collection of I/O devices, including cameras, microphones, sensors, and modems. There exist many novel use cases for allowing an application on one mobile system to utilize I/O devices from another. This paper presents Rio, an I/O sharing solution that supp ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
(Show Context)
Mobile systems are equipped with a diverse collection of I/O devices, including cameras, microphones, sensors, and modems. There exist many novel use cases for allowing an application on one mobile system to utilize I/O devices from another. This paper presents Rio, an I/O sharing solution that supports unmodified applications and exposes all the functionality of an I/O device for sharing. Rio’s design is common to many classes of I/O devices, thus significantly reducing the engineering effort to support new I/O devices. Our implementation of Rio on Android consists of about 7100 total lines of code and supports four I/O classes with fewer than 500 class-specific lines of code. Rio also supports I/O sharing between mobile systems of different form factors, including smartphones and tablets. We show that Rio achieves performance close to that of local I/O for audio devices, sensors, and modem, but suffers noticeable performance degradation for camera due to network throughput limitations between the two systems, which is likely to be alleviated by emerging wireless standards.
Popcorn: Bridging the Programmability Gap in Heterogeneous-ISA Platforms
"... The recent possibility of integrating multiple-OS-capable, high-core-count, heterogeneous-ISA processors in the same platform poses a question: given the tight integration be-tween system components, can a shared memory program-ming model be adopted, enhancing programmability? If this can be done, a ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
The recent possibility of integrating multiple-OS-capable, high-core-count, heterogeneous-ISA processors in the same platform poses a question: given the tight integration be-tween system components, can a shared memory program-ming model be adopted, enhancing programmability? If this can be done, an enormous amount of existing code writ-ten for shared memory architectures would not have to be rewritten to use a new programming paradigm (e.g., code offloading) that is often very expensive and error prone. We propose a new software architecture that is composed of an operating system and a compiler framework to run ordinary shared memory applications, written for homogeneous ma-chines, on OS-capable heterogeneous-ISA machines. Appli-cations run transparently amongst different ISA processors while exploiting the most optimized instruction set for each code block. We have implemented and tested our system, called Popcorn, on a multi-core Intel Xeon machine with a PCIe Intel Xeon Phi to demonstrate the viability of our ap-proach. Application execution on Popcorn demonstrates to be up to 52 % faster than the most performant native execu-tion on Linux, on either Xeon or Xeon Phi, while remov-ing the burden of the programmer having to adopt a differ-ent programming model than shared memory on a hetero-geneous system. When compared to an offloading program-ming model, Popcorn is shown to be up to 6.2 times faster. 1.
Enhancing Mobility Apps To Use Sensor Hubs Without Programmer Effort
"... Emerging sensor hubs for smartphones can run long-lived sensing tasks efficiently, but they require application de-velopers to consider how to divide functionality between the main processor and sensor hub. We implement Mo-bileHub for Android to let developers and users automat-ically rewrite mobili ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
Emerging sensor hubs for smartphones can run long-lived sensing tasks efficiently, but they require application de-velopers to consider how to divide functionality between the main processor and sensor hub. We implement Mo-bileHub for Android to let developers and users automat-ically rewrite mobility applications to leverage the sen-sor hub. The key to our approach is to use implicit and explicit information flow tracking during representative application usage to track the flow of sensor data in an application. MobileHub then learns when the applica-tion can be idle, and buffers sensor readings at the sensor hub until the application needs to be awakened. Accord-ingly, we rewrite the application bytecode (with no ac-cess to source code) to interface with the sensor hub. In experiments on five applications from the Android mar-ketplace, we achieved power gains of up to 80%. 1.
Autonomic Thread Scaling Library for QoS Management *
"... ABSTRACT Over the last years embedded system industry is facing a revolution with the introduction of multicores and heterogeneous devices. The availability of these new platforms opens new paths for these devices that can be nowadays used for more high demand tasks, exploiting the parallelism made ..."
Abstract
- Add to MetaCart
(Show Context)
ABSTRACT Over the last years embedded system industry is facing a revolution with the introduction of multicores and heterogeneous devices. The availability of these new platforms opens new paths for these devices that can be nowadays used for more high demand tasks, exploiting the parallelism made available by the muticore processors. Nonetheless the progress of the HW technology are not backed up by improvements of the SW side, and runtime mechanisms to manage resource allocation and contention on resources are still lacking the proper effectiveness. This paper tackles the problem of dynamic resource management from the application point of view and presents a user space library to control application performance. The control knob exploited by the library is the possibility of scaling the number of threads used by an application and seamlessly integrates with OpenMP. A case study illustrates the benefits that this library has in a classic embedded system scenario, introducing an overhead of less than 0.3%.
The Case For Heterogeneous HTAP
"... ABSTRACT Modern database engines balance the demanding requirements of mixed, hybrid transactional and analytical processing (HTAP) workloads by relying on i) global shared memory, ii) system-wide cache coherence, and iii) massive parallelism. Thus, database engines are typically deployed on multi- ..."
Abstract
- Add to MetaCart
(Show Context)
ABSTRACT Modern database engines balance the demanding requirements of mixed, hybrid transactional and analytical processing (HTAP) workloads by relying on i) global shared memory, ii) system-wide cache coherence, and iii) massive parallelism. Thus, database engines are typically deployed on multi-socket multi-cores, which have been the only platform to support all three aspects. Two recent trends, however, indicate that these hardware assumptions will be invalidated in the near future. First, hardware vendors have started exploring alternate non-cache-coherent shared-memory multi-core designs due to escalating complexity in maintaining coherence across hundreds of cores. Second, as GPGPUs overcome programmability, performance, and interfacing limitations, they are being increasingly adopted by emerging servers to expose heterogeneous parallelism. It is thus necessary to revisit database engine design because current engines can neither deal with the lack of cache coherence nor exploit heterogeneous parallelism. In this paper, we make the case for Heterogeneous-HTAP (H 2 TAP), a new architecture explicitly targeted at emerging hardware. H 2 TAP engines store data in shared memory to maximize data freshness, pair workloads with ideal processor types to exploit heterogeneity, and use message passing with explicit processor cache management to circumvent the lack of cache coherence. Using Caldera, a prototype H 2 TAP engine, we show that the H 2 TAP architecture can be realized in practice and can offer performance competitive with specialized OLTP and OLAP engines.
Jouler: A Policy Framework Enabling Effective and Flexible Smartphone Energy Management
"... Abstract. Smartphone energy management is a complex challenge. Consider-able energy-related variation exists between devices, apps, and users; and while over-allocating energy can strand the user with an empty battery, over-conserving energy can unnecessarily degrade performance. But despite this co ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract. Smartphone energy management is a complex challenge. Consider-able energy-related variation exists between devices, apps, and users; and while over-allocating energy can strand the user with an empty battery, over-conserving energy can unnecessarily degrade performance. But despite this complexity, cur-rent smartphone platforms include ”one-size-fits-all ” energy management poli-cies that cannot satisfy the diverse needs of all users. To address this problem we present Jouler, a framework enabling effective and flexible smartphone energy management by cleanly separating energy control mechanisms from management policies. Jouler provides both imperative mechanisms that can control all apps, and cooperative mechanisms that allow modified apps to adapt to the user’s en-ergy management goals. We have implemented Jouler for Android and used it to provide three new energy management policies to 203 smartphone users. Results from our deployment indicate that users appreciate more flexible smartphone en-ergy management and that Jouler policies can help users achieve their energy management goals. Key words: Smartphone energy management; Smartphone platforms 1
Operating Systems
"... My research interest is to understand and build energy-efficient mobile systems, especially architecture and operating systems support for a future of efficient personal computing. Over the past five years, I have been driving toward a vision of continuous mobile vision services, which invoke freque ..."
Abstract
- Add to MetaCart
My research interest is to understand and build energy-efficient mobile systems, especially architecture and operating systems support for a future of efficient personal computing. Over the past five years, I have been driving toward a vision of continuous mobile vision services, which invoke frequent vision sensing, compu-tation, and offload to understand a user’s real-world environment, providing vision-based services that relieve users ’ memory and attention. Continuous mobile vision will revolutionize personal computing: immersive interaction will empower consumer applications; sight assistance will aid the memory- and vision-impaired; hazard detection will alert military units; and visual localization will guide robotic drones. However, systems hosting continuous mobile vision face a fundamental challenge: energy efficiency. De-spite recent advances in vision algorithms, current systems are severely limited by the substantial energy con-sumption of always-on vision sensing and processing. Conventional vision systems drain a small wearable battery in 40 minutes and raise the device’s surface temperature to over 55 ◦C [APSys ’14]. This efficiency challenge is symptomatic of a key fact: conventional mobile systems are provisioned for on-demand user interaction, not for continuous service. Systems must be redesigned at all levels to meet the demands of periodically and/or sporadically interpreting visual information. My research takes on this chal-lenge by innovating mobile system support, using an experimental, prototype-driven approach that combines domain knowledge from software systems, hardware architecture, and machine learning. Dissertation Work: Provisioning Mobile Systems for Continuous Mobile Vision My dissertation research ventures through multiple levels of the vision system stack, designing solutions in: (i) application support, (ii) operating systems, and (iii) sensor hardware, as shown in the figure. The principal objective has been to enable energy-proportionality; energy consumption should be proportional to the quantity and quality of the capture and compute needed to complete a set of tasks.
GPUnet: Networking Abstractions for GPU Programs Sangman Kim
"... Despite the popularity of GPUs in high-performance and scientific computing, and despite increasingly general-purpose hardware capabilities, the use of GPUs in net-work servers or distributed systems poses significant challenges. GPUnet is a native GPU networking layer that provides a socket abstrac ..."
Abstract
- Add to MetaCart
(Show Context)
Despite the popularity of GPUs in high-performance and scientific computing, and despite increasingly general-purpose hardware capabilities, the use of GPUs in net-work servers or distributed systems poses significant challenges. GPUnet is a native GPU networking layer that provides a socket abstraction and high-level networking APIs for GPU programs. We use GPUnet to streamline the de-velopment of high-performance, distributed applications like in-GPU-memory MapReduce and a new class of low-latency, high-throughput GPU-native network ser-vices such as a face verification server. 1.
Scheduling of Multiserver System Components on Over-provisioned Multicore Systems
"... Until recently, microkernel-based multiserver systems could not match the performance of monolithic designs due to their architectural choices which target high reliability rather than high performance. With the advent of multicore processors, heterogeneous and over-provisioned architectures, it is ..."
Abstract
- Add to MetaCart
(Show Context)
Until recently, microkernel-based multiserver systems could not match the performance of monolithic designs due to their architectural choices which target high reliability rather than high performance. With the advent of multicore processors, heterogeneous and over-provisioned architectures, it is possi-ble to employ multiple cores to run individual components of the system, avoiding expensive context switching and stream-line the system’s operations. Thus, multiserver systems can overcome their performance issues without compromising reliability. However, while resources are becoming abundant, it is important to use them efficiently and to select and tune the resources carefully for the best performance and energy efficiency depending on the current workload. Most of the prior work focused solely on scheduling and placement of the applications. In multiserver systems, the operating system itself needs to be scheduled in time and space as the demand changes. Therefore the system servers must no longer be opaque processes and the scheduler must understand the system’s workload to make good decisions. 1.
Hare: a file system for non-cache-coherent multicores
"... Hare is a new file system that provides a POSIX-like inter-face on multicore processors without cache coherence. Hare allows applications on different cores to share files, directo-ries, and file descriptors. The challenge in designing Hare is to support the shared abstractions faithfully enough to ..."
Abstract
- Add to MetaCart
(Show Context)
Hare is a new file system that provides a POSIX-like inter-face on multicore processors without cache coherence. Hare allows applications on different cores to share files, directo-ries, and file descriptors. The challenge in designing Hare is to support the shared abstractions faithfully enough to run applications that run on traditional shared-memory operating systems, with few modifications, and to do so while scaling with an increasing number of cores. To achieve this goal, Hare must support features (such as shared file descriptors) that traditional network file systems don’t support, as well as implement them in a way that scales (e.g., shard a directory across servers to allow concurrent operations in that directory). Hare achieves this goal through a combination of new protocols (including a 3-phase com-mit protocol to implement directory operations correctly and scalably) and leveraging properties of non-cache-coherent multiprocessors (e.g., atomic low-latency message delivery and shared DRAM). An evaluation on a 40-core machine demonstrates that Hare can run many challenging Linux applications (including a mail server and a Linux kernel build) with minimal or no modifications. The results also show these applications achieve good scalability on Hare, and that Hare’s techniques are important to achieving scalability. 1