ClubEnsayos.com - Ensayos de Calidad, Tareas y Monografias
Buscar

Informatica Italiana


Enviado por   •  25 de Julio de 2014  •  2.787 Palabras (12 Páginas)  •  255 Visitas

Página 1 de 12

Copyright ©2014 by Filippo Sironi. All right reserved.

Re-distributed by Politecnico di Milano under license with the author.

This work is licensed under a Creative Commons

Attribution-NonCommercial-ShareAlike 4.0 International License. All trademarks cited here are the property of their respective owners.

A B S T R A C T

Computer architecture crossed a critical juncture at the beginning of the last decade. Single-thread performance stopped scaling due to technology lim- itations and complexity constraints. Therefore, chip manufacturers started relying on multi-threading and multicore processors to scale-up performance efficiently while keeping other figures of merit like energy and power con- sumption under control. In fact, whenever parallel software is available, a multicore processor harnessing Thread-Level Parallelism (TLP) can outper- form a massive superscalar processor exploiting Instruction-Level Parallelism (ILP) within the same power budget. As a consequence, on-chip parallel archi- tectures, which once were rare, are now commodity across all domains, from embedded and mobile computing systems to large-scale installations. Nev- ertheless, achieving efficient performance accounting for energy and power consumption progressively became increasingly complex requiring significant innovation across the hardware/software execution stack, even for commodity solutions.

At a high level, two challenges arise that hinder multicore processors efficiency. First, it must be possible to effectively partition hardware resources among co- runner applications within multi-program workloads and avoid the negative effects of sharing when hardware resource cannot be partitioned. Hardware resource partitioning is necessary because most multi-threaded applications do not fully exploit the parallelism available in commodity multicore processors due to the major difficulties of fine-grain parallelism. Among the hardware resources worth partitioning, there are: compute bandwidth and cores, and possibly others depending on the workload. Ideally, the system software layer of the hardware/software execution stack should act on hardware resource partitioning to attain fair application performance and provide Quality of Ser-

vii

vice (QoS) guarantees while respecting system constraints. Second, the system software layer should operate in a transparent fashion without burdening application programmers with all the complexities of the hardware/software execution stack.

The focus of this dissertation is twofold. First, support efficient hardware re- source partitioning for commodity multicore processors through a system soft- ware layer, which operate transparently for applications. To this end, I present solutions to attain fair application performance and provide QoS guarantees for co-runner applications within a multi-program workload accounting for application-specific performance measurements and performance goals. Sec- ond, support efficient Dynamic Thermal Management (DTM) for commodity multicore processors through a low-level system software layer. For this pur- pose, I present a solution to constrain temperature when a multi-program work- load of single-threaded applications runs on a Chip-Multiprocessor (CMP). The resulting artifact is a set of changes, runtimes, and libraries for the GNU’s not Unix (GNU)/Linux operating system.

On the performance side, I present the Heart Rate Monitor (HRM), Metronome, and Metronome++. First, HRM is a split-design subsystem consisting of an extension of the Linux kernel and a user-space library to attach applications to the subsystem. HRM addresses the impedance-mismatch problem by pro- viding application-specific performance measurements that are meaningful to both programmers and users and, at the same time, useful to the system software layer of the hardware/software execution stack. libhrm provides pro- grammers a simple Application Programming Interface (API) to instrument applications so as to define performance measurements and allow users to specify performance goals. HRM and libhrm make the operations of the sys- tem software layer I developed transparent to application programmers, which just exploit their knowledge of the application domain to define meaningful performance measurements. Second, Metronome is a kernel-space runtime introducing the notion of performance-aware fair scheduling by extending one of the scheduling classes of the Linux kernel. Metronome exploits HRM and the performance measurements it provides to drive application performance

viii

towards performance goals for co-runner applications within a multi-program workload. Metronome achieves its goal by implementing simple compute bandwidth partitioning mechanism and policy. Third, Metronome++ is a leap ahead with respect to Metronome; it adopts a split-design across the kernel- and user-space. A user-space runtime drives the kernel-space extension of the scheduling infrastructure of the Linux kernel to provide QoS guarantees for co-runner applications within a multi-program workload by harnessing application characteristics like speedup and execution phases. Metronome++ achieves its goal by implementing compute core partitioning mechanism and policy. This dissertation additionally presents a set of minor achievements harnessing different decision-making techniques other than the heuristics Metronome and Metronome++ make use of.

On the temperature side, I present ThermOS, an extension of the Linux kernel providing DTM through formal feedback control and idle cycle injection. ThermOS addresses a shortcoming of commodity CMPs, which do not allow different cores to run at different clock frequencies when they operate in the same state. ThermOS avoids the negative effects depending on the lack of fine-grain control over hardware facilities like Dynamic Voltage and Frequency Scaling (DVFS) and improves upon state of the art.

On the performance/temperature side, I present preliminary results regarding joint adaptive performance and thermal management combining some of the aforementioned approaches.

S O M M A R I O

L’architettura dei calcolatori ha attraversato un momento critico all’inizio dello scorso decennio. Le prestazioni dei processori dotati di un singolo core e capaci di

...

Descargar como (para miembros actualizados) txt (17 Kb)
Leer 11 páginas más »
Disponible sólo en Clubensayos.com