Funktionen

Print[PRINT]
.  Home  .  Lehre  .  Vorlesungen  .  Wintersemester 2020/21  .  Introdution to Power-Aware HPC

Introdution to Power-Aware HPC

Lecture in the winter-term 2020/21
Prof. Dr. D. Kranzlmüller,
Dr. Hayk Shoukourian

This course will be held in English!

Welcome to the course webpage for Introdution to Power-Aware HPC for winter-term 2020/21 at LMU Munich. Here you will be able to find the details on the lecture and the accompanying practical project.

News

26.10.2020
Connection details have been communicated to all registered students via Uni2Work framework.
26.10.2020
Due to current COVID-19 related situation, it is decided to hold the lecture online. We will be using Zoom remote conferencing service. Please ensure that the Zoom Client is installed on your side prior joining the meeting. The connection specifics will be announced soon to all registered participants. Please note the Rules for Online Teaching.
12.10.2020
Please note that the lecture room has been modified. The lectures will be held in the room A 120, Hauptgebaeude/Main Building.
20.08.2020
Welcome to the course webpage Introdution to Power-Aware HPC for winter-term 2020/21 at LMU Munich.
  • Registration will be opened on the 3rd of September via UNI2WORK
    (NOTE: registration closes on 11.10.2020 at 18:00)
  • The lectures are scheduled for Wednesdays from 10:00 to 12:00. The first lecture will take place on 04.11.2020 at 10:00. The room number is 220 (Amalienstr. 73A)

Contents of the lecture

Some of the current High Performance Computing (HPC) systems already consume more than 15 MW of power - a sufficient amount of power for sustaining a small city. Energy consumption is becoming a dominating factor for the Total Cost of Ownership of many HPC systems, making high-performance design and energy-efficient design in many ways synonymous.

Apart from the high power bills, power consumptions of these magnitudes act as a limiting factor in building and operating Exascale systems, i.e. next generation of HPC systems that are capable of performing 1018 floating point operations per second. This could already cause the entire data center's power delivery and cooling infrastructures to breach the safety limits as well as affect the environmental sustainability by producing high carbon footprint. Therefore, it is important to be preemptive in improving energy/power efficiency of HPC data centers.

This course explores different energy consumption issues in modern HPC data centers, discusses their impacts on the design of new computing systems and presents different strategies that aim to reduce the overall power consumption.

The lecture will cover the main concepts of energy consumption paradigms that should remain valid despite the continuous technological changes in the area.

Upon completion of this course the participants should acquire knowledge on:

  • the importance of power/energy-efficiency for modern data centers
  • the theory behind a variety of impacts that power dissipation in a CMOS chip has on HPC data centers
  • contemporary tools for monitoring different power consumption related metrics
  • diverse techniques on energy-efficiency tuning
  • power-related challenges for next generation HPC systems
  • contemporary resource management and scheduling techniques that are tuned for energy-efficiency
  • power variation in homogeneous HPC systems and the potential of possible cost savings
  • Intel's Model Specific Registers (MSRs) used for power management support
  • principles of various machine learning techniques and their applications for intelligent power management
  • high-frequency data collection techniques
  • datacenter basics (understand the building blocks of modern datacenters and learn about possible architectures)

Audience

The course is intended for master students of computer science and related fields. The lecture and the project work have a cumulative weight of 6 ECTS.

More formally, in German:
Die Vorlesung richtet sich an Master-Studierende der Informatik. Für die Vorlesung und die Projektarbeit werden 6 ECTS-Punkte vergeben.

The number of students will be limited to 20. The registration will open 03.09.2020 from 08:00 via UNI2WORK and will close on 11.10.2020 at 18:00.

Prerequisites:

  • Python knowledge
  • Interest in energy-efficient supercomputing
  • Interest in developing machine-learning frameworks

Dates

  • Lecture: Wednesdays, 10:15 to 11:45 ONLINE. The connection details will be communicated to all registered participants.in room 220 in Amalienstr. 73A A 120, Hauptgebaeude/Main Building. The first lecture will be held on 4th of November 2020 (online, via zoom).
  • Guided Tour at Leibniz Supercomputing Centre (LRZ): TBA. Meeting point: LRZ, 85748 Garching bei Muenchen.
  • Exam: TBA (see the exam section for more details)
  • Repeat Exam: TBA (see the exam section for more details)

Project: "Increasing Cooling Efficiency of a Data Center"

LRZ_TwinCube

This project aims at building Machine-Learning (ML) based models for predicting the power consumption of a HPC data center's cooling loop. Participants will form groups, where each group will be assigned with an annual operational data obtained at Leibniz Supercomputing Centre (LRZ).

The provided data will contain various sensor measurements from LRZ's building infrastructure.

Each group of students would need to analyze the data, design and develop a ML-based model capable of predicting the power consumption of LRZ's warm-water cooling loop.

During this project students will gain an experience that could be applied not only to HPC data centers but also to other domains involving ML-based modeling.

The detailed description of the project assignment will follow during the lecture.

The training data can be found here Project Section.

Exam

There will be a written examination (closed book) which will be held in February 2021. The exact time and room will be published as soon as possible.

DATE: TBA

TIME: TBA

ROOM: TBA


The retake of the exam is scheduled for:

DATE: TBA

TIME: TBA

ROOM: TBA

Scripts

The lecture notes are available in the Download Section.

Literature

Book

CMOS VLSI Design: A Circuits and Systems Perspective (4th Edition) by Neil Weste, David Harris

Book

Computer Organization and Design RISC-V Edition: The Hardware Software Interface by David A. Patterson, John L. Hennessy

Book

Energy-Efficient Distributed Computing Systems by Albert Y. Zomaya, Young Choon Lee

Book

Machine Learning: A Probabilistic Perspective by Kevin P. Murphy

Book

Machine Learning: An Algorithmic Perspective, second edition by Stephen Marsland

Book

Introduction to Apache Flink: Stream Processing for Real Time and Beyond By Ellen, M.D. Friedman, Kostas Tzoumas

Book

The Data Center as a Computer by Luiz André Barroso, Jimmy Clidaras, Urs Hölzle

Book

Additional scholary articles: sources will be indicated in the course slides






Regeln zur Online-Lehre

Sehr viele Lehrveranstaltungen finden online statt. Als Dozenten bitten wir um Nachsicht, falls Dinge nicht immer perfekt laufen und hoffen auf Ihre konstruktive Mitarbeit. In dieser Situation gelten zudem online einige Regeln, die im realen Leben ohnehin klar wären, auf die wir hier aber explizit hinweisen möchten:

  • In Live-Veranstaltungen bitten wir um einen disziplinierten Umgang mit Audio (normalerweise aus) und Bandbreite (Video nach Bedarf)
  • Die Aufzeichnung oder Weiterleitung von Veranstaltungen durch Teilnehmer sind nicht erlaubt.
  • Die Verteilung von Inhalten (Video, Audio, Bilder, PDFs, etc.) in anderen Kanälen als den vom Autor vorgesehenen ist nicht erlaubt.

Wer eine dieser Regeln verletzt, muss damit rechnen, von der fraglichen Veranstaltung ausgeschlossen zu werden und wir behalten uns weitere Schritte vor. Mit allen anderen freuen wir uns auf das gemeinsame Experiment "Online-Semester".

Rules for Online Teaching

Most teaching happens currently online. As lecturers, we ask you to be forgiving if things should not work perfectly right away, and we hope for your constructive participation. In this situation, we would also like to explicitly point out some rules, which would be self-evident in real life:

  • In live meetings, we ask you to responsibly deal with audio (off by default) and bandwidth (video as needed).
  • Recording or redirecting streams by participants is not allowed.
  • Distributing content (video, audio, images, PDFs, etc.) in other channels than those foreseen by the author is not allowed.

If you violate one of these rules, you can expect to be expelled from the respective course, and we reserve the right for further action. With all others, we are looking forward to the joint experiment of an "online semester".

Contact

Via email, or per appointment, or after lectures.