Real-Time Multi-Tasking Environments Shinpei Kato * , Karthik - - PowerPoint PPT Presentation

real time multi tasking environments
SMART_READER_LITE
LIVE PREVIEW

Real-Time Multi-Tasking Environments Shinpei Kato * , Karthik - - PowerPoint PPT Presentation

TimeGraph: GPU Scheduling for Real-Time Multi-Tasking Environments Shinpei Kato * , Karthik Lakshmanan * , Raj Rajkumar * , and Yutaka Ishikawa ** * Carnegie Mellon University ** The University of Tokyo USENIX Annual Technical Conference 2011


slide-1
SLIDE 1

TimeGraph: GPU Scheduling for Real-Time Multi-Tasking Environments

Shinpei Kato*, Karthik Lakshmanan*, Raj Rajkumar*, and Yutaka Ishikawa** * Carnegie Mellon University ** The University of Tokyo

slide-2
SLIDE 2

USENIX Annual Technical Conference 2011 – Shinpei Kato (CMU), June 15, 2011

Graphics Applications

slide-3
SLIDE 3

USENIX Annual Technical Conference 2011 – Shinpei Kato (CMU), June 15, 2011

Graphics Processing Unit (GPU)

NVIDIA GPU GeForce GTX 480

L2 Cache L1 L1 L1 L1 L1 L1 L1 Device Memory Host Memory CPU

48 480 0 simple simple co cores es

slide-4
SLIDE 4

USENIX Annual Technical Conference 2011 – Shinpei Kato (CMU), June 15, 2011 7900 GTX 8800 GTX 9800 GTX GTX 280 GTX 285 GTX 480 GTX 580 E4300 E6850 Q9650 X7460 980 XE 200 400 600 800 1000 1200 1400 1600 2006/3/4 2007/12/14 2009/9/24 2011/7/6 GFLOPS NVIDIA GPU Intel CPU

Peak Performance

slide-5
SLIDE 5

USENIX Annual Technical Conference 2011 – Shinpei Kato (CMU), June 15, 2011 7900 GTX 8800 GTX 9800 GT GTX 280 GTX 285 GTX 480 GTX 580 E4300 E6850 Q9650 X7460 980XE 1 2 3 4 5 6 7 2006/3/4 2007/12/14 2009/9/24 2011/7/6 GFLOPS / Watt NVIDIA GPU Intel CPU

Peak Performance “per Watt”

slide-6
SLIDE 6

USENIX Annual Technical Conference 2011 – Shinpei Kato (CMU), June 15, 2011

General-Purpose Computing

  • n GPU (GPGPU)

3-D Interface 3-D On-line Game Virtual Reality Computer Vision Scientific Simulation Autonomous Driving

slide-7
SLIDE 7

USENIX Annual Technical Conference 2011 – Shinpei Kato (CMU), June 15, 2011

Outline

  • 1. Introduction
  • 2. What’s Problem
  • 3. Our Solution – “TimeGraph”
  • 4. Evaluation
  • 5. Summary
slide-8
SLIDE 8

USENIX Annual Technical Conference 2011 – Shinpei Kato (CMU), June 15, 2011

CMD_HtoD CMD_HtoD CMD_LAUNCH CMD_DtoH

GPU Code Input Data

Host Memory

GPU Code Input Data

Device Memory

GPU Code Input Data

Host Memory

GPU Code Input Data

Device Memory

GPU Code Input Data

Host Memory

GPU Code

Device Memory

GPU Code Input Data

Host Memory Device Memory

GPU Is Command-Driven

GPU Code Output Data Output Data

copy

Input Data

copy

Output Data

copy

slide-9
SLIDE 9

USENIX Annual Technical Conference 2011 – Shinpei Kato (CMU), June 15, 2011

Multi-Tasking Problem

High-priority task Low-priority task GPU command GPU driver

Blocked Blocked

CPU GPU time time

slide-10
SLIDE 10

USENIX Annual Technical Conference 2011 – Shinpei Kato (CMU), June 15, 2011

Impact of Interference

Observe Frame Rate of OpenArena (3-D Game)

  • n Linux

0.2 0.4 0.6 0.8 1 NVIDIA Nouveau NVIDIA Nouveau GeForce 9500 GeForce GTX 285 Relative frame-rate to standalone Execute with Engine (low workload) Execute with Clearspd (high workload) Compete w/ Widget (low GPU workload) Compete w/ Bomb (high GPU workload)

NVIDIA proprietary driver Nouveau open-source driver

slide-11
SLIDE 11

USENIX Annual Technical Conference 2011 – Shinpei Kato (CMU), June 15, 2011

Outline

  • 1. Introduction
  • 2. What’s Problem
  • 3. Our Solution – “TimeGraph”
  • 4. Evaluation
  • 5. Summary
slide-12
SLIDE 12

USENIX Annual Technical Conference 2011 – Shinpei Kato (CMU), June 15, 2011

Software Approach

Applications OpenGL/CUDA Library User-space GPU Driver Submission Interface IRQ Handler Kernel-space GPU Driver

TimeGraph

GPU Command Scheduler GPU Reserve Manager GPU Command Profiler Graphics Processing Unit (GPU) GPU Command Queue Notification GPU resource control GPU exec. time prediction

Kernel Space

Interrupt

User Space Device Space

TimeGraph Architecture

GPU Command Group High- Priority

slide-13
SLIDE 13

USENIX Annual Technical Conference 2011 – Shinpei Kato (CMU), June 15, 2011

Priority Support – Predictable Response Time (PRT) Policy

  • When GPU is not idle, GPU commands are queued
  • When GPU gets idle, GPU commands are dispatched

High-priority task Low-priority task GPU command GPU driver Interrupt CPU GPU time time

Prioritized correctly Overhead

slide-14
SLIDE 14

USENIX Annual Technical Conference 2011 – Shinpei Kato (CMU), June 15, 2011

Priority Support – High Throughput (HT) Policy

  • When GPU is not idle, GPU commands are queued,
  • nly if priority is lower than current GPU context
  • When GPU gets idle, GPU commands are dispatched

High-priority task Low-priority task GPU command GPU driver Interrupt CPU GPU time time

Overhead reduced

slide-15
SLIDE 15

USENIX Annual Technical Conference 2011 – Shinpei Kato (CMU), June 15, 2011

Reservation Support – Posterior Enforcement (PE) Policy

  • Enforce GPU resource usage optimistically
  • Specify capacity (C) and period (P) per task (/proc/GPU/$TASK)

CPU GPU Execution Budget C C P time time time

Enforced

slide-16
SLIDE 16

USENIX Annual Technical Conference 2011 – Shinpei Kato (CMU), June 15, 2011

Reservation Support – Apriori Enforcement (AE) Policy

  • Enforce GPU resource usage pessimistically
  • Specify capacity (C) and period (P) per task (/proc/GPU/$TASK)

CPU GPU Execution Budget C time time time

Predict Predict Enforced Enforced

C P

Predict Predict

slide-17
SLIDE 17

USENIX Annual Technical Conference 2011 – Shinpei Kato (CMU), June 15, 2011

GPU Execution Time Prediction

  • History-based approach

– Search records of previous sequences of GPU commands that match the incoming sequences of GPU commands – Works for 2-D but needs investigation for 3-D and Compute

  • Please see the paper for the detail
slide-18
SLIDE 18

USENIX Annual Technical Conference 2011 – Shinpei Kato (CMU), June 15, 2011

Outline

  • 1. Introduction
  • 2. What’s Problem
  • 3. Our Solution – “TimeGraph”
  • 4. Evaluation
  • 5. Summary
slide-19
SLIDE 19

USENIX Annual Technical Conference 2011 – Shinpei Kato (CMU), June 15, 2011

Experimental Setup

  • GPU: NVIDIA GeForce 9800 GT
  • CPU: Intel Xeon E5504
  • OS: Linux Kernel 2.6.36

– Nouveau open-source driver

  • Benchmark:

– Phoronix Test Suite http://www.phoronix-test-suite.com/

  • Including OpenGL 3-D game programs

– Gallium3D Demo Suite http://www.mesa3d.org/

  • Including OpenGL 3-D widget and graphics-bomb programs
slide-20
SLIDE 20

USENIX Annual Technical Conference 2011 – Shinpei Kato (CMU), June 15, 2011

Performance Protection

10 20 30 40 50 60 OpenArena World of Padman Urban Terror Unreal Trounament 3-D Game Application Average frame-rate (fps) No Timing Support Priority Support Priority & Soft Reservation Support Priority & Hard Reservation Support No TimeGraph Support Priority Support (High Priority -> 3-D Game) Priority & PE Reservation Support (GPU Util. 10% -> Graphics Bomb) Priority & AE Reservation Support (GPU Util. 10% -> Graphics Bomb)

Frame Rate of 3-D Game competing with Graphics Bomb in background

slide-21
SLIDE 21

USENIX Annual Technical Conference 2011 – Shinpei Kato (CMU), June 15, 2011

Interference on Time

40 80 120 160 200 20 40 60 80 100 120 Frames per Second Elapsed Time (Second) Engine #1 Engine #2 Engine #3 40 80 120 160 200 20 40 60 80 100 120 Frames per Second Elapsed Time (Second) Engine #1 Engine #2 Engine #3

No TimeGraph Support

40 80 120 160 200 20 40 60 80 100 120 Frames per Second Elapsed Time (Second) Engine #1 Engine #2 Engine #3

Priority Support (PRT) Priority Support (PRT) + Reservation Support (PE)

Widget Widget Widget Widget Widget Widget Widget Widget Widget

slide-22
SLIDE 22

USENIX Annual Technical Conference 2011 – Shinpei Kato (CMU), June 15, 2011

10 20 30 40 50 60 70 OpenArena World of Padman Urban Terror Unreal Trounament 3-D Game Application Average frame-rate (fps) No TimeGraph Support Priority Support (HT) Priority Support (PRT) Priority & Reservation Support (PRT & PE) Priority & Reservation Support (PRT & AE)

Standalone Performance

X server is assigned PRT policy

Overhead is acceptable for protecting GPU

slide-23
SLIDE 23

USENIX Annual Technical Conference 2011 – Shinpei Kato (CMU), June 15, 2011

Outline

  • 1. Introduction
  • 2. What’s Problem
  • 3. Our Solution – “TimeGraph”
  • 4. Evaluation
  • 5. Summary
slide-24
SLIDE 24

USENIX Annual Technical Conference 2011 – Shinpei Kato (CMU), June 15, 2011

Concluding Remarks

  • TimeGraph enables prioritization and isolation

for GPU applications in multi-tasking environments

– Device-driver solution: no modification to user-space – Scheduling of GPU commands – Reservation of GPU resource usage

  • http://rtml.ece.cmu.edu/projects/timegraph/
slide-25
SLIDE 25

USENIX Annual Technical Conference 2011 – Shinpei Kato (CMU), June 15, 2011

Current Status

  • GPGPU support (collaboration with PathScale Inc.)

– Visit http://github.com/pathscale/pscnv

  • Making open-source fast and reliable

– It’s getting competitive to the proprietary driver! – Some result from our OSPERT’11 paper (*) below:

0.01 0.1 1 10 100 NVIDIA Ours NVIDIA Ours NVIDIA Ours NVIDIA Ours NVIDIA Ours NVIDIA Ours NVIDIA Ours 16 x 16 32 x 32 64 x 64 128 x 128 256 x 256 512 x 512 1024 x 1024 Execution Time (ms)

Launch HtoD DtoH

NVIDIA GPU GeForce GTX 480 Matrix Multiplication * Available at http://www.contrib.andrew.cmu.edu/~shinpei/papers/ospert11.pdf

slide-26
SLIDE 26

Thank you for your attention! Questions?