PLATFORMS BRANT ZHAO, NVIDIA MAX LV, NVIDIA Performance Energy - PowerPoint PPT Presentation

POWER EFFICIENT VISUAL COMPUTING ON MOBILE PLATFORMS BRANT ZHAO, NVIDIA MAX LV, NVIDIA

• Performance • Energy Efficiency

Power Efficient GPU Programming - Case Studies & Findings

Case study #1: Image Pyramid Blending

Image Pyramid Blending Reconstruct, Up-sample and + = Add

Image Pyramid Blending - A naïve CUDA implementation cudaMalloc cudaMalloc cudaMalloc cudaMalloc CPU for for for for pyramids pyramids pyramids pyramids Create Create Create Laplacian Laplacian Gaussian Blend Reconstruct GPU Pyramids Pyramids Pyramids Laplacian Blended Image for left for right for mask Pyramids image image image CPU FREQUENCY TIME

Image Pyramid Blending - Power optimized: Avoid CPU<->GPU interleaving cudaMalloc cudaMalloc cudaMalloc cudaMalloc CPU for for for for pyramids pyramids pyramids pyramids Create Create Create Laplacian Laplacian Gaussian Blend Reconstruct Pyramids Pyramids Pyramids Laplacian Blended GPU for left for right for mask Pyramids Image image image image CPU FREQUENCY TIME

Image Pyramid Blending - Perf/Watt comparison 1.05 1.00 NORMALIZED PERFORMANCE 0.95 0.90 0.85 0.80 0.75 0.70 0.65 0.60 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00 1.05 NORMALIZED CPU+GPU POWER CPU GPU interleaving NOT Interleaving

Case study #2: 2D Convolution

2D Convolution 1 2 1 2 0.25 0 0 0 0 2 3 1 + = 0 0 0 2 0 1 10 0 0 0.75

2D Convolution 1 2 1 2 0.25 0 0 0 0 2 3 1 8 + = 0 0 0 2 0 1 10 0 0 0.75

2D Convolution - 3x3 2D convolution with FP16 0.25 0 0 1 2 1 2 pack0 pack2 pack1 pack3 pack5 1 8 0 0 0 0 0 pack4 2 3 0.25 0.5 0 0 0.75 2 0 1 10 pack6 pack8 pack7 • Basic operations for 2 output pixels • 9 packed FP16 MAD

2D Convolution - Perf/Watt comparison 1.1 1.0 NORMALIZED PERFORMANCE 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 NORMALIZED GPU POWER FP32 FP16

Case study #3: Sparse Lucas- ∆𝑞 2 Kanade Optical Flow ∆𝑞 1 (SparseLK)

SparseLK ∆𝑞 2 ∆𝑞 0 ∆𝑞 1 First Frame 𝐽 Second Frame 𝐽 𝑜𝑓𝑦𝑢 −1 2 𝐽 𝑦 𝐽 𝑦 𝐽 𝑧 (𝐽 𝑦 − 𝐽 𝑜𝑓𝑦𝑢 𝑦 + ∆𝑞 𝑞𝑠𝑓𝑤 ) 𝐽 𝑦 ∆𝑞 = 𝐽 𝑧 2 𝐽 𝑦 𝐽 𝑧 𝐽 𝑧 𝑦∈𝛻 𝑦∈𝛻

SparseLK - Solution#1 −1 2 𝐽 𝑦 𝐽 𝑦 𝐽 𝑧 (𝐽 𝑦 − 𝐽 𝑜𝑓𝑦𝑢 𝑦 + ∆𝑞 𝑞𝑠𝑓𝑤 ) 𝐽 𝑦 ∆𝑞 = 𝐽 𝑧 2 𝐽 𝑦 𝐽 𝑧 𝐽 𝑧 𝑦∈𝛻 𝑦∈𝛻 T0 T1 … T5 • Multiple threads for a feature point • Share data via shared memory or shuffle Reduction needed to get final • results High thread level • parallelism(TLP) but more instructions needed

SparseLK - Solution#2 −1 2 𝐽 𝑦 𝐽 𝑦 𝐽 𝑧 (𝐽 𝑦 − 𝐽 𝑜𝑓𝑦𝑢 𝑦 + ∆𝑞 𝑞𝑠𝑓𝑤 ) 𝐽 𝑦 ∆𝑞 = 𝐽 𝑧 2 𝐽 𝑦 𝐽 𝑧 𝐽 𝑧 𝑦∈𝛻 𝑦∈𝛻 T0 • Each thread handles a feature point T1 • No need to shuffle data No need to do reduction • Need more registers to hold • data High instruction level • parallelism(ILP) but low occupancy

SparseLK - Instruction# and Perf/Watt 𝑋𝑝𝑠𝑙𝑚𝑝𝑏𝑒 𝑄𝑓𝑠𝑔 = 𝑋𝑝𝑠𝑙𝑚𝑝𝑏𝑒 𝑇𝑓𝑑 𝑋𝑏𝑢𝑢 = 𝐹𝑜𝑓𝑠𝑕𝑧 𝐹𝑜𝑓𝑠𝑕𝑧 𝑇𝑓𝑑 𝐹𝑜𝑓𝑠𝑕𝑧 = 𝐹𝑜𝑓𝑠𝑕𝑧𝑄𝑓𝑠𝐽𝑜𝑡𝑢 𝑡ℎ𝑣𝑔𝑔𝑚𝑓 ∗ 𝐽𝑜𝑢𝑠𝑣𝑑𝑢𝑗𝑝𝑜 𝑡ℎ𝑣𝑔𝑔𝑚𝑓 𝐹𝑜𝑓𝑠𝑕𝑧 = 𝐹𝑜𝑓𝑠𝑕𝑧𝑄𝑓𝑠𝐽𝑜𝑡𝑢 𝑡ℎ𝑣𝑔𝑔𝑚𝑓 ∗ 𝐽𝑜𝑢𝑠𝑣𝑑𝑢𝑗𝑝𝑜 𝑡ℎ𝑣𝑔𝑔𝑚𝑓 + 𝐹𝑜𝑓𝑠𝑕𝑧𝑄𝑓𝑠𝐽𝑜𝑡𝑢 𝑠𝑓𝑒𝑣𝑑𝑢𝑗𝑝𝑜 ∗ 𝐽𝑜𝑡𝑢𝑠𝑣𝑑𝑢𝑗𝑝𝑜 𝑠𝑓𝑒𝑣𝑑𝑢𝑗𝑝𝑜 + 𝐹𝑜𝑓𝑠𝑕𝑧𝑄𝑓𝑠𝐽𝑜𝑡𝑢 𝑠𝑓𝑒𝑣𝑑𝑢𝑗𝑝𝑜 ∗ 𝐽𝑜𝑡𝑢𝑠𝑣𝑑𝑢𝑗𝑝𝑜 𝑠𝑓𝑒𝑣𝑑𝑢𝑗𝑝𝑜 + 𝐹𝑜𝑓𝑠𝑕𝑧𝑄𝑓𝑠𝐽𝑜𝑡𝑢 𝑝𝑢ℎ𝑓𝑠 ∗ 𝐽𝑜𝑡𝑢𝑠𝑣𝑑𝑢𝑗𝑝𝑜 𝑝𝑢ℎ𝑓𝑠 + 𝐹𝑜𝑓𝑠𝑕𝑧𝑄𝑓𝑠𝐽𝑜𝑡𝑢 𝑝𝑢ℎ𝑓𝑠 ∗ 𝐽𝑜𝑡𝑢𝑠𝑣𝑑𝑢𝑗𝑝𝑜 𝑝𝑢ℎ𝑓𝑠 + 𝑄𝑝𝑥𝑓𝑠 𝑥𝑏𝑡𝑢𝑓𝑒 ∗ 𝑈𝑗𝑛𝑓

SparseLK - Perf/Watt comparison 1.2 NORMALIZED PERFORMANCE 1.0 0.8 0.6 0.4 0.2 0.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 NORMALIZED GPU POWER Multithread per feature SingleThreadPerFeature

Summary  Analyze the whole pipeline at the system level  Use energy efficient features on the target platform  Balance between TLP and ILP

THANK YOU brantz@nvidia.com

PLATFORMS BRANT ZHAO, NVIDIA MAX LV, NVIDIA Performance Energy - PowerPoint PPT Presentation

POWER EFFICIENT VISUAL COMPUTING ON MOBILE PLATFORMS BRANT ZHAO, NVIDIA MAX LV, NVIDIA Performance Energy Efficiency Power Efficient GPU Programming - Case Studies & Findings Case study #1: Image Pyramid Blending Image Pyramid

WILL YOU EAT OR BE EATEN ? Platforms are as old as trains 2 Sometimes platforms go wrong 3

You call it Data Lake; we call it Data Historian Naghman Waheed Data Platforms Lead Brian

Platforms Where is the market going? Adviser lead Platforms: Current state of affairs c.

Powering Compute Powering Compute Platforms in High Platforms in High Efficiency Data

Mobile Phone Platforms and Mobile Phone Platforms and Service Enablers Service Enablers Dr.

Building Open Sour Building Open Source platforms ce platforms on A on AWS WS Julien Simon

ORADEA INDUSTRIAL PLATFORMS ROMANIA Oradea Local Development Agency www.adlo.ro ORADEA

Install new track on Fully operational, December approach platforms 1-8 2018 Realign and

FTC Team By Patti Poston FIRST Senior Mentor Virtual Platforms Virtual Platforms Zoom

Cache Storage Channels Alias-driven Attacks Formally Verified Platforms Formally Verified

What does the title mean? 1. part: R on Different Platforms on Different Platforms What is R ?

@rmchase PEERS INC EXCESS PEOPLE PLATFORMS CAPACITY People & Platforms are Inventing the

Digital platforms and coring strategies for public-private collaboration IN5320 2020

T-PLATFORMS March 3, 2016 Artem Osipov Alexander Daryin GraphHPC-2016 www.t-platforms.com BFS

Find your own Applet! AppletPro helps users efficiently search and manage applets from different

Complementi di Piattaforme Abilitanti Distribuite Distributed Enabling Platforms || MCSN N.

Nuts and Bolts of Service-Learning HILLARY AISENSTEIN Director, PHENND Philadelphia Higher

Improving socio-emotional health and school performance for pupils in early secondary education

Introduc tions Ellie Martin Mary Louise Hemmeter Amy Hunter Routefinder Consulting, PLLC

NM Pyramid Partnership A Framework for Supporting the Social Emotional Well-Being of Infants,

Aldo Leopold A Sand County Alamanac ETHICS Complex

Pyramid Way and McCarran Blvd Intersection Improvement Project Sparks City Council Project

AIA CHEER AND STUNT SCORESHEETS The purpose of this presentation is to explain the

Bab Al Qasr Hotel otel by y Mil Millenn lennium ium CONTENT SLIDES: General info &

PLATFORMS BRANT ZHAO, NVIDIA MAX LV, NVIDIA Performance Energy - PowerPoint PPT Presentation

POWER EFFICIENT VISUAL COMPUTING ON MOBILE PLATFORMS BRANT ZHAO, NVIDIA MAX LV, NVIDIA Performance Energy Efficiency Power Efficient GPU Programming - Case Studies & Findings Case study #1: Image Pyramid Blending Image Pyramid

WILL YOU EAT OR BE EATEN ? Platforms are as old as trains 2 Sometimes platforms go wrong 3

You call it Data Lake; we call it Data Historian Naghman Waheed Data Platforms Lead Brian

Platforms Where is the market going? Adviser lead Platforms: Current state of affairs c.

Powering Compute Powering Compute Platforms in High Platforms in High Efficiency Data

Mobile Phone Platforms and Mobile Phone Platforms and Service Enablers Service Enablers Dr.

Building Open Sour Building Open Source platforms ce platforms on A on AWS WS Julien Simon

ORADEA INDUSTRIAL PLATFORMS ROMANIA Oradea Local Development Agency www.adlo.ro ORADEA

Install new track on Fully operational, December approach platforms 1-8 2018 Realign and

FTC Team By Patti Poston FIRST Senior Mentor Virtual Platforms Virtual Platforms Zoom

Cache Storage Channels Alias-driven Attacks Formally Verified Platforms Formally Verified

What does the title mean? 1. part: R on Different Platforms on Different Platforms What is R ?

@rmchase PEERS INC EXCESS PEOPLE PLATFORMS CAPACITY People &amp; Platforms are Inventing the

Digital platforms and coring strategies for public-private collaboration IN5320 2020

T-PLATFORMS March 3, 2016 Artem Osipov Alexander Daryin GraphHPC-2016 www.t-platforms.com BFS

Find your own Applet! AppletPro helps users efficiently search and manage applets from different

Complementi di Piattaforme Abilitanti Distribuite Distributed Enabling Platforms || MCSN N.

Nuts and Bolts of Service-Learning HILLARY AISENSTEIN Director, PHENND Philadelphia Higher

Improving socio-emotional health and school performance for pupils in early secondary education

Introduc tions Ellie Martin Mary Louise Hemmeter Amy Hunter Routefinder Consulting, PLLC

NM Pyramid Partnership A Framework for Supporting the Social Emotional Well-Being of Infants,

Aldo Leopold A Sand County Alamanac ETHICS Complex

Pyramid Way and McCarran Blvd Intersection Improvement Project Sparks City Council Project

AIA CHEER AND STUNT SCORESHEETS The purpose of this presentation is to explain the

Bab Al Qasr Hotel otel by y Mil Millenn lennium ium CONTENT SLIDES: General info &amp;

@rmchase PEERS INC EXCESS PEOPLE PLATFORMS CAPACITY People & Platforms are Inventing the

Bab Al Qasr Hotel otel by y Mil Millenn lennium ium CONTENT SLIDES: General info &