Atlas Tracking Optimization on GPU Luis Domingues Professor: - PowerPoint PPT Presentation

Master Thesis Atlas Tracking Optimization on GPU Luis Domingues Professor: Frédéric Bapst Supervisors: Paolo Calafiura Wim Lavrijsen Expert: Mathieu Monney 02/25/2015

Target Luis Domingues - January 2015 2

Code we started from ● Demonstrator of ATLAS trigger on GPUs ● Basic host side – Take data – Send and compute data on GPU – Sleep waiting the response Luis Domingues - January 2015 3

Code we started from Luis Domingues - January 2015 4

Overlapping pixels and SCT ● The pixel and SCT processing are done in sequence ● Same event, but sequential processing... Time Time Pixel Kernels stamp stamp Time Time SCT Kernels stamp stamp Time Luis Domingues - January 2015 5

Overlapping pixels and SCT Luis Domingues - January 2015 6

CUDA Streams ● A stream is a queue of execution ● Non-default streams can be executed in parallel Stream1 H2D Kernel D2H Stream2 H2D Kernel D2H Stream3 H2D Kernel D2H Time H2D = Host to device transfer D2H = Device to host transfer Luis Domingues - January 2015 7

Overlapping pixels and SCT ● Use CUDA Streams ● Start the processing of SCT before pixels end Time Time Pixel stream Kernels stamp stamp Time Time SCT stream Kernels stamp stamp Time Luis Domingues - January 2015 8

Overlapping pixels and SCT Luis Domingues - January 2015 9

Overlapping pixels and SCT ● For 2000 events, without overlapping – Avg Pixel: 2.03 ms – Avg SCT: 1.95 ms – Total avg: 3.98 ms ● For 2000 events, overlapping – Avg Pixel: 2.3 ms – Avg SCT: 2.5 ms Luis Domingues - January 2015 10

Overlapping pixels and SCT ● Total execution – Without overlapping: 8.65 s – With overlapping: 6.53 s Luis Domingues - January 2015 11

Multi-thread server side ● Huge amount of “small” data – They do not fulfill the GPU ● Parallelize the “event” level processing with streams Luis Domingues - January 2015 12

Multi-thread server side Client Client Client Client FIFO Client Client Client Client Luis Domingues - January 2015 13

Multi-thread server side ● Life of a thread Luis Domingues - January 2015 14

Multi-thread server side Luis Domingues - January 2015 15

Multi-thread server side ● Executions time – Without overlapping: 8.65 s – With overlapping: 6.53 s – Multi-threading server side: 4.7 s Luis Domingues - January 2015 16

CUDA Occupancy ● A good setup of Grid/Block size in card can be significant ● CUDA offers an API to maximize the occupancy of the kernels Luis Domingues - January 2015 17

CUDA Occupancy Cuda Core Multiprocessor GPU Luis Domingues - January 2015 18

CUDA Occupancy ● Bad block size Setup Cuda Core Multiprocessor GPU Kernel 1 Kernel 2 Intra-block synchronization Luis Domingues - January 2015 19

CUDA Occupancy ● Better block Setup Cuda Core Multiprocessor GPU Kernel 1 Kernel 2 Intra-block synchronization Luis Domingues - January 2015 20

CUDA Occupancy ● Maximize the occupancy kills global performances ● Runs results for 2000 events – Big Blocks size: 10.88 s – Original configuration: 4.7 s – Small blocks size: 4.4 s Luis Domingues - January 2015 21

CUDA Occupancy ● Maximize the occupancy kills global performances ● Runs results for 2000 events – Big blocks size: 3 kernels in parallel (Max 5) – Small blocks size: 4 kernels in parallel (Max 7) Luis Domingues - January 2015 22

Conclusion ● Important points when using a GPU – Port of an algorithm to the GPU – Communicate with the GPU – Host side design ● Keep the GPU busy ● Big occupancy does not allow the GPU to schedule its tasks efficiently Luis Domingues - January 2015 23

Atlas Tracking Optimization on GPU Luis Domingues Professor: - PowerPoint PPT Presentation

Master Thesis Atlas Tracking Optimization on GPU Luis Domingues Professor: Frdric Bapst Supervisors: Paolo Calafiura Wim Lavrijsen Expert: Mathieu Monney 02/25/2015 Target Luis Domingues - January 2015 2 Code we started from

Measuring DNSSEC using RIPE Atlas Kaveh Ranjbar RIPE NCC RIPE Atlas Coverage RIPE Atlas 2

ATLAS Searches for SUSY Chris Young, CERN ATLAS Group What have we not looked for? 1 / 37 ATLAS

Status of GPU offloading on Wayland Axel Davy FOSDEM 2014 Status of GPU offloading on Wayland

Motivation to Learn GPGPU Julius Parulek Why to Learn About GPU? Computational power of GPU vs.

ATLAS ROOT I/O pt 2 Atlas Hot Topics (with reference to CHEP presentations) Big data

ATLAS I/O Overview Peter van Gemmeren (ANL) gemmeren@anl.gov for many in ATLAS 8/23/2018 Peter

Tracking H akan Ard o March 4, 2013 H akan Ard o Tracking March 4, 2013 1 / 57

UNIFIED MEMORY ON PASCAL AND VOLTA Nikolay Sakharnykh - May 10, 2017 1 HETEROGENEOUS

Advancements in V-Ray RT GPU Vlado Koylazov, CTO & Co-founder Blagovest Taskov, RT GPU Team

Performance Evaluation of a Multithreaded GPU Using CUDA GPU architecture GeForce 8800 GPU

TDT24 Presentation - GPU Optimization Principles Johannes Kvam Department of Engineering

GPU-Accelerated Object Tracking Using Particle Filtering and Appearance-adaptive Models Bogusaw

Top Properties from ATLAS Chris Young (CERN), on behalf of ATLAS 27th May 2020 1 / 19 Top

Atlas Summit 2016 C ALL FOR P RESENTA TION P ROPOSALS The Atlas Society is currently planning the

Atlas Arteria Investor Presentation July 2018 Important notice and disclaimer Disclaimer Atlas

ATLAS Shrugged ATLAS Shrugged Pat O Toole Toole Pat O (with apologies to Ayn Rand and

High-level Componentization as a Way of Efficient Server-side Logic Implementation in Ubiq Mobile

Toll Information Meeting February 13, 2017 East High School The final meeting of the Westtown

Rescue Robotics u obo Challenge g Satoshi Tadokoro Satoshi Tadokoro Tohoku University /

Integrating Wildfire Management with Conservation Objectives Richard Harris, Ph.D., RPF #1961

Utah State Archives Conference October 6, 2016 About UVU Email Outlook and Exchange

CWRU Discovery Days Dean of Students: Advocacy & Support at CWRU Fall 2020 Dean of Students

Publications From the SL Program College of Engineering Office of International Affairs College

Suzanne & Richard Pieper Family Foundation Servant-Leader Chair Annual Presentation 2013

Atlas Tracking Optimization on GPU Luis Domingues Professor: - PowerPoint PPT Presentation

Master Thesis Atlas Tracking Optimization on GPU Luis Domingues Professor: Frdric Bapst Supervisors: Paolo Calafiura Wim Lavrijsen Expert: Mathieu Monney 02/25/2015 Target Luis Domingues - January 2015 2 Code we started from

Measuring DNSSEC using RIPE Atlas Kaveh Ranjbar RIPE NCC RIPE Atlas Coverage RIPE Atlas 2

ATLAS Searches for SUSY Chris Young, CERN ATLAS Group What have we not looked for? 1 / 37 ATLAS

Status of GPU offloading on Wayland Axel Davy FOSDEM 2014 Status of GPU offloading on Wayland

Motivation to Learn GPGPU Julius Parulek Why to Learn About GPU? Computational power of GPU vs.

ATLAS ROOT I/O pt 2 Atlas Hot Topics (with reference to CHEP presentations) Big data

ATLAS I/O Overview Peter van Gemmeren (ANL) gemmeren@anl.gov for many in ATLAS 8/23/2018 Peter

Tracking H akan Ard o March 4, 2013 H akan Ard o Tracking March 4, 2013 1 / 57

UNIFIED MEMORY ON PASCAL AND VOLTA Nikolay Sakharnykh - May 10, 2017 1 HETEROGENEOUS

Advancements in V-Ray RT GPU Vlado Koylazov, CTO &amp; Co-founder Blagovest Taskov, RT GPU Team

Performance Evaluation of a Multithreaded GPU Using CUDA GPU architecture GeForce 8800 GPU

TDT24 Presentation - GPU Optimization Principles Johannes Kvam Department of Engineering

GPU-Accelerated Object Tracking Using Particle Filtering and Appearance-adaptive Models Bogusaw

Top Properties from ATLAS Chris Young (CERN), on behalf of ATLAS 27th May 2020 1 / 19 Top

Atlas Summit 2016 C ALL FOR P RESENTA TION P ROPOSALS The Atlas Society is currently planning the

Atlas Arteria Investor Presentation July 2018 Important notice and disclaimer Disclaimer Atlas

ATLAS Shrugged ATLAS Shrugged Pat O Toole Toole Pat O (with apologies to Ayn Rand and

High-level Componentization as a Way of Efficient Server-side Logic Implementation in Ubiq Mobile

Toll Information Meeting February 13, 2017 East High School The final meeting of the Westtown

Rescue Robotics u obo Challenge g Satoshi Tadokoro Satoshi Tadokoro Tohoku University /

Integrating Wildfire Management with Conservation Objectives Richard Harris, Ph.D., RPF #1961

Utah State Archives Conference October 6, 2016 About UVU Email Outlook and Exchange

CWRU Discovery Days Dean of Students: Advocacy &amp; Support at CWRU Fall 2020 Dean of Students

Publications From the SL Program College of Engineering Office of International Affairs College

Suzanne &amp; Richard Pieper Family Foundation Servant-Leader Chair Annual Presentation 2013

Advancements in V-Ray RT GPU Vlado Koylazov, CTO & Co-founder Blagovest Taskov, RT GPU Team

CWRU Discovery Days Dean of Students: Advocacy & Support at CWRU Fall 2020 Dean of Students

Suzanne & Richard Pieper Family Foundation Servant-Leader Chair Annual Presentation 2013