BLINK: A GPU-Enabled Image Processing Framework Mark Davey Lead - PowerPoint PPT Presentation

BLINK: A GPU-Enabled Image Processing Framework Mark Davey Lead HPC Engineer The Foundry

The Foundry and HPC • The Foundry – Founded in 1996 – We develop award-winning visual effects, computer graphics and design software used globally by leading artists and designers • HPC – We create frameworks to make best use of all available compute devices – “make things go faster” – Initial target: 2D Image Processing

2D Image Processing • A fundamental component in many Foundry products. Used in such effects as: • Noise reduction • Keying • Motion and disparity estimation • Colour correction/grading • Panoramic stitching • 3D texture creation Need to make it as fast as possible!

Moving to GPUs • Traditionally used the CPU for image processing • Lots of legacy code • GPUs are great at image processing • Our customers often have powerful GPUs but not always (e.g. render farms) • Need a fallback CPU path • Do not want to write same code multiple times (debugging, maintenance, new hardware, etc.)

The Solution - BLINK • “Write once, deploy everywhere” • Image processing algorithms expressed as kernels • Kernels written in a C++ like, domain-specific language • Kernels run over an iteration space • Metadata expresses access patterns, image formats, boundary conditions, etc. • Kernels are translated into different back-ends • JIT Compilation for many paths

BLINK - Features • Multiple back-ends supported • Consistent results across devices • Range of image formats and layouts available • Kernel execution strategy left to framework • Profiling (execute and transfer)

BLINK Back-ends • CUDA (4.2, Compute Capability 2.0) • OpenCL (1.1) • GLSL (1.2) • x86 (Scalar, SSE2, SSE4.1, AVX, AVX2)

BLINK Example class GainImage: ImageComputationKernel<eComponentWise> { param: Image<eRead, ePoint> src; Image<eWrite, ePoint> dst; float gain; void define(){ defineParam (gain, “ myGain ” , 1.0f); } void process(){ dst() = src() * gain; } };

BLINK - The Foundry Nuke – Post Production Compositing Software • Many key plug-ins written using BLINK • BlinkScript – Customers can create kernels within Nuke for GPU and CPU – Multi-GPU support on selected configurations • OCULA 4 – Stereoscopic Toolset Projects • ASAP – A Scalable 2D/3D Architecture for Cross Media Virtual Production • Dreamspace – Advancements in Virtual Production Frameworks

OCULA • A collection of Nuke tools to handle stereoscopic imagery • Vector Disparity Generator at its heart – Correct colour and focus, automatically correct alignment, retime • Latest version (4) written using BLINK • Over 12K kernel calls per frame!

OCULA 4 – Disparity Generation

OCULA 4 – Different Devices

Numerical Identity I • Our customers need visually identical results when processing on different devices. • Some algorithms are extremely sensitive to small differences in mathematical results (e.g. OCULA!) • Need to ensure numerical identity to guarantee visual identity

Numerical Identity – General Overview • Disable fast math - to prevent compiler from reordering math operations. • Force floating point literals to single precision - different compilers treat double literals differently giving inconsistent results. • Disable Fused-Multiply-Add (FMA) • Implement unified math library for all code paths – Algebraic functions sqrt, hypot … – Transcendental functions sin, exp … – Integral rounding functions ceil, floor … – IEEE standard functions fmod, fabs … – Matrices and operators transpose, inverse … – Vectors and operators dot, cross … – Others min, max …

Numerical Identity – Platform Specifics CUDA (nvcc flags) • Disable “Flush Denormals To Zero” (--ftz=false) • Disable “Fused M ultiply Add” (--fmad=false) • Enable precise square root and divide (--prec-sqrt=true --prec-div=true) CPU: • Precisely control FPU control register for rounding, denormal handing, etc ( using _mm_setcsr intrinsic ) • Implement vector types (float1..float4, int1..int4,...) Also supported for OpenCL (NVIDIA GPUs only)

OCULA 4 - Results OCULA 4 -Disparity - 3.3MPixel - Unified Math • Disparity generation 9 8 • 3.3MPixel (2560x1350) frames 7 6 Time (s) • End-to-end processing cost 5 4 • 3 Only 5% overhead for Numerical Identity 2 • 1 Many kernels are memory bound 0 K5000 - Unified Math K5000 - Optimised

OCULA 4 - Results Ocula 4 Disparity - 3.3MPixel Stereo 35 30 ~ 5 times faster on the GPU 25 … and more speed to come! Time (s) 20 15 10 5 0 CPU - 2x 6-Core Xeon K5000 - Unified Math K5000 - Optimised

Under Development…Examples • Heterogeneous Compute – Run graphs of kernels using scheduler – Target all available compute devices – Target data parallelism • BLINK for Real-time – Export BLINK graphs from Nuke to run in BLINKPlayer – Kernels can be modified in BLINKPlayer – Parameters can be introspected from kernels and presented as GUI widgets – Composite live and rendered imagery

Thank You Questions?

BLINK: A GPU-Enabled Image Processing Framework Mark Davey Lead - PowerPoint PPT Presentation

BLINK: A GPU-Enabled Image Processing Framework Mark Davey Lead HPC Engineer The Foundry The Foundry and HPC The Foundry Founded in 1996 We develop award-winning visual effects, computer graphics and design software used globally

Lessons from Contributng to WebKit and Blink Bruno de Oliveira Abinader WebKit, Chromium/Blink

PROGRAMMING ADVANCED FEEDBACK WITH ARDUINO Part 02 - Led blink fastled FASTLED LIBRARY

Introduction: What is Image Processing? CS 4640: Image Processing Basics January 10, 2012 What

Status of GPU offloading on Wayland Axel Davy FOSDEM 2014 Status of GPU offloading on Wayland

Motivation to Learn GPGPU Julius Parulek Why to Learn About GPU? Computational power of GPU vs.

Image Restoration Image Enhancement and Image Restoration both deal with improving images. Image

Blink: 3D Display Multiplexing for Virtualized Applications Jacob Gorm Hansen, University of

Image Processing Todays Class Image Representations: Matrices Image Representations: RGB,

Real-Time GPU Management Heechul Yun 1 This Week Topic: General Purpose Graphic Processing

UNIFIED MEMORY ON PASCAL AND VOLTA Nikolay Sakharnykh - May 10, 2017 1 HETEROGENEOUS

Advancements in V-Ray RT GPU Vlado Koylazov, CTO & Co-founder Blagovest Taskov, RT GPU Team

Image Processing Tricks in Image Processing Tricks in OpenGL OpenGL Simon Green Simon Green

Image Processing CS 110 Why Image Processing? Medical Images

Unprivileged GPU containers on a LXD cluster GPU-enabled system containers at scale Stphane

GPU Enabled Spark MLlib Lingyun Li & Lei Yao CS 848 University of Waterloo Outline

Rapidmind Background on GPGPU GPU Tutorial Goal: 3-D image -> 2-D image 2 main stages:

Third Quarter 2019 Earnings Presentation October 30, 2019 Legal Disclaimer Forward-Looking

Phonemic Awareness Phonological Awareness Phonological awareness is ones sensitivity

RBS Morgans S o ga s 2013 Institutional Conference Presentation Blended Travel by Andrew

Paper No. 256 Optimization of Naphtha Feedstock Blending for Integrated Olefins-Aromatics Plant

COM300 Book Review Presentation Book: Blink The Power of Thinking Without Thinking Author:

Data Landing Page Current State All users can currently do on Users struggle with the terms our

LM001.00 02 PAGE DOB JOB NUMBER 1 EXISTING EXTERIOR - 4TH AVENUE NONE NEW LOCATION eclipse

occam on the Arduino Adam T. Sampson School of Computing, University of Kent Matt C. Jadud

BLINK: A GPU-Enabled Image Processing Framework Mark Davey Lead - PowerPoint PPT Presentation

BLINK: A GPU-Enabled Image Processing Framework Mark Davey Lead HPC Engineer The Foundry The Foundry and HPC The Foundry Founded in 1996 We develop award-winning visual effects, computer graphics and design software used globally

Lessons from Contributng to WebKit and Blink Bruno de Oliveira Abinader WebKit, Chromium/Blink

PROGRAMMING ADVANCED FEEDBACK WITH ARDUINO Part 02 - Led blink fastled FASTLED LIBRARY

Introduction: What is Image Processing? CS 4640: Image Processing Basics January 10, 2012 What

Status of GPU offloading on Wayland Axel Davy FOSDEM 2014 Status of GPU offloading on Wayland

Motivation to Learn GPGPU Julius Parulek Why to Learn About GPU? Computational power of GPU vs.

Image Restoration Image Enhancement and Image Restoration both deal with improving images. Image

Blink: 3D Display Multiplexing for Virtualized Applications Jacob Gorm Hansen, University of

Image Processing Todays Class Image Representations: Matrices Image Representations: RGB,

Real-Time GPU Management Heechul Yun 1 This Week Topic: General Purpose Graphic Processing

UNIFIED MEMORY ON PASCAL AND VOLTA Nikolay Sakharnykh - May 10, 2017 1 HETEROGENEOUS

Advancements in V-Ray RT GPU Vlado Koylazov, CTO &amp; Co-founder Blagovest Taskov, RT GPU Team

Image Processing Tricks in Image Processing Tricks in OpenGL OpenGL Simon Green Simon Green

Image Processing CS 110 Why Image Processing? Medical Images

Unprivileged GPU containers on a LXD cluster GPU-enabled system containers at scale Stphane

GPU Enabled Spark MLlib Lingyun Li &amp; Lei Yao CS 848 University of Waterloo Outline

Rapidmind Background on GPGPU GPU Tutorial Goal: 3-D image -&gt; 2-D image 2 main stages:

Third Quarter 2019 Earnings Presentation October 30, 2019 Legal Disclaimer Forward-Looking

Phonemic Awareness Phonological Awareness Phonological awareness is ones sensitivity

RBS Morgans S o ga s 2013 Institutional Conference Presentation Blended Travel by Andrew

Paper No. 256 Optimization of Naphtha Feedstock Blending for Integrated Olefins-Aromatics Plant

COM300 Book Review Presentation Book: Blink The Power of Thinking Without Thinking Author:

Data Landing Page Current State All users can currently do on Users struggle with the terms our

LM001.00 02 PAGE DOB JOB NUMBER 1 EXISTING EXTERIOR - 4TH AVENUE NONE NEW LOCATION eclipse

occam on the Arduino Adam T. Sampson School of Computing, University of Kent Matt C. Jadud

Advancements in V-Ray RT GPU Vlado Koylazov, CTO & Co-founder Blagovest Taskov, RT GPU Team

GPU Enabled Spark MLlib Lingyun Li & Lei Yao CS 848 University of Waterloo Outline

Rapidmind Background on GPGPU GPU Tutorial Goal: 3-D image -> 2-D image 2 main stages: