Computational Glass-Free Displays Zhuolun He and Guojie Luo Peking - PowerPoint PPT Presentation

FPGA Acceleration for Computational Glass-Free Displays Zhuolun He and Guojie Luo Peking University FPGA, Feb. 2017

Motivation: hyperopia/myopia Issues 2

Background Technology: Glass-Free Display • Light-field display – [Huang and Wetzstein, SIGGRAPH 2014] • Correcting for visual aberrations – Display: predistorted content – Retina: desired image display retina Desired perception Target light field 3

Related Technologies: Light Field Camera 4

Related: Near-eye Light-field Display Source: NVIDIA, SIGGRAPH Asia 2013 5

Pinhole Array vs. Microlens One 75um pinhole in every 390um manufactured using lithography 6

In this Paper… • Analyze the computational kernels • Accelerate using FPGAs • Propose several optimizations 7

Computational Glass-Free Display display retina Desired perception Target light field T T T = * x P u 8

Casting as a Model Fitting Problem T T T = * x P u 𝑔 𝑦 = ∥ 𝑣 − 𝑄𝑦 ∥ 2 minimize subeject to 0 ≤ 𝑦 ≤ 1 9

Background of the L-BFGS Algorithm • L-BFGS: a widely-used convex optimization algorithm Calculate gradient 𝛼𝑔(𝑦 𝑙 ) N Calculate direction 𝑞 𝑙 Y done Converged? Search for step length 𝛽 𝑙 Update 𝑦 𝑙+1 = 𝑦 𝑙 + 𝛽 𝑙 𝑞 𝑙 10

Background of the L-BFGS Algorithm • L-BFGS algorithm 𝑞 𝑙 = −𝛼𝑔(𝑦 𝑙 ) for 𝑗 = 𝑙 − 1 to 𝑙 − 𝑛 do – Input: (history size = m) 𝛽 𝑗 = (𝑡 𝑗 ∙ 𝑞 𝑙 ) (𝑡 𝑗 ∙ 𝑧 𝑗 ) # some work 𝑦 𝑙−𝑛+1 ⋯ 𝑦 𝑙 𝑞 𝑙 = 𝑞 𝑙 − 𝛽 𝑗 𝑧 𝑗 𝛼𝑔 𝑦 𝑙 𝛼𝑔 𝑦 𝑙−𝑛+1 end for 𝑡 𝑘 = 𝑦 𝑘+1 − 𝑦 𝑘 ▫ 𝑞 𝑙 = 𝑞 𝑙 ∙ (𝑡 𝑙−1 ∙ 𝑧 𝑙−1 ) (𝑧 𝑙−1 ∙ 𝑧 𝑙−1 ) ▫ 𝑧 𝑘 = 𝛼𝑔(𝑦 𝑘+1 ) − 𝛼𝑔(𝑦 𝑘 ) for 𝑗 = 𝑙 − 𝑛 to 𝑙 − 1 do – Output: direction 𝑞 𝑙 𝛾 𝑗 = (𝑧 𝑗 ∙ 𝑞 𝑙 ) (𝑡 𝑗 ∙ 𝑧 𝑗 ) # more work 𝑞 𝑙 = 𝑞 𝑙 +(𝛽 𝑗 − 𝛾 𝑗 )𝑡 𝑗 • Computational kernels end for – dot prod return direction 𝑞 𝑙 – vector updates 11

Vector-free L-BFGS Algorithm • Original idea 𝑞 𝑙 = −𝛼𝑔(𝑦 𝑙 ) for 𝑗 = 𝑙 − 1 to 𝑙 − 𝑛 do – [NIPS 2014] 𝛽 𝑗 = (𝑡 𝑗 ∙ 𝑞 𝑙 ) (𝑡 𝑗 ∙ 𝑧 𝑗 ) # some work • Observation 𝑞 𝑙 = 𝑞 𝑙 − 𝛽 𝑗 𝑧 𝑗 – 𝑞 𝑙 is a linear combination of some end for basis in {𝑡 𝑘 } and {𝑧 𝑘 } 𝑞 𝑙 = 𝑞 𝑙 ∙ (𝑡 𝑙−1 ∙ 𝑧 𝑙−1 ) (𝑧 𝑙−1 ∙ 𝑧 𝑙−1 ) • Techniques for 𝑗 = 𝑙 − 𝑛 to 𝑙 − 1 do 𝛾 𝑗 = (𝑧 𝑗 ∙ 𝑞 𝑙 ) (𝑡 𝑗 ∙ 𝑧 𝑗 ) – dot prod ⇒ lookup + scalar op. # more work 𝑞 𝑙 = 𝑞 𝑙 +(𝛽 𝑗 − 𝛾 𝑗 )𝑡 𝑗 – vector update ⇒ coeff. update end for return direction 𝑞 𝑙 12

𝛼𝑔(𝑦 𝑙 ) {coeff.} dotprod() s-v mult. scalar {coeff.} dotprod dotprod() table s-v mult. v-v add. scalar {coeff.} v−v add. {coeff.} dotprod() s-v mult. scalar 𝑞 𝑙 v−v add. 𝑞 𝑙 {coeff.} Original L-BFGS Vector-free L-BFGS 13

Updating the Dot Product Table Scenario Focus Distributed computing minimize [NIPS 2014] using MapReduce #syncs FPGA acceleration with minimize Ours small on-chip BRAM data transfers • Similar idea to reduce data transfers – dot prod ⇒ lookup + scalar op. – vector update => coeff. update 14

Distributed vs. FPGA-based Scenario Focus data transfer Distributed computing minimize [NIPS 2014] 8md using MapReduce #syncs FPGA acceleration with minimize Ours (4m+4)d small on-chip BRAM data transfers – m: history size (e.g., 10) – d: image size 15

Sparse Matrix-Vector Multiplication minimize 𝑔 𝑦 = ∥ 𝑣 − 𝑄𝑦 ∥ 2 • Size of matrix/vector – Sparse matrix 𝑄 : 16384*490000 – Variable 𝑦 : 490000 16

Sparse Matrix-Vector Multiplication minimize 𝑔 𝑦 = ∥ 𝑣 − 𝑄𝑦 ∥ 2 • Problem: storage of P Format Storage (MB) • Solution: flat 32112.64 COO 6.63 – Sparsity => compressed row storage (CRS) CRS 5.24 – Range of indices => bitwidth reduction CRS+LUT 2.90 – #unique values => look-up table (LUT) ▫ ~ 810K non-zero entries ▫ ~600 unique values 17

Sparse Matrix-Vector Multiplication minimize 𝑔 𝑦 = ∥ 𝑣 − 𝑄𝑦 ∥ 2 Min Max Factor 𝑶 Method Total cycle cycle/row cycle/row 980 cyclic 1 1 16384 • Problem: partitioning vector 𝑦 1225 cyclic 1 1 16384 • “Solution”: 1250 cyclic 1 2 19840 – Matrix 𝑄 is irregular but constant … … … … … 1400 block 4 18 188564 – => access pattern is non-affine but statistically analyzable 1250 block 5 18 193276 – => enumerate factors of |𝑦| as partitioning factors … … … … … 1 N/A 37 54 816272 18

Overall Design of the Accelerator • [Li et al, FPGA 15] • Maximize performance • Subject to resources 19

Experimental Evaluation 140 124.5 Baseline 120 100 SpMV optimization Time (s) 80 65.49 L-BFGS enhancement 60 47.47 parameter tuning in L-BFGS 40 25.26 9.74 20 Overall result after other fine tunings 0 Runtime Comparison +: 12.78X Speedup - : Peak memory bandwidth < 800MB/s 20

Conclusions • Summary – Bandwidth-friendly L-BFGS algorithm – Application-specific sparse matrix compression – Memory partitioning for non-affine access • Future work – Possibility of real-time processing – Construct transformation matrix by eye-ball tracking – A demonstrative system 21

Questions? 22

Runtime Profiling of a 2-min L-BFGS per procedure per operation 23

Computational Glass-Free Displays Zhuolun He and Guojie Luo Peking - PowerPoint PPT Presentation

FPGA Acceleration for Computational Glass-Free Displays Zhuolun He and Guojie Luo Peking University FPGA, Feb. 2017 Motivation: hyperopia/myopia Issues 2 Background Technology: Glass-Free Display Light-field display [Huang and

SCULPTURE MATERIALS SCULPTURE MATERIALS PATTERNS PLATE GLASS GLASS BOTTLES GLASS BOTTLES

3M Glass Bubbles iM16K 3M Glass Bubbles iM16K 3M Glass Bubbles and Other Additives

Knowledge Seminar Series : High Performance Glass October 18, 2017 Jim Larsen Director,

An Image Capture Application For The Google Glass Framework Oliver Nina (UCF), Roger Pack (FS)

RUSSIAN FLAT GLASS COATINGS MARKET OVERVIEW Dmitriy D. Bernt Glass Coating Technology

Glass Packaging Institute Overview and Activity Update Bryan Vickers Glass Packaging Institute

Glass Container Manufacturing & Recycling Overview Scott DeFife President Glass Packaging

3. A vacuum Glass Catch attachment for broken glass GLASS OTHERS HOME $2.35 Million

The Glass Menagerie Jadn S. & Claire B. Symbolism Lauras Glass Menagerie The Glass

Glass Fiber Reinforced Concrete What is Glass Fiber Reinforced Concrete Glass-fiber Reinforced

Recycling of Glass from Construction & Building Demolition Waste Views from the flat glass

Sept 2017 Fast, Safe , clean the Easy Lamination of Glass-Glass CONFIDENTIAL Page 1 Agenda

Glass Transformation- - Glass Transformation Range Behavior Range Behavior Richard K. Brow

Web Course Web Course Physical Properties of Glass Physical Properties of Glass 1. Properties

GLASS 2.0 Dale Henrichs VMware, Inc. ESUG 2011 Edinburgh, Scotland August 22, 2011 GLASS 2.0

Glass Transformation- -Range Range Glass Transformation Behavior- - Odds and Ends Odds and

TECHNIQUES AND ADVANCES IN VISION SCREENING Neither I nor my immediate family members have a

Static and dynamic properties of accommodation. Maddox components (response categories) of

Effectively Engaging in Should change Have been thinking about changing Change but

CARE MANAGEMENT AT RUSH: A FOCUS ON PRIMARY CARE Robyn Golden, LCSW Associate Vice President,

Disclosures Management of Hyperparathyroidism Pacific Rim Otolaryngology Head and Neck Update

Disclosure Calcium, Vitamin D, PTH Disorders I have nothing to disclose related to this topic

Which systemic therapy for which patient with newly diagnosed metastatic prostate cancer?

New Understandings/ New Opportunities Learning Objectives Show why the current definitions of

Computational Glass-Free Displays Zhuolun He and Guojie Luo Peking - PowerPoint PPT Presentation

FPGA Acceleration for Computational Glass-Free Displays Zhuolun He and Guojie Luo Peking University FPGA, Feb. 2017 Motivation: hyperopia/myopia Issues 2 Background Technology: Glass-Free Display Light-field display [Huang and

SCULPTURE MATERIALS SCULPTURE MATERIALS PATTERNS PLATE GLASS GLASS BOTTLES GLASS BOTTLES

3M Glass Bubbles iM16K 3M Glass Bubbles iM16K 3M Glass Bubbles and Other Additives

Knowledge Seminar Series : High Performance Glass October 18, 2017 Jim Larsen Director,

An Image Capture Application For The Google Glass Framework Oliver Nina (UCF), Roger Pack (FS)

RUSSIAN FLAT GLASS COATINGS MARKET OVERVIEW Dmitriy D. Bernt Glass Coating Technology

Glass Packaging Institute Overview and Activity Update Bryan Vickers Glass Packaging Institute

Glass Container Manufacturing &amp; Recycling Overview Scott DeFife President Glass Packaging

3. A vacuum Glass Catch attachment for broken glass GLASS OTHERS HOME $2.35 Million

The Glass Menagerie Jadn S. &amp; Claire B. Symbolism Lauras Glass Menagerie The Glass

Glass Fiber Reinforced Concrete What is Glass Fiber Reinforced Concrete Glass-fiber Reinforced

Recycling of Glass from Construction &amp; Building Demolition Waste Views from the flat glass

Sept 2017 Fast, Safe , clean the Easy Lamination of Glass-Glass CONFIDENTIAL Page 1 Agenda

Glass Transformation- - Glass Transformation Range Behavior Range Behavior Richard K. Brow

Web Course Web Course Physical Properties of Glass Physical Properties of Glass 1. Properties

GLASS 2.0 Dale Henrichs VMware, Inc. ESUG 2011 Edinburgh, Scotland August 22, 2011 GLASS 2.0

Glass Transformation- -Range Range Glass Transformation Behavior- - Odds and Ends Odds and

TECHNIQUES AND ADVANCES IN VISION SCREENING Neither I nor my immediate family members have a

Static and dynamic properties of accommodation. Maddox components (response categories) of

Effectively Engaging in Should change Have been thinking about changing Change but

CARE MANAGEMENT AT RUSH: A FOCUS ON PRIMARY CARE Robyn Golden, LCSW Associate Vice President,

Disclosures Management of Hyperparathyroidism Pacific Rim Otolaryngology Head and Neck Update

Disclosure Calcium, Vitamin D, PTH Disorders I have nothing to disclose related to this topic

Which systemic therapy for which patient with newly diagnosed metastatic prostate cancer?

New Understandings/ New Opportunities Learning Objectives Show why the current definitions of

Glass Container Manufacturing & Recycling Overview Scott DeFife President Glass Packaging

The Glass Menagerie Jadn S. & Claire B. Symbolism Lauras Glass Menagerie The Glass

Recycling of Glass from Construction & Building Demolition Waste Views from the flat glass