Perceptually-Driven Video Coding with the Daala Video Codec Timothy - PowerPoint PPT Presentation

Perceptually-Driven Video Coding with the Daala Video Codec Timothy B. Terriberry The Xiph.Org Foundation & The Mozilla Corporation

Summary ● Daala is an attempt to completely avoid royalty- bearing technologies ● Used many unconventional tools ● Some worked well, others more challenging – We think the challenges are more interesting ● Many lessons learned that can inform AV1 development – Only a few presented here, see paper for more 2 The Xiph.Org Foundation & The Mozilla Corporation

Challenge 1: Lapped Transforms with Variable Block Sizes 3 The Xiph.Org Foundation & The Mozilla Corporation

Original Lapping Strategy ● Filter size chosen based on size of smallest block on an edge (to prevent overlap) ● Filter order chosen to mimic a loop filter’s – Horizontal edges first 4 The Xiph.Org Foundation & The Mozilla Corporation

Original Lapping Strategy ● Filter size chosen based on size of smallest block on an edge (to prevent overlap) ● Filter order chosen to mimic a loop filter’s – Then vertical – Maximal parallelism, minimum buffering 5 The Xiph.Org Foundation & The Mozilla Corporation

Problem #1: Basis Weirdness 6 The Xiph.Org Foundation & The Mozilla Corporation

Problem #2: Block size decision ● Have to know neighbors’ block sizes to compute lapping size ● Used a heuristic based on the estimated visibility of ringing to pick block sizes up front – Worked “okay” for still images (at least not obviously broken) – Was not making good decisions for inter frames ● Wanted to try explicit block size RDO (like other encoders)... – But lapping dependency makes this infeasible 7 The Xiph.Org Foundation & The Mozilla Corporation

“Fixed Lapping”: Remove the Dependency ● Always use 8-point lapping (4 pixels on either side of an edge) – Except on 4×4 blocks (details in a few slides) – Always use 4-point lapping for chroma (because of subsampling) 8 The Xiph.Org Foundation & The Mozilla Corporation

New Filter Order ● Filter top/bottom superblock (64×64) edges first 9 The Xiph.Org Foundation & The Mozilla Corporation

New Filter Order ● Filter left/right superblock (64×64) edges next 10 The Xiph.Org Foundation & The Mozilla Corporation

New Filter Order ● Splitting: Filter interior edges 11 The Xiph.Org Foundation & The Mozilla Corporation

New Filter Order ● Splitting: Filter interior edges – 4×4 blocks: ● Exterior edges use 8-point filter (from previous levels) ● Interior edges use 4-point filter (overlaps 8-point filter) 12 The Xiph.Org Foundation & The Mozilla Corporation

Results ● Big boost in metrics RATE (%) DSNR (dB) PSNR -10.36612 0.40904 – Almost all from decision PSNRHVS -4.48956 0.25806 SSIM -12.32547 0.38397 – Used fixed lapping decision FASTSSIM -5.20467 0.17350 with old lapping scheme and got almost all of the gains ● Smaller lapping means less ringing but more blockiness (especially on gradients) – Didn’t save much on ringing: 4×4 blocks have 12- pixel support instead of 8 – Eventually dropped to 4-point lapping everywhere 13 The Xiph.Org Foundation & The Mozilla Corporation

Challenge 2: Frequency Domain Intra Prediction 14 The Xiph.Org Foundation & The Mozilla Corporation

Frequency Domain Intra Prediction ● Perform prediction in transform domain – Shorter pipeline dependency for hardware ● Multiple (linear) prediction matrices trained from large dataset (approx. equiv. to spatial directions) ● Computational complexity controlled by enforcing “sparsity” (4 muls per output coefficient) 15 The Xiph.Org Foundation & The Mozilla Corporation

Frequency Domain Intra Prediction ● Variable block sizes make this worse – Best results: convert all neighbors to 4×4 with “TF” ● Most multiplies spent on predicting DC ● A simpler approach: – Haar DC: combine DCs from smaller blocks with Haar transform (down to one DC per 64x64 block) ● Hugely effective, no multiplies – Use first row/column of neighbors’ coefficients as sole AC predictor (only when block sizes match) ● Works just as well as orig. FDIP (not very), much simpler 16 The Xiph.Org Foundation & The Mozilla Corporation

Things We Did Not Try ● Spatial prediction from outside lapping region – Very complicated with original lapping scheme – Feasible with fixed lapping scheme ● Correcting for biorthogonal basis function scales – Intractable with original lapping ● “Smart” factorization of prediction matrices – Only improves up to the limit of non-sparse predictors 17 The Xiph.Org Foundation & The Mozilla Corporation

Directions for AV1 ● Directional Deringing – Fully SIMDable, good perceptual improvements ● Non-binary Arithmetic Coding – Small effective parallelism in entropy coding ● Perceptual Vector Quantization – Already showing small gains vs. scalar on PSNR – Potential for large perceptual improvements – Enables freq. Domain Chroma-from-Luma, others ● Rate control improvements 18 The Xiph.Org Foundation & The Mozilla Corporation

Daala Progress (Fast MS-SSIM): January 2014 to April 2016 up and left is better HQ YouTube LQ Video Conference Jan H.265 May Jun Apr Apr Nov Nov Feb The Xiph.Org Foundation & The Mozilla Corporation

Daala Progress (PSNR-HVS): January 2014 to April 2016 up and left is better HQ YouTube LQ Video Conference Jan May H.265 Jun Apr Apr Nov The Xiph.Org Foundation & The Mozilla Corporation Nov Feb

Questions? 21 The Xiph.Org Foundation & The Mozilla Corporation

Perceptually-Driven Video Coding with the Daala Video Codec Timothy - PowerPoint PPT Presentation

Perceptually-Driven Video Coding with the Daala Video Codec Timothy B. Terriberry The Xiph.Org Foundation & The Mozilla Corporation Summary Daala is an attempt to completely avoid royalty- bearing technologies Used many

Pyramid Vector Quantization for Video Coding Jean-Marc Valin Daala Coding Party Sep 2013

Daala Daala is a high-efficiency video codec designed for internet applications Technical

Image and Video Coding: Video Coding Extensions Screen Content Coding Screen Content Coding

ADVANCED MULTIMEDIA ADVANCED MULTIMEDIA CODING CODING Fernando Pereira Instituto Superior

Formal Modeling in Cognitive Science 1 Coding Theorems Lecture 28: Kraft Inequality; Source Coding

Image and Video Coding: Improved Inter-Picture Prediction Review of Hybrid Video Coding Last

The Daala Video Codec Project Next-next Generation Video Timothy B. Terriberry Mozilla & The

Image and Video Coding: Hybrid Video Coding s n 1 [ x , y ] s n [ x , y ] m k = ( m x , m

Image and Video Coding: Motion Estimation and Coding 4 5 6 B C D 1 D 0 3 7 A current 2

Dynamical systems Expanding maps on the circle. Coding Jana Rodriguez Hertz ICTP 2018 coding

The Daala Video Codec: Research Update Nathan Egge <negge@mozilla.com> (Xiph.org, Mozilla)

Image and Video Coding: Introduction bitstream encoder decoder Motivation Image and Video

Perceptually-Driven Statistical Texture Modeling Eero Simoncelli Howard Hughes Medical Institute,

Image and Video Coding: Encoder Control D D = - R d R Problem Statement / Scope of Image

Image and Video Coding: Video Coding Standards s k [ x , y ] u k [ x , y ] quantization indexes q

VIDEO SIGNALS Lossless coding g LOSSLESS CODING LOSSLESS CODING The goal of lossless image

Inform ormat ation & & Cor Correlati tion on Jill illes V s Vreeken 11 11 June

Lecture 0 Introduction I-Hsiang Wang Department of Electrical Engineering National Taiwan

Introduction to Symbolic Dynamics Part 4: Entropy Silvio Capobianco Institute of Cybernetics at

Data Compression Techniques Grzegorz Pastuszak Warsaw University of Technology Trieste

Image/video compression: Basics and research issues Christine GUILLEMOT Outline A few basics

Deep Learning for Image and Video Compression Yao Wang Dept. of Electrical and Computer

Challenge Codes for Physically Unclonable Functions (PUFs) A Maximum Entropy Problem Alexander

Information Theory Maneesh Sahani maneesh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience

Perceptually-Driven Video Coding with the Daala Video Codec Timothy - PowerPoint PPT Presentation

Perceptually-Driven Video Coding with the Daala Video Codec Timothy B. Terriberry The Xiph.Org Foundation & The Mozilla Corporation Summary Daala is an attempt to completely avoid royalty- bearing technologies Used many

Pyramid Vector Quantization for Video Coding Jean-Marc Valin Daala Coding Party Sep 2013

Daala Daala is a high-efficiency video codec designed for internet applications Technical

Image and Video Coding: Video Coding Extensions Screen Content Coding Screen Content Coding

ADVANCED MULTIMEDIA ADVANCED MULTIMEDIA CODING CODING Fernando Pereira Instituto Superior

Formal Modeling in Cognitive Science 1 Coding Theorems Lecture 28: Kraft Inequality; Source Coding

Image and Video Coding: Improved Inter-Picture Prediction Review of Hybrid Video Coding Last

The Daala Video Codec Project Next-next Generation Video Timothy B. Terriberry Mozilla &amp; The

Image and Video Coding: Hybrid Video Coding s n 1 [ x , y ] s n [ x , y ] m k = ( m x , m

Image and Video Coding: Motion Estimation and Coding 4 5 6 B C D 1 D 0 3 7 A current 2

Dynamical systems Expanding maps on the circle. Coding Jana Rodriguez Hertz ICTP 2018 coding

The Daala Video Codec: Research Update Nathan Egge &lt;negge@mozilla.com&gt; (Xiph.org, Mozilla)

Image and Video Coding: Introduction bitstream encoder decoder Motivation Image and Video

Perceptually-Driven Statistical Texture Modeling Eero Simoncelli Howard Hughes Medical Institute,

Image and Video Coding: Encoder Control D D = - R d R Problem Statement / Scope of Image

Image and Video Coding: Video Coding Standards s k [ x , y ] u k [ x , y ] quantization indexes q

VIDEO SIGNALS Lossless coding g LOSSLESS CODING LOSSLESS CODING The goal of lossless image

Inform ormat ation &amp; &amp; Cor Correlati tion on Jill illes V s Vreeken 11 11 June

Lecture 0 Introduction I-Hsiang Wang Department of Electrical Engineering National Taiwan

Introduction to Symbolic Dynamics Part 4: Entropy Silvio Capobianco Institute of Cybernetics at

Data Compression Techniques Grzegorz Pastuszak Warsaw University of Technology Trieste

Image/video compression: Basics and research issues Christine GUILLEMOT Outline A few basics

Deep Learning for Image and Video Compression Yao Wang Dept. of Electrical and Computer

Challenge Codes for Physically Unclonable Functions (PUFs) A Maximum Entropy Problem Alexander

Information Theory Maneesh Sahani maneesh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience

The Daala Video Codec Project Next-next Generation Video Timothy B. Terriberry Mozilla & The

The Daala Video Codec: Research Update Nathan Egge <negge@mozilla.com> (Xiph.org, Mozilla)

Inform ormat ation & & Cor Correlati tion on Jill illes V s Vreeken 11 11 June