Compression with Flows via Local Bits-Back Coding Jonathan Ho, Evan - PowerPoint PPT Presentation

Compression with Flows via Local Bits-Back Coding Jonathan Ho, Evan Lohn, Pieter Abbeel

Background • Lossless compression with likelihood-based generative model p( x ) encode decode 01000101100100110 11101010000101011 • Information theory: a uniquely decodable code exists with lengths ≈ − log p ( x ) • Training (maximum likelihood) optimizes expected codelength • But what about computational e ffi ciency of coding?

Existing compression algorithms • Naive algorithm requires enumerating all data. Needs exponential resources in data dimension • Must harness structure of p( x ) to code e ffi ciently • Autoregressive model: code one dimension at a time • Latent variable models trained with variational inference: bits-back coding

<latexit sha1_base64="3cAYUCWdTOTUmFKx3vHhcEIG6Y=">ACMnicbVDLSgMxFM3UVx1foy7dBEulBS0zIuhGKLqxG6lgH9CpJZNm2tDMgyQj1mG+yY1fIrjQhSJu/QgzbUWtHgczjmX3HuckFEhTfNJy8zMzs0vZBf1peWV1TVjfaMugohjUsMBC3jTQYIw6pOapJKRZsgJ8hxGs7gNPUb14QLGviXchiStod6PnUpRlJHaNie0j2HTe+TeCOLagHRwJGLD5PCuYurBShbev5r9iNih1D9yres5LC92wRwo6RM0vmCPAvsSYkByaodowHuxvgyCO+xAwJ0bLMULZjxCXFjCS6HQkSIjxAPdJS1EceEe14dHIC80rpQjfg6vkSjtSfEzHyhBh6jkqmS4pLxX/81qRdI/aMfXDSBIfjz9yIwZlANP+YJdygiUbKoIwp2pXiPuIyxVy7oqwZo+S+p75cs2RdHOTKJ5M6smALbIMCsMAhKIMzUAU1gMEdeAQv4FW71561N+19HM1ok5lN8Avaxyci26g1</latexit> <latexit sha1_base64="3cAYUCWdTOTUmFKx3vHhcEIG6Y=">ACMnicbVDLSgMxFM3UVx1foy7dBEulBS0zIuhGKLqxG6lgH9CpJZNm2tDMgyQj1mG+yY1fIrjQhSJu/QgzbUWtHgczjmX3HuckFEhTfNJy8zMzs0vZBf1peWV1TVjfaMugohjUsMBC3jTQYIw6pOapJKRZsgJ8hxGs7gNPUb14QLGviXchiStod6PnUpRlJHaNie0j2HTe+TeCOLagHRwJGLD5PCuYurBShbev5r9iNih1D9yres5LC92wRwo6RM0vmCPAvsSYkByaodowHuxvgyCO+xAwJ0bLMULZjxCXFjCS6HQkSIjxAPdJS1EceEe14dHIC80rpQjfg6vkSjtSfEzHyhBh6jkqmS4pLxX/81qRdI/aMfXDSBIfjz9yIwZlANP+YJdygiUbKoIwp2pXiPuIyxVy7oqwZo+S+p75cs2RdHOTKJ5M6smALbIMCsMAhKIMzUAU1gMEdeAQv4FW71561N+19HM1ok5lN8Avaxyci26g1</latexit> <latexit sha1_base64="3cAYUCWdTOTUmFKx3vHhcEIG6Y=">ACMnicbVDLSgMxFM3UVx1foy7dBEulBS0zIuhGKLqxG6lgH9CpJZNm2tDMgyQj1mG+yY1fIrjQhSJu/QgzbUWtHgczjmX3HuckFEhTfNJy8zMzs0vZBf1peWV1TVjfaMugohjUsMBC3jTQYIw6pOapJKRZsgJ8hxGs7gNPUb14QLGviXchiStod6PnUpRlJHaNie0j2HTe+TeCOLagHRwJGLD5PCuYurBShbev5r9iNih1D9yres5LC92wRwo6RM0vmCPAvsSYkByaodowHuxvgyCO+xAwJ0bLMULZjxCXFjCS6HQkSIjxAPdJS1EceEe14dHIC80rpQjfg6vkSjtSfEzHyhBh6jkqmS4pLxX/81qRdI/aMfXDSBIfjz9yIwZlANP+YJdygiUbKoIwp2pXiPuIyxVy7oqwZo+S+p75cs2RdHOTKJ5M6smALbIMCsMAhKIMzUAU1gMEdeAQv4FW71561N+19HM1ok5lN8Avaxyci26g1</latexit> <latexit sha1_base64="3cAYUCWdTOTUmFKx3vHhcEIG6Y=">ACMnicbVDLSgMxFM3UVx1foy7dBEulBS0zIuhGKLqxG6lgH9CpJZNm2tDMgyQj1mG+yY1fIrjQhSJu/QgzbUWtHgczjmX3HuckFEhTfNJy8zMzs0vZBf1peWV1TVjfaMugohjUsMBC3jTQYIw6pOapJKRZsgJ8hxGs7gNPUb14QLGviXchiStod6PnUpRlJHaNie0j2HTe+TeCOLagHRwJGLD5PCuYurBShbev5r9iNih1D9yres5LC92wRwo6RM0vmCPAvsSYkByaodowHuxvgyCO+xAwJ0bLMULZjxCXFjCS6HQkSIjxAPdJS1EceEe14dHIC80rpQjfg6vkSjtSfEzHyhBh6jkqmS4pLxX/81qRdI/aMfXDSBIfjz9yIwZlANP+YJdygiUbKoIwp2pXiPuIyxVy7oqwZo+S+p75cs2RdHOTKJ5M6smALbIMCsMAhKIMzUAU1gMEdeAQv4FW71561N+19HM1ok5lN8Avaxyci26g1</latexit> Flow models z ∼ N (0 , I ) • Flow model: smooth invertible map between noise and data • They are likelihood-based, so coding algorithm must exist • This work: computationally e ffi cient coding for flows

Local approximations of flows • Strategy for coding: locally approximate the flow as a VAE , then apply bits-back coding • Flow model maps data to latent: z = f( x ) • Construct a VAE where f is q( z | x ) and f -1 is p( x | z ) f z x • The VAE bound will closely match the flow’s log likelihood

Local bits-back coding • Our algorithm is bits-back coding on this VAE approximation of the flow • Straightforward implementation needs cubic time in data dimension. No assumptions on flow structure. • Better than exponential, but not fast enough

Specializing local bits-back coding • Making extra assumptions on the flow lets us speed up compression • For RealNVP family: linear time, fully parallelizable compression by exploiting structure of coupling layers and composition

Results • Implemented for Flow++, a RealNVP-type flow model CIFAR10 ImageNet 32x32 ImageNet 64x64 Compression algorithm Theoretical 3.116 3.871 3.701 Local bits-back (ours) 3.118 3.875 3.703 • State of the art fully parallelizable compression on these datasets • Requires “auxiliary bits” for bits-back coding • Codelength can degrade if auxiliary bits are unavailable

Results: speed • Specializing local bits-back to the RealNVP structure speeds up compression by orders of magnitude Algorithm Batch size CIFAR10 ImageNet 32x32 ImageNet 64x64 Black box (Algorithm 1) 1 64 . 37 ± 1 . 05 534 . 74 ± 5 . 91 1349 . 65 ± 2 . 30 Compositional (Section 3.4.3) 1 0 . 77 ± 0 . 01 0 . 93 ± 0 . 02 0 . 69 ± 0 . 02 64 0 . 09 ± 0 . 00 0 . 17 ± 0 . 00 0 . 18 ± 0 . 00 Neural net only, without coding 1 0 . 50 ± 0 . 03 0 . 76 ± 0 . 00 0 . 44 ± 0 . 00 0 . 04 ± 0 . 00 0 . 13 ± 0 . 00 0 . 05 ± 0 . 00 64

Conclusion • Local bits-back coding: compression with flow models • Naive algorithm: exponential time in data dimension • Our algorithm for general flows: polynomial time • Our algorithm for RealNVP family: linear time and parallelizable • For algorithm details and comparisons to other types of models, come to our poster! • Open source: github.com/hojonathanho/localbitsback

Compression with Flows via Local Bits-Back Coding Jonathan Ho, Evan - PowerPoint PPT Presentation

Compression with Flows via Local Bits-Back Coding Jonathan Ho, Evan Lohn, Pieter Abbeel Background Lossless compression with likelihood-based generative model p( x ) encode decode 01000101100100110 11101010000101011 Information

MIPS Instruction Formats 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits for instance,

Coding and Applications in Sensor Networks Why coding? Information compression

Formal Modeling in Cognitive Science 1 Coding Theorems Lecture 28: Kraft Inequality; Source Coding

VIDEO SIGNALS Lossless coding g LOSSLESS CODING LOSSLESS CODING The goal of lossless image

Lossless compression in lossy compression systems Almost every lossy compression system

14.9.2 JPEG2000 compression DCT compression basis for JPEG wavelet compression

Speech & Audio Coding TSBK01 Image Coding and Data Compression Lecture 11, 2003 Jrgen

JPEG Compression Ian Snyder December 11, 2009 Ian Snyder JPEG Compression Outline

Lecture 9: Compression 1 / 52 Compression Recap Bu ff er Management Recap 2 / 52 Compression

Coding and Applications in Sensor Networks Coding and Applications in Sensor Networks Why coding?

From Sorting to Heaps to Compression Data Compression video on demand/set top box jpeg

Image and Video Coding: Video Coding Extensions Screen Content Coding Screen Content Coding

ADVANCED MULTIMEDIA ADVANCED MULTIMEDIA CODING CODING Fernando Pereira Instituto Superior

Dynamical systems Expanding maps on the circle. Coding Jana Rodriguez Hertz ICTP 2018 coding

Algorithms in the Real World Data Compression 4 Page 1 Compression Outline Introduction : Lossy

Bits and Bytes Topics Topics Why bits? Representing information as bits

In the Compression Hornet's Nest: A Security Study of Data Compression in Network Services

String Attractors: A unifying theory of repetitiveness Dominik Kempa 1 Nicola Prezza 2 1

Data suppression and compression SW in DUNE detector simula6on

Data Blocks: Hybrid OLTP and OLAP on Compressed Storage using both Vectorization and Compilation

Statistical Physics of Information Measures Neri Merhav Department of Electrical Engineering

Compression of Propositional Resolution Proofs by Lowering Subproofs Joseph Boudou 1 Bruno

Compression: Huffmans Algorithm Greg Plaxton Theory in Programming Practice, Spring 2005

Recent Developments in Video Compression Standardization CVPR CLIC Workshop, Salt Lake City,

Compression with Flows via Local Bits-Back Coding Jonathan Ho, Evan - PowerPoint PPT Presentation

Compression with Flows via Local Bits-Back Coding Jonathan Ho, Evan Lohn, Pieter Abbeel Background Lossless compression with likelihood-based generative model p( x ) encode decode 01000101100100110 11101010000101011 Information

MIPS Instruction Formats 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits for instance,

Coding and Applications in Sensor Networks Why coding? Information compression

Formal Modeling in Cognitive Science 1 Coding Theorems Lecture 28: Kraft Inequality; Source Coding

VIDEO SIGNALS Lossless coding g LOSSLESS CODING LOSSLESS CODING The goal of lossless image

Lossless compression in lossy compression systems Almost every lossy compression system

14.9.2 JPEG2000 compression DCT compression basis for JPEG wavelet compression

Speech &amp; Audio Coding TSBK01 Image Coding and Data Compression Lecture 11, 2003 Jrgen

JPEG Compression Ian Snyder December 11, 2009 Ian Snyder JPEG Compression Outline

Lecture 9: Compression 1 / 52 Compression Recap Bu ff er Management Recap 2 / 52 Compression

Coding and Applications in Sensor Networks Coding and Applications in Sensor Networks Why coding?

From Sorting to Heaps to Compression Data Compression video on demand/set top box jpeg

Image and Video Coding: Video Coding Extensions Screen Content Coding Screen Content Coding

ADVANCED MULTIMEDIA ADVANCED MULTIMEDIA CODING CODING Fernando Pereira Instituto Superior

Dynamical systems Expanding maps on the circle. Coding Jana Rodriguez Hertz ICTP 2018 coding

Algorithms in the Real World Data Compression 4 Page 1 Compression Outline Introduction : Lossy

Bits and Bytes Topics Topics Why bits? Representing information as bits

In the Compression Hornet's Nest: A Security Study of Data Compression in Network Services

String Attractors: A unifying theory of repetitiveness Dominik Kempa 1 Nicola Prezza 2 1

Data suppression and compression SW in DUNE detector simula6on

Data Blocks: Hybrid OLTP and OLAP on Compressed Storage using both Vectorization and Compilation

Statistical Physics of Information Measures Neri Merhav Department of Electrical Engineering

Compression of Propositional Resolution Proofs by Lowering Subproofs Joseph Boudou 1 Bruno

Compression: Huffmans Algorithm Greg Plaxton Theory in Programming Practice, Spring 2005

Recent Developments in Video Compression Standardization CVPR CLIC Workshop, Salt Lake City,

Speech & Audio Coding TSBK01 Image Coding and Data Compression Lecture 11, 2003 Jrgen