Information Theory and Coding Image, Video and Audio Compression - PowerPoint PPT Presentation

Information Theory and Coding – Image, Video and Audio Compression Markus Kuhn Lent 2003 – Part II Computer Laboratory http://www.cl.cam.ac.uk/Teaching/2002/InfoTheory/

Structure of modern audiovisual communication systems Perceptual Entropy Sensor+ Channel Signal ✲ ✲ ✲ ✲ coding sampling coding coding ❄ Noise Channel ✲ ❄ Perceptual Entropy Channel Human Display ✛ ✛ ✛ ✛ senses decoding decoding decoding The dashed box marks the focus of the main part of this course as taught by Neil Dodgson. 2

Sampling, aliasing and Nyquist limit i ⋅ f s ± f f 0 0 −3fs −2fs −fs 0 fs 2fs 3fs A wave cos(2 πtf ) sampled with frequency f s cannot be distinguished from cos(2 πt ( if s ± f )) for any i ∈ Z , therefore ensure | f | < f s / 2 . 3

Quantization Uniform: 4 2 0 −2 −4 −4 −3 −2 −1 0 1 2 3 Non-uniform (e.g., logarithmic): 8 6 4 2 0 0.5 1 2 4 8 4

Example for non-uniform quantization: digital telephone network µ −law (US) A−law (Europe) signal voltage 0 −128 −96 −64 −32 0 32 64 96 128 byte value Simple logarithm fails for values ≤ 0 → apply µ -law compression y = V log(1 + µ | X | /V ) sgn ( x ) log(1 + µ ) before uniform quantization ( µ = 255 , V maximum value). Lloyd’s algorithm: finds least-square-optimal non-uniform quantization function for a given probability distribution of sample values. S.P. Lloyd: Least Squares Quantization in PCM. IEEE Trans. on Information Theory. Vol. 28, March 1982, pp 129–137. 5

Psychophysics of perception Sensation limit (SL) = lowest intensity stimulus that can still be perceived Difference limit (DL) = smallest perceivable stimulus difference at given intensity level Weber’s law Difference limit ∆ φ is proportional to the intensity φ of the stimulus (except for a small correction constant a describe deviation of experi- mental results near SL): ∆ φ = c · ( φ + a ) Fechner’s scale Define a perception intensity scale ψ using the sensation limit φ 0 as the origin and the respective difference limit ∆ φ = c · φ as a unit step. The result is a logarithmic relationship between stimulus intensity and scale value: φ ψ = log c φ 0 6

Fechner’s scale matches older subjective intensity scales that follow differentiability of stimuli, e.g. the astronomical magnitude numbers for star brightness introduced by Hipparchos ( ≈ 150 BC). Stevens’ law A sound that is 20 DL over SL is perceived as more than twice as loud as one that is 10 DL over SL, i.e. Fechner’s scale does not describe well perceived intensity. A rational scale attempts to reflect subjective relations perceived between different values of stimulus intensity φ . Stevens observed that such rational scales ψ follow a power law: ψ = k · ( φ − φ 0 ) a Example coefficients a : temperature 1.6, weight 1.45, loudness 0.6, brightness 0.33. 7

Decibel Communications engineers love logarithmic units: → Quantities often vary over many orders of magnitude → difficult to agree on a common SI prefix → Quotient of quantities (amplification/attenuation) usually more interesting than difference → Signal strength usefully expressed as field quantity (voltage, current, pressure, etc.) or power, but quadratic relationship between these two ( P = U 2 /R = I 2 R ) rather inconvenient → Weber/Fechner: perception is logarithmic Plus: Using magic special-purpose units has its own odd attractions ( → typographers, navigators) Neper (Np) denotes the natural logarithm of the quotient of a field quantity F and a reference value F 0 . Bel (B) denotes the base-10 logarithm of the quotient of a power P and a reference power P 0 . Common prefix: 10 decibel (dB) = 1 bel. 8

Where P is some power and P 0 a 0 dB reference power, or F is a field quantity and F 0 the reference: P F 10 dB · log 10 = 20 dB · log 20 P 0 F 0 Common reference vales indicated with additional letter afer dB: 0 dBW = 1 W 0 dBm = 1 mW = − 30 dBW 0 dB µ V = 1 µ V 0 dB SPL = 20 µ Pa (sound pressure level) 0 dB SL = perception threshold (sensation level) 3 dB = double power, 6 dB = double pressure/voltage/etc. 10 dB = 10 × power, 20 dB = 10 × pressure/voltage/etc. 9

RGB video colour coordinates Hardware interface (VGA): red, green, blue signals with 0–0.7 V Electron-beam current and photon count of cathode-ray display are proportional to ( v − v 0 ) γ , where v is the video-interface or screen-grid voltage and γ is usually in the range 1.5–3.0. CRT non-linearity is compensated electronically in TV cameras and approximates Stevens scale. Software interfaces map RGB voltage linearly to { 0 , 1 , . . . , 255 } or 0–1 Mapping of numeric RGB values to colour and luminosity is at present still highly hardware and sometimes even operating-system or device- driver dependent. New specification “sRGB” aims to fix meaning of RGB with γ = 2 . 2 and standard primary colour coordinates. http://www.w3.org/Graphics/Color/sRGB http://www.srgb.com/ IEC 61966 10

YCrCb video colour coordinates Human eye processes color and luminosity at different resolutions, therefore use colour space with luminance coordinate Y = 0 . 3 R + 0 . 6 G + 0 . 1 B and colour components V = R − Y = 0 . 7 R − 0 . 6 G − 0 . 1 B U = B − Y = − 0 . 3 R − 0 . 6 G + 0 . 9 B Since − 0 . 7 ≤ V ≤ 0 . 7 and − 0 . 9 ≤ U ≤ 0 . 9 , a more convenient normalized encoding of chrominance is: U Cb = 2 . 0 + 0 . 5 V Cr = 1 . 6 + 0 . 5 Modern image compression techniques operate on Y , Cr , Cb channels separately, using half the resolution of Y for storing Cr , Cb . 11

Correlation of neighbour pixels Values of nighbour pixels at distance 1 Values of nighbour pixels at distance 2 250 250 200 200 150 150 100 100 50 50 0 0 0 100 200 0 100 200 Values of nighbour pixels at distance 4 Values of nighbour pixels at distance 8 250 250 200 200 150 150 100 100 50 50 0 0 0 100 200 0 100 200 12

Karhunen-Lo` eve transform (KLT) Two random variables x , y are not correlated if their covariance cov( x, y ) = E { ( x − E { x } ) · ( y − E { y } ) } = 0 . Take an image (or in practice a small 8 × 8 pixel block) as a random- variable vector b . The components of a random-variable vector b = ( b 1 , . . . , b k ) are decorrelated if the covariance matrix cov( b ) with (cov( b )) i,j = E { ( b i − E { b i } ) · ( b j − E { b j } ) } = cov( b i , b j ) is a diagonal matrix. The Karhunen-Lo` eve transform of b is the matrix A with which cov(A b ) is diagonal. Since cov( b ) is symmetric, its eigenvectors are orthogonal. Using these eigenvectors as the rows of A and the corresponding eigenvalues as the diagonal elements of the diagonal matrix D , we obtain the decomposition cov( b ) = A T DA , and therefore cov( A b ) = D . The Karhunen-Lo` eve transform is the orthogonal matrix of the singular- value decomposition of the covariance matrix of its input. 13

Discrete cosine transform (DCT) The forward and inverse discrete cosine transform N − 1 C ( u ) s ( x ) cos (2 x + 1) uπ � S ( u ) = � 2 N N/ 2 x =0 N − 1 C ( u ) S ( u ) cos (2 x + 1) uπ � s ( x ) = � 2 N N/ 2 u =0 with 1 � u = 0 √ C ( u ) = 2 1 u > 0 is an orthonormal transform: � 1 N − 1 C ( u ) cos (2 x + 1) uπ · C ( u ′ ) cos (2 x + 1) u ′ π u = u ′ � = 0 u � = u ′ � 2 N � 2 N N/ 2 N/ 2 x =0 14

The 2-dimensional variant of the DCT applies the 1-D transform on both rows and columns of an image: S ( u, v ) = C ( u ) C ( v ) · � � N/ 2 N/ 2 N − 1 N − 1 s ( y, x ) cos (2 x + 1) uπ cos (2 x + 1) vπ � � 2 N 2 N y =0 x =0 Breakthrough: Ahmed/Natarajan/Rao discovered the DCT as an excellent approxima- tion of the KLT for typical photographic images, but far more efficient to calculate. Ahmed, Natarajan, Rao: Discrete Cosine Transform. IEEE Transactions on Computers, Vol. 23, January 1974, pp. 90–93. A range of fast algorithms have been found for calculating 1-D and 2-D DCTs (e.g., Ligtenberg/Vetterli). 15

Whole-image DCT 2D Discrete Cosine Transform (log10) Original image 4 3 2 1 0 −1 −2 −3 −4 16

Whole-image DCT, 80% coefficient cutoff 80% truncated 2D DCT (log10) 80% truncated DCT: reconstructed image 4 3 2 1 0 −1 −2 −3 −4 17

Base vectors of 8 × 8 DCT 21

Information Theory and Coding Image, Video and Audio Compression - PowerPoint PPT Presentation

Information Theory and Coding Image, Video and Audio Compression Markus Kuhn Lent 2003 Part II Computer Laboratory http://www.cl.cam.ac.uk/Teaching/2002/InfoTheory/ Structure of modern audiovisual communication systems Perceptual

Image and Video Coding: Video Coding Extensions Screen Content Coding Screen Content Coding

Image and Video Coding: Encoder Control D D = - R d R Problem Statement / Scope of Image

Image and Video Coding: Introduction bitstream encoder decoder Motivation Image and Video

Image and Video Coding: Improved Inter-Picture Prediction Review of Hybrid Video Coding Last

Image and Video Coding: Hybrid Video Coding s n 1 [ x , y ] s n [ x , y ] m k = ( m x , m

ADVANCED MULTIMEDIA ADVANCED MULTIMEDIA CODING CODING Fernando Pereira Instituto Superior

Create PowerPoint Audio and Video V0B August 2020 V0B V0B Schield: 2020 PPTX Create Audio-Video

Game Audio Coding vs. Aesthetics Leonard Paul of Lotus Audio Vancouver, Canada Game Audio :

Formal Modeling in Cognitive Science 1 Coding Theorems Lecture 28: Kraft Inequality; Source Coding

Image and Video Coding: Motion Estimation and Coding 4 5 6 B C D 1 D 0 3 7 A current 2

VIDEO SIGNALS Lossless coding g LOSSLESS CODING LOSSLESS CODING The goal of lossless image

Speech & Audio Coding TSBK01 Image Coding and Data Compression Lecture 11, 2003 Jrgen

Audio Device Client Better and Faster Audio I/O on Web Hongchan Choi Google Chrome Web Audio

Image and Video Coding: Video Coding Standards s k [ x , y ] u k [ x , y ] quantization indexes q

Information theory and coding 1.00 Image, video and audio compression 0 1 0.40 0.60

Dynamical systems Expanding maps on the circle. Coding Jana Rodriguez Hertz ICTP 2018 coding

Digital Photography II The Image Processing Pipeline EE367/CS448I: Computational Imaging and

June 28, 2018 Shawna Forsberg President and CEO United Way of the Midlands Tim and Terri Burke

Truth (Authenticity) as Evidence for Trust (Assurance) Gary L. Dickinson Director, Healthcare

Welcome + Patient Centered Medical Home Case Study featuring Mary Howard Health Center December

DIGITAL IMAGE DIGITAL IMAGE COMPRESSION COMPRESSION Fernando Pereira Fernando Pereira

PHOTOGRAPHIC PHOTOGRAPHIC IMAGING IMAGING Fernando Pereira Fernando Pereira Instituto

Chapter 7.2: Layer 6: Compression CS/ECPE 5516: Comm. Network Prof. Abrams, Spring 2000 Based

15-853:Algorithms in the Real World Announcements: HW2 will be released tomorrow Oct 16 (Wed)

Information Theory and Coding Image, Video and Audio Compression - PowerPoint PPT Presentation

Information Theory and Coding Image, Video and Audio Compression Markus Kuhn Lent 2003 Part II Computer Laboratory http://www.cl.cam.ac.uk/Teaching/2002/InfoTheory/ Structure of modern audiovisual communication systems Perceptual

Image and Video Coding: Video Coding Extensions Screen Content Coding Screen Content Coding

Image and Video Coding: Encoder Control D D = - R d R Problem Statement / Scope of Image

Image and Video Coding: Introduction bitstream encoder decoder Motivation Image and Video

Image and Video Coding: Improved Inter-Picture Prediction Review of Hybrid Video Coding Last

Image and Video Coding: Hybrid Video Coding s n 1 [ x , y ] s n [ x , y ] m k = ( m x , m

ADVANCED MULTIMEDIA ADVANCED MULTIMEDIA CODING CODING Fernando Pereira Instituto Superior

Create PowerPoint Audio and Video V0B August 2020 V0B V0B Schield: 2020 PPTX Create Audio-Video

Game Audio Coding vs. Aesthetics Leonard Paul of Lotus Audio Vancouver, Canada Game Audio :

Formal Modeling in Cognitive Science 1 Coding Theorems Lecture 28: Kraft Inequality; Source Coding

Image and Video Coding: Motion Estimation and Coding 4 5 6 B C D 1 D 0 3 7 A current 2

VIDEO SIGNALS Lossless coding g LOSSLESS CODING LOSSLESS CODING The goal of lossless image

Speech &amp; Audio Coding TSBK01 Image Coding and Data Compression Lecture 11, 2003 Jrgen

Audio Device Client Better and Faster Audio I/O on Web Hongchan Choi Google Chrome Web Audio

Image and Video Coding: Video Coding Standards s k [ x , y ] u k [ x , y ] quantization indexes q

Information theory and coding 1.00 Image, video and audio compression 0 1 0.40 0.60

Dynamical systems Expanding maps on the circle. Coding Jana Rodriguez Hertz ICTP 2018 coding

Digital Photography II The Image Processing Pipeline EE367/CS448I: Computational Imaging and

June 28, 2018 Shawna Forsberg President and CEO United Way of the Midlands Tim and Terri Burke

Truth (Authenticity) as Evidence for Trust (Assurance) Gary L. Dickinson Director, Healthcare

Welcome + Patient Centered Medical Home Case Study featuring Mary Howard Health Center December

DIGITAL IMAGE DIGITAL IMAGE COMPRESSION COMPRESSION Fernando Pereira Fernando Pereira

PHOTOGRAPHIC PHOTOGRAPHIC IMAGING IMAGING Fernando Pereira Fernando Pereira Instituto

Chapter 7.2: Layer 6: Compression CS/ECPE 5516: Comm. Network Prof. Abrams, Spring 2000 Based

15-853:Algorithms in the Real World Announcements: HW2 will be released tomorrow Oct 16 (Wed)

Speech & Audio Coding TSBK01 Image Coding and Data Compression Lecture 11, 2003 Jrgen