������������ TSBK01 Image Coding and Data Compression Lecture 10 Jörgen Ahlberg
������� I. Colour coding II. Moving images: From 2D to 3D? III. Hybrid coding IV. Video coding standards
������� ������������� The base colours of colour television are – Red: 700 nm – Green: 546 nm – Blue: 435 nm Three base colours enough to synthesize any visible colour!
����������������� B R In this plane, the G luminance Y = R+G+B = 1
��������������� R Y G Matrix R-Y B B-Y Y = 0.30B + 0.59G + 0.11B Cr = 0.70R - 0.59G - 0.11B Cb = - 0.30R - 0.59G + 0.89B Y luminance; Cr, Cb chrominance
��������������������� � Change basis to YUV (almost the same as YCrCb). – For more info on color spaces, see colour FAQ at www.poynton.com/Poynton-color.html � The Human Visual System perceives the luminance in higher resolution than the chrominance! � Subsample the colour components. U V Y Y U V 4:2:0 4:2:2
�������� ����������������������� Principle I - Extend known methods to 3D Decoding Coding Method Prestanda (bpp) Complexity complexity PCM 6 – 8 Low Low VQ 0.5 – 2 Very high Low Predictive 2 – 5 Low Low Transform 0.5 – 1.5 High High Subband/ 0.1 – 1.0 High High Wavelet Fractal 0.1 - 0.5 Very high Low
�������������������� Predictive coding � 3D predictors – Motion compensated predictors – Transform coding � 3D transforms – Subband coding � 3D subband filters – BUT! The properties of the image signal are different in the temporal and the spatial domain!
����� Principle II: Hybrid methods Hybrid predictive/transform coding popular++
��������� ������������� � Combine predictive coding and transform coding. � Use predictive coding to predict the next frame in the sequence. � Use transform coding to code the prediction error.
���������������� T Q VLC T: Transform Q: Quantizer VLC: Variable Length Coder
����������������� Q VLC Q -1 P Q: Quantizer Q -1 : Inverse quantizer (reconstructor) P: Predictor
������������� T Q VLC Q -1 T -1 P
���������������� Predictively coded Intra-coded P-frames I-frame Better prediction if it can compensate for motion!
�������������������
������������������� ������������� TQ VLC TQ -1 TQ: Transform + quantization P ME VLC ME: Motion estimation
������������������� � Typically one motion vector per macroblock (4 transform blocks) � Motion estimation is a time consuming process – Hierarchical motion estimation – Maximum length of motion vectors – Clever search strategies � Motion vector accuracy: – Integer, half or quarter pixel – Bilinear interpolation
�������� ���������������������� Mobile Videophone ISDN Video CD Digital TV HDTV videophone over PSTN videophone 8 16 64 384 1.5 5 20 kbit/s Mbit/s Low bitrate Medium bitrate High bitrate Very low bitrate MPEG-4 H.263 H.261 MPEG-1 MPEG-2
��������� � H.26x – Standards for real time communication like video telephony and video conferencing. – Standardized by ITU. � MPEG – Standards for stored video data like movies on CDs, DVDs, etc. – Standardized by ISO.
����� � Standard for ISDN picture phones in 1990. � Motion compensation: – One motion vector per macroblock. – One macroblock = four 8 � 8 luminance blocks + two chrominance blocks (one U and one V). – Motion vectors max 15 pixels long in each direction. � Format: – CIF (352 � 288) or QCIF (176 � 144) – 7.5 – 30 frames/s. � Bitrate: Multiple of 64 kbit/s (=ISDN) including audio. � Quality: Acceptable for small motion at 128 kbit/s.
����� � Standard for picture telephones over analog subscriber lines in 1995. � Format: – CIF, QCIF or Sub-QCIF. – Usually less than 10 frames/s. � Bitrate: Typically 20 – 30 kbit/s. � Quality: With new options as good as H.261 (at half the bitrate).
���� � Moving Pictures Expert Group – a committee under ISO and IEC. � Original plan: – MPEG-1 for 1.5 Mbit/s (VideoCD) – MPEG-2 for 10 Mbit/s (Digital TV) – MPEG-3 for 40 Mbit/s (HDTV) � What happened: – MPEG-1 for 1.5 Mbit/s (Video CD) – MPEG-2 for 2 – 60 Mbit/s (TV and HDTV) – MPEG-4, -7 and -21 for other things.
������ � ISO/IEC standard in 1991. � Target bitrate around 1.5 Mbit/s (Video CD). � Properties: – Bi-directionally predictively coded frames (”B-frames”, see next slide). – More flexible than H.261. – Almost JPEG for intra frames. � Format: – CIF – No interlace. – 24 – 30 frames/s.
���������������� Predictively coded Intra-coded P-frames I-frame Bi-directionally predictively I B B P B B P B B P B B I coded B-frames Group of frames (GOF)
����������������������� � Intracoded � 8 � 8 DCT � Arbitrary weighting matrix for coefficients � Predictive coding of DC-coefficients � Uniform quantization � Zig-zag, run-level, entropy coding
����������������������� � Motion compensated prediction from I- or P-frame. � Half-pixel accuracy of motion vectors, bilinear interpolation. � Predictive coding of motion vectors. � Prediction error coded as I-frame.
����������������������� � Motion compensated prediction from two consecutive I- or P-frames. – Forward prediction only (1 vector/macroblock). – Backward prediction only (1 vector/macroblock). – Average of fwd and bwd (2 vectors/macroblock). � Otherwise as P-frames.
������ � ISO/IEC standard in 1994. � Properties: – Handles interlace (optimized for TV) – Even more flexible than MPEG-1 � Format: – 352 � 288 – 704 � 576 (25 frames/s) or 720 � 480 (30 frames/s) – 1440 � 1152 or 1920 � 1080 (HDTV) � Bitrate: – 2 – 60 Mbit/s – ~4 Mbits/s: Image quality similar to PAL / NTSC / SECAM. – 18 – 20 Mbit/s: HDTV.
�������������� � Profiles: – Simple profile without B-frames. – Scaleable profiles. � Experience tells that: – At 1.5 – 2 Mbit/s MPEG-2 is not better than MPEG-1. – With manual interaction at the coding, good quality can be achieved at 3 – 4 Mbit/s. – Problems with implementing the full standard has caused compatibility problems. – Buffering and rate control hard problems.
������ � ISO/IEC standard in 1998, version 2 in 1999 � Instead of frames as coding units, MPEG-4 use audio-visual objects � Focus is not primarily on compression, but on content-based functionality � Contains definitions of: – Media object types (video, audio, text, graphics, ...) – Parameters for describing the objects – Bitstream syntax for the (compressed) parameters – Scene description, file format, streaming, synchronization, ... � Allows mixing of media objects.
�������������������� �������� � Part 1, Systems, contains – The bitstream syntax and the the binary ”language” for scene description – Computer graphics object descriptions – Multiplexing, transport, ... � Part 2, Visual, contains – Video coding – Still image coding – Texture coding, ... � Part 3, Audio, contains a toolbox of audio coders for different applications � ...
����������������������� ������� ��� ������� ������ ���������� ��� ������� ������ ��������� ����������������� ��� ��� ������� ������
���������������������� � Instead of frames: Video Object Planes � Coded with Shape Adaptive DCT ������������� ��������� ��� ������ �������������� ���
������������������� TQ VLC TQ -1 ������������� �������������� Mux ��������� ������ VLC ���������� ����� VLC ������
Recommend
More recommend