MPEG 1&2 Video Compression Idea: • Predict the current frame based on what was in the previous frame (or possibly what is in the frames to either side). • Subtract the predicted pixels from the actual values. • Employ (essentially) JPEG compression on what is left. Useful textbooks: B. Haskell, A. Puri, and A. N. Netravali [1997]. Digital Video: An Introduction to MPEG-2 , Kluwer Academic Publishers. J. Mitchell, W. Pennebaker, C. Fogg, and D. LeGall [1997]. MPEG Video Compression Standard , Chapman and Hall. V. Bhaskaran and K. Konstantinides [1995]. Image and Video Compression Standards , Kluwer Academic Press. - 1 -
Step 1 • Use a YUV or equivalent color space. - 2 -
Step 2 • Divide the current frame into macroblocks of 16 by 16 pixels. - 3 -
Step 3 • "Down sample" the two color planes by a factor of two in each direction. • So now we have a gray scale plane of n by n pixels and two color component planes of dimension n /2 by n /2. • That is, each macroblock has four 8 by 8 blocks for the gray scale plane and one 8 by 8 block for each color plane. - 4 -
Step 4 • The first frame to be encoded is always an I-frame ("intra-frame"), where its 8x8 blocks are encoded in a fashion similar to JPEG. • P-frames ("predicted frames") can be encoded as follows: For each macroblock X of the current frame, determine the macroblock Y of the previous frame that is most similar to X (e.g., least mean squared error) and send a pair of numbers to the encoder for the difference between its coordinates and that of the macroblock being encoded. • B-frames ("bidirectional" frames) are encoded by predicting from the left, the right, or averaging predictions from both the frame to the left and the right. Note: B-frames cannot be predicted from other B-frames; that is, a sequence of B-frames must be preceded and followed by and I or P frame. P-frames are always predicted from the closest previous P-frame or I-frame. - 5 -
Step 4 continued Typically I-frames appear at regular intervals in a pattern such as IBBPBBPBBI..., where B frames transmitted slightly out of order: (Figure 8.6 of the Haskell, Puri, Netravali book) - 6 -
Step 5 For each block of an I frame and each predicted block in a P-frame or B- frame, compute the difference block X between the prediction and the original block and encode X in a fashion similar to the JPEG standard (DCT followed by quantization followed by a combination of run-length and Huffman coding). The degree of quantization can be varied from frame to frame to accommodate a desired channel rate. - 7 -
MPEG1 vs. MPEG2: We have described what is common to MPEG1 and MPEG2. MPEG2 adds many new features, including: • Interlaced video (frames can be divided into two fields consisting of the odd rows and even rows of pixels). • A number of new prediction modes that involve fields (and in some cases 8 by 16 blocks). • Prediction at the 1/2 pixel resolution (by interpolation). • A rich system level syntax that allows several video packet streams mixed together. - 8 -
Recommend
More recommend