First Parts of H.264 Decoder Chun-Chieh Lin
Contents � H.264 Overview � NAL Unit Unwrapping Details � Entropy Decoding Details � Hardware Design � Design Explorations � Benchmark Results
H.264 Overview � Works on blocks of 4x4 to 16x16 pixels � Encoder picks a way to approximate current block using previous data � Residual data transformed in 4x4 blocks � Almost everything is entropy coded � Units of encoded data wrapped in Network Abstraction Layer (NAL)
NAL Unit Unwrapping � Units separated by 3 byte combination “start code prefix” � End of units might be padded with bytes with value 0 � Encoder inserts bytes to prevent start code prefix inside units � Unwrapper reverses these effects
Entropy Decoding � First checks the type of a NAL unit � Parses the unit accordingly � Most syntax elements coded with Exp- Golomb codes � Transformed residual data coded with Context-based Adaptive Variable Length Coding (CAVLC)
Exp-Golomb Codes
CAVLC � Data encoded in several components � Each component has a set of tables � A table is chosen based on context � Decoded result from neighboring blocks used as context for one component
Hardware Design
NAL Unwrapper Module States � Three byte buffer � Counter for number of bytes in buffer � Counter for number of consecutive bytes with value 0
NAL Unwrapper Module Rules � A rule fills the buffer � A rule checks for start code prefix � A rule removes extra bytes that prevent start code prefix from appearing in data � A rule for normal operation � A rule for end of file case
Entropy Decoder States � Parsing state register � 77-bit input buffer � Input buffer counter � 16-element FIFO for intermediate results of CAVLC � Registers for decoded syntax elements that are needed for parsing
Entropy Decoding Rules � A rule for initializing � A rule for checking the NAL unit type � A rule for filling the input buffer � A rule for parsing the data � Basically a large finite state machine
Design Exploration A � Residual data (output of CAVLC) usually contains many consecutive zeros � Original: outputs zeros one by one � Change: outputs the consecutive number of zeros
Design Exploration B � Most of the Exp-Golomb syntax elements only up to 16 bits decoded � Some infrequent ones are up to 32 bits � Original: use same decoder function � Change: two versions of decoder � 1-cycle 16 bit decoder function � 32 bit decoder split into 2 parts (2 cycles)
Design Exploration C � The input buffer filler and parser rules of entropy decoder conflict � Original: buffer filled one byte at a time � Change: an extra 32-bit buffer is used � An extra rule adds bytes into extra buffer � 32 bits inserted into main buffer each time
Benchmarks � Small clips of three different files � 5 frames with 176x144 resolution � 15 frames with 176x144 resolution � 5 frames with 352x288 resolution
Benchmark Results Total Cycle Total Area Cycles Delay Time (mm^ 2) Original 654290 6.468 ns 4.232 ms 0.3378 A 251524 6.405 ns 1.611 ms 0.3283 A+ B 251552 5.955 ns 1.498 ms 0.2820 A+ C 230750 6.400 ns 1.477 ms 0.3690 A+ B+ C 230712 6.184 ns 1.427 ms 0.2932
Recommend
More recommend