First Parts of H.264 DecoderChun-Chieh LinContents H.264 Overview NAL Unit Unwrapping Details Entropy Decoding Details Hardware Design Design Explorations Benchmark ResultsH.264 Overview Works on blocks of 4x4 to 16x16 pixels Encoder picks a way to approximate current block using previous data Residual data transformed in 4x4 blocks Almost everything is entropy coded Units of encoded data wrapped in Network Abstraction Layer (NAL)NAL Unit Unwrapping Units separated by 3 byte combination “start code prefix” End of units might be padded with bytes with value 0 Encoder inserts bytes to prevent start code prefix inside units Unwrapper reverses these effectsEntropy Decoding First checks the type of a NAL unit Parses the unit accordingly Most syntax elements coded with Exp-Golomb codes Transformed residual data coded with Context-based Adaptive Variable Length Coding (CAVLC)Exp-Golomb CodesCAVLC Data encoded in several components Each component has a set of tables A table is chosen based on context Decoded result from neighboring blocks used as context for one componentHardware DesignNAL Unwrapper Module States Three byte buffer Counter for number of bytes in buffer Counter for number of consecutive bytes with value 0NAL Unwrapper Module Rules A rule fills the buffer A rule checks for start code prefix A rule removes extra bytes that prevent start code prefix from appearing in data A rule for normal operation A rule for end of file caseEntropy Decoder States Parsing state register 77-bit input buffer Input buffer counter 16-element FIFO for intermediate results of CAVLC Registers for decoded syntax elements that are needed for parsingEntropy Decoding Rules A rule for initializing A rule for checking the NAL unit type A rule for filling the input buffer A rule for parsing the data Basically a large finite state machineDesign Exploration A Residual data (output of CAVLC) usually contains many consecutive zeros Original: outputs zeros one by one Change: outputs the consecutive number of zerosDesign Exploration B Most of the Exp-Golomb syntax elements only up to 16 bits decoded Some infrequent ones are up to 32 bits Original: use same decoder function Change: two versions of decoder 1-cycle 16 bit decoder function 32 bit decoder split into 2 parts (2 cycles)Design Exploration C The input buffer filler and parser rules of entropy decoder conflict Original: buffer filled one byte at a time Change: an extra 32-bit buffer is used An extra rule adds bytes into extra buffer 32 bits inserted into main buffer each timeBenchmarks Small clips of three different files 5 frames with 176x144 resolution 15 frames with 176x144 resolution 5 frames with 352x288 resolutionBenchmark Results0.29321.427 ms6.184 ns230712A+B+C0.36901.477 ms6.400 ns230750A+C0.28201.498 ms5.955 ns251552A+B0.32831.611 ms6.405 ns251524A0.33784.232 ms6.468 ns654290OriginalArea(mm^2)Total TimeCycle DelayTotal
View Full Document