Low complexity AVS M by implementing machine learning algorithm C4 5 By Ramolia Pragnesh R Guided by Dr K R Rao Term Spring 2011 1 Motivation Increase in demand of multimedia contents over internet and wireless networks Bandwidth is the too expensive resource to increase it in proportion to the increase in demand of data Video codec plays an important role here compressing the data with high efficiency tools Complexity comes along with high efficiency in codecs Implementing hardware solutions for low end devices like mobile is very expensive and also creates problem of over heating and power consumption 2 Brief overview of the thesis Figure 1 Proposed encoder with C4 5 3 Table of contents Overview of AVS M Complexity calculation in AVS M Various approaches to reduce complexity Introduction to machine learning and algorithm C4 5 Proposed encoder Results Future work References 4 Introduction to AVS M 24 AVS M is the seventh part of video coding standard developed by AVS working group of China targeting mobile applications It has 9 different levels for different formats 16 It supports only progressive video coding hence codes frames only 22 It uses only 4 2 0 chroma sub sampling format 22 It uses only I and P frames 22 5 Different parts of AVS 10 Part Name 1 System 2 Video 3 Audio 4 Conformance test 5 Reference software 6 Digital media rights management 7 Mobile video 8 Transmit AVS via IP network 9 AVS file format 10 Mobile speech and audio coding Table 1 Different parts of 6 Key tools of AVS M 31 Network abstraction layer NAL Supplemental enhancement information SEI Transform 4x4 integer transform Adaptive quantization of step size varying from 0 63 Intra prediction 9 modes Fig 5 simple 4x4 intra prediction and direct intra prediction 25 Inter prediction 16x16 16x8 8x16 8x8 8x4 4x8 and 4x4 block sizes for ME MC Fig 7 Quarter pixel accuracy in motion estimation Simplified in loop de blocking filter Entropy coding Error resilience 7 Layered Data Structure Sequence Picture Slice Macro Block Block G O P Sequence Picture Slice Figure 2 Layered data structure of AVS M Bloc k Macro block 8 AVS M Codec 10 Each MB needs to be intra or inter predicted Switch S0 Fig 3 is used to decide between inter and intra based type of MB Unit size for intra prediction is block size of 4x4 and predictions are derived based on left and upper blocks Inter predictions are derived from blocks of varying sizes 16x16 16x8 8x16 8x8 8x4 4x8 and 4x4 from locally reconstructed frames Transform coefficients are coded by VLC Deblocking filter is applied on reconstructed image 9 Encoder Figure 3 Encoder of AVS M 10 10 Decoder Figure 4 Decoder of AVS M 10 11 Intra adaptive directional prediction 25 Figure 5 Intra adaptive directional prediction 12 Intra prediction Intra prediction scheme in AVS M brings much simplicity as compared to H 264 baseline profile of H 264 It uses 4x4 block as the unit for intra prediction It uses 2 modes of prediction in intra prediction intra 4x4 and direct intra prediction Intra 4x4 uses content based most probable intra mode decision as shown in Table 2 to save bits where U and L represents the upper ad left blocks as shown in Fig 6 Upper block U Left block L Current block Fig 6 Current block and neighboring block representation 16 Direct intra prediction brings much of the compression based on trade off decision 13 Intra prediction 1 0 1 2 3 4 5 6 7 8 1 8 8 8 8 8 8 8 8 8 8 0 8 0 0 2 0 0 0 2 0 2 1 8 2 1 2 2 2 2 2 2 2 2 8 2 2 2 2 2 2 2 2 2 3 8 2 1 2 3 4 5 2 7 2 4 8 4 4 2 4 4 4 6 4 4 5 8 5 5 2 5 5 5 6 5 5 6 8 6 6 6 6 6 6 6 6 6 7 8 7 7 2 7 7 7 6 7 7 8 8 0 1 2 3 4 5 6 7 8 U L Table 2 Content based most probable mode decision Mode 1 is assigned to L or U when the current block does not have Left or Upper block respectively table 25 14 Inter frame prediction Size of the blocks in inter frame prediction can be 16x16 16x8 8x16 8x8 8x4 4x8 and 4x4 depending on the amount of information present within the macro block 9 Motion is predicted up to pixel accuracy If the half pixel mv flag is 1 then it is up to pixel accuracy Half pixel and quarter pixel accurate motion vectors are calculated by interpolating the reference frame by applying filters Fig 8 15 Inter frame block sizes 9 7 block sizes are present in AVS M for inter frame prediction Figure 7 Inter frame prediction block sizes 16 sub pixel motion estimation by interpolation 15 16 Figure 8 interpolation of sub pixels hatched lines show half pixels empty circles are quarter pixels and capital letters represent full pixels 17 Complexity calculation for AVS M Variable 7 block sizes in Inter Mode It supports 9 intra 4 4 mode and 1 Direct intra prediction mode Full search for motion estimation gives the optimum result but that comes along with implementation complexity For example assuming FS full search and M block types N reference frames and a search range for each reference frame and block type equal to W check for N x M x 2W 1 2 positions to find out inter prediction mode and its motion vector that too inter pixel accurate 18 Continued 7 inter prediction modes because of 7 different block sizes 9 intra 4 4 modes and 1 direct intra prediction mode and pixel accuracy in motion vector estimation 19 Various techniques to reduce complexity Intra mode selection algorithm 26 Only intra spatial prediction scheme 27 Fast mode decision algorithm for intra prediction for H 264 AVC 28 Dynamic control of motion estimation search parameters for low complexity H 264 29 Adaptive algorithm for fast motion estimation 30 Adaptive algorithm for fast motion estimation in H 264 MPEG 4 AVC 4 20 Introduction to machine learning 32 It is a branch of science which develops algorithms to allow computers to evolve or become smart Machine learning algorithms are applied in large number fields machine vision medical diagnostics fraud transaction detection image processing wireless communication and market analysis are just few among them 21 Machine learning algorithm C4 5 33 It was developed by J R Quinlan It is descendant of ID3 and CLS c4 5 doc It uses divide and conquer approach to develop a tree Uses two possible criteria to carry out a test at each node of the tree information gain and gain ratio Initial tree is pruned to avoid overfitting which introduces errors in prediction 22 Proposed encoder 23 Implementation steps 2 Select number of frames of a video sequence in QCIF as training sequences Obtain the required attributes …
View Full Document