System Modeling and SoftwareImplementation of MPEG-4 EncoderChen HeShi ZhongMay 4, 2000Problem Statement• Real-time implementation of MPEG-4encoder– Computation-intensive– Inherent parallelism– Precedence preservation– Flexible configurationOur Approach• System modeling using ComputationalProcess Networks– Deterministic concurrent model– Precedence-preserving• Software implementation– C++, POSIX Threads– Allen’s CPN frameworkPN Model of the Core EncoderOutputbitstreamFork1Sub_PredDCTQuantVLCInv.QuantInv.DCTAdd_PredFork2Fork3MEMCInputframedelayMotion VectorFork4Finer Hierarchical Model of MotionEstimation NodeME1SegmentationCompositionME2MEnME NodePreviousframeCurrentframeSoftware Implementation• Node and queue design– Data type and structure for node input, node output andtokens• Code generation (time-consuming!)– Based on existing C source code on the web• Simulation– Frame-based top level core encoder– Platform: Single Intel Pentium III Xeon (733MHz?)processor, Linux, 256MB memoryExample of Nodes Execution...Encoding frame 0 ...ForkNode starting .ForkNode processed 1 frame(s).ForkNode starting .ForkNode processed 1 frame(s).MENode starting.MENode processed 1 frame(s).MCNode starting.MCNode processed 1 frame(s).ForkNode starting .ForkNode processed 1 frame(s).SUBPredNode starting.SUBPredNode processed 1 frame(s).DCTNode starting.DCTNode processed 1 frame(s).QvlcNode starting.QvlcNode processed 1 frame(s).IQUANTNode starting.IQUANTNode processed 1 frame(s).IDCTNode starting.IDCTNode processed 1 frame(s).ADDPredNode starting .ADDPredNode processed 1 frame(s)....Simulation Results• Successful encoding results– On test sequences (128*128, color format 4:2:0)– Decodable and playable by existing MPEG player• Faster than the original sequential encoder– Even on a single processor!– Benefits from concurrent model and Pthreadimplementation outweigh thread overheads– Benefit margin may depend on the inherent parallelismexposed by the designed model and node granularityPerformance Evaluation0 200 400 600010203040506070Number ofFrames EncodedEncoding Time (Seconds)Comparison ofEncoding TimeSequentialProposed0 200 400 6000102030405060708090100Number ofFrames EncodedEncoding Time Improvement (%)Encoding Time DifferenceConclusion• Our approach is– Scalable to multi-processor environment (expected tohave approximate linear speedup thus potentiallyfeasible for real-time implementation)– Faster due to concurrent execution (Pthreadimplementation of PN nodes)• Future work– Profiling the computation load of each node– Evaluation on multi-processor
View Full Document