Mixed Raster Content for Compound Image Compression Final Project Presentation EE 5359 Spring 2009 Submitted to Dr K R Rao Submitted by Pritesh Shah 1000555858 MOTIVATION In today s world it is impossible to imagine a day without information or information exchange in digital form over internet With new advances in data processing systems and scanning devices documents are present in a wide variety of printing systems Documents in digital form are easy to store and edit and can also be transmitted within seconds These documents may contain text graphics and pictures So their storage requires huge memory and also transmission requires high compression and bit rates to avoid expenses and delay INTRODUCTION Compound images are documents containing both binary text and continuous tone images JPEG can be used for documents containing only pictures and graphics But when compressing a compound document MRC is found to have an upper hand MRC uses a multi layer multi resolution representation of a compound document Instead of using a single algorithm it uses multiple compression algorithms including the ones specifically developed for text and images So it can combine the best of new or existing algorithms and can provide various qualitycompression ratio tradeoffs Mixed Raster Content MRC Imaging Model FG MASK BG Fig 1 MRC 3 plane configuration Foreground FG plane poured into Background BG through Mask layer 1 The 3 layer MRC model contains 2 colored image layers foreground FG and background BG and one binary image layer mask 1 Mask layer is the decisive layer in reconstructing the image from the FG and BG layers When the pixel value in the mask layer is 1 the corresponding pixel from the FG is selected and when its 0 the corresponding pixel from the BG is selected and the final image is reconstructed as shown in Fig 1 MRC Framework for Scanned Data 4 Fig 2 MRC framework In MRC after the original single resolution image is decomposed into layers they are processed and compressed using different algorithms Fig 2 shows how the original document X is input to the pre processor which yields output Y 3 layer MRC model It also constructs an edge sharpening map and estimates the original edge softness This information is termed as side information This data is then encoded and then the encoded data goes to the decoder where the reconstructed data Y along with the side information is used by the postprocessor to assemble the reconstructed version X of the original document Decomposition approaches yielding the same reconstructed image This figure describes the decomposition process The basic approaches are region classification RC and transition identification TI In case of RC decomposition the regions containing graphics and text are represented in a separate FG plane Everything is represented in FG plane including the spaces in between letters and other blank spaces The mask as shown is uniform with large patches clearly differentiating the text and the graphic regions and the background contains the document background complex graphics and or continuous tone pictures 1 TI decomposition is quite similar to RC decomposition as can be seen from the same figure However in case of TI mask and FG planes represent graphics and text in different manner The FG plane pours the ink of text and graphics through the mask onto the BG plane So the mask should have the necessary text contours Hence the mask layer contains text characters line art and filled regions while the FG layer contains colors of the text letters and graphics In RC decomposition the mask layer is very uniform and very well compressible But the FG can contain edges and continuous tone details So it cannot be compressed well with typical continuous tone coders such as JPEG As the mask layer in case of TI contains text objects and edges it can be efficiently encoded using standard binary coders such as JBIG and JBIG 2 The FG plane can be very efficiently coded even with coders such as JPEG because it contains large uniform patches Also the FG plane can be sub sampled without much loss in image quality 1 RD plot modification in multiple MRC layers 1 Fig 3 RD plot modification RD plot modification The benefit of using MRC model for compression can be observed by analyzing its rate distortion RD characteristics As shown in Fig 3 if image a is compressed with a generic coder A with fixed parameters except for a compression parameter it will operate under a RD plot as shown in b It is seen that another coder B under the same circumstances is found to perform better than coder A if its RD plot is shifted to the left as shown in c The logic for MRC is to split the image into multiple planes as shown in d and to apply coders C D and E to each plane with RD plots similar to that of coder B Thus the equivalent coder may have better RD plots than A but there would be an overhead associated with a multi plane representation List of existing MRC based encoders 2 RER Resolution Enhanced Rendering 2 In RER Adaptive Error Diffusion method is used to encode edge detail into the binary mask layer of the MRC document The MRC document is then decoded using a Nonlinear Predictor to determine the relative amount of foreground and background color to be applied to each pixel To yield minimum document image distortion a method for jointly optimizing the parameters of the RER encoder and decoder is proposed Simulation results indicate that RER method can reduce document image distortion to a great extent at a fixed bit rate Also RER method is totally compatible with MRC standard and can be efficiently implemented in standard MRC encoders and decoders Training model of the optimized encoder and decoder Fig 4 Comparison of MRC a and RER Resolution Enhanced Rendering b Encoders 2 Xs Xs RER Encoder In figure 4 b the RER encoding module creates the dithered mask Ds as the output by taking in the FG BG binary mask layer and the original document as the input The Ds FG and BG are separately compressed by using the binary image encoder and continuous tone encoder that are used for the MRC encoder In the encoding module an edge detection procedure is performed on the binary mask layer to determine the pixels that lie on the boundary between FG and BG to compute dithered mask If Ks is 1 then s is a boundary pixel else it is a non boundary pixel s is obtained by Linear Projection Then at each boundary pixel Ds is generated from s by switching on the error diffusion
View Full Document