DREXEL CS 536 - Multiresolution Video - D2579275

Home> Schools> Drexel University> Computer Science (CS) > CS 536> Multiresolution Video

DREXEL CS 536 - Multiresolution Video

Pages 13

Download Save

Unformatted text preview:

MultiresolutionVideoByAdam FinkelsteinCharles JacobsDavid SalesinVideo DataTime varying image dataCan be save as a sequence of imagesAll at the same resolutionNew StructureMutiresolution VideoProvides a means of capturing time-varying image data produced at multiple scales,both spatially and temporally.Video FormatSparse Binary Tree of sparse quadtreesBinary tree encodes the flow of timeQuadtrees encodes the spatial decomposition of a frameThe Time TreeRefers to the Binary TreeNodes correspond to a single image or frame of the video sequence at some resolution.Leaves of the time tree correspond to the frames at the highest resolution.Time Tree ContinuedInternal nodes of the time tree correspond to a box-filtered average of their two children.Visually these frames appear as motion-blurred versions of their children.The time tree can grow to different depths.Image Tree A Quadtree is called an image treeRepresents the multiresolution image content of a single frame a video sequence. Leaves of the Image tree correspond to pixels at the highest spatial resolution.Image Tree ContinuedThe Internal nodes of the Image tree correspond to box-filtered average of their children.The Image tree can grow to different depths.Data structure are provided in the paper.The tree of quadtreesBinary Tree1.2 OverviewThe rest of this paper is organized as follows. Section 2 describesour representation for multiresolution video, and Section 3 describeshow it is created and displayed. Section 4 describes a variety of ap-plications of the multiresolution video representation, and Section 5provides some concrete examples. Finally, Section 6 outlines someareas for future work. The appendix provides additional low-leveloperations useful for editing multiresolution video.2 RepresentationOur goals in designing a multiresolution video representation werefivefold. We wanted it to:support varying spatial and temporal resolutions;require overall storage proportional only to the detail present(with a small constant of proportionality);efficiently support a variety of primitive operations for creating,viewing, and editing the video;permit lossy compression; andrequire only a small “working storage” overhead, so that videocould be streamed in from disk as it is needed.The rest of this section describes the multiresolution video formatwe chose and an analysis of the storage required.2.1 The basic multiresolution video formatPerhaps the most obvious choice for a multiresolution video formatwould be a sparse octree [14], whose three dimensions were used toencode the two spatial directions and time. Indeed, such a represen-tation was our first choice, but we found that it did not adequately ad-dress a number of the goals enumerated above. Put briefly, the prob-lem with such a representation is that it couples the dimensions ofspace and time too tightly. In an octree structure, each node wouldcorrespond to a “cube” with a fixed extent in space and time. Thus,it would be efficient to rescale a video to, say, twice the spatial res-olution only if it were equally rescaled in time—that is, played athalf the speed. We therefore needed to develop a representation that,while still making it possible to take advantage of temporal and spa-tial coherence, could couple space and time more loosely.The structure we ultimately chose is a sparse binary tree of sparsequadtrees. The binary tree encodes the flow of time, and eachquadtree encodes the spatial decomposition of a frame (Figure 1).In the binary tree, called the Time Tree, each node corresponds to asingle image, or frame, of the video sequence at some temporal res-olution. The leaves of the Time Tree correspond to the frames at thehighest temporal resolution for which information is present in thevideo. Internal nodes of the Time Tree correspond to box-filtered av-erages of their two children frames. Visually, these frames appear asmotion-blurred versions of their children. Note that this representa-tion supports video sequences with varying degrees of temporal res-olution simply by allowing the Time Tree to grow to different depthsin different parts of the sequence. For convenience, we will call thechild nodes of the Time Tree child time nodes and their parents par-ent time nodes. We will use capitalized names for any time node.Time Tree nodes are represented by the following data structure:typeTimeNode= recordframe: pointer toImageNodeHalf1,Half2: pointer toTimeNodeend record[6,8][2,4][0,2] [4,6][5,6]image trees[4,5]time}}}}}1 2 3 4 5 6 70 8[0,8][0,4] [4,8]Figure 1 Binary tree of quadtrees.Each node of the Time Tree points to a sparse quadtree, called animage tree, which represents the multiresolution image content ofa single frame of the video sequence. In analogy to the Time Tree,leaves of an image tree correspond to pixels at the highest spatialresolution for which information is present in the particular framebeing represented. Internal nodes of an image tree correspond, onceagain, to box-filtered averages of their children—in this case, to a22 block of higher-resolution pixels. Note that the image tree sup-ports varying spatial resolution simply by allowing the quadtree toreach different depths in different parts of the frame. We will call thechild nodes of an image treechild image nodes and their parents par-ent image nodes. In our pseudocode we will use lower-case namesfor any image node. Figure 4 shows a frame from a video clip, whereleaf nodes of the image tree are boxed in yellow.Specifically, here is how we encode each node in the image tree:typeImageNode= recordtype: TREE COLORuplink:UpLinkInfouniontree: pointer toImageSubtreecolor:PixelRGBAend unionend recordtypeImageSubtree= recordavgcolor:PixelRGBAchild[0..1, 0..1]: array ofImageNodeend recordEach subtree contains both the average color for a region of the im-age, stored as an RGBA pixel, and also image nodes for the fourquadrants of that region. We compute the average of the pixels asif each color channel were premultiplied by alpha—as prescribedby Porter and Duff [12]—but we do not actually represent the pix-els that way in our image nodes, in order to preserve color fidelityin highly-transparent regions. Each image node generally contains apointer to a subtree for each quadrant. However, if a given quadrantonly has a single pixel’s worth of data, then the color of the pixel isstored in the node directly, in place of the pointer. (This trick worksnicely, since an RGBA pixel value is

View Full Document


School:
Email:
New Password:
Confirm Password:

DREXEL CS 536 - Multiresolution Video

Sign up for free to view:

Please select your school