Unformatted text preview:

Parallel Volume Rendering Using Binary Swap Image Composition Presented by Jin Ding Spring 2002 Visualization and Advanced Computer Graphics Introduction Observations Existing volume rendering methods are very computationally intensive and therefore fail to achieve interactive rendering rates for large data sets Volume data sets can be too large for a single processor to machine to hold in memory at once Trading memory for time results in another order of magnitude increase in memory use An animation sequence of volume visualization normally takes hours to days to generate Massively parallel computers and hundreds of high performance workstations are available Goals Ways Distributing the increasing amount of data as well as the timeconsuming rendering process to the tremendous distributed computing resources available to us Parallel ray tracing Parallel compositing Implementations Hardware CM 5 and networked workstations The parallel volume renderer evenly distributes data to the computing resources available Each volume is ray traced locally and generates a partial image No need to communicate with other processing units The parallel compositing process then merges all resulting partial images in depth order to produce the complete image The compositing algorithm always makes use of all processing units by repeatedly subdividing partial images and distributing them to the appropriate processing units Background Background The major algorithmic strategy for paralleling volume rendering is the divide and conquer paradigm Data space subdivision assigns the computation associated with particular subvolumes to processors usually applied to a distributed memory parallel computing environment Image space subdivision distributes the computation associated with particular portions of the image space often used in shared memory multiprocessing environments The method to introduce here is a hybrid because it subdivides both data space during rendering and image space during compositing A Divide and Conquer Algorithm Starting Point The volume ray tracing technique presented by Levoy The data volume is evenly sampled along the ray usually at a rate of one or two samples per pixel The composting is front to back based on Porter and Duff s over operator which is associative a over b over c a over b over c The associativity allows breaking a ray up into segments processing the sampling and compositing of segment independently and combining the results from each segment via a final compositing step Data Subdivision Load Balancing The divide and conquer algorithm requires that we partition the input data into subvolumes Ideally each subvolumes requires about the same amount of computation Minimize the amount of data which must be communicated between processors during compositing Data Subdivision Load Balancing The simplest method is probably to partition the volume along planes parallel to the coordinate planes of the data If the viewpoint is fixed and known the data can be subdivided into slices orthogonal to the coordinate plane most nealy orthogonal to the view direction Produce subimages with little overlap and therefore little communications during compositing when orthographic projection is used When the view point is unknown a priori or perspective projection is used it is better to partition the volume equally along all coordinate planes Known as block data distribution and can be accomplished by gridding the data equally along each dimension Data Subdivision Load Balancing This method instead use a k D tree structure for data subdivision Alternates binary subdivision of the coordinate planes at each level in the tree When trilinear interpolation is used the data lying on the boundary between two subvolumes must be replicated and stored with both subvolumes Parallel Rendering Local rendering is performed on each processor independently Data communication is not required Only rays within the image region covering the corresponding subvolumes are cast and sampled Parallel Rendering Some care needs to be taken to avoid visible artifacts where subvolumes meet Consistent sampling locations must be ensured S2 1 S1 n S1 n S1 n 1 Or the subvolume boundary can become visible as an artifact in the final image Image Composition The final step is to merge ray segments and thus all partial images into the final total image Need to store both color and opacity The rule for merging subimages is based on the over compositing operator Subimages are composited in a front to back order This order can be determined hierarchically by a recursive traversal of the k D tree structure A natural way to achieve parallelism in compostion sibling nodes in the tree may be processed concurrently Image Composition A na ve approach is binary compositing Each disjoint pair of processors produces a new subimage N 2 subimages are left after the first stage Half the number of the original processors are paired up for the next level of compositing hence another half would be idle More parallelism must be exploited to acquire subsecond times The binary swap compositing method makes sure that every processor participates in all the stages of the process The key idea at each compositing stage the two processors involved in a composite operation split the image plane into two pieces Image Composition In early phases each processor is responsible for a large portion of the image area In later phases the portion of each processor gets smaller and smaller At the top of the tree all processors have complete information for a small rectangle of the image The final image can be constructed by tiling these subimages onto the display Two forces affect the size of the bounding rectangle as we move up the compositing tree the bounding rectangle grows due to the contributions from other processors but shrinks due to the further subdivision of image plane Image Composition The binary swap compositing algorithm for four processors Communication Costs At the end of local rendering each processor holds a subimage of size approximately p n 2 3 pixels where p is the number of pixels in the final image and n is the number of processors The total number of pixels over all n processors is therefore p n1 3 Half of these pixels are communicated in the first phase and some reduction in the total number of pixels will occur due to the depth overlap resolved in this compositing stage The k D tree partitioning splits each of the coordinate planes in


View Full Document

UTK CS 594 - Parallel Volume Rendering Using Binary-Swap Image Composition

Documents in this Course
Load more
Loading Unlocking...
Login

Join to view Parallel Volume Rendering Using Binary-Swap Image Composition and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Parallel Volume Rendering Using Binary-Swap Image Composition and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?