DOC PREVIEW
UW-Madison ME 964 - Midterm Progress

This preview shows page 1-2-3-4-25-26-27-52-53-54-55 out of 55 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 55 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 55 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 55 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 55 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 55 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 55 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 55 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 55 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 55 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 55 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 55 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 55 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Collision Detection Design & Final Project Topiccontact_data AllocationKernel Call SetupCollide Kernel: IndexingCollide Kernel: Contact TestingFinal Project: Monte Carlo Radiation TransportExample: Fusion Reactor ShieldingTasks during a Particle’s LifeExisting Fortran CodePotential for ParallelismImplementation ChallengesME 964: Project Proposal Vikalp MishraCollision DetectionFinal Project: Bone FEAWhy study femur ?BackgroundTypical approachUse of FEABottleneckGPU based approachME 964 – Midterm and Final ProjectsCUDA Collision detectionTask Parallelism – pseudo codeFinal ProjectExampleGoalsMidterm ProjectThe TaskThe AlgorithmIndexingFinal Project - Image Processing on the GPUProposed ImplementationsHarris Corner DetectorHarris Corner Detector Contd..Midterm and Final ProjectsSlide 36Slide 37Slide 38Final ProjectSlide 40Slide 41Slide 42Slide 43Slide 44ME 964 Midterm & Final ProjectOutlineEfficient collision detectionOverview of methodDetermine possible contactsThree dimensional caseNeed to verify collisionVerifying collisionImplementation in CUDAExtending midterm to final projectReferencesCollision Detection Design & Final Project TopicBrandon SmithNovember 5, 2008ME 964contact_data Allocation•Possible ways to allocate the contact_data array:–Allocate contact_data[ N(N-1)/2 ]–Allocate contact_data[ n_contacts ]•To avoid creating a huge array, I chose the second method:–1st Kernel Call•Find the number of contacts.–2nd Kernel Call•Calculate the contact_data for each contact.Kernel Call Setup •The total number of contact tests is:n_tests = N(N-1)/2•The total number of concurrent threads is:n_concurrent_threads = N_SMs * BLOCKS_PER_SM * THREADS_PER_BLOCK•Each thread will perform several tests:n_test_per_thread = n_tests / n_concurrent_threads + 1Collide Kernel: Indexing•Given the block number and thread number, a range of test numbers (ki,kf) are generated:thread_id = bx*THREADS_PER_BLOCK + tx;ki = tests_per_thread*thread_id + 1;kf = ki + tests_per_thread - 1;Body 1 2 3 4 j1 1 2 4 72 3 5 83 6 94 ki•Given a test number k, the indices (i,j) can be calculated:k = ( (j-1)2-(j-1) )/2 + Ik <= (j2-j )/2Collide Kernel: Contact Testing•__global__ function calls __device__ test to actually perform the contact test•In the first pass it simply tests for contact•In the second pass it calculates contact_data.•atomicAdd is used to count the number of contacts –Keeps one contact tall for all concurrent threads–No need for condensation of results from each thread–Hassle to compile:nvcc.exe -ccbin "C:\Program Files\Microsoft Visual Studio 8\VC\bin" -c -arch sm_11 -D_CONSOLE -Xcompiler "/EHsc /W3 /nologo /Wp64 /O2 /Zi /MT " -I"C:\CUDA\include" -I"C:\Program Files\NVIDIA Corporation\NVIDIA CUDA SDK\common\inc" -o Release\collide.obj collide.cuFinal Project: Monte Carlo Radiation Transport•Objective: –Compute radiation flux or derived quantities over a spatial/temporal domain.•Method:–Follow the life of individual particles through the domain.1D Half Absorber - Half Scatter Benchmark1.00E+001.00E+011.00E+021.00E+031.00E+041.00E+051.00E+061.00E+071.00E+080 5 10 15 20Distance [cm]Log (Flux) [n/cm^2*s]Diffusion TheoryMonte Carlo•Quality of Results:–Statistical error is proportional to 1/sqrt(n_particles)–Difficult to get even particle distribution across the domain–Many particles are required to achieve low statistical errorExample: Fusion Reactor Shielding•The GPU Advantage:–Increase the number of simulated particles–Decrease statistical errorTasks during a Particle’s Life•Birth: particles are created at a source•Ray-cast: the distance to the next surface is calculated•Collision: the particle interacts with matter•Next volume: the particle crosses a boundary into another material•Death: if the particle is absorbed, it is killed.a b cVolume 1 Volume 2ComplementParticle Tracking - No Collisions, No OverlapsabcVolume 1 Volume 2ComplementParticle Tracking - Collision at a, No OverlapsdExisting Fortran Code•Geometry:–3-D geometry supporting boxes and spheres•Physics:–Only neutral particles (neutrons, photons)–No energy dependence–No time dependence•Materials:–Simple materials (only a few isotopes)•Sources:–point, line, area, volume•Results:–mesh tallies and volume talliesPotential for Parallelism•Usually we can assume each particle is independent, unless:–criticality, weight windows, etc…•Each thread could calculate independent particle trajectories–embarrassingly parallel•When enough particles are simulated, condense the results from each threadImplementation Challenges•Current code is in Fortran 90–~1700 lines–Has anyone tried F2C?•Designed for Fortran 77•Particles are tracked on a large mesh– ~1 M mesh elements, accessed once per particle–Mesh will need to be in global memory–Mesh will be accessed with an atomic function for data sharing?•Ensure that random numbers are not repeated–Use a pseudo-random number generator for each thread–Each thread will need a different random seed –Check to ensure sufficiently large stride•Could schedule rendezvous to check for solution convergence–Stop simulation once statistical error falls below a set value ( 5% )ME 964: Project ProposalVikalp MishraCollision Detection•Aim–Solve collision detection problem given N rigid spheres in 3D space•Approach–Brute Force–Compare each sphere with every other sphere•O(n2)–If distance between centers is•more than sum of radii  No collision•Less than sum of radii  Collision–When collision detected•compute normal and object IDsFinal Project: Bone FEA•Title:–GPU based Finite Element Analysis of Femur•Femur–Thigh bone: Bone between hip and knee joint–Longest/ strongest bone in the bodyWhy study femur ?•To better understand bone mechanics/ properties–Across species•To understand the impact & extent of injury under various loading–Use in sports medicine & surgery•To study impact of DNA change on bone formation/ growth–Improve the process of cloning to develop better species•To study effect of nutrition cycle on bone developmentBackground•In past–Experiments were done to study bone behavior / material properties•Test performed–Fracture test–Bending test–Torsion test•Experiments on mouse / pig–Costly and time consuming–Only one experiment per sample possible•Alternative–Capture bone geometry and material properties–Use


View Full Document

UW-Madison ME 964 - Midterm Progress

Documents in this Course
Load more
Download Midterm Progress
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Midterm Progress and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Midterm Progress 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?