This preview shows page 1-2-3-25-26-27 out of 27 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

High Level Programming for GPGPUFebruary 8, 2008University of Central FloridaAMD IL1High Level Programming for GPGPUFebruary 8, 2008University of Central FloridaAMD IL2Last time we introduced HLSL and the R600 ISAAMD IL is a portable immediate language that sits between high level languages (Brook+ or HLSL) and the ISAAMD IL is meant to be generation compatible s.t. future hardware can compile from IL whereas the ISA is asic dependentHigh Level Programming for GPGPUFebruary 8, 2008University of Central FloridaExample3kernel void sum(float a<>, float b<>, out float c<>){ c = a + b;}il_ps_2_0dcl_cb cb0[1]dcl_resource_id(0)_type(2d,unnorm)_fmtx(float)_fmty(float)_fmtz(float)_fmtw(float)dcl_input_generic_interp(linear) v0.xy__dcl_resource_id(1)_type(2d,unnorm)_fmtx(float)_fmty(float)_fmtz(float)_fmtw(float)dcl_input_generic_interp(linear) v1.xy__sample_resource(0)_sampler(0) r0.x, v0.xy00sample_resource(1)_sampler(1) r1.x, v1.xy00mov r2.x, r0.xxxxmov r3.x, r1.xxxxcall 0mov r4.x, r5.xxxxdcl_output_generic o0mov o0, r4.xxxxretfunc 0add r6.x, r2.xxxx, r3.xxxxmov r7.x, r6.xxxxmov r5.x, r7.xxxxretendBrook+ KernelGenerated AMD ILHigh Level Programming for GPGPUFebruary 8, 2008University of Central FloridaIL code generationDX HLSLCompile to DX asm using fxc (Microsoft HLSL compiler)Compile DX asm to IL using AMD GPU Shader AnalyzerAMD HLSLCompile AMD HLSL to IL using AMD HLSL compilerBrook+Compile Brook+ kernels to IL using brccOr write it yourself4High Level Programming for GPGPUFebruary 8, 2008University of Central FloridaWriting IL codeIL resembles DX assemblyIL is also used for DirectX and OpenGL shadersIL code will be optimized by the GPU compiler to ISAReadability may be more important when writing IL5High Level Programming for GPGPUFebruary 8, 2008University of Central FloridaDX asm vs IL6ps_4_0dcl_input linear v0.xydcl_output o0.xyzwdcl_sampler s0, mode_defaultdcl_sampler s1, mode_defaultdcl_resource_texture2d ( float , float , float , float ) t0dcl_resource_texture2d ( float , float , float , float ) t1dcl_temps 2sample r0.xyzw, v0.xyxx, t0.xyzw, s0sample r1.xyzw, v0.xyxx, t1.xyzw, s1add o0.xyzw, r0.xyzw, r1.xyzwret il_ps_2_0dcl_input_interp(linear) v0.xy__dcl_output_generic o0dcl_resource_id(0)_type(2d)_fmtx(float)_fmty(float)_fmtz(float)_fmtw(float)dcl_resource_id(1)_type(2d)_fmtx(float)_fmty(float)_fmtz(float)_fmtw(float)sample_resource(0)_sampler(0) r0, v0.xyxxsample_resource(1)_sampler(1) r1, v0.xyxxadd o0, r0, r1ret_dynendDX asmAMD ILHigh Level Programming for GPGPUFebruary 8, 2008University of Central FloridaInstruction syntax<instr>[_<ctrl>][_<ctrl(val)>] [<dst>[_<mod>][.<write-mask>]] [, <src>[_<mod>][.<swizzle-mask>]]...Broken down:<instr> [_<ctrl>][_<ctrl(val)>]–instruction with control specifiers[<dst>[_<mod>][.<write-mask>]]–destination register with modifier and write mask[, <src>[_<mod>][.<swizzle-mask>]]...–source registers with modifier and swizzle mask7High Level Programming for GPGPUFebruary 8, 2008University of Central FloridaRegistersRegisters are four component vectorsv# - import registerso# - output registersr# - general purpose registersThere are also other special enumerated registersRegisters are typeless. Integer instructions can operate on float data. User must take care to keep track of register types and convert8High Level Programming for GPGPUFebruary 8, 2008University of Central FloridaRegister modifiersDestination modifiers apply extra operations to the destination register after the instruction runsExamples:<dst>_x2 multiplies by 2<dst>_d4 divides by 4Source modifiers apply extra operations to the source register before the instruction runsExamples:<src>_abs returns the absolute value<src>_sign returns the sign9High Level Programming for GPGPUFebruary 8, 2008University of Central FloridaWrite MasksElement-wise write masks on destination registers control how components are written toSyntax: reg.{x|_|0|1}{y|_|0|1}{z|_|0|1}{w|_|0|1}–x,y,z,w are components–underscore “_”, means don’t write (can also leave blank)–0 or 1, replaces component with 0 or 1Mask is position dependentExamples:–mov r0.x___, r1; move r1.x to r0.x and leave rest unchanged–mov r0.x, r1; same as previous–mov r0.y_w, r1; only write to r0.y and r0.w and leave rest–mov r0.0000, r1; zero out all components–mov r0.xyz1, r1; write all except w, which changes to 110High Level Programming for GPGPUFebruary 8, 2008University of Central FloridaSwizzle maskControls how the register components are usedSyntax:reg.{x|y|z|w|0|1}{x|y|z|w|0|1}{x|y|z|w|0|1}{x|y|z|w|0|1}Mask is position independentBlanks mean use default componentExamples:mov r0, r1.yxzw; move r1.y->r0.x, r1.x->r0.y, r1.z->r0.z, r1.w->r0.wmov r0, r1.yx; same as previousmov r0, r1.xyz0; standard move except force r0.w to zero11High Level Programming for GPGPUFebruary 8, 2008University of Central FloridaInstructionsDeclaration and InitializationInput (memory fetches)Conversion General ALU instructions (Math/Trig/Special)Flow controlBitwiseDoubleComparison12High Level Programming for GPGPUFebruary 8, 2008University of Central FloridaDeclaration and initializationInputs and outputs must be declared–constant buffers–input interpolators–resources (textures/streams)–literals13High Level Programming for GPGPUFebruary 8, 2008University of Central FloridaConstant buffersIn the past, constants were passed to the shader individuallyNow, constants are passed together in constant buffersConstant buffers elements are 4 component vectorsDeclaration:dcl_cb cb<#>[<size>]Buffer elements are addressed like C arrays14High Level Programming for GPGPUFebruary 8, 2008University of Central FloridaConstant literalsLiterals are constant values used in the source code. For example, int a = 5;Previously, constants had to be passed in like constant variablesSyntax:dcl_literal l#, <x-bits>, <y-bits>, <z-bits>, <w-bits>Example:dcl_literal l1, 0x40A00000, 0x3F800000, 0x3E99999A, 0x3E0F5C29–float literal float4(5, 1, 0.3, 0.14)dcl_literal l2, 0x00000003, 0xFFFFFFFF, 0xFFFFFFFB, 0x00000007–integer literal int4(3, -1, -5, 7)15High Level Programming for GPGPUFebruary 8, 2008University of Central FloridaDeclaring memorySyntax:dcl_resource_id(n)_type(pixtexusage[,unorm])_fmtx(fmt)_fmty(fmt)_fmtz(fmt)_fmtw(fmt)Example:dcl_resource_id(0)_type(2d)_fmtx(float)_fmty(float)_fmtz(float)_fmtw(float)–2D texture of float416High Level Programming for GPGPUFebruary 8, 2008University of Central


View Full Document

UCF CAP 6938 - AMD IL

Download AMD IL
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view AMD IL and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view AMD IL 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?