Graphics acceleration An example of line drawing by the ATI Radeon s 2D graphics engine Bresenham s algorithm Recall this iterative algorithm for doing a scanline conversion for a straight line It required five parameters The starting endpoint coordinates X0 Y0 The ending endpoint coordinates X1 Y1 The foreground color for the solid color line It begins by initializing a decision variable errorTerm 2 deltaY deltaX Algorithm s main loop for int y Y0 x X0 x X1 x drawPixel x y color if errorTerm 0 errorTerm 2 delY else y 1 errorTerm 2 delY delX How much work for CPU Example To draw the longest visible line in 1024x768 graphics mode will require approximately 10 000 CPU instructions The loop gets executed once for each of the 1024 horizontal pixels and each pass through that loop requires about ten CPU operations moves compares branches adds and subtracts plus the function calls Is acceleration possible The IBM 8514 A appeared in late 1980s It could do line drawing and some other common graphics operations if just a few parameters were supplied So instead of requiring the CPU to do ten thousand operations the CPU could do maybe ten operations then let the 8514 A graphics engine do the rest of the work 8514 A Block Diagram Graphics processor RAMDAC LUT DAC Display Monitor VRAM memory Display processor CRT controller Drawing engine CPU ROM PC Bus Interface PC Bus ATI improved on IBM s 8514 A Various OEM vendors soon introduced their own graphics accelerator designs Because IBM had not released details of its design others had to create their own programming interfaces all are different Early PC graphics software was therefore NOT portable between hardware platforms How does X300 draw lines To demonstrate the line drawing ability of our classroom s Radeon X300 graphics processors we wrote drawline cpp demo We did not have access to ATI s official Radeon programming manual but we had several such manuals from other vendors and we found clues in source code files for the Linux Radeon device driver Programming concepts Our demo program must first verify that it is running on a Radeon equipped machine It must determine how it can communicate with the Radeon s graphics accelerator Normal VGA registers are at standard I O port addresses but the graphics engine is outside the scope of established standards Peripheral Component Interconnect An industry committee led by Intel has established a standard mechanism that PC device drivers can use to identify the peripheral devices that a workstation has and their mechanisms for communication To simplify the Pre Boot Execution code modern PC s provide ROM BIOS routines that can be called to identify peripherals PCI Configuration Space Each peripheral device has a set of nonvolatile memory locations which store information about that device using a standard layout PCI CONFIGURATION HEADER 1024 bytes 256 bytes ADDITIONAL PCI CONFIGURATION DATA This device information is accessed via I O Port Addresses 0x3C8 0x3CF PCI Configuration Header Sixteen longword entries 256 bytes DEVICE VENDOR ID ID BASE ADDRESS BASE ADDRESS BASE ADDRESS BASE ADDRESS RESOURCE 0 RESOURCE 1 RESOURCE 2 RESOURCE 3 VENDOR ID 0x1002 Advanced Technologies Incorporated DEVICE ID 0x5B60 ATI Radeon X300 graphics processor BASE ADDRESS for RESOURCE 1 is the 2D engine s I O port Our findsvga cpp utility will show you the PCI Configuration Space for any peripheral devices of Class 0x030000 i e VGA compatible graphics cards Interface to PCI BIOS Our dosio c device driver and int86 cpp companion code allow us access to BIOS The PCI BIOS services are accessible in the Pentium s virtual 8086 mode using function 0xB1 of software interrupt 0x1A There are several subfunctions you can find documentation online for example Professor Ralf Brown s Interrupt List return radeon port address Our demo invokes these PCI ROM BIOS subfunctions to discover which I O Port our Radeon s 2D graphics engine uses Subfunction 1 Detect BIOS presence Subfunction 3 Find Device in a Class Subfunction A Read Configuration Dword Configuration Dword at offset 0x14 holds I O Port Address for 2D graphics engine The ATI I O Port Interface iobase 0 iobase 4 MM INDEX MM DATA You output a register s index to the iobase 0 address Then you have read or write access to that register at the iobase 4 address Many 2D engine registers You can peruse the radeon h header file to see names and register index numbers for the Radeon 2D graphics accelerator You could also write a programming loop to input the contents from various offsets and thereby get some idea of which ones appear to hold live values i e hundreds Only a small number used in line drawing Main Line Drawing registers DP GUI MASTER CNTL DP BRUSH FRGD COLOR DP BRUSH BKGD COLOR DP WRITE MSK DST LINE START DST LINE END Others that affect drawing RB2D DSTCACHE MODE MC FB LOCATION DEFAULT PITCH OFFSET DST PITCH OFFSET SRC PITCH OFFSET DP DATATYPE DEFAULT SC TOP LEFT DEFAULT SC BOTTOM RIGHT CPU GPU synchronization Intel Pentium CPU ATI Radeon GPU When CPU off loads the work of drawing lines and doing other common Graphical operations tp the Graphics Processing Unit then this frees up the CPU to execute other instructions but it opens up the possibility that the CPU will send more drawing commands to the GPU even before the GPU is finished doing earlier commands Some mechanism is needed to prevent the GPU from becoming overwhelmed by work the CPU sends it Solution is a FIFO for pending commands plus a Status Register Engine has 64 FIFO slots Before the CPU initiates a new drawing command it checks to see if there are enough free slots in the command FIFO for storing that command s parameters The CPU can do busy waiting until the GPU reports that enough FIFO slots are ready to accept new command arguments An alternative is interrupt driven drawing Testing drawline cpp We developed our drawline cpp demo on a Radeon 7000 graphics card then tested it on a newer and faster Radeon 9250 Our code worked fine Tonight we shall try it on the Radeon X300 If these various models of the Radeon are fully compatible with one another we can expect our demo to work fine on the X300 Hardware changes But if any significant differences exist in the various Radeon design generations then we may discover that our drawline fails to perform properly on an X300 We would then have to explore the ways in which Radeon designs have changed and try to devise fixes for any flaws that we have found in our
View Full Document