Unformatted text preview:

Graphics accelerationBresenham’s algorithmAlgorithm’s main loopHow much work for CPU?Is acceleration possible?8514/A Block DiagramATI improved on IBM’s 8514/AHow does X300 draw lines?Programming conceptsPeripheral Component InterconnectPCI Configuration SpacePCI Configuration HeaderInterface to PCI BIOSreturn_radeon_port_address();The ATI I/O Port InterfaceMany 2D engine registers!Main Line-Drawing registersOthers that affect drawingCPU/GPU synchronizationEngine has 64 FIFO slotsTesting ‘drawline.cpp’Hardware changes?In-class exercisesGraphics accelerationAn example of line-drawing by the ATI Radeon’s 2D graphics engineBresenham’s algorithm•Recall this iterative algorithm for doing a ‘scanline conversion’ for a straight line•It required five parameters:–The starting endpoint coordinates: (X0,Y0)–The ending endpoint coordinates: (X1,Y1)–The foreground color for the solid-color line•It begins by initializing a decision-variable– errorTerm = 2*deltaY - deltaX;Algorithm’s main loop for (int y = Y0, x = X0; x <= X1; x++){drawPixel( x, y, color );if ( errorTerm >= 0 ) { errorTerm += 2*delY; }else { y += 1; errorTerm += 2*(delY – delX); }}How much work for CPU?•Example: To draw the longest visible line (in 1024x768 graphics mode) will require approximately 10,000 CPU instructions•The loop gets executed once for each of the 1024 horizontal pixels, and each pass through that loop requires about ten CPU operations: moves, compares, branches, adds and subtracts, plus the function-callsIs acceleration possible?•The IBM 8514/A appeared in late 1980s•It could do line-drawing (and some other common graphics operations) if just a few parameters were supplied •So instead of requiring the CPU to do ten thousand operations, the CPU could do maybe ten operations, then let the 8514/A graphics engine do the rest of the work!8514/A Block DiagramVRAMmemoryROMRAMDACDisplay MonitorPC Bus Interface LUT DACDisplay processorDrawing engineCRT controllerPC Bus GraphicsprocessorCPUATI improved on IBM’s 8514/A•Various OEM vendors soon introduced their own graphics accelerator designs•Because IBM had not released details of its design, others had to create their own programming interfaces – all are different•Early PC graphics software was therefore NOT portable between hardware platformsHow does X300 draw lines?•To demonstrate the line-drawing ability of our classroom’s Radeon X300 graphics processors, we wrote ‘drawline.cpp’ demo•We did not have access to ATI’s official Radeon programming manual, but we had several such manuals from other vendors, and we found ‘clues’ in source-code files for the Linux Radeon device-driverProgramming concepts•Our demo-program must first verify that it is running on a Radeon-equipped machine•It must determine how it can communicate with the Radeon’s graphics accelerator•Normal VGA registers are at ‘standard’ I/O port-addresses, but the graphics engine is outside the scope of established standardsPeripheral Component Interconnect•An industry committee (led by Intel) has established a standard mechanism that PC device-drivers can use to identify the peripheral devices that a workstation has, and their mechanisms for communication•To simplify the Pre-Boot Execution code, modern PC’s provide ROM-BIOS routines that can be called to identify peripheralsPCI Configuration SpaceADDITIONALPCICONFIGURATIONDATAEach peripheral device has a set of nonvolatile memory-locationswhich store information about that device using a standard layoutPCI CONFIGURATION HEADER 1024bytes256 bytesThis device-information is accessed via I/O Port-Addresses 0x3C8-0x3CFPCI Configuration HeaderBASE-ADDRESSRESOURCE 0BASE-ADDRESSRESOURCE 1BASE-ADDRESSRESOURCE 2BASE-ADDRESSRESOURCE 3DEVICEIDVENDORIDVENDOR-ID = 0x1002: Advanced Technologies, Incorporated DEVICE-ID = 0x5B60: ATI Radeon X300 graphics processor BASE-ADDRESS for RESOURCE 1 is the 2D engine’s I/O portSixteen longword entries (256 bytes)Our ‘findsvga.cpp’ utility will show you the PCI Configuration Space for anyperipheral devices of Class 0x030000 (i.e., VGA-compatible graphics cards)Interface to PCI BIOS •Our ‘dosio.c’ device-driver (and ‘int86.cpp’ companion code) allow us access to BIOS•The PCI BIOS services are accessible (in the Pentium’s virtual-8086 mode) using function 0xB1 of software interrupt 0x1A •There are several subfunctions – you can find documentation online – for example, Professor Ralf Brown’s Interrupt Listreturn_radeon_port_address();•Our demo invokes these PCI ROM-BIOS subfunctions to discover which I/O Port our Radeon’s 2D graphics engine uses –Subfunction 1: Detect BIOS presence–Subfunction 3: Find Device in a Class–Subfunction A: Read Configuration Dword•Configuration Dword at offset 0x14 holds I/O Port-Address for 2D graphics engineThe ATI I/O Port InterfaceMM_INDEX MM_DATA iobase + 0 iobase + 4 You output a register’s index to the iobase + 0 addressThen you have read or write access to that register at the iobase + 4 addressMany 2D engine registers!•You can peruse the ‘radeon.h’ header-file to see names and register-index numbers for the Radeon 2D graphics accelerator•You could also write a programming loop to input the contents from various offsets and thereby get some idea of which ones appear to hold ‘live’ values (i.e.,hundreds!)•Only a small number used in line-drawingMain Line-Drawing registers•DP_GUI_MASTER_CNTL•DP_BRUSH_FRGD_COLOR•DP_BRUSH_BKGD_COLOR•DP_WRITE_MSK•DST_LINE_START•DST_LINE_ENDOthers that affect drawing•RB2D_DSTCACHE_MODE•MC_FB_LOCATION•DEFAULT_PITCH_OFFSET•DST_PITCH_OFFSET•SRC_PITCH_OFFSET•DP_DATATYPE•DEFAULT_SC_TOP_LEFT•DEFAULT_SC_BOTTOM_RIGHTCPU/GPU synchronizationIntelPentiumCPUATIRadeonGPUWhen CPU off-loads the work of drawing lines (and doing other commonGraphical operations) tp the Graphics Processing Unit, then this frees up the CPU to execute other instructions – but it opens up the possibility thatthe CPU will send more drawing commands to the GPU, even before theGPU is finished doing earlier commands. Some mechanism is needed toprevent the GPU from becoming overwhelmed by work the CPU sends it.Solution is a FIFO for pending commands, plus a Status RegisterEngine has 64 FIFO slots•Before the CPU initiates a new drawing command, it checks to see if there are enough free slots in the command FIFO for storing that


View Full Document

USF CS 686 - Graphics acceleration

Documents in this Course
Load more
Download Graphics acceleration
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Graphics acceleration and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Graphics acceleration 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?