ISU CPRE 583 - The density advantage of configurative computing - D2030512

Home> Schools> Iowa State University> Computer Engineering (CPRE) > CPRE 583> The density advantage of configurative computing

DOC PREVIEW

ISU CPRE 583 - The density advantage of configurative computing

School name Iowa State University

Course Cpre 583- Reconfig Comptg Sys

Pages 9

This preview shows page 1-2-3 out of 9 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 9 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 9 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 9 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 9 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

0018-9162/00/$10.00 © 2000 IEEE April 2000 41The DensityAdvantage ofConfigurableComputingAlarge and growing community of re-searchers has successfully used field-programmable gate arrays (FPGAs) toaccelerate computing applications. Theabsolute performance achieved by theseconfigurable machines has been impressive—oftenone to two orders of magnitude greater than proces-sor-based alternatives. Configurable computers haveproved themselves the fastest or most economical wayto solve problems such as the following:• RSA (Rivest-Shamir-Adelman) decryption. Theprogrammable-active-memory (PAM) machinebuilt at INRIA (Informatics and AutomationResearch Institute, Paris) and Digital EquipmentCorporation’s Paris Research Lab achieved thefastest RSA decryption rate of any machine (600Kbps with 512-bit keys, and 185 Kbps with 970-bit keys).• DNA sequence matching. The SupercomputerResearch Center’s Splash and Splash-2 config-urable accelerators ran DNA-sequence-matchingroutines more than two orders of magnitudefaster than contemporary MPPs (massively par-allel processors) and supercomputers (CM-2,Cray-2) and three orders of magnitude faster thanthe attached workstation (Sparcstation I).• Signal processing. Filters implemented on Xilinxand Altera components outperform digital signalprocessors (DSPs) and other processors by anorder of magnitude.1• Emulation. Chip designers use FPGA-based emu-lation systems to simulate modern microproces-sors.2• Cryptographic attacks. Collections of FPGAs offerthe highest-performance, most cost-effective pro-grammable approach to breaking difficult encryp-tion algorithms. For example, Berkeley studentsshowed that an Altera FPGA can search 800,000keys per second, whereas a contemporary Pentiumsearches only 41,000 keys per second.3From an operational standpoint, what we see in theseexamples is a reconfigurable device (typically an FPGA)completing, in one cycle, computations that takeprocessors tens to hundreds of cycles. Although theseachievements are impressive, by themselves they do nottell us why FPGAs were so much more successful thantheir microprocessor and DSP counterparts. Do FPGAarchitectures have inherent advantages? Or are theseexamples just flukes of technology and market pricing?Can we expect the advantages to increase, decrease, orremain the same as technology advances? Can we gen-eralize the factors that account for the advantages inthese cases?To attack these questions, we must quantify the den-sity advantage of configurable architectures over tem-poral architectures—both empirically and with asimple area model. We must also understand the trade-offs that configurable architectures make to achievethis density advantage. Once we understand the trade-offs involved in using general-purpose computingAn examination of processors and FPGAs to characterize and comparetheir computational capacities reveals how FPGA-based machines achieve greater performance per unit of silicon area. If we can exploit this advantage across applications, configurable architectures can becomean important part of general-purpose computer design.AndréDeHonCaliforniaInstitute ofTechnologyCOVER FEATURE42 Computerblocks, we can expand the comparison to include cus-tom hardware and functional units. Taking theseeffects together, we can see how configurable com-puting fits into the arsenal of structures we use to buildgeneral, programmable computing platforms.CONFIGURABLE COMPUTINGComputing with FPGAs is called configurable com-puting because the computation is defined by config-uration bits in the device that tell each gate andinterconnect how to behave. Like processors, FPGAsare programmed after fabrication to solve virtuallyany computational task—that is, any task that fits inthe device’s finite state and operational resources. Thisimpermanent, postfabrication customizability distin-guishes processors and FPGAs from custom functionalblocks, which are operationally set during fabricationand can implement only one function or a very smallrange of functions. (See the “Field-ProgrammableGate Arrays” sidebar.)Unlike processors, the primitive computing andinterconnect elements in an FPGA hold only a singledevice-wide instruction. (Here, the term instructionbroadly refers to the set of bits that control one cycleof operation of the postfabrication programmabledevice.) Without undergoing a lengthy reconfigura-tion, FPGA resources can be reused only to performthe same operation from cycle to cycle. In these con-figurable devices, we implement tasks by spatiallycomposing primitive operators—that is, by linkingField-Programmable Gate ArraysAn FPGA is an array of bit-processing units whosefunction and interconnection can be programmedafter fabrication. Most traditional FPGAs use smalllookup tables to serve as programmable computa-tional elements. The lookup tables are wired togetherwith a programmable interconnect, which accountsfor most of the area in each FPGA cell (Figure A).Many commercial devices use four-input lookuptables (4-LUTs) for the programmable processing ele-ments because they are area efficient.1As their nameimplies, FPGAs were originally designed as user-pro-grammable alternatives to mask-configured gatearrays—the bit-processing elements implementingthe logic gates, and the programmable interconnectreplacing selective gate wiring.2Increasingly, FPGAshave served as spatial computing devices.Most of the examples mentioned in the introduc-tion of this article use Xilinx XC4000 or AlteraA8000 components as their main computationalworkhorses. These commercial architectures haveseveral special-purpose features beyond the generalmodel—for example, carry-chains for adders, mem-ory modes, shared bus lines—but they are basically4-LUT devices.References1. J. Rose et al., “Architecture of Field-ProgrammableGate Arrays: The Effect of Logic Block Functional-ity on Area Efficiency,” IEEE J. Solid-State Circuits,Oct. 1990, pp. 1217-1225.2. S. Trimberger, Field Programmable Gate Arrays,Kluwer Academic, Norwell, Mass., 1992.Configurationmemory3Flip-flopInterconnectActionlogicConfigurationmemoryLUTFigure A. A three-input lookup table (3-LUT) FPGA. A programmable interconnect wires the lookup tables together toserve as programmable computational elements.April 2000 43them together with wires. In contrast, in traditionalprocessors, we temporally compose operations bysequencing them in time, using registers or memoryto store intermediate results (see Figure

View Full Document