DOC PREVIEW
Berkeley COMPSCI 250 - Lecture 10 – Design Verification

This preview shows page 1-2-3-4-5-6-45-46-47-48-49-50-51-92-93-94-95-96-97 out of 97 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 97 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 97 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 97 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 97 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 97 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 97 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 97 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 97 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 97 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 97 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 97 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 97 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 97 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 97 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 97 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 97 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 97 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 97 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 97 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 97 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

UC Regents Fall 2010 © UCBCS 250 L10: Design Verification2010-10-11John Wawrzynek and Krste Asanovic! with John LazzaroCS 250 VLSI System DesignLecture 10 – Design Verificationwww-inst.eecs.berkeley.edu/~cs250/TA: Yunsup Lee 1UC Regents Fall 2010 © UCBCS 250 L10: Design Verificationmulti-site team, necessitating the development of ways tosynchronize the design environment and data (as well asthe design team).In the following sections of this paper, the designmethodology, clock network, circuits, power distribution,integration, and timing approaches used to meet thesechallenges for the POWER4 chip are described, andresults achieved for POWER4 are presented.Design methodologyThe design methodology for the POWER4 microprocessorfeatured a hierarchical approach across multiple aspects ofthe design. The chip was organized physically and logicallyin a four-level hierarchy, as illustrated in Figure 2.typically containing 50 000 transistors. Units compriseapproximately 50 related macros, with the microprocessorcore made up of six units. The highest level is the chip,which contains two cores plus the units associated with theon-chip memory subsystem and interconnection fabric.This hierarchy facilitates concurrent design across a ll fourlevels. While the macros (blocks such as adders, SRAMs,and control logic) are being designed at the transistor andFigure 1POWER4 chip photograph showing the principal functional units in the microprocessor core and in the memory subsystem.Figure 2Elements in the physical and logical hierarchy used to design the POWER4 chip.Core CoreChipCoreFPUFXUIFUUnit FUnit FUnit FMacro 1Unit ZUnit XMacro nMacro nMacro 1Macro 3Macro 2Unit AMacros, units, core,and chip all generateinitial timing andfloorplan contractsMemory subsystemTable 1 Features of the IBM CMOS 8S3 SOItechnology.Gate Leff0.09!mGate oxide 2.3 nmMetal layers pitch thicknessM1 0.5!m 0.31!mM2 0.63!m 0.31!mM3–M5 0.63!m 0.42!mM6 (MQ) 1.26!m 0.92!mM7 (LM) 1.26!m 0.92!mDielectric "r!4.2Vdd1.6 VTable 2 Characteristics of the POWER4 chip fabricatedin CMOS 8S3 SOI.Clock frequency ( fc) "1.3 GHzPower 115 W (@ 1.1 GHz, 1.5 V)Transistors 174,000,000Macros (unique/total) 1015 4341Custom 442 2002RLM 523 2158SRAM 50 181Total C4s 6380Signal I/Os 2200I/O bandwidth "500 Mb/sBus frequency 1/2 fcEngineered wiresBuffers and invertersDecoupling cap 300 nFJ. D. W ARNOCK ET AL. IB M J. RES. & DEV. VO L. 46 NO. 1 JANUARY 20022835KThe smallest members of the hierarchy are “macros”100KIBM Power 4174 Million TransistorsA complex design ...96% of all bugs were caught before first tape-out.First silicon booted AIX & Linux, on a 16-die system.How ???2UC Regents Fall 2010 © UCBCS 250 L10: Design VerificationThree main components ...(1) Specify chip behavior at the RTL level, and comprehensively simulate it.(2) Use formal verification to show equivalence betweenVerilog RTL and circuit schematic RTL.(3) Technology layer: do the the electrons implement the RTL, at speed and power?Today, we focus on (1).3UC Regents Fall 2010 © UCBCS 250 L10: Design VerificationLecture Focus: Functional Design Test1600 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 36, NO. 11, NOVEMBER 2001Fig. 1. Process SEM cross section.The process was raised from [1] to limit standby power.Circuit design and architectural pipelining ensure low voltageperformance and functionality. To further limit standby currentin handheld ASSPs, a longer poly target takes advantage of theversus dependence and source-to-body bias is usedto electrically limit transistor in standby mode. All corenMOS and pMOS transistors utilize separate source and bulkconnections to support this. The process includes cobalt disili-cide gates and diffusions. Low source and drain capacitance, aswell as 3-nm gate-oxide thickness, allow high performance andlow-voltage operation.III. ARCHITECTUREThe microprocessor contains 32-kB instruction and datacaches as well as an eight-entry coalescing writeback buffer.The instruction and data cache fill buffers have two and fourentries, respectively. The data cache supports hit-under-missoperation and lines may be locked to allow SRAM-like oper-ation. Thirty-two-entry fully associative translation lookasidebuffers (TLBs) that support multiple page sizes are providedfor both caches. TLB entries may also be locked. A 128-entrybranch target buffer improves branch performance a pipelinedeeper than earlier high-performance ARM designs [2], [3].A. Pipeline OrganizationTo obtain high performance, the microprocessor core utilizesa simple scalar pipeline and a high-frequency clock. In additionto avoiding the potential power waste of a superscalar approach,functional design and validation complexity is decreased at theexpense of circuit design effort. To avoid circuit design issues,the pipeline partitioning balances the workload and ensures thatno one pipeline stage is tight. The main integer pipeline is sevenstages, memory operations follow an eight-stage pipeline, andwhen operating in thumb mode an extra pipe stage is insertedafter the last fetch stage to convert thumb instructions into ARMinstructions. Since thumb mode instructions [11] are 16 b, twoinstructions are fetched in parallel while executing thumb in-structions. A simplified diagram of the processor pipeline isFig. 2. Microprocessor pipeline organization.shown in Fig. 2, where the state boundaries are indicated bygray. Features that allow the microarchitecture to achieve highspeed are as follows.The shifter and ALU reside in separate stages. The ARM in-struction set allows a shift followed by an ALU operation in asingle instruction. Previous implementations limited frequencyby having the shift and ALU in a single stage. Splitting this op-eration reduces the critical ALU bypass path by approximately1/3. The extra pipeline hazard introduced when an instruction isimmediately followed by one requiring that the result be shiftedis infrequent.Decoupled Instruction Fetch. A two-instruction deep queue isimplemented between the second fetch and instruction decodepipe stages. This allows stalls generated later in the pipe to bedeferred by one or more cycles in the earlier pipe stages, therebyallowing instruction fetches to proceed when the pipe is stalled,and also relieves stall speed paths in the instruction fetch andbranch prediction units.Deferred register dependency stalls. While register depen-dencies are checked in the RF stage, stalls due to these hazardsare deferred until the X1 stage. All the


View Full Document

Berkeley COMPSCI 250 - Lecture 10 – Design Verification

Download Lecture 10 – Design Verification
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 10 – Design Verification and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 10 – Design Verification 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?