A bit about computer architectureOverviewRISC microprocessor designEmbedded 32-bit microprocessorEmbedded processor-based applicationsDevice OverviewDevice Overview (continued)Device Overview (continued)Slide 9Slide 10RC32364 Block DiagramDiagnostic TestingSoftware DevelopmentCPU Instruction SetLoad Link Store Conditional OpcodesCPU Pipeline ArchitectureCPU Pipeline StagesCPU Pipeline Stages (continued)Slide 19Slide 20Slide 21Slide 22Slide 23Activities during each ALU pipeline stage......for load, store, and branch instructions.Stall ConditionsSlip ConditionsMemory Management Unit (MMU)32-bit Virtual Address TranslationTLB Register FormatTLB Register Field DescriptionsMMU Register DescriptionsRange of wired and random entriesUser Mode Address SpaceKernal Mode Address SpaceCPU Exception ProcessingCPU Exception Processing (continued)Exception Processing Registers...Basic CP0 RegistersException PriorityCache Organization, Operation, and CoherencyPrimary I-Cache Line FormatPrimary D-Cache Line FormatConceptual Primary Cache Lookup Seq.Primary Cache Data and Tag OrganizationPrimary Cache StatesClocking, Reset, and Initialization InterfacesTiming Illustration of MasterClock-to-PClock Multiply by 2EJTAG (In-circuit Emulator) InterfaceEJTAG Block DiagramSystem-on-Chip (SoC)SoC (continued)Slide 53SummaryReferencesReferences (continued)A bit about computer architectureA bit about computer architectureCS 147, Fall Semester 2007Robert CorrellOverviewRISC microprocessor designDiagnostic testingSoftware developmentMicroprocessor featuresSystem-on-Chip (SoC)RISC microprocessor design12 members on the team:oDesign Manager (1)oASIC Design Engineers (9)oDiagnostics Manager (1)oSoftware Engineer (1)Culture:oHigh-tech (Verilog)oVery quietEmbedded 32-bit microprocessorEarns Editor's Choice Award Microprocessor Report Names IDT’s RC32364 Best Embedded Processor for Price/Performance(Volume 12, Number 7, June 1, 1998)Embedded processor-based applicationsLow-end routers and switchesCellular base stationsConsumer multimedia game systemsDevice OverviewMIPS-II RISC architecture with enhancementsoScalar 5-stage pipeline minimizes branch and load delaysoDSP engine capable of doing 1 multiply accumulate instruction every 2 clock cyclesDevice Overview (continued)Enhanced instruction set architectureoMIPS-IV compatible conditional move instructionsoMIPS-IV superset PREF (prefetch) instructionoFast multiplier with atomic multiply-add, multiply-suboCount leading zero/one instructionsDevice Overview (continued) Large, efficient on-chip cachesoSeparate 8KB Instruction cache and 2KB Data cacheo2-way set associativeoWrite-back and write-through support on a per page basisoOptional cache locking, with per line resolution, to facilitate deterministic responseoSimultaneous instruction and data fetch in each clock cycle, achieves over l GB/sec bandwidthDevice Overview (continued)Flexible MMU with 32-page TLBoVariable page sizeoEnhanced write algorithm supportoVariable number of locked entriesoNo performance penalty for address translationDevice Overview (continued)Flexible bus interface allows simple, low-cost designsoBus interface runs at a fraction of pipeline rate Programmable port-width interface (8-,16-, 32-bit memory and I/O regions)oProgrammable bus turnaround (BTA) timesoSupports single datum or burst transactionsoSelectable system byte-orderingRC32364 Block DiagramDiagnostic TestingBegan with 300 tests and behavior modelDownloaded 10 to 40 new tests per dayOne test per directoryBuild each testRun each test on an RTL modelDebug and track failuresFinished with more than 3,000 testsSoftware DevelopmentTest Release SystemoAutomated regression processoDistributed jobs based upon cycle countsoProvided customized history reportsAccumulated load per signal utilityTest vectorsMany other value-added scriptsDiagnostic testsCPU Instruction SetLoad Link Store Conditional Opcodes li $9, 1 sw $9, 0($6) .word 0xc0850000 # opcode # ll $5, 0($4) bne $5, $0, Fail # verify sem = 0 li $5, 2 li $9, 2 sw $9, 0($6) .word 0xe0850000 # opcode # sc $5, 0($4) bne $5, $8, Fail # verify sc indicates success li $8, 2CPU Pipeline ArchitectureCPU Pipeline Stages 1I - Instruction Fetch, Phase oneoInstruction address translation begins 2I - Instruction Fetch, Phase twooInstruction cache fetch begins oInstruction address translation continuesCPU Pipeline Stages (continued)1R - Register Fetch, Phase oneoThe instruction cache fetch finishes.oThe instruction cache tag is checked against the physical page frame number obtained from the address translation.CPU Pipeline Stages (continued)2R - Register Fetch, Phase twooThe instruction decoder decodes the instruction.oAny required operands are fetched from the register file.oMake a decision to either issue or slip (for an interlock condition).oFor a branch, the branch address is calculated.CPU Pipeline Stages (continued)1A - Execution, Phase oneoAny result from the A or D stages are bypassed.oThe arithmetic logic unit (ALU) starts the integer arithmetic, logical or shift operation.oThe ALU calculates the data virtual address for load and store instructions.oThe ALU determines whether the branch condition is true.CPU Pipeline Stages (continued)2A - Execution, Phase twooThe integer arithmetic, logical or shift operation will complete.oA data cache access will start.oStore data is shifted to the specified byte position(s).oThe data virtual to physical address translation will start.CPU Pipeline Stages (continued)1D - Data Fetch, Phase oneoThe data cache access will continue.oThe data address translation completes. 2D - Data Fetch, Phase twooThe data cache access will finish and the data is then shifted down and extended.oThe data cache tag is checked against the physical address for any data cache access.CPU Pipeline Stages (continued)1W - Write Back, Phase oneoThe processor uses this phase internally to resolve all exceptions in preparation for the register file write.2W - Write Back, Phase twooFor register-to-register and load instructions, the result is written back to the register file. oBranch instructions perform no operation during this stage.Activities during each ALU pipeline stage......for load, store, and branch instructions.Stall ConditionsDetected after the R pipe-stage.The processor will resolve the condition.oDetect
View Full Document