Multiprocessor InitializationMultiprocessor topologyInterrupt Command RegisterICR (upper 32-bits)ICR (lower 32-bits)MP initialization protocolIssue ‘INIT’ IPIIssue ‘Startup’ IPIDelaying for EAX microsecondsMutual ExclusionROM-BIOS isn’t ‘reentrant’Implementing a ‘spinlock’In-class exerciseWe need to use a ‘barrier’Multiprocessor InitializationAn introduction to the use of Interprocessor InterruptsMultiprocessor topologyCPU#0LocalAPICCPU#1LocalAPICIOAPICFront Side Busperipheral devicesBack Side Bus system memory bridgeInterrupt Command Register•Each Pentium’s Local-APIC has a 64-bit Interrupt Command Register •It can be programmed by system software to transmit messages (via the Back Side Bus) to one or several other processors•Each processor has a unique identification number in its APIC Local-ID Register that can be used to direct messages to itICR (upper 32-bits)reservedDestinationfield31 24 0Memory-Mapped Register-Address: 0xFEE00310The Destination Field (8-bits) can be used to specify whichprocessor (or group of processors) will receive the messageICR (lower 32-bits)Vectorfield31 19 18 07Destination Shorthand 00 = no shorthand 01 = only to self 10 = all including self 11 = all excluding selfR/O10 8Delivery Mode 000 = Fixed 001 = Lowest Priority 010 = SMI 011 = (reserved) 100 = NMI 101 = INIT 110 = Start Up 111 = (reserved)Trigger Mode 0 = Edge 1 = Level15Level 0 = De-assert 1 = AssertDestination Mode 0 = Physical 1 = Logical12Delivery Status 0 = Idle 1 = PendingRegister-address: 0xFEE00300MP initialization protocol•Set processor-counter equal to zero•Step 1: issue an ‘INIT’ IPI to all-except-self•Delay for ten millieconds•Step 2: issue ‘Startup’ IPI to all-except-self•Delay for 200 microseconds•Step 3: issue ‘Startup’ IPI to all-except-self•Delay for 200 microseconds•Check the value of the processor-counterIssue ‘INIT’ IPI ; broadcast ‘INIT’ IPI to all-except-selfmov eax, #0x000C4500mov [0xFEE00300], eax.B0: bt dword [0xFEE00300], #12jc .B0Issue ‘Startup’ IPI ; broadcast ‘Startup’ IPI to all-except-self ; using vector 0x11 to specify entry-point ; is at the memory-address 0x00011000 mov eax, #0x000C4611 mov [0xFEE00300], eax.B1: bt dword [0xFEE00300], #12jc .B1Delaying for EAX microseconds; We use the 8254 Timer/Counter Channel 2 to generate a; timed delay (expressed in microseconds by value in EAX)mov ecx, eax ; copy delay-time to ECXmov eax, #1000000 ; #microseconds-per-secxor edx, edx ; extended to quadworddiv ecx ; perform dword divisionmov ecx, eax ; copy quotient into ECXmov eax, #1193182 ; #input-pulses-per-secxor edx, edx ; extended to quadworddiv ecx ; perform dword division; now transfer the quotient from AX to the Channel 2 LatchMutual Exclusion•Shared variables must be accessed by only one processor at a time•The Pentium’s ‘lock’ prefix assist with this•Example: every processor adds 1 to countlockinc dword [count]•Example: all processors needs private stacksmov ax, #0x1000lockxadd [new_SS], axmov ss, axROM-BIOS isn’t ‘reentrant’•The video service-functions in ROM-BIOS that we use to display a message-string at the current cursor-location (and afterward advance the cursor) modify global storage locations (as well as i/o ports), and hence must be called by one processor at a time•A shared memory-variable (called ‘mutex’) is used to enforce this mutual exclusionImplementing a ‘spinlock’mutex: .WORD 1spin: bt mutex, #0jnc spinlockbtr mutex, #0jnc spin ;-- CRITICAL SECTION OF CODE GOES HERE --lockbts mutex, #0In-class exercise•Include this procedure that multiple CPUs will execute simultaneously (without ‘lock)total: .WORD 0add_one_thousand:mov cx, #1000nxinc: add [total], #1loop nxincretWe need to use a ‘barrier’•We can use a software construct (known as a ‘barrier’) to delay entry to a block of code until a prescribed number of CPUs are ready to enter it togetherarrived: .WORD 0barrier: lockinc word [arrived]await: cmp word [arrived], #2jb awaitcall
View Full Document