Multiprocessor InitializationMultiprocessor topologyThe Local-APIC ID registerThe Local-APIC EOI registerThe Spurious Interrupt registerInterrupt Command RegisterICR (upper 32-bits)ICR (lower 32-bits)MP initialization protocolIssue ‘INIT’ IPIIssue ‘Startup’ IPITiming delaysMathematical examplesSlide 14Delaying for EAX microsecondsMutual ExclusionROM-BIOS isn’t ‘reentrant’Implementing a ‘spinlock’Demo: ‘smphello.s’In-class exerciseWe may need a ‘barrier’Multiprocessor InitializationAn introduction to the use of Interprocessor InterruptsMultiprocessor topologyCPU#0LocalAPICCPU#1LocalAPICIOAPICFront Side Busperipheral devicesBack Side Bus system memory bridgeThe Local-APIC ID registerreservedAPICID31 24 0Memory-Mapped Register-Address: 0xFEE00020This register is initially zero, but its APIC ID Field (8-bits) is programmed by the BIOS during system startup with a unique processor identification-number which subsequently is used when specifying the processor as arecipient of inter-processor interrupts.The Local-APIC EOI registerwrite-only register31 0Memory-Mapped Register-Address: 0xFEE000B0This write-only register is used by Interrupt Service Routines to issue an‘End-Of-Interrupt’ command to the Local-APIC. Any value written to thisregister will be interpreted by the Local-APIC as an EOI command. Thevalue stored in this register is initially zero (and it will remain unchanged).The Spurious Interrupt registerreserved spuriousvector31 7 0Memory-Mapped Register-Address: 0xFEE000F0This register is used to Enable/Disable the functioning of the Local-APIC,and when enabled, to specify the interrupt-vector number to be deliveredto the processor in case the Local-APIC generates a ‘spurious’ interrupt.(In some processor-models, the vector’s lowest 4-bits are hardwired 1s.) EN8Local-APIC is Enabled (1=yes, 0=no)Interrupt Command Register•Each Pentium’s Local-APIC has a 64-bit Interrupt Command Register •It can be programmed by system software to transmit messages (via the Back Side Bus) to one or several other processors•Each processor has a unique identification number in its APIC Local-ID Register that can be used for directing messages to itICR (upper 32-bits)reservedDestinationfield31 24 0Memory-Mapped Register-Address: 0xFEE00310The Destination Field (8-bits) can be used to specify whichprocessor (or group of processors) will receive the messageICR (lower 32-bits)Vectorfield31 19 18 07Destination Shorthand 00 = no shorthand 01 = only to self 10 = all including self 11 = all excluding selfR/O10 8Delivery Mode 000 = Fixed 001 = Lowest Priority 010 = SMI 011 = (reserved) 100 = NMI 101 = INIT 110 = Start Up 111 = (reserved)Trigger Mode 0 = Edge 1 = Level15Level 0 = De-assert 1 = AssertDestination Mode 0 = Physical 1 = Logical12Delivery Status 0 = Idle 1 = PendingMemory-Mapped Register-Address: 0xFEE00300MP initialization protocol•Set shared processor-counter equal to 1•Step 1: issue an ‘INIT’ IPI to all-except-self•Delay for 10 milliseconds•Step 2: issue ‘Startup’ IPI to all-except-self•Delay for 200 microseconds•Step 3: issue ‘Startup’ IPI to all-except-self•Delay for 200 microseconds•Check the value of the processor-counterIssue ‘INIT’ IPI # address Local-APIC via register FSmov $sel_fs, %axmov %ax, %fs# broadcast ‘INIT’ IPI to ‘all-except-self’mov $0x000C4500, %eaxmov %eax, %fs:0xFEE00300).B0: btl $12, %fs:(0xFEE00300)jc .B0Issue ‘Startup’ IPI # broadcast ‘Startup’ IPI to all-except-self # using vector 0x11 to specify entry-point # at real memory-address 0x00011000 mov $0x000C4611, %eax mov %eax, %fs:(0xFEE00300).B1: btl $12, %fs:(0xFEE00300)jc .B1Timing delays•Intel’s MP Initialization Protocol specifies the use of some timing-delays:–10 milliseconds ( = 10,000 microseconds)–200 microseconds•We can use the 8254 Timer’s Channel 2 for implementing these timed delays, by programming it for ‘one-shot’ countdown mode, then polling bit #5 at i/o port 0x61Mathematical examplesEXAMPLE 2Delaying for 200-microseconds means delaying 1/5000-th of a second (because 5000 times 200 microseconds = one-million microseconds)EXAMPLE 1 Delaying for 10-milliseconds means delaying for 1/100-th of a second (because 100 times 10 milliseconds = one-thousand milliseconds) GENERAL PRINCIPLEDelaying for x–microseconds means delaying for 1000000/x seconds (because 1000000/x times x-microseconds = one-million microseconds)Mathematical theoryRECALL: Clock-Frequency-in-Seconds = 1193182 HertzALSO: One second equals one-million microsecondsPROBLEM: Given the desired delay-time in microseconds, express the desired delay-time in clock-frequency pulses and program that number into the PIT’s Latch-RegisterDelay-in-Clock-Pulses = Delay-in-Microseconds * Pulses-Per-MicrosecondPulses-Per-Microsecond = Pulses-Per-Second / Microseconds-Per-SecondAPPLYING DIMENSIONAL ANALYSISCONCLUSIONFor a desired time-delay of x microseconds, the number of clock-pulsesmay be computed as x * (1193182 /1000000) = 1193182 / (1000000 / x )as dividing by a fraction amounts to multiplying by that fraction’s reciprocalDelaying for EAX microseconds# We use the 8254 Timer/Counter Channel 2 to generate a# timed delay (expressed in microseconds by value in EAX)mov %eax, %ecx # copy delay-time to ECXmov %1000000, %eax # microseconds-per-secxor %edx, %edx # extended to quadworddiv %ecx # perform dword divisionmov %eax, %ecx # copy quotient into ECXmov $1193182, %ecx # input-pulses-per-secxor %edx, %edx # extended to quadworddiv %ecx # perform dword division# now transfer the quotient from AX to the Channel 2 LatchMutual Exclusion•Shared variables must not be modified by more than one processor at a time (‘mutual exclusion’)•The Pentium’s ‘lock’ prefix helps enforce this•Example: every processor adds 1 to countlockincl (count)•Example: all processors needs private stacksmov 0x1000, %axlockxadd [new_SS], %axmov %ax, %ssROM-BIOS isn’t ‘reentrant’•The video service-functions in ROM-BIOS that we use to display a message-string at the current cursor-location (and afterward advance the cursor) modify global storage locations (as well as i/o ports), and hence must be called by one processor at a time•A shared memory-variable (called ‘mutex’) is used to enforce this
View Full Document