W4118 Operating Systems Instructor: Junfeng Yang2Learning goals of this lecture Different flavors of synchronization primitives and when to use them, in the context of Linux kernel How synchronization primitives are implemented for real “Portable” tricks: useful in other context as well (when you write a high performance server)Optimize for common case3Synchronization is complex and subtle Already learned this from the code examples we’ve seen Kernel synchronization is even more complex and subtleHigher requirements: performance, protection …Code heavily optimized, “fast path” often in assembly, fit within one cache lineRecall: Layered approach to synchronizationHardware-provided low-level atomic operationsHigh-level synchronization primitivesProperly synchronized application Hardware provides simple low-level atomic operations, upon which we can build high-level, synchronization primitives, upon which we can implement critical sections and build correct multi-threaded/multi-process programs4Outline Low-level synchronization primitives in LinuxMemory barrierAtomic operationsSynchronize with interruptsSpin locks High-level synchronization primitives in LinuxCompletionSemaphoreFutexMutex5Architectural dependency Implementation of synchronization primitives: highly architecture dependent Hardware provides atomic operations Most hardware platforms provide test-and-set or similar: examine and modify a memory location atomically Some don’t, but would inform if operation attempted was atomic6Memory barrier motivation Evil compiler!Reorder code as long as it correctly maintains data flow dependencies within a function and with called functionsEvil hardware!Reorder instruction execution as long as it correctly maintains register flow dependenciesReorder memory modification as long as it correctly maintains data flow dependencies7Memory barrier definitionMemory Barriers: instructions to compiler and/or hardware to complete all pending accesses before issuing any morePrevent compiler/hardware reorderingRead memory barriers: prevent reordering of read instructionsWrite memory barriers: prevent reordering of write instructions8Linux barrier operationsbarrier – prevent only compiler reorderingmb – prevents load and store reorderingrmb – prevents load reorderingwmb – prevents store reorderingsmp_mb – prevent load and store reordering only in SMP kernelsmp_rmb – prevent load reordering only in SMP kernelssmp_wmb – prevent store reordering only in SMP kernelsset_mb – performs assignment and prevents load and store reorderinginclude/asm-i386/system.h9Outline Low-level synchronization primitives in LinuxMemory barrierAtomic operationsSynchronize with interruptsSpin locks High-level synchronization primitives in LinuxCompletionSemaphoreMutexFutex10Atomic operationsSome instructions not atomic in hardware (smp)Read-modify-write instructions that touch memory twice, e.g., inc, xchg Most hardware provides a way to make these instructions atomicIntel lock prefix: appears to lock the memory busExecute at memory speed11Linux atomic operationsATOMIC_INIT – initialize an atomic_t variableatomic_read – examine value atomicallyatomic_set – change value atomicallyatomic_inc – increment value atomicallyatomic_dec – decrement value atomicallyatomic_add - add to value atomicallyatomic_sub – subtract from value atomicallyatomic_inc_and_test – increment value and test for zero atomic_dec_and_test – decrement from value and test for zeroatomic_sub_and_test – subtract from value and test for zeroatomic_set_mask – mask bits atomicallyatomic_clear_mask – clear bits atomicallyinclude/asm-i386/atomic.h12Outline Low-level synchronization primitives in LinuxMemory barrierAtomic operationsSynchronize with interruptsSpin locks High-level synchronization primitives in LinuxCompletionSemaphoreFutexMutex13Linux interrupt operations local_irq_disable - disables interrupts on the current CPU local_irq_enable - enable interrupts on the current CPU local_save_flags - return the interrupt state of the processor local_restore_flags - restore the interrupt state of the processor Dealing with the full interrupt state of the system is officially discouraged. Locks should be used14Outline Low-level synchronization primitives in LinuxMemory barrierAtomic operationsSynchronize with interruptsSpin locks High-level synchronization primitives in LinuxCompletionSemaphoreMutexFutex15Linux spin lock operationsspin_lock_init – initialize a spin lock before using it for the first timespin_lock – acquire a spin lock, spin waiting if it is not availablespin_unlock – release a spin lockspin_unlock_wait – spin waiting for spin lock to become available, but don't acquire itspin_trylock – acquire a spin lock if it is currently free, otherwise return errorspin_is_locked – return spin lock stateinclude/asm-i386/spinlock.hand kernel/spinlock.c16Spin lock usage rulesSpin locks should not be held for long periods because waiting tasks on other CPUs are spinning, and thus wasting CPU execution time Remember, don’t call blocking operations (any function that may call schedule()) when holding a spin lock1718Linux spin lock implementation__raw_spin_lock_string1: lock; decb %0 # atomically decrementjns 3f # if clear sign bit (>=0) jump forward to 32: rep; nop # waitcmpb $0, %0 # spin – compare to 0jle 2b # go back to 2 if <= 0 (locked)jmp 1b # unlocked; go back to 1 to try again 3: # we have acquired the lock…spin_unlock merely writes 1 into the lock field.Variant of spin locks and operations Spin locks that serialize with interrupts Read-write spin locks (rwlock_t) Read-write spin locks that serialize with interrupts Big reader lock (brlock) Sequential lock (seqlock)19Outline Low-level synchronization primitives in LinuxMemory barrierAtomic operationsSynchronize with interruptsSpin locks High-level synchronization primitives in LinuxCompletionSemaphoreFutexMutex2021Completions Simple way to ensure execution order: wait and wake-up semantics wait_for_complete(struct completion*) – wait for
View Full Document