DOC PREVIEW
Columbia COMS W4118 - Lecture Notes

This preview shows page 1-2-16-17-18-33-34 out of 34 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

W4118 Operating Systems Instructor: Junfeng Yang2Learning goals of this lecture Different flavors of synchronization primitives and when to use them, in the context of Linux kernel How synchronization primitives are implemented for real “Portable” tricks: useful in other context as well (when you write a high performance server)Optimize for common case3Synchronization is complex and subtle Already learned this from the code examples we’ve seen Kernel synchronization is even more complex and subtleHigher requirements: performance, protection …Code heavily optimized, “fast path” often in assembly, fit within one cache lineRecall: Layered approach to synchronizationHardware-provided low-level atomic operationsHigh-level synchronization primitivesProperly synchronized application Hardware provides simple low-level atomic operations, upon which we can build high-level, synchronization primitives, upon which we can implement critical sections and build correct multi-threaded/multi-process programs4Outline Low-level synchronization primitives in LinuxMemory barrierAtomic operationsSynchronize with interruptsSpin locks High-level synchronization primitives in LinuxCompletionSemaphoreFutexMutex5Architectural dependency Implementation of synchronization primitives: highly architecture dependent Hardware provides atomic operations Most hardware platforms provide test-and-set or similar: examine and modify a memory location atomically Some don’t, but would inform if operation attempted was atomic6Memory barrier motivation Evil compiler!Reorder code as long as it correctly maintains data flow dependencies within a function and with called functionsEvil hardware!Reorder instruction execution as long as it correctly maintains register flow dependenciesReorder memory modification as long as it correctly maintains data flow dependencies7Memory barrier definitionMemory Barriers: instructions to compiler and/or hardware to complete all pending accesses before issuing any morePrevent compiler/hardware reorderingRead memory barriers: prevent reordering of read instructionsWrite memory barriers: prevent reordering of write instructions8Linux barrier operationsbarrier – prevent only compiler reorderingmb – prevents load and store reorderingrmb – prevents load reorderingwmb – prevents store reorderingsmp_mb – prevent load and store reordering only in SMP kernelsmp_rmb – prevent load reordering only in SMP kernelssmp_wmb – prevent store reordering only in SMP kernelsset_mb – performs assignment and prevents load and store reorderinginclude/asm-i386/system.h9Outline Low-level synchronization primitives in LinuxMemory barrierAtomic operationsSynchronize with interruptsSpin locks High-level synchronization primitives in LinuxCompletionSemaphoreMutexFutex10Atomic operationsSome instructions not atomic in hardware (smp)Read-modify-write instructions that touch memory twice, e.g., inc, xchg Most hardware provides a way to make these instructions atomicIntel lock prefix: appears to lock the memory busExecute at memory speed11Linux atomic operationsATOMIC_INIT – initialize an atomic_t variableatomic_read – examine value atomicallyatomic_set – change value atomicallyatomic_inc – increment value atomicallyatomic_dec – decrement value atomicallyatomic_add - add to value atomicallyatomic_sub – subtract from value atomicallyatomic_inc_and_test – increment value and test for zero atomic_dec_and_test – decrement from value and test for zeroatomic_sub_and_test – subtract from value and test for zeroatomic_set_mask – mask bits atomicallyatomic_clear_mask – clear bits atomicallyinclude/asm-i386/atomic.h12Outline Low-level synchronization primitives in LinuxMemory barrierAtomic operationsSynchronize with interruptsSpin locks High-level synchronization primitives in LinuxCompletionSemaphoreFutexMutex13Linux interrupt operations local_irq_disable - disables interrupts on the current CPU local_irq_enable - enable interrupts on the current CPU local_save_flags - return the interrupt state of the processor local_restore_flags - restore the interrupt state of the processor Dealing with the full interrupt state of the system is officially discouraged. Locks should be used14Outline Low-level synchronization primitives in LinuxMemory barrierAtomic operationsSynchronize with interruptsSpin locks High-level synchronization primitives in LinuxCompletionSemaphoreMutexFutex15Linux spin lock operationsspin_lock_init – initialize a spin lock before using it for the first timespin_lock – acquire a spin lock, spin waiting if it is not availablespin_unlock – release a spin lockspin_unlock_wait – spin waiting for spin lock to become available, but don't acquire itspin_trylock – acquire a spin lock if it is currently free, otherwise return errorspin_is_locked – return spin lock stateinclude/asm-i386/spinlock.hand kernel/spinlock.c16Spin lock usage rulesSpin locks should not be held for long periods because waiting tasks on other CPUs are spinning, and thus wasting CPU execution time Remember, don’t call blocking operations (any function that may call schedule()) when holding a spin lock1718Linux spin lock implementation__raw_spin_lock_string1: lock; decb %0 # atomically decrementjns 3f # if clear sign bit (>=0) jump forward to 32: rep; nop # waitcmpb $0, %0 # spin – compare to 0jle 2b # go back to 2 if <= 0 (locked)jmp 1b # unlocked; go back to 1 to try again 3: # we have acquired the lock…spin_unlock merely writes 1 into the lock field.Variant of spin locks and operations Spin locks that serialize with interrupts Read-write spin locks (rwlock_t) Read-write spin locks that serialize with interrupts Big reader lock (brlock) Sequential lock (seqlock)19Outline Low-level synchronization primitives in LinuxMemory barrierAtomic operationsSynchronize with interruptsSpin locks High-level synchronization primitives in LinuxCompletionSemaphoreFutexMutex2021Completions Simple way to ensure execution order: wait and wake-up semantics wait_for_complete(struct completion*) – wait for


View Full Document

Columbia COMS W4118 - Lecture Notes

Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?