DOC PREVIEW
Berkeley COMPSCI 152 - Lecture 23 – Synchronization

This preview shows page 1-2-3-24-25-26 out of 26 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Slide 1Slide 2Slide 3Slide 4Slide 5Slide 6Slide 7Slide 8Slide 9Slide 10Slide 11Slide 12Slide 13Slide 14Slide 15Slide 16Slide 17Slide 18Slide 19Slide 20Slide 21Slide 22Slide 23Slide 24Slide 25Slide 26UC Regents Fall 2005 © UCBCS 152 L23: Synchronization2005-11-17John Lazzaro (www.cs.berkeley.edu/~lazzaro)CS 152 Computer Architecture and EngineeringLecture 23 – Synchronizationwww-inst.eecs.berkeley.edu/~cs152/TAs: David Marquardt and Udam SainiUC Regents Fall 2005 © UCBCS 152 L23: SynchronizationLast Time: How Routers Work2. Forwarding engine determines the next hop for the packet, and returns next-hop data to the line card, together with an updated header.2.2.UC Regents Fall 2005 © UCBCS 152 L23: SynchronizationRecall: Two CPUs sharing memoryIn fact, it is an architectural challenge. Even letting several threads on one machine share memory is tricky.In fact, it is an architectural challenge. Even letting several threads on one machine share memory is tricky.In earlier lectures, we pretended it was easy to let several CPUs share a memory system.In earlier lectures, we pretended it was easy to let several CPUs share a memory system.UC Regents Fall 2005 © UCBCS 152 L23: SynchronizationToday: Hardware Thread SupportProducer/Consumer: One thread writes A, one thread reads A.Locks: Two threads share write access to A.On Tuesday: Multiprocessor memory system design and synchronization issues.Tuesday is a simplified overview -- graduate-level architecture courses spend weeks on this topic ...UC Regents Fall 2005 © UCBCS 152 L23: SynchronizationHow 2 threads share a queue ...Words in MemoryHigher Address NumbersTail HeadWe begin with an empty queue ...Thread 1 (T1) adds data to the tail of the queue.“Producer” threadThread 2 (T2) takes data from the head of the queue.“Consumer” threadUC Regents Fall 2005 © UCBCS 152 L23: SynchronizationProducer adding x to the queue ...xWords in MemoryHigher Address NumbersTail HeadWords in MemoryHigher Address NumbersTail HeadT1 code(producer)Before:After:ORi R1, R0, xval ; Load x value into R1LW R2, tail(R0) ; Load tail pointer into R2 SW R1, 0(R2) ; Store x into queueADDi R2, R2, 4 ; Shift tail by one wordSW R2 0(tail) ; Update tail memory addrUC Regents Fall 2005 © UCBCS 152 L23: SynchronizationProducer adding y to the queue ...y xWords in MemoryHigher Address NumbersTail HeadORi R1, R0, yval ; Load y value into R1LW R2, tail(R0) ; Load tail pointer into R2 SW R1, 0(R2) ; Store y into queueADDi R2, R2, 4 ; Shift tail by one wordSW R2 0(tail) ; Update tail memory addrxWords in MemoryHigher Address NumbersTail HeadT1 code(producer)Before:After:UC Regents Fall 2005 © UCBCS 152 L23: SynchronizationConsumer reading the queue ...yWords in MemoryHigher Address NumbersTail Head LW R3, head(R0) ; Load head pointer into R3spin: LW R4, tail(R0) ; Load tail pointer into R4 BEQ R4, R3, spin ; If queue empty, wait LW R5, 0(R3) ; Read x from queue into R5 ADDi R3, R3, 4 ; Shift head by one word SW R3 head(R0) ; Update head pointerT2 code(consumer)Before:After:y xWords in MemoryTail HeadUC Regents Fall 2005 © UCBCS 152 L23: SynchronizationWhat can go wrong?Higher Addresses LW R3, head(R0) ; Load head pointer into R3spin: LW R4, tail(R0) ; Load tail pointer into R4 BEQ R4, R3, spin ; If queue empty, wait LW R5, 0(R3) ; Read x from queue into R5 ADDi R3, R3, 4 ; Shift head by one word SW R3 head(R0) ; Update head pointerT2 code(consumer)y xTail HeadyTail HeadAfter:Before:Higher AddressesT1 code(producer)ORi R1, R0, x ; Load x value into R1LW R2, tail(R0) ; Load tail pointer into R2 SW R1, 0(R2) ; Store x into queueADDi R2, R2, 4 ; Shift tail by one wordSW R2 0(tail) ; Update tail pointer1234What if order is 2, 3, 4, 1? Then, x is read before it is written!The CPU running T1 has no way to know its bad to delay 1 !UC Regents Fall 2005 © UCBCS 152 L23: SynchronizationLeslie Lamport: Sequential ConsistencySequential Consistency: As if each thread takes turns executing, and instructions in each thread execute in program order.Sequential Consistent architectures get the right answer, but give up many optimizations. LW R3, head(R0) ; Load queue head into R3spin: LW R4, tail(R0) ; Load queue tail into R4 BEQ R4, R3, spin ; If queue empty, wait LW R5, 0(R3) ; Read x from queue into R5 ADDi R3, R3, 4 ; Shift head by one word SW R3 head(R0) ; Update head memory addrT2 code(consumer)T1 code(producer)ORi R1, R0, x ; Load x value into R1LW R2, tail(R0) ; Load queue tail into R2 SW R1, 0(R2) ; Store x into queueADDi R2, R2, 4 ; Shift tail by one wordSW R2 0(tail) ; Update tail memory addr1234Legal orders: 1, 2, 3, 4 or 1, 3, 2, 4 or 3, 4, 1 2 ... but not 2, 3, 1, 4!UC Regents Fall 2005 © UCBCS 152 L23: SynchronizationEfficient alternative: Memory barriersIn the general case, machine is not sequentially consistent.When needed, a memory barrier may be added to the program (a fence). All memory operations before fence complete, then memory operations after the fence begin.ORi R1, R0, x ;LW R2, tail(R0) ;SW R1, 0(R2) ;MEMBARADDi R2, R2, 4 ;SW R2 0(tail) ;12Ensures 1 completes before 2 takes effect.MEMBAR is expensive, but you only pay for it when you use it.Many MEMBAR variations for efficiency (versions that only effect loads or stores, certain memory regions, etc).UC Regents Fall 2005 © UCBCS 152 L23: SynchronizationProducer/consumer memory fencesHigher Addresses LW R3, head(R0) ; Load queue head into R3spin: LW R4, tail(R0) ; Load queue tail into R4 BEQ R4, R3, spin ; If queue empty, wait MEMBAR ; LW R5, 0(R3) ; Read x from queue into R5 ADDi R3, R3, 4 ; Shift head by one word SW R3 head(R0) ; Update head memory addrT2 code(consumer)y xTail HeadyTail HeadAfter:Before:Higher AddressesT1 code(producer)ORi R1, R0, x ; Load x value into R1LW R2, tail(R0) ; Load queue tail into R2 SW R1, 0(R2) ; Store x into queueMEMBAR ;ADDi R2, R2, 4 ; Shift tail by one wordSW R2 0(tail) ; Update tail memory addr1234Ensures 1 happens before 2, and 3 happens before 4.UC Regents Fall 2005 © UCBCS 152 L23: SynchronizationReminder: Final Checkoff this Friday!TAs will provide “secret” MIPS machine code tests.Bonus points ifthese tests run byend of section. If not, TAs give you test code to use over


View Full Document

Berkeley COMPSCI 152 - Lecture 23 – Synchronization

Documents in this Course
Quiz 5

Quiz 5

9 pages

Memory

Memory

29 pages

Quiz 5

Quiz 5

15 pages

Memory

Memory

29 pages

Memory

Memory

35 pages

Memory

Memory

15 pages

Quiz

Quiz

6 pages

Midterm 1

Midterm 1

20 pages

Quiz

Quiz

12 pages

Memory

Memory

33 pages

Quiz

Quiz

6 pages

Homework

Homework

19 pages

Quiz

Quiz

5 pages

Memory

Memory

15 pages

Load more
Download Lecture 23 – Synchronization
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 23 – Synchronization and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 23 – Synchronization 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?