DOC PREVIEW
CORNELL CS 3410 - Parallel Programming and Synchronization

This preview shows page 1-2-3-19-20-38-39-40 out of 40 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Slide 1Slide 2Slide 3Slide 4NUMAShared Memory MultiprocessorsCache Coherence ProblemCoherence DefinedSnoopingIs cache coherence sufficient?Is cache coherence sufficient?Slide 12ProcessesProcess and ProgramRole of the OSRole of the OSSlide 17How to create a process?pstree exampleProcesses Under UNIXExampleInter-process CommunicationInter-process CommunicationSlide 24Processes are heavyweightProcesses and ThreadsMultithreaded ProcessesThreadsThreads versus ForkExample Multi-Threaded ProgramRace ConditionsProgramming with threadsSlide 33GoalsRace conditionsCritical sectionsInterrupt DisablePreemption DisableMutexesSlide 401Parallel Programming and SynchronizationP&H Chapter 2.11Guest Lecture: Kevin WalshCS 3410, Spring 2011Computer ScienceCornell University3Multi-core is a reality…… but how do we write multi-core safe code?4Cache Coherence: Necessary, but not Sufficient5NUMA6Shared Memory MultiprocessorsShared Memory Multiprocessor (SMP)•Typical (today): 2 – 4 processor dies, 2 – 8 cores each•Assume physical addresses (ignore virtual memory)•Assume uniform memory access (ignore NUMA)Core0 Core1 CoreNCacheCacheCacheMemory I/OInterconnect...7Cache Coherence ProblemShared Memory Multiprocessor (SMP)What could possibly go wrong?...x = x+1......while (x==5) { // wait}...Core0 Core1 Core3I/OInterconnect...8Coherence DefinedCache coherence defined...Informal: Reads return most recently written valueFormal: For concurrent processes P1 and P2•P writes X before P reads X (with no intervening writes) read returns written value•P1 writes X before P2 reads X  read returns written value•P1 writes X and P2 writes X all processors see writes in the same order–all see the same final value for X9SnoopingRecall: Snooping for Hardware Cache Coherence•All caches monitor bus and all other caches•Bus read: respond if you have dirty data•Bus write: update/invalidate your copy of dataCore0CacheMemory I/OInterconnectCore1CacheCoreNCache...10Is cache coherence sufficient?Example with cache coherence:P1P2x = x +1 while (x==5) ;11Is cache coherence sufficient?Example with cache coherence:P1P2x = x +1 x = x + 112Software Support for Synchronization and Coordination:Programs and Processes13ProcessesHow do we cope with lots of activity?Simplicity? Separation into processesReliability? IsolationSpeed? Program-level parallelismgccemacsnfsdlprlswwwemacsnfsd lprlswwwOSOS14Process and ProgramProcessOS abstraction of a running computation•The unit of execution•The unit of scheduling•Execution state+ address spaceFrom process perspective•a virtual CPU•some virtual memory•a virtual keyboard, screen, …Program“Blueprint” for a process•Passive entity (bits on disk)•Code + static data15Role of the OSRole of the OSContext Switching•Provides illusion that every process owns a CPUVirtual Memory•Provides illusion that process owns some memoryDevice drivers & system calls•Provides illusion that process owns a keyboard, …To do: How to start a process? How do processes communicate / coordinate?16Role of the OS17Creating Processes:Fork18How to create a process?Q: How to create a process? A: Double clickAfter boot, OS starts the first process…which in turn creates other processes•parent / child  the process tree19pstree example$ pstree | view -init-+-NetworkManager-+-dhclient |-apache2 |-chrome-+-chrome | `-chrome |-chrome---chrome |-clementine |-clock-applet |-cron |-cupsd |-firefox---run-mozilla.sh---firefox-bin-+-plugin-cont |-gnome-screensaver |-grep |-in.tftpd |-ntpd `-sshd---sshd---sshd---bash-+-gcc---gcc---cc1 |-pstree |-vim `-view20Processes Under UNIXInit is a special case. For others…Q: How does parent process create child process?A: fork() system callWait. what? int fork() returns TWICE!21Examplemain(int ac, char **av) {int x = getpid(); // get current process ID from OSchar *hi = av[1]; // get greeting from command lineprintf(“I’m process %d\n”, x);int id = fork();if (id == 0)printf(“%s from %d\n”, hi, getpid());else printf(“%s from %d, child is %d\n”, hi, getpid(), id);}$ gcc -o strange strange.c$ ./strange “Hey”I’m process 23511Hey from 23512Hey from 23511, child is 2351222Inter-process CommunicationParent can pass information to child•In fact, all parent data is passed to child•But isolated after (C-O-W ensures changes are invisible)Q: How to continue communicating?A: Invent OS “IPC channels” : send(msg), recv(), …23Inter-process CommunicationParent can pass information to child•In fact, all parent data is passed to child•But isolated after (C-O-W ensures changes are invisible)Q: How to continue communicating?A: Shared (Virtual) Memory!24Processes and Threads25Processes are heavyweightParallel programming with processes:•They share almost everything code, shared mem, open files, filesystem privileges, …•Pagetables will be almost identical•Differences: PC, registers, stackRecall: process = execution context + address space26Processes and ThreadsProcessOS abstraction of a running computation•The unit of execution•The unit of scheduling•Execution state+ address spaceFrom process perspective•a virtual CPU•some virtual memory•a virtual keyboard, screen, …ThreadOS abstraction of a single thread of control•The unit of scheduling•Lives in one single processFrom thread perspective•one virtual CPU core on a virtual multi-core machine27Multithreaded Processes28Threads#include <pthread.h> int counter = 0;void PrintHello(int arg) {printf(“I’m thread %d, counter is %d\n”, arg, counter++);... do some work ...pthread_exit(NULL); }int main () { for (t = 0; t < 4; t++) { printf(“in main: creating thread %d\n", t); pthread_create(NULL, NULL, PrintHello, t); } pthread_exit(NULL); }29Threads versus Forkin main: creating thread 0I’m thread 0, counter is 0in main: creating thread 1I’m thread 1, counter is 1in main: creating thread 2in main: creating thread 3I’m thread 3, counter is 2I’m thread 2, counter is 3If processes?30Example Multi-Threaded ProgramExample: Apache web servervoid main() {setup();while (c = accept_connection()) {req = read_request(c);hits[req]++;send_response(c, req);}cleanup();}31Race ConditionsExample: Apache web serverEach client request handled by a separate thread (in parallel)•Some shared state: hit counter, ...(look familiar?)Timing-dependent failure  race condition• hard to reproduce  hard to debugThread 52...hits =


View Full Document

CORNELL CS 3410 - Parallel Programming and Synchronization

Documents in this Course
Marra

Marra

43 pages

Caches

Caches

34 pages

ALUs

ALUs

5 pages

Caches!

Caches!

54 pages

Memory

Memory

41 pages

Caches

Caches

32 pages

Caches

Caches

54 pages

Caches

Caches

34 pages

Caches

Caches

54 pages

Load more
Download Parallel Programming and Synchronization
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Parallel Programming and Synchronization and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Parallel Programming and Synchronization 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?