Slide 1Slide 2Slide 3Slide 4NUMAShared Memory MultiprocessorsCache Coherence ProblemCoherence DefinedSnoopingIs cache coherence sufficient?Is cache coherence sufficient?Slide 12ProcessesProcess and ProgramRole of the OSRole of the OSSlide 17How to create a process?pstree exampleProcesses Under UNIXExampleInter-process CommunicationInter-process CommunicationSlide 24Processes are heavyweightProcesses and ThreadsMultithreaded ProcessesThreadsThreads versus ForkExample Multi-Threaded ProgramRace ConditionsProgramming with threadsSlide 33GoalsRace conditionsCritical sectionsInterrupt DisablePreemption DisableMutexesSlide 401Parallel Programming and SynchronizationP&H Chapter 2.11Guest Lecture: Kevin WalshCS 3410, Spring 2011Computer ScienceCornell University3Multi-core is a reality…… but how do we write multi-core safe code?4Cache Coherence: Necessary, but not Sufficient5NUMA6Shared Memory MultiprocessorsShared Memory Multiprocessor (SMP)•Typical (today): 2 – 4 processor dies, 2 – 8 cores each•Assume physical addresses (ignore virtual memory)•Assume uniform memory access (ignore NUMA)Core0 Core1 CoreNCacheCacheCacheMemory I/OInterconnect...7Cache Coherence ProblemShared Memory Multiprocessor (SMP)What could possibly go wrong?...x = x+1......while (x==5) { // wait}...Core0 Core1 Core3I/OInterconnect...8Coherence DefinedCache coherence defined...Informal: Reads return most recently written valueFormal: For concurrent processes P1 and P2•P writes X before P reads X (with no intervening writes) read returns written value•P1 writes X before P2 reads X read returns written value•P1 writes X and P2 writes X all processors see writes in the same order–all see the same final value for X9SnoopingRecall: Snooping for Hardware Cache Coherence•All caches monitor bus and all other caches•Bus read: respond if you have dirty data•Bus write: update/invalidate your copy of dataCore0CacheMemory I/OInterconnectCore1CacheCoreNCache...10Is cache coherence sufficient?Example with cache coherence:P1P2x = x +1 while (x==5) ;11Is cache coherence sufficient?Example with cache coherence:P1P2x = x +1 x = x + 112Software Support for Synchronization and Coordination:Programs and Processes13ProcessesHow do we cope with lots of activity?Simplicity? Separation into processesReliability? IsolationSpeed? Program-level parallelismgccemacsnfsdlprlswwwemacsnfsd lprlswwwOSOS14Process and ProgramProcessOS abstraction of a running computation•The unit of execution•The unit of scheduling•Execution state+ address spaceFrom process perspective•a virtual CPU•some virtual memory•a virtual keyboard, screen, …Program“Blueprint” for a process•Passive entity (bits on disk)•Code + static data15Role of the OSRole of the OSContext Switching•Provides illusion that every process owns a CPUVirtual Memory•Provides illusion that process owns some memoryDevice drivers & system calls•Provides illusion that process owns a keyboard, …To do: How to start a process? How do processes communicate / coordinate?16Role of the OS17Creating Processes:Fork18How to create a process?Q: How to create a process? A: Double clickAfter boot, OS starts the first process…which in turn creates other processes•parent / child the process tree19pstree example$ pstree | view -init-+-NetworkManager-+-dhclient |-apache2 |-chrome-+-chrome | `-chrome |-chrome---chrome |-clementine |-clock-applet |-cron |-cupsd |-firefox---run-mozilla.sh---firefox-bin-+-plugin-cont |-gnome-screensaver |-grep |-in.tftpd |-ntpd `-sshd---sshd---sshd---bash-+-gcc---gcc---cc1 |-pstree |-vim `-view20Processes Under UNIXInit is a special case. For others…Q: How does parent process create child process?A: fork() system callWait. what? int fork() returns TWICE!21Examplemain(int ac, char **av) {int x = getpid(); // get current process ID from OSchar *hi = av[1]; // get greeting from command lineprintf(“I’m process %d\n”, x);int id = fork();if (id == 0)printf(“%s from %d\n”, hi, getpid());else printf(“%s from %d, child is %d\n”, hi, getpid(), id);}$ gcc -o strange strange.c$ ./strange “Hey”I’m process 23511Hey from 23512Hey from 23511, child is 2351222Inter-process CommunicationParent can pass information to child•In fact, all parent data is passed to child•But isolated after (C-O-W ensures changes are invisible)Q: How to continue communicating?A: Invent OS “IPC channels” : send(msg), recv(), …23Inter-process CommunicationParent can pass information to child•In fact, all parent data is passed to child•But isolated after (C-O-W ensures changes are invisible)Q: How to continue communicating?A: Shared (Virtual) Memory!24Processes and Threads25Processes are heavyweightParallel programming with processes:•They share almost everything code, shared mem, open files, filesystem privileges, …•Pagetables will be almost identical•Differences: PC, registers, stackRecall: process = execution context + address space26Processes and ThreadsProcessOS abstraction of a running computation•The unit of execution•The unit of scheduling•Execution state+ address spaceFrom process perspective•a virtual CPU•some virtual memory•a virtual keyboard, screen, …ThreadOS abstraction of a single thread of control•The unit of scheduling•Lives in one single processFrom thread perspective•one virtual CPU core on a virtual multi-core machine27Multithreaded Processes28Threads#include <pthread.h> int counter = 0;void PrintHello(int arg) {printf(“I’m thread %d, counter is %d\n”, arg, counter++);... do some work ...pthread_exit(NULL); }int main () { for (t = 0; t < 4; t++) { printf(“in main: creating thread %d\n", t); pthread_create(NULL, NULL, PrintHello, t); } pthread_exit(NULL); }29Threads versus Forkin main: creating thread 0I’m thread 0, counter is 0in main: creating thread 1I’m thread 1, counter is 1in main: creating thread 2in main: creating thread 3I’m thread 3, counter is 2I’m thread 2, counter is 3If processes?30Example Multi-Threaded ProgramExample: Apache web servervoid main() {setup();while (c = accept_connection()) {req = read_request(c);hits[req]++;send_response(c, req);}cleanup();}31Race ConditionsExample: Apache web serverEach client request handled by a separate thread (in parallel)•Some shared state: hit counter, ...(look familiar?)Timing-dependent failure race condition• hard to reproduce hard to debugThread 52...hits =
View Full Document