NFS & Distributed Systems IssuesMechanicsThe Next ProjectBehavioral SpecCaching MmapMan Pages You May LikeBeing A Good UserImagine The FollowingWas It Always Like This?How Did We Get Here?Why Use Little Boxes?Were Minis Immune?Why Not Just Shared Disk?New ProblemToward StatelessnessIdempotencyDistributed CachingDistributed Write ProblemSun’s Write PhilosophyMetadata OperationsNew Statelessness ProblemsWhat Slows Down ServersNFS & Distributed Systems IssuesVivek PaiDec 6, 20012MechanicsA few words about Project 5It’s not just another webserver project3The Next ProjectBehavioral specImplementation up to youCan assume max of 128 procs/threadsUse a simple counter to implement simple countsI may release a tool to test easier4Behavioral SpecThe following behavioral spec is importantIf there aren’t enough free processes/threads, the server should spawn one per secondIf there are too many free, one should be killed per secondThis should not depend on any other activity in the system5Caching MmapAlways use mmapKeep cache of active & inactive mapsTotal cache size in KB should be limited by command-line argumentCan only exceed this limit if all mappings are active6Man Pages You May LikeMmap, munmapMan –k pthreadFlockSleepSignalAlarm7Being A Good UserDo not fork wildlyTry to test on non-shared system8Imagine The FollowingEveryone has a desktop machineEach machine has a userEach user has a home directoryWhat problems arise?Can’t move between machinesCan’t easily share files with othersHow does this data get backed up?9Was It Always Like This?NoThink mainframes:Big, centralized boxAll disks attachedPrograms ran on boxOnly terminals/monitors on each desk10How Did We Get Here?Mainframe killers advocated little boxesLots of little boxes are a distributed systemDistributed systems introduce new problems11Why Use Little Boxes?Little boxes are cheapEasier to order a PC than a mainframeLittle boxes are disposableNo need for a maintenance contractEconomy of scaleDesign cost amortized over more units12Were Minis Immune?Minicomputers were “department”-sized versus “company”-sizedMost information not shared among everyoneAdministrator per department OKShared resources only within department OK13Why Not Just Shared Disk?Centralized storageEasier administration/backupBetter use of capacityEasier to build large filesystem cacheEasier to provide AC/powerProblem: compare bandwidth10 Mbit/sec Ethernet at the timeSwitched versus shared irrelevant14New ProblemSingle point of failureMeans everything depends on this itemIn other cases, duplication helpsCommon failures = rebootBut all information (state) lostAll clients would have to be toldWe’d need to keep track of all clients•On stable storage!15Toward StatelessnessMake server as dumb as possibleShift burdens to client-sideClient failure only harms that clientEach operation is self-containedRepeating operations permissibleIdempotent – repeating causes no change16IdempotencyRegular Unix system callWrite(fd, buf, size)Writes size bytes at current position, moves position forward by sizeIdempotent versionPwrite(fd, buf, size, offset)Idempotent operations in NFS hidden from user programs17Distributed CachingLocal filesystems have cachesUse caches to offload network trafficSame object replicated in many cachesNo problem for readsWhat happens on write/update?Multiple different copies of data?What happens if it’s metadata?18Distributed Write ProblemPossible approachesDisallow caching on writes•What about emacs?Disallow caching of shared files•What happens for really big files?Disallow caching of metadata writesWhat disk blocks does OS care about?19Sun’s Write PhilosophyFile block write sharing not an issueVery few programs do itCorrectness depends on programReduce window of opportunityFlush dirty blocks periodicallyFlush can be asynchronous20Metadata OperationsPerformed synchronously at serverMust be reflected to diskWhy: stabilityOverhead: disk op + networkCan we speed up synchronous ops?21New Statelessness ProblemsStale file handle problemcd ~vivek/temp1/temp in window Arm –r ~vivek/temp1 in window B“ls” in window AStale inode problemMachine A gets file for readFilesystem reformatted by adminMachine A modifies file, tries to write22What Slows Down ServersNetwork overheadDisk DMA in 4KB piecesNetwork processing in 1500 byte packets + manipulationMultiple CPUsSynchronous operationsNonvolatile memory +
View Full Document