Virtual Memory and I/OI/O SystemsTechniques to Improve I/O PerformanceOther Techniques to Improve I/O PerformanceSummary of First PaperSummary of Second PaperFeatures of IO-LiteRelated work before IO-LiteKey Data StructuresDiscussion IImpact of Immutable I/O BuffersDiscussion IIWhat does IO-Lite do?IO-Lite and IPCIo-Lite and FilesystemCopy Semantics Illustration 1Copy Semantics Illustration 2Copy Semantics Illustration 3More on File Cache Management & VM PagingIO-Lite and Network SubsystemA Cross-Subsystem OptimizationPerformance – CompetitorsPerformance – Static Content requestingPerformance – CGI ProgramsPerformance – Real WorkloadPerformance – WAN EffectsPerformance – Other ApplicationsConclusion on I/O-LiteSoftware Prefetching & Caching for TLBsIssues in Virtual MemoryMotivationsApproachDiscussionPrefetching: What entries to prefetch?Prefetching: DetailsPrefetching: PerformanceSlide 37Caching: Software Victim CacheCaching: BenefitsCaching: PerformanceSlide 41Prefetching + Caching: PerformanceSlide 43Virtual Memory and I/OMingsheng HongI/O SystemsMajor I/O HardwareHard disks, network adaptors …Problems related with I/O SystemsVarious types of Hardware – device drivers to provide OS with a unified I/O interfaceTypically much slower than CPU and memory speed – system bottleneckToo much CPU involvement in I/O operationsTechniques to Improve I/O PerformanceBufferinge.g. download a file from networkDMACachingCPU cache, TLB, file cache..Other Techniques to Improve I/O PerformanceVirtual Memory Page Remapping (IO-Lite)Allows (cached) files and memory to be shared by different processes without extra data copyPrefetching Data (Software Pretching and Caching for TLBs)Prefetches and caches page table entriesSummary of First PaperIO-Lite: A Unified I/O Buffering and Caching System (Pai et al. Best Paper of 3rd OSDI, 1999)A unified I/O SystemUses immutable data buffers to store all I/O data (only one physical copy)Uses VM page remappingIPCfile system (disk files, file cache)network subsystemSummary of Second PaperSoftware Prefetching and Caching for Translation Lookaside buffers (Bala et al. 1994)A software approach to help reduce TLB missesWorks well for IPC-intensive systemsBigger performance gain for future systemsFeatures of IO-LiteEliminates redundant data copyingSaves CPU work & avoids cache pollution Eliminates Multiple bufferingSaves main memory => improves hit rate of file cacheEnables cross-subsystem optimizationsCache Internet checksumSupports application-specific cache replacement policiesRelated work before IO-LiteI/O APIs should preserve copy semanticsMemory-mapped filesCopy On WriteFbufsKey Data StructuresImmutable Buffers and Buffer AggregatesDiscussion IWhen we pass a buffer aggregate from process A to process B, how to efficiently do VM page remapping (modify B’s page table entries)?Possible Approach 1: find any empty entry, and modify the VM address contained in buffer aggregateVery inefficientPossible Approach 2: reserve the range of virtual addresses of buffers in the address space of each process Basically limited the total size of buffers – How about dynamically allocated buffers?Impact of Immutable I/O BuffersCopy-On-Write OptimizationModified values are stored in a new buffer, as opposed to “in-place modification”Three situations when the data object is …Completely modifiedAllocates a new bufferPartially modified (modification localized)Chains unmodified and modified portions of dataPartially modified (modification not localized)Compares the cost of writing an entire object with that of chaining; chooses the cheaper methodDiscussion IIHow to measure the two costs?Heuristics neededFragmented data v.s. clustered dataChained data increase reading costSimilar to shadow page technique used in System RShould the cost of retrieving data from buffer also be considered?What does IO-Lite do?Reduces extra data copy in IPCfile system (disk files, file cache)network subsystemMakes possible cross-subsystem optimizationIO-Lite and IPCOperations on Buffers & AggregatesWhen I/O data is transferredPass related aggregates by valueAssociated buffers are passed by referenceWhen buffer is deallocatedBuffer returned to a memory poolBuffer’s VM page mappings persistWhen buffer is reused (by the same process)No further VM map changes required(Temporarily) grant write permission to associated producer processIo-Lite and FilesystemIO-Lite I/O APIs ProvidedIOL_read(int fd, IOL_Agg **aggr, size_t size)IOL_write(int fd, IOL_Agg **aggr)IOL_write operations are atomic – concurrency supportI/O functions in stdio library reimplementedFilesystem cache reorganized Buffer aggregates (pointers to data), instead of file data, are stored in cacheCopy Semantics ensuredSuppose a portion of a cached file is read, and then is overwrittenCopy Semantics Illustration 1Copy Semantics Illustration 2Copy Semantics Illustration 3More on File Cache Management & VM PagingCache replacement policy (can be customized)The eviction order is by current reference status & time of last file accessEvict one entry when the file cache “appears” to be too largeAdded one entry on every file cache missWhen a buffer page is paged out, data will be written back to swap space, and possibly to several other disk locations (for different files)IO-Lite and Network SubsystemAccess control and protection for processesACL related with buffer poolsMust determine the ACL of a data object prior to allocating memory for itEarly demultiplexing technique to determine ACL for each incoming packetA Cross-Subsystem OptimizationInternet checksum cachingCache the computed checksum for each slice of a buffer aggregateIncrement the version number when buffer is reallocated – can be used to check whether data changed Works well for static files. Also has a big benefit on the CGI programs that chain dynamic data with static dataPerformance – CompetitorsFlash Web server – a high performance HTTP serverFlash-Lite – A modified version of Flash using IO-Lite APIApache 1.3.1 – representing the widely used Web server todayPerformance – Static Content requestingPerformance – CGI ProgramsPerformance – Real
View Full Document