Unformatted text preview:

CS 636 InternetworkingCS 636 InternetworkingRamana KompellaEND NODE ALGORITHMICSLecture 7: Copying1Recap: Web server sending a file Web server reads from disk and writes to the socket◦ Two system calls -- Read() and Write()  Read () :◦ From disk to file cache◦ From file cache to application buffers Write()◦ From application buffer to socket buffer◦ From socket buffer to NICCS 636 Internetworking 2CS 636 InternetworkingRecap: Web server sending a file3Web server applicationTCP/IP File systemWrite()read()kernelServer bufferSocket bufferFile cache bufferCPUMemoryDISKAdaptorNICMEMORY BUSI/O BUSCopy 1Copy 2Copy 3Copy 4Local restructuring solutions Exploiting adaptor memory ◦ Use P13 (degree of freedom) ◦ Memory-mapped I/O Use Copy-on-write◦ Exploit virtual memory ◦ Explicit COW bits to raise interrupts Use Page Remapping◦ Reconfigure virtual page mappings ◦ Fbufs.CS 636 Internetworking 4Transparent COW (TCOW) Preserves copy semantics ◦ Allows appln to change even after passing a buffer to the kernel Fbufs makes this illegal How to preserve API and yet apply COW Modify fault handler◦ After write(), appln buffer made r-only◦ First write  page handler  check outstanding sends  create a new copy Too complicated with TCPCS 636 Internetworking 5Just change the protocol! Web requests using GET to a webserver What if a client could say◦ I want a file X◦ I want you to put that file between addresses A and B in my memory The server could say◦ I will schedule a DMA into your memory Voila! Transfer done!CS 636 Internetworking 6Remote DMA RDMA is just an extension of DMA over network Originally proposed by a group of architects over VAX clusters (1986) Goal: No per-packet mediation by the CPUs Two issues: ◦ How receiver knows where to place data◦ How security and integrity are handledCS 636 Internetworking 7Remote DMA Used within SANs and clusters Can transfer megabytes without CPU involvement Incorporated in modern protocols: Fiber channel, Infiniband, iSCSI Buffer id’s are random strings hard to guess.CS 636 Internetworking 8B1B2Page 11Page 16Buffer BB1B2DestinationSourceWhere all RDMA is used Storage Area Networks◦ A separate high-speed network for storage◦ Isolated from messaging network◦ Optimized for moving data between servers and storage devices◦ E.g. Fiber channel, Infiniband, iSCSI Limitations of traditional architectures◦ Unavailability of data in case of a failure on server◦ Bandwidth saturation during backupsCS 636 Internetworking 9What are storage area networks ?CS 636 Internetworking 10Storage technologies Fiberchannel◦ Transfer data between two ports ◦ At speeds of 1.0625 Gbit◦ For Storage via the SCSI-3 protocol◦ Using optical and copper media. Infiniband◦ Merge network, disk and PCI bus into one iSCSI◦ Gb Ethernet cheaper than Fiberchannel.CS 636 Internetworking 11Web server sending a file Web server reads from disk and writes to the socket◦ Two system calls -- Read() and Write()  So far, optimized only the Write() How about merging reads and writes to eliminate copy 2 ?CS 636 Internetworking 12CS 636 InternetworkingMemory-mapped I/O Removes copy 2 ◦ Server buffer mapped to same pages as file cache buffer mmap() system call ◦ Allows application to map a file directly to an application address space◦ Like a copy of the file in application cache◦ P4 leverage system components13Example: Flash Flash Web Server ◦ Avoids Copy1 and Copy 2 by using mmap()◦ Caches frequently used files into the application buffers ◦ Limited storage, so use cache replacement. Duplication of caching ◦ File system also caches files◦ Copy 3 not avoided here (oh, man!)CS 636 Internetworking 14How to avoid copy 2 and 3? Mmap() + TCOW (copy on write) ◦ Does nothing for dynamic content and web server.◦ CGI processes transmit data via IPC◦ What about TCP checksums ? CS 636 Internetworking 15IO-lite: Unified view of bufferingCS 636 Internetworking 16Web server applicationTCP/IP File systemWrite()read()kernelIO-Lite bufferCPUMemoryDISKAdaptorNICMEMORY BUSI/O BUSCopy 1Copy 2Server bufferSocket bufferFile cache bufferCachedchecksumCachedresponseCS 636 InternetworkingKey data structures17 Immutable buffers (read only) Composite buffers (aggregates) Lazily created cache of buffersIO-lite API IO-Lite and Applications◦ To take full advantage of IO-Lite, application programs can use an extended I/O API that is based on buffer aggregates.◦ IO-Lite I/O API Size_t IOL _read (int fd, IOL_Agg **aggr, size_tsize) Size_t IOL_write (int fd, IOL_Agg *aggr)CS 636 Internetworking 18Approaches: Pass a buffer aggregate from process A to process B How to do VM page remapping ? Possible Approach 1◦ Find any empty entry, and modify the VM address contained in buffer aggregate Possible Approach 2◦ Reserve the range of virtual addresses of buffers in the address space of each processCS 636 Internetworking 19IO-lite optimizations Cross-Subsystem Optimization◦ Optimizations across applications and OS subsystems ◦ Not possible in conventional I/O systems.◦ Eg. Checksum◦ Generation number and address identify uniquely data contentsCS 636 Internetworking 20IO-Lite Design Operation in a Web server ◦ IO-Lite’s ability to eliminate data copying and multiple buffering can dramatically reduce the cost of serving static and dynamic content◦ The impact is particularly strong in the case when a cached copy of the request content existsCS 636 Internetworking 21IO-lite performance benefits.CS 636 Internetworking 22LimitationsCS 636 Internetworking 23 Page can be both VM page and file page –complex replacement policies Must deal with complex sharing patterns, several applications might share bufferCS 636 InternetworkingI/O splicing Extend API with the sendfile() system call A full file is sent without ever passing through user space The kernel uses the file cache buffer as socket buffer24I/O SplicingCS 636 Internetworking 25Web server applicationSendfile()kernelFile cache bufferCPUMemoryDISKAdaptorNICMEMORY BUSI/O BUSCopy 1Copy 2CS 636 InternetworkingBroadening beyond copies VJ’s idea: ◦ RISC processors have delay slot between loads and stores◦ Empty cycle used for other computation◦ E.g. copy loop  checksum Clark & Tenenhouse◦ Generalize VJ’s idea to


View Full Document

Purdue CS 63600 - Lecture notes

Download Lecture notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?