Speculative Execution in a Distributed File System Edmund B Nightingale Peter M Chen and Jason Flinn Department of Electrical Engineering and Computer Science University of Michigan ABSTRACT Keywords Speculator provides Linux kernel support for speculative execution It allows multiple processes to share speculative state by tracking causal dependencies propagated through interprocess communication It guarantees correct execution by preventing speculative processes from externalizing output e g sending a network message or writing to the screen until the speculations on which that output depends have proven to be correct Speculator improves the performance of distributed file systems by masking I O latency and increasing I O throughput Rather than block during a remote operation a file system predicts the operation s result then uses Speculator to checkpoint the state of the calling process and speculatively continue its execution based on the predicted result If the prediction is correct the checkpoint is discarded if it is incorrect the calling process is restored to the checkpoint and the operation is retried We have modified the client server and network protocol of two distributed file systems to use Speculator For PostMark and Andrew style benchmarks speculative execution results in a factor of 2 performance improvement for NFS over local area networks and an order of magnitude improvement over wide area networks For the same benchmarks Speculator enables the Blue File System to provide the consistency of single copy file semantics and the safety of synchronous I O yet still outperform current distributed file systems with weaker consistency and safety Distributed file systems speculative execution causality 1 INTRODUCTION Distributed file systems often perform substantially worse than local file systems because they perform synchronous I O operations for cache coherence and data safety File systems such as AFS 13 and NFS 3 present users with the abstraction of a single coherent namespace shared across multiple clients Although caching data on local clients improves performance many file operations still use synchronous message exchanges between client and server to maintain cache consistency and protect against client or server failure Even over a local area network the performance impact of this communication is substantial As latency increases due to physical distance middleboxes and routing delays the performance cost may become prohibitive Many distributed file systems weaken consistency and safety to improve performance Whereas local file systems typically guarantee that a process that reads data from a file will see all modifications previously completed by other processes distributed file systems such as AFS and NFS provide no such guarantee For example most NFS implementations provide close to open consistency which guarantees only that a client that opens a file will see modifications made by other clients that have previously closed the file Weaker consistency semantics improve performance by reducing the number of synchronous messages that are exchanged Nevertheless as our results show even these weaker semantics are time consuming We demonstrate that with operating system support for lightweight checkpointing speculative execution and tracking of causal interdependencies between processes distributed file systems can be fast safe and consistent Rather than block a process while waiting for the result of a remote communication with a file server the operating system checkpoints its state predicts the result of the communication and continues to execute the process speculatively If the prediction is correct the checkpoint is discarded if it is false the application is rolled back to the checkpoint Our solution relies on three observations First file system clients can correctly predict the result of many operations For instance consistency checks seldom fail since concurrent file updates are rare Second the time to take a lightweight checkpoint is often much less than network round trip time to the server so substantial work can be done while waiting for a remote request to complete Finally modern computers often have spare resources that can be used to execute General Terms Performance Design Categories and Subject Descriptors D 4 3 Operating Systems File Systems Management Distributed file systems D 4 7 Operating Systems Organization and Design D 4 8 Operating Systems Performance Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page To copy otherwise to republish to post on servers or to redistribute to lists requires prior specific permission and or a fee SOSP 05 October 23 26 2005 Brighton United Kingdom Copyright 2005 ACM 1 59593 079 5 05 0010 5 00 191 Client 1 Modify A Modify B Server Client 1 Client 2 Modify A Write speculate Commit Server Client 2 Write Commit Modify B Getattr Write Open C Open B speculate Commit Getattr Open C Getattr Open B Getattr a Unmodified NFS Open B speculate b Speculative NFS Figure 1 Example of speculative execution for NFS addition our version of BlueFS provides synchronous I O in which all file modifications are safe on the server s disk before an operation is observed to complete Despite providing these strong guarantees BlueFS is 66 faster than non speculative NFS over a LAN and more than 11 times faster with a 30 ms delay processes speculatively Encouraged by these observations and by the many prior successful applications of speculation in processor design we have added support for speculative execution which we call Speculator to the Linux kernel In our work the distributed file system controls when speculations start succeed and fail Speculator provides a mechanism for correct execution of speculative code It does not allow a process that is executing speculatively to externalize output e g make network transmissions or display output to the screen until the speculations on which that output depends prove to be correct If a speculative process tries to execute a potentially unrecoverable operation e g it calls the reboot system call it is blocked until its speculations are resolved Speculator tracks causal dependencies between kernel objects in order to share speculative state among multiple
View Full Document
Unlocking...