UW-Madison COMPSCI 838 Topic - Latency Management in Storage Systems

Unformatted text preview:

Latency Management in Storage SystemsRodney Van Meter Quantum [email protected] GaoU.C. [email protected] Latency Estimation Descriptors, or SLEDs, arean API that allow applications to understand and takeadvantage of the dynamic state of a storage system. Byaccessing data in the file system cache or high-speedstorage first, total I/O workloads can be reduced andperformance improved. SLEDs report estimated datalatency, allowing users, system utilities, and scripts tomake file access decisions based on those retrieval timeestimates. SLEDs thus can be used to improve individ-ual application performance, reduce system workloads,and improve the user experience with more predictablebehavior.We have modified the Linux 2.2 kernel to supportSLEDs, and several Unix utilities and astronomical ap-plications have been modified to use them. As a result,execution times of the Unix utilities when data file sizesexceed the size of the file system buffer cache have beenreduced from 50% up to more than an order of mag-nitude. The astronomical applications incurred 30-50%fewer page faults and reductions in execution time of10-35%. Performance of applications which use SLEDsalso degrade more gracefully as data file size grows.1 IntroductionStorage Latency Estimation Descriptors, or SLEDs, ab-stract the basic characteristics of data retrieval in adevice-independent fashion. The ultimate goal is to cre-ate a mechanism that reports detailed performance char-acteristics without being tied to a particular technology. Author’s current address: Nokia, Santa Cruz, CA.Storage systems consist of multiple devices with differ-ent performance characteristics, such as RAM (e.g., theoperating system’s file system buffer cache), hard disks,CD-ROMs, and magnetic tapes. These devices may beattached to the machine on which the application is run-ning, or may be attached to a separate server machine.All of these elements communicate via a variety of inter-connects, including SCSI buses and ethernets. As sys-tems and applications create and access data, it movesamong the various devices along these interconnects.Hierarchical storage management (HSM) systems withcapacities up to a petabyte currently exist, and systemsup to 100PB are currently being designed [LLJR99,Shi98]. In such large systems, tape will continue to playan important role. Data is migrated to tape for long-term storage and fetched to disk as needed, analogous tomovement between disk and RAM in conventional filesystems. A CD jukebox or tape library automaticallymounts media to retrieve requested data.Storage systems have a significant amount of dynamicstate, a result of the history of accesses to the system.Disks have head and rotational positions, tape driveshave seek positions, autochangers have physical posi-tions as well as a set of tapes mounted on various drives.File systems are often tuned to give cache priority to re-cently used data, as a heuristic for improving future ac-cesses. As a result of this dynamic state, the latencyand bandwidth of access to data can vary dramatically;in disk-based file systems, by four orders of magnitude(from microseconds for cached, unmapped data pages,to tens of milliseconds for data retrieved from disk), inHSM systems, by as much as eleven (microseconds upto hundreds of seconds for tape mount and seek).File system interfaces are generally built to hide thisvariability in latency. A read() system call works thesame for data to be read fromthe file system buffer cacheSLEDsApplicationsStorageSystemSoftwareStorageDevicesHardwareUser SpaceKernelHintsFigure 1: SLEDs and hints in the storage system stackas for data to be read from disk. Only the behavior isdifferent; the semantics are the same, but in the first casethe data is obtained in microseconds, and in the second,in tens of milliseconds.CPU performance is improving faster than storage de-vice performance. It therefore becomes attractive to ex-pend CPU instructions to make more intelligent deci-sions concerning I/O. However, with the strong abstrac-tion of file system interfaces, applications are limited intheir ability to contribute to I/O decisions; only the sys-tem has the information necessary to schedule I/Os.SLEDs are an API that allows applications and librariesto understand both the dynamic state of the storage sys-tem and some elements of the physical characteristics ofthe devices involved, in a device-independent fashion.Using SLEDs, applications can manage their patterns ofI/O calls appropriately. They may reorder or choose notto execute some I/O operations. They may also reportpredicted performance to users or other applications.SLEDs can be constrasted to file system hints, as shownin Figure 1. Hints are the flow of information down thestorage system stack, while SLEDs are the flow informa-tion up the stack. The figure is drawn with the storagedevices as well as the storage system software partic-ipating. In current implementations of these concepts,the storage devices are purely passive, although theircharacteristics are measured and presented by proxy forSLEDs.This paper presents the first implementation and mea-surement of the concept of SLEDs, which we proposedin an earlier paper [Van98]. We have implemented theSLEDs system in kernel and library code under Linux(Red Hat 6.0 and 6.1 with 2.2 kernels), and modifiedseveral applications to use SLEDs.The applications we have modified demonstrate the dif-ferent uses of SLEDs. wc and grep were adaptedto reorder their I/O calls based on SLEDs information.The performance of wc and grep have been improvedby 50% or more over a broad range of file sizes, andmore than an order of magnitudeunder some conditions.find is capable of intelligently choosing not to performcertain I/Os. The GUI file manager gmc reports esti-mated retrieval times, improving the quality of informa-tion users have about the system.We also modified LHEASOFT, a large, complex suiteof applications used by professional astronomers for im-age processing [NAS00]. One member of the suite,fimhisto, which copies the data file and appends ahistogram of the data to the file, showed a reduction inpage faults of 30-50% and a 15-25% reduction in execu-tion time for files larger than the file system buffercache.fimgbin, which rebins an image, showed a reductionof 11-35% in execution time for various parameters. Thesmaller improvements are due in part to the complexityof the applications, relative to wc and


View Full Document

UW-Madison COMPSCI 838 Topic - Latency Management in Storage Systems

Download Latency Management in Storage Systems
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Latency Management in Storage Systems and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Latency Management in Storage Systems 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?