DOC PREVIEW
Modeling the Relative Fitness of Storage

This preview shows page 1-2-3-4 out of 12 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Modeling the Relative Fitness of StorageMichael P. Mesnier,∗Matthew Wachs, Raja R. Sambasivan,Alice X. Zheng, Gregory R. GangerCarnegie Mellon UniversityPittsburgh, PAABSTRACTRelative fitness is a new black-box approach to modeling theperformance of storage devices. In contrast with an abso-lute model that predicts the performance of a workload ona given storage device, a relative fitness model predicts per-formance differences between a pair of devices. There aretwo primary advantages to this approach. First, becausea relative fitness model is constructed for a device pair, theapplication-device feedback of a closed workload can be cap-tured (e.g., how the I/O arrival rate changes as the workloadmoves from device A to device B). Second, a relative fitnessmodel allows performance and resource utilization to be usedin place of workload characteristics. This is beneficial whenworkload characteristics are difficult to obtain or conciselyexpress (e.g., rather than describe the spatio-temporal char-acteristics of a workload, one could use the observed cachebehavior of device A to help predict the performance of B).This paper describes the steps necessary to build a relativefitness model, with an approach that is general enough to beused with any black-box modeling technique. We comparerelative fitness models and absolute models across a vari-ety of workloads and storage devices. On average, relativefitness models predict bandwidth and throughput within 10–20% and can reduce prediction error by as much as a factorof two when compared to absolute models.Categories and Subject Descriptors: I.6.5 [Model De-velopment]: Modeling methodologies, D.4.8 [Performance]:Modeling and prediction, D.4.2 [Storage Management].General Terms: Measurement, Performance.Keywords: black-box, storage, modeling, CART.1. INTRODUCTIONRelative fitness: the fitness of a genotype com-pared with another in the same gene system [8].∗Intel and Carnegie Mellon UniversityPermission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.SIGMETRICS’07, June 12–16, 2007, San Diego, California, USA.Copyright 2007 ACM 978-1-59593-639-4/07/0006 ...$5.00.fitnessB’s RelativeStep 1:A’s PerformanceA’s Resource utilizationA’s Workload characteristicsFitness TestresultsRelative fitnessDevice ADevice BmodelalgorithmModel learningRelative fitnessmodelModel differences between devices A and BStep 2: Use model to predict the performance of BFigure 1: Relative fitness models predict changesin performance between two devices; an I/O “fit-ness test” is used to build a model of the perfor-mance differences. For new workloads, one inputsinto the model the workload characteristics, perfor-mance, and resource utilization of a workload ondevice A to predict the relative fitness of device B.Storage administration continues to be overly complexand costly. One challenging aspect of administering stor-age, particularly in large infrastructures, is deciding whichapplication data sets to store on which devices. Amongother things, this decision involves balancing loads, match-ing workload characteristics to device strengths, and ensur-ing that performance goals are satisfied. Storage adminis-tration currently relies on experts who use rules-of-thumbto make educated, but ad hoc, decisions. With a mechanismfor predicting the performance of any given workload, onecould begin to automate this process [1, 3, 5, 11].Previous research on such prediction and automation fo-cuses on per-device models that take as input workloadcharacteristics (e.g., request arrival rate and read/write ra-tio) and output a performance prediction (e.g., through-put). Many envision these device models being constructedautomatically in a black-box manner. That is, given pre-deployment measurements on a device, one can train a modelto predict the performance of the device as a function of aworkload’s characteristics [2, 14, 27]. Such black-box mod-els are absolute in the sense that they are trained to predictperformance from assumedly static workload characteristics.Though it sounds simple, the above approach has provenquite difficult to realize in practice, for several reasons. First,workload characterization has been an open problem fordecades [9, 16, 27]; describing a complex workload in termsof concise characteristics, without losing necessary informa-tion, remains a challenge. Second, and more fundamentally,an absolute model does not capture the connection betweena workload and the storage device on which it executes.Generally speaking, application performance may dependon storage performance. If storage performance increases ordecreases, the I/O rate of an application could also change.If such feedback is not accounted for, a model’s applicabil-ity will be limited to environments where workloads are open(not affected by storage performance) or to devices that aresimilar enough that a workload’s feedback will not changesignificantly when moving between them.This paper proposes a new black-box approach, called rel-ative fitness modeling, which removes the above constraints.In contrast with an absolute model, a relative fitness modelpredicts how a workload will change if moved from one de-vice to another by comparing the devices’ performance overa large number of workloads. This will naturally capture theeffects of workload-device feedback. Further, since relativefitness models are constructed for pairs of devices, ratherthan one per device, they can take as input performanceand resource utilization in addition to basic workload char-acteristics. In other words, the performance and resourceutilization of one device can be used in predicting the per-formance of another. Often, such observations yield at leastas much information as workload characteristics, but aremuch easier to obtain and describe. For example, althoughone may not know how to concisely describe access locality,a workload that experiences a high cache hit rate on onedevice may experience a high hit rate on another.Relative fitness models can be used to solve storage ad-ministration problems in a similar way to absolute models.One would train models


Modeling the Relative Fitness of Storage

Download Modeling the Relative Fitness of Storage
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Modeling the Relative Fitness of Storage and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Modeling the Relative Fitness of Storage 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?