Information and Control in Gray Box Systems Andrea C Arpaci Dusseau Remzi H Arpaci Dusseau Department of Computer Sciences University of Wisconsin Madison Department of Computer Sciences University of Wisconsin Madison dusseau cs wisc edu remzi cs wisc edu ABSTRACT In modern systems developers are often unable to modify the underlying operating system To build services in such an environment we advocate the use of gray box techniques When treating the operating system as a gray box one recognizes that not changing the OS restricts but does not completely obviate both the information one can acquire about the internal state of the OS and the control one can impose on the OS In this paper we develop and investigate three gray box Information and Control Layers ICLs for determining the contents of the file cache controlling the layout of files across local disk and limiting process execution based on available memory A gray box ICL sits between a client and the OS and uses a combination of algorithmic knowledge observations and inferences to garner information about or control the behavior of a gray box system We summarize a set of techniques that are helpful in building gray box ICLs and have begun to organize a gray toolbox to ease the construction of ICLs Through our case studies we demonstrate the utility of gray box techniques by implementing three useful OS like services without the modification of a single line of OS source code 1 INTRODUCTION Modern operating systems are large complex bodies of code in which hundreds of programmer years have been invested As a result modifying an operating system is a difficult costly and often impractical endeavor In an extreme but perhaps realistic view some researchers have noted that traditional operating systems are so rigid that to most the OS is simply hardware masquerading as software 14 Viewing the operating system as an immutable object is clearly at odds with the bulk of operating systems research which seeks to develop and integrate new ideas into operating systems themselves Thus to reduce the efforts required to change the OS a large body of research has investigated how the operating system should be restructured so that it is extensible 8 13 15 35 In these systems new functional To appear in the 18th Symposium on Operating Systems Principles SOSP 18 October 21 24 2001 Chateau Lake Louise Banff Canada ity or performance improvements can easily be added often tailored to the desires of particular applications However the limitation of these approaches is that they too require changes to the operating system even those efforts that try to minimize OS modifications require that the OS be altered in at least some minor way 17 22 Unfortunately requiring a change to even a single line of OS code can make the deployment of an innovation much less likely For commercial operating systems the problem is an obvious one as many non technical hurdles must be overcome to persuade a large company to incorporate a new idea Even if accepted by a single vendor or into an open source base without wide spread adoption innovations are likely to go unused since applications that run cross platform must use the existing interfaces on other systems For example consider a transactional database that manages raw disk to obtain high performance even if one OS implements an optimized database oriented file system there is little incentive to use that file system on the single platform since doing so complicates the database source code Thus only the rare idea gets incorporated widely and a large range of good ideas are orphaned Thus we believe that a remaining challenge is how to disseminate OS research ideas without requiring any changes to the underlying OS Some projects particularly in distributed computing have addressed building system services on top of unmodified commodity operating systems 18 26 however this approach may appear to be constricting as it seemingly stifles the implementation of new functionality The thesis of this paper is that a surprisingly large class of OS like services can be provided to applications without any modification to the OS itself Specifically it is often possible to acquire information about the state of the OS and to control its behavior in unexpectedly powerful ways even when no explicit interfaces to do so exist With this approach the OS is treated as a gray box in which the general characteristics of the algorithms employed by the OS are known By combining this knowledge with run time observations of how the OS reacts to various commands and queries many new services can be implemented We term a software layer that provides interfaces to gather information about and to control a gray box system a graybox Information and Control Layer a gray box ICL An ICL residing between clients e g applications and a graybox system e g the OS presents clients with traditional or enhanced interfaces The interfaces in the ICL allow clients to learn about the state of the underlying system e g what data is in the file cache and to control its behavior e g place these files near one another on disk Internally to obtain information the ICL may observe the existing client interactions with the gray box system or it may itself insert probes into the system in either case combining these observations with statistical analyses and a priori knowledge of how the OS behaves may allow the ICL to infer the current state of the OS Experienced programmers tend to exploit their knowledge of the behavior of the underlying system we believe that this knowledge should be encapsulated in ICLs so that these techniques can be used by all programmers However gray box systems go one step further by combining knowledge with measurements and observations a technique commonly found in microbenchmarks 3 33 39 40 42 We believe there exists a strong duality between microbenchmarks and gray box techniques First ICLs often require that underlying components be benchmarked to configure internal thresholds and parameters Second understanding the behavior of ICLs requires understanding the behavior of the OS thus ICLs often reveal surprising behavior in the OS much as a microbenchmark might also do In this paper we explore the challenges of building graybox ICLs by developing and studying three services The first is a file cache content detector FCCD which determines the contents of the OS file cache and thus allows applications to re order file operations to
View Full Document
Unlocking...