COMPUTING PRACTICES Operating System Support for Database Management Michael Stonebraker University of California Berkeley 1 Introduction Database management systems DBMS provide higher level user support than conventional operating systems The DBMS designer must work in the context of the OS he she is faced with Different operating systems are designed for different use In this paper we examine several popular operating system services and indicate whether they are appropriate for support of database management functions Often we will see that the wrong service is provided or that severe performance problems exist When possible we offer some Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage the ACM copyright notice and the title of the publication and its date appear and notice is given that copying is by permission of the Association for Computing Machinery To copy otherwise or to republish requires a fee and or specific permission This research was sponsored by U S Air Force Office of Scientific Research Grant 783596 U S Army Research Offtce Grant DAAG29 76 G 0245 Naval Electronics Systems Command Contract NOOO39 78 G 0013 and National Science Foundation Grant MCS75 03839 AOl Key words and phrases database management operating systems buffer management tile systems scheduling interprocess communication CR Categories 3 50 3 70 4 22 4 33 4 34 4 35 Author s address M Stonebraker Dept of Electrical Engineering and Computer Sciences University of California Berkeley CA 94720 0 1981 ACMOOOl 0782 81 0700 0412 00 75 412 SUMMARY Several operating system services are examined with a view toward their applicability to support of database management functions These services include buffer pool management the file system scheduling process management and interprocess communication and consistency control 2 Buffer Pool Management Many modem operating systems provide a main memory cache for the tile system Figure 1 illustrates this service In brief UNIX provides a buffer pool whose size is set when the operating system is compiled Then all file I O is handled through this cache A file read e g read X in Figure 1 returns data directly from a block in the cache if possible otherwise it causes a block to be pushed to disk and replaced by the desired block In Figure 1 we show block Y being pushed to make room for block X A tile write simply moves data into the cache at some later time the buffer manager writes the block to the disk The UNIX buffer manager used the popular LRU 151replacement strategy Finally when UNIX detects sequential accessto a file it prefetches blocks before they are requested Conceptually this service is desirable because blocks for which there is so called locality of reference 15 181will remain in the cache over repeated reads and writes However the problems enumerated in the following subsectionsarise in using this service for database management Communications of the ACM July 198 1 Volume 24 Number 7 suggestions concerning improvements In the next several sections we look at the services provided by buffer pool management the file system scheduling process management and interprocess communication and consistency control We then conclude with a discussion of the merits of including all files in a paged virtual memory The examples in this paper are drawn primarily from the UNIX operating system 171and the INGRES relational database system 19 201 which was designed for use with UNIX Most of the points made for this environment have general applicability to other operating systems and data managers read X I virtual memory e g Pilot 161 may provide a solution to this problem This topic is examined in detail in Section 6 2 2 LRU Replacement Although the folklore indicates that LRU is a generally good tactic for buffer management it appears to perform only marginally in a database environment Database access in INGRES is a combination oE Although UNIX correctly prefetches pages when sequential access is detected there are important instances in which it fails Except in rare cases INGRES at or very shortly after the beginning of its examination of a block knows Communications of the ACM July 1981 Volume 24 Number 7 2 3 4 of a Cache 2 1 Performance The overhead to fetch a block from the buffer pool manager usually includes that of a system call and a core to core move For UNIX on a PDP 1 l 70 the cost to fetch 5 12 bytes exceeds 5 000 instructions To fetch 1 byte from the buffer pool requires about 1 800 instructions It appears that these numbers are somewhat higher for UNIX than other contemporary operating systems Moreover they can be cut somewhat for VAX 11 780 hardware lo It is hoped that this trend toward lower overhead access will continue However many DBMSs including INGRES 20 and System R 4 choose to put a DBMS managed buffer pool in user space to reduce overhead Hence each of these systems has gone to the trouble of constructing its own buffer pool manager to enhance performance In order for an operating system OS provided buffer pool manager to be attractive the access overhead must be cut to a few hundred instructions The trend toward providing the file system as a part of shared 413 2 4 Crash Recovery An important DBMS service is to provide recovery from hard and soft crashes The desired effect is for a unit of work a transaction which may be quite large and span multiple files to be either completely done or look like it had never started The way many DBMSs provide this service is to maintain an intentions list When the intentions list is complete a commit flag is set The last step of a transaction is to process the intentions list making the actual updates The DBMS makes the last operation idempotent i e it generates the same final outcome no matter how many times the intentions list is processed by careful programming The general procedure is described in 6 131 An alternate process is to do updates as they are found and maintain a log of before images so that backout is possible During recovery from a crash the commit flag is examined If it is set the DBMS recovery utility processes the intentions list to correctly install the changes made by updates in progress at the time of the crash If the flag is not set the utility removes the intentions list thereby backing out the transaction The impact of crash recovery on the buffer pool manager is the following
View Full Document