Recap of Feb 27 Disk Block Access and Buffer Management Major concepts in Disk Block Access covered Disk arm Scheduling Non volatile write buffers Clustering Log disks Fragmentation Buffer Management Overview of Buffers Buffer replacement strategies LRU MRU toss immediate pinning Other buffer details buffer interaction with existing OS variations on buffers etc File Organization The database is stored logically as a collection of files Each file is a sequence of records A record is a sequence of fields Easy so far but as we just finished discussing anything stored on disk is stored in blocks which are a physical constraint unrelated to the storage system used for files So how do we organize a file into blocks and records formatting fields within a record formatting records within a block assigning records to blocks Fixed Length Records Simplest approach We know the length of each record and they are all the same store record i starting from byte n i 1 where n is record length record access is simple Fixed Length Records Problems records may cross blocks normal modification don t permit records to cross block boundaries deletion of record i leaves a gap which requires some way of dealing with the empty space E G record 2 A 215 is deleted from the example block on the right Fixed Length Records Deletion One simple fix is to shift all the records down to fill the gap as shown on the right This involves a lot of work so it might be slow Fixed Length Records more Deletion Another fix would be to shift the last record record n to the deleted position I Much less work Not useful if the records are stored in order sorted Fixed Length Records still more Deletion Another possibility is to not move records at all Maintain a header at the beginning of the file Store a link to the list of addresses of deleted records Use each deleted record to store the link to the next deleted record Essentially a linked list often called a free list Variable Length Records Can occur in a database system in several ways storage of multiple record types in a file record types that allow variable lengths for one or more fields record types that allow repeating fields Several ways to store variable length records attach an end of record symbol delimiter to mark the end of each record store the length of the record at the beginning of each record store header information at the beginning of the file with the location and length of each record these techniques can be applied at the file block or record level Variable Length Records Files delimit each record within the file store a length field at the beginning of each record store header information at the beginning of the file with the location and length of each record within the file Blocks delimit each record within the block store a length field at the beginning of each record store header information at the beginning of the block with the location and length of each record inside the block Records delimit each field within the record store a length field at the beginning of each field store header information at the beginning of the record with the location and length of each field Variable Length Records Two more techniques for storing variable length records use fixed length fields reserve space if there is a maximum space just reserve that and mark unused space with a special null end of record symbol wasteful if the maximum record length is much larger than the average record length list representation represent variable length records by lists of fixed length records chained together by pointers useful for variable length records caused by repeating same length fields we don t want a single field of the variable length record to cross the boundary of two fixed length records in its representation so this can also be wasteful of space Organizing Records in a Block Two major ways we can organize records within a block disk page fixed packed contiguous storage slotted page structure indexed heap 1 fixed packed records are stored contiguously highly inflexible records may span over block boundary fragmentation with deletions and insertions external pointers may prevent internal block reorganization records are pinned to their address in the block Organizing Records in a Block 2 slotted page structure an initial block header storing block address and offset is used to reference the record records are indexed within a block insertions and deletions are easy there are no assumptions about contiguity of records and record address startpoints to deal with records may be rearranged within the block without concerns about external pointers records are not pinned within the block Organizing Records in a Files Given a set of records how do we organize them in a file Three possible methods are 1 Heap no order at all A record can be placed anywhere in the file where there is space 2 Sequential records are stored in a sorted order according to the value of a search key 3 Hashing a hash function computed on some attribute of each record is used to specify in which block of the file the record should be placed Records of each relation are often stored in separate files Sometimes it is useful to use a clustering file organization where records of several relations might be stored in a single file Heap File Organization Heap no order at all A record can be placed anywhere in the file where there is space easy insert easy delete lack of any structure makes queries including finding a particular record very difficult not usually useful for anything except very small relations Sequential File Organization Sequential records are stored in a sorted order according to the value of a search key designed for efficient queries in sorted order very suitable for applications that require sequential processing of the entire file difficult to maintain sorted order with insert delete deletions can use a free list pointer chain to mark empty space as previously described Sequential File Organization Insertions use the following method locate the location to be inserted if there is space there insert with no more work otherwise insert the record in an overflow block in either case the pointer chain must be updated Every so often we need to reorganize the whole file to restore sequential order Clustering File Organization Simple file structure stores each relation in a separate file tuples can be represented as fixed length records easy to implement well suited to small databases Large
View Full Document
Unlocking...