DOC PREVIEW
UW-Milwaukee COMPSCI 557 - Heap Files

This preview shows page 1-2-3-4-5-6 out of 17 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

TodayDirectory of Slots ExampleHeap FilesHeap File ExampleProg 1 HintsSlide 6Slide 7Slide 8Slide 9Ordered FilesFile of Ordered RecordsHashed FilesHashed Files (contd.)Slide 14Slide 15Hashed Files - Overflow handlingFill in This TableToday•Review of Directory of Slot Block Organizations•Heap Files•Program 1 Hints•Ordered Files & Hash Files•RAIDDirectory of Slots ExampleHeap Files•Heap files are stored as unordered records–the use of “heap” here is unrelated to the “free store” used for dynamic memory allocation•The simplest organizationHeap File ExampleParadise, Sal 231Favor, Sue 123Mach, Chris 401Rodgers, Bill 616Smith, MaryYost, Ned 819Alm, LouisLink, StevePatch, LindaJones, Jim 9832819314183 Ming, YaoTuring, Alan933709block 1 of file block 2 of file block N of fileName IDName ID Name IDassuming N data blocks, R records per block, I/O cost of D and “record processing time” of C what are the costs of:2D+C2D+CN(D+RC)N(D+RC)/2N(D+RC)inserting a record, deleting record given RID (ignore reclaiming space)scan, search for “key” w/ equality selection,search w/ range selection?Prog 1 Hints•check slot numbers to make sure they are valid•sizeof(), memcpy(), memmove()–man is your friend•test patterns – make sure you handle error cases correctly•error reportingclass HFPage { struct slot_t { short offset; short length; }; // equals EMPTY_SLOT if slot is not in use static const int DPFIXED = sizeof(slot_t) + 4 * sizeof(short)+ 3 * sizeof(PageId); short slotCnt; // number of slots in use short usedPtr; // offset of first used byte in data[] short freeSpace; // number of bytes free in data[] short type; // an arbitrary value used by subclasses as needed PageId prevPage; // backward pointer to data page PageId nextPage; // forward pointer to data page PageId curPage; // page number of this page slot_t slot[1]; // first element of slot array. char data[MAX_SPACE - DPFIXED]; // methods ...// **********************************************************// page class constructorvoid HFPage::init(PageId pageNo){ nextPage = prevPage = INVALID_PAGE; slotCnt = 0; // no slots in use curPage = pageNo; usedPtr = sizeof(data); // offset of used space in data array freeSpace = sizeof(data) + sizeof(slot_t); // amount of space available // (initially one unused slot)}•init()•getNextPage(), setNextPage()•getPrevPage(), setPrevPage()•insertRecord(), deleteRecord()•firstRecord(), nextRecord()•getRecord(), returnRecord()•available_space()•empty()int HFPage::available_space(void){ // look for an empty slot. if one exists, then freeSpace // bytes are available to hold a record. int i; for (i=0; i < slotCnt; i++) { if (slot[i].length == EMPTY_SLOT) return freeSpace; } // no empty slot exists. must reserve sizeof(slot_t) bytes // from freeSpace to hold new slot. return freeSpace - sizeof(slot_t);}Ordered Files•Also called a sequential file.•File records are kept sorted by the values of an ordering field.•Insertion is expensive: records must be inserted in the correct order.–It is common to keep a separate unordered overflow (or transaction) file for new records to improve insertion efficiency; this is periodically merged with the main ordered file.•A binary search can be used to search for a record on its ordering field value.–This requires reading and searching log2 of the file blocks on the average, an improvement over linear search.•Reading the records in order of the ordering field is quite efficient.File of Ordered RecordsHashed Files•Hashing for disk files is called External Hashing•The file blocks are divided into M equal-sized buckets, numbered bucket0, bucket1, ..., bucketM-1.–Typically, a bucket corresponds to one (or a fixed number of) disk block.•One of the file fields is designated to be the hash key of the file.•The record with hash key value K is stored in bucket i, where i=h(K), and h is the hashing function.•Search is very efficient on the hash key.•Collisions occur when a new record hashes to a bucket that is already full.–An overflow file is kept for storing such records.–Overflow records that hash to each bucket can be linked together.Hashed Files (contd.)Hashed Files (contd.)•There are numerous methods for collision resolution, including the following:–Open addressing: Proceeding from the occupied position specified by the hash address, the program checks the subsequent positions in order until an unused (empty) position is found. –Chaining: For this method, various overflow locations are kept, usually by extending the array with a number of overflow positions. In addition, a pointer field is added to each record location. A collision is resolved by placing the new record in an unused overflow location and setting the pointer of the occupied hash address location to the address of that overflow location. –Multiple hashing: The program applies a second hash function if the first results in a collision. If another collision results, the program uses open addressing or applies a third hash function and then uses open addressing if necessary.Hashed Files (contd.)•To reduce overflow records, a hash file is typically kept 70-80% full.•The hash function h should distribute the records uniformly among the buckets–Otherwise, search time will be increased because many overflow records will exist.•Main disadvantages of static external hashing:–Fixed number of buckets M is a problem if the number of records in the file grows or shrinks.–Ordered access on the hash key is quite inefficient (requires sorting the records).Hashed Files - Overflow handlingFill in This TableHeapSortedHashedscan equality range insert delete search


View Full Document

UW-Milwaukee COMPSCI 557 - Heap Files

Download Heap Files
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Heap Files and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Heap Files 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?