DOC PREVIEW
UT Dallas CS 6385 - CS-6360 Chapter 17-FileStructuresHashing

This preview shows page 1-2-3-4-5-6-40-41-42-43-44-82-83-84-85-86-87 out of 87 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 87 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 87 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 87 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 87 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 87 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 87 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 87 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 87 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 87 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 87 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 87 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 87 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 87 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 87 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 87 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 87 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 87 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 87 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

!" #Dr. Chris Irwin Davis !Email: [email protected] Phone: (972) 883-3574 Office: ECSS 4.705Chapter 17: File Structures and HashingCS-6360 Database Design!" #• Placing File Records on Disk!• Operations on Files!• Files of Unordered Records (Heap Files)!• File of Ordered Records (Sorted Files)!• Hashing techniques!• Other Primary File Organizations!• Parallelizing Disk Access Using RAID!• New Storage Systems!2!" #Placing File Records On Disk• Data is usually stored in the form of records. !• Each record consists of a collection of related data values or items, where each value is formed of one or more bytes and corresponds to a particular field of the record.!• Records usually describe entities and their attributes.3!" #Placing File Records On Disk• For example, an EMPLOYEE record represents an employee entity, and each field value in the record specifies some attribute of that employee, such as Name, Birth_date, Salary, or Supervisor.!• A collection of field names and their corresponding data types constitutes a record type or record format definition. !• A data type, associated with each field, specifies the types of values a field can take.4!" #Data Types and Size• The data type of a field is usually one of the standard data types used in programming (e.g. numeric, string, Boolean, sometimes Date and/or Time).!• The number of bytes required for each data type is fixed for a given computer system.!°Software platform dependent!°Hardware platform dependent5!" #Records• For example, an EMPLOYEE record type may be defined—using the C programming language notation—as the following structure:6What if an attribute is not a primitive type?!" #BLOBs• The need may arise for storing data items that consist of large unstructured objects, which represent images, digitized video or audio streams, or free text. !• These are referred to as BLOBs (binary large objects). !• A BLOB data item is typically stored separately from its record in a pool of disk blocks, and a pointer to the BLOB is included in the record7Files, Fixed-Length Records, !and Variable-Length Records!" #Files, Fixed-Length Records, and Variable-Length Records• A file is a sequence of records. !• In many cases, all records in a file are of the same record type. !• If every record in the file has exactly the same size (in bytes), the file is said to be made up of fixed-length records. !• If different records in the file have different sizes, the file is said to be made up of variable-length records. 9!" #Variable-Length Records• A file may have variable-length records for several reasons:!°The file records are of the same record type, but one or more of the fields are of varying size (variable-length fields). For example, the Name field of EMPLOYEE can be a variable-length field.!°The file records are of the same record type, but one or more of the fields may have multiple values for individual records; such a field is called a repeating field and a group of values for the field is often called a repeating group.10!" #Variable-Length Records• The file records are of the same record type, but one or more of the fields are optional; that is, they may have values for some but not all of the file records (optional fields).!• The file contains records of different record types and hence of varying size (mixed file). This would occur if related records of different types were clustered (placed together) on disk blocks; for example, the GRADE_REPORT records of a particular student may be placed following that STUDENT’s record.11!" #Record Storage Formats12!" #Record Storage Formats13!" #Record Storage Formats14!" #Operations on Files• OPEN: Readies the file for access, and associates a pointer that will refer to a current file record at each point in time.!• FIND: Searches for the first file record that satisfies a certain condition, and makes it the current file record.!• FINDNEXT: Searches for the next file record (from the current record) that satisfies a certain condition, and makes it the current file record.!• READ: Reads the current file record into a program variable.!• INSERT: Inserts a new record into the file & makes it the current file record. 15!" #Operations on Files• DELETE: Removes the current file record from the file, usually by marking the record to indicate that it is no longer valid.!• MODIFY: Changes the values of some fields of the current file record.!• CLOSE: Terminates access to the file.!• REORGANIZE: Reorganizes the file records.!°For example, the records marked deleted are physically removed from the file or a new organization of the file records is created.!• FIND_ORDERED: Retrieves all the records in the file in some specified order.!• READ_ORDERED: Read the file blocks in order of a specific field of the file. 16!" #Unordered Records (Heap Files)• Also called a heap or a pile file.!• New records are inserted at the end of the file.!• A linear search through the file records is necessary to search for a record.!°This requires reading and searching half the file blocks on the average, and is hence quite expensive.!• Record insertion is quite efficient.!• Reading the records in order of a particular field requires sorting the file records. 17!" #Ordered Records (Sorted Files)• Also called a sequential file.!• File records are kept sorted by the values of an ordering field.!• Insertion is expensive: records must be inserted in the correct order.!• It is common to keep a separate unordered overflow (or transaction) file for new records to improve insertion efficiency; this is periodically merged with the main ordered file.18!" #Ordered Records (Sorted Files)• A binary search can be used to search for a record on its ordering field value.!• This requires reading and searching log2 of the file blocks on the average, an improvement over linear search.!• Reading the records in order of the ordering field is quite efficient.19Memory Blocks of a Sorted File!" #Other Strategies to Organize Data• What if data were evenly distributed among memory “buckets”?!• Which field to sort / order by?!• How to insure even distribution?21!" #Hashing• Another type of primary file organization is based on hashing, which provides very fast access to records under certain search conditions. !• This organization is usually called a hash file. !• The search condition must be an equality condition on a


View Full Document

UT Dallas CS 6385 - CS-6360 Chapter 17-FileStructuresHashing

Documents in this Course
assn1

assn1

2 pages

38rel2

38rel2

5 pages

Report

Report

3 pages

networks

networks

18 pages

lp2

lp2

44 pages

lp2 (2)

lp2 (2)

27 pages

lp1(1)

lp1(1)

21 pages

integer1

integer1

50 pages

FrankR2

FrankR2

3 pages

duality

duality

28 pages

CMST

CMST

44 pages

hw4

hw4

3 pages

for 1

for 1

11 pages

ENCh02

ENCh02

33 pages

pree

pree

2 pages

new  3

new 3

2 pages

new  2

new 2

2 pages

hw4a

hw4a

2 pages

T2_Sol

T2_Sol

4 pages

ISM3

ISM3

8 pages

hw4_sol

hw4_sol

6 pages

Elm04_06

Elm04_06

11 pages

atn proj2

atn proj2

20 pages

12CUT1

12CUT1

8 pages

09Ford

09Ford

23 pages

08FLOW

08FLOW

6 pages

03LP_su

03LP_su

6 pages

40REL40

40REL40

5 pages

39rel3

39rel3

5 pages

38arel2

38arel2

5 pages

37REL1

37REL1

3 pages

24TABU

24TABU

3 pages

22DYNPR

22DYNPR

3 pages

21B&C

21B&C

2 pages

20BBEX0

20BBEX0

3 pages

19BB

19BB

5 pages

14CAPBUD0

14CAPBUD0

11 pages

35BRXCH

35BRXCH

2 pages

34COMB

34COMB

4 pages

32CAPAS

32CAPAS

4 pages

31QUEUE

31QUEUE

3 pages

Load more
Download CS-6360 Chapter 17-FileStructuresHashing
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view CS-6360 Chapter 17-FileStructuresHashing and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view CS-6360 Chapter 17-FileStructuresHashing 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?