DOC PREVIEW
UW CSE 444 - DBMS Internals Storage

This preview shows page 1-2-14-15-29-30 out of 30 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

DBMS Internals: StorageRepresenting Data ElementsRecord Formats: Fixed LengthRecord HeaderVariable Length RecordsRecords With Repeating FieldsStoring Records in BlocksStorage and IndexingCost Model for Our AnalysisFile Organizations and AssumptionsCost of OperationsIndexesIndex ClassificationPrimary IndexSlide 15Primary Index with Duplicate KeysSlide 17Slide 18Secondary IndexesClustered/UnclusteredClustered vs. Unclustered IndexSlide 22Applications of Secondary IndexesComposite Search KeysB+ TreesB+ Trees BasicsB+ Tree ExampleB+ Tree DesignSearching a B+ TreeB+ Trees in PracticeDBMS Internals: StorageFebruary 27th, 2004Representing Data Elements•Relational database elements:•A tuple is represented as a recordCREATE TABLE Product (pid INT PRIMARY KEY,name CHAR(20),description VARCHAR(200),maker CHAR(10) REFERENCES Company(name))CREATE TABLE Product (pid INT PRIMARY KEY,name CHAR(20),description VARCHAR(200),maker CHAR(10) REFERENCES Company(name))Record Formats: Fixed Length•Information about field types same for all records in a file; stored in system catalogs.•Finding i’th field requires scan of record.•Note the importance of schema information!Base address (B)L1 L2L3 L4F1 F2F3 F4Address = B+L1+L2Record HeaderL1 L2L3 L4F1 F2F3 F4To schemalengthtimestampNeed the header because:•The schema may changefor a while new+old may coexist•Records from different relations may coexistheaderVariable Length RecordsL1 L2L3 L4F1 F2F3 F4Other header informationlengthPlace the fixed fields first: F1, F2Then the variable length fields: F3, F4Null values take 2 bytes onlySometimes they take 0 bytes (when at the end)headerRecords With Repeating FieldsL1 L2L3F1 F2F3Other header informationlengthheaderNeeded e.g. in Object Relational systems,or fancy representations of many-many relationshipsStoring Records in Blocks•Blocks have fixed size (typically 4k)R1R2R3BLOCKR4Storage and Indexing•How do we store efficiently large amounts of data?•The appropriate storage depends on what kind of accesses we expect to have to the data.•We consider:–primary storage of the data–additional indexes (very very important).Cost Model for Our AnalysisAs a good approximation, we ignore CPU costs:–B: The number of data pages–R: Number of records per page–D: (Average) time to read or write disk page–Measuring number of page I/O’s ignores gains of pre-fetching blocks of pages; thus, even I/O cost is only approximated. –Average-case analysis; based on several simplistic assumptions.File Organizations and Assumptions•Heap Files:–Equality selection on key; exactly one match.–Insert always at end of file.•Sorted Files:–Files compacted after deletions.–Selections on sort field(s).•Hashed Files:–No overflow buckets, 80% page occupancy. •Single record insert and delete.Cost of Operations HeapFileSorted FileHashedFileScan all recsEquality SearchRange SearchInsertDeleteIndexes•An index on a file speeds up selections on the search key fields for the index.–Any subset of the fields of a relation can be the search key for an index on the relation.–Search key is not the same as key (minimal set of fields that uniquely identify a record in a relation).•An index contains a collection of data entries, and supports efficient retrieval of all data entries with a given key value k.Index Classification•Primary/secondary•Clustered/unclustered•Dense/sparse•B+ tree / Hash table / …Primary Index•File is sorted on the index attribute•Dense index: sequence of (key,pointer) pairs10203040506070801020304050607080Primary Index•Sparse index10305070901101301501020304050607080Primary Index with Duplicate Keys•Dense index:10203040506070801010102020203040Primary Index with Duplicate Keys•Sparse index: pointer to lowest search key in each block:•Search for 2010102030101010202020304020 is here......but need to search here too•Better: pointer to lowest new search key in each block:•Search for 20•Search for 15 ? 35 ?Primary Index with Duplicate Keys1020304050607080101010203030405020 is here......ok to search from here3030Secondary Indexes•To index other attributes than primary key•Always dense (why ?)10102020203030302030302010201030Clustered/Unclustered•Primary indexes = usually clustered•Secondary indexes = usually unclusteredClustered vs. Unclustered IndexData entries(Index File)(Data file)Data RecordsData entriesData RecordsCLUSTEREDUNCLUSTEREDSecondary Indexes•Applications:–index other attributes than primary key–index unsorted files (heap files)–index clustered dataApplications of Secondary Indexes•Clustered dataCompany(name, city), Product(pid, maker)Select cityFrom Company, ProductWhere name=maker and pid=“p045”Select cityFrom Company, ProductWhere name=maker and pid=“p045”Select pidFrom Company, ProductWhere name=maker and city=“Seattle”Select pidFrom Company, ProductWhere name=maker and city=“Seattle”Company 1 Company 2 Company 3Products of company 1Products of company 2 Products of company 3Composite Search Keys•Composite Search Keys: Search on a combination of fields.–Equality query: Every field value is equal to a constant value. E.g. wrt <sal,age> index:•age=20 and sal =75–Range query: Some field value is not a constant. E.g.:•age =20; or age=20 and sal > 10sue 13 75bobcaljoe 121020801112name age sal<sal, age><age, sal> <age><sal>12,2012,1011,8013,7520,1210,1275,1380,111112121310207580Data recordssorted by nameData entries in indexsorted by <sal,age>Data entriessorted by <sal>Examples of composite keyindexes using lexicographic order.B+ Trees•Search trees•Idea in B Trees:–make 1 node = 1 block•Idea in B+ Trees:–Make leaves into a linked list (range queries are easier)•Parameter d = the degree•Each node has >= d and <= 2d keys (except root)•Each leaf has >=d and <= 2d keys:B+ Trees Basics30 120 240Keys k < 30Keys 30<=k<120Keys 120<=k<240 Keys 240<=k40 50 604050 60Next leafB+ Tree Example8020 60 100 120 14010 15 18 20 30 40 50 60 65 80 85 9010 15 18 20 30 40 50 60 65 80 85 90d = 2Find the key 4040  8020 < 40  6030 < 40  40B+ Tree Design•How large d ?•Example:–Key size = 4 bytes–Pointer size = 8 bytes–Block size = 4096 byes•2d x 4 + (2d+1) x 8 <= 4096•d = 170Searching a B+ Tree•Exact key values:–Start at the root–Proceed down, to the leaf•Range queries:–As above–Then sequential traversalSelect nameFrom peopleWhere age = 25Select


View Full Document

UW CSE 444 - DBMS Internals Storage

Documents in this Course
XML

XML

48 pages

SQL

SQL

25 pages

SQL

SQL

42 pages

Recovery

Recovery

30 pages

SQL

SQL

36 pages

Indexes

Indexes

35 pages

Security

Security

36 pages

Wrap-up

Wrap-up

6 pages

SQL

SQL

37 pages

More SQL

More SQL

48 pages

SQL

SQL

35 pages

XML

XML

46 pages

Triggers

Triggers

26 pages

Load more
Download DBMS Internals Storage
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view DBMS Internals Storage and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view DBMS Internals Storage 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?