CORNELL CS 632 - Main Memory Database Systems - D1822763

Home> Schools> Cornell University> Computer Science (CS) > CS 632> Main Memory Database Systems

CORNELL CS 632 - Main Memory Database Systems

Pages 63

Download Save

Unformatted text preview:

Main Memory Database SystemsIntroductionQuestions about MMDBDifferences in properties of main memory and diskImpact of memory resident dataConcurrency controlCommit ProcessingAccess MethodsData representationA study of Index Structures for Main Memory Database Management SystemsDisk versus Main MemoryClassic index structuresClassic index structures (cont)Hash-based indexingHash-based indexing (cont)The T treeSlide 17Search algorithm for T treeInsert algorithmInsert algorithm (cont)Delete algorithmDelete algorithm (cont)LL RotationLR RotationSpecial LR RotationConclusionsBut…The Architecture of the Dali Main-Memory Storage ManagerSlide 29Principles in the design of DaliPrinciples in the design of Dali (cont)Architecture of the DaliLayers of abstractionStorage allocation requirementsSegments and chunksThe Page Table and Segment HeadersTransaction management in DaliMulti-level recovery (MLR)System overview - figSystem overviewSystem overview (cont)Transaction and OperationsLogging modelLogging model (cont)Ping-Pong CheckpointingPing-Pong Checkpointing (cont)Abort processingRecoveryRecovery (cont)Post-commit operationsFault ToleranceDetecting Process DeathLow level cleanupCleaning Up TransactionsMemory protectionCodewordsSlide 57Latch implementationLocking SystemCollections and IndexingHeap fileIndexesHigher Level InterfacesMain Memory Database SystemsAdina CosteaIntroductionMain Memory database system (MMDB)•Data resides permanently on main physical memory •Backup copy on diskDisk Resident database system (DRDB)•Data resides on disk•Data may be cached into memory for accessMain difference is that in MMDB, the primary copy lives permanently in memoryQuestions about MMDB•Is it reasonable to assume that the entire database fits in memory?Yes, for some applications!•What is the difference between a MMDB and a DRDB with a very large cache? In DRDB, even if all data fits in memory, the structures and algorithms are designed for disk access.Differences in properties of main memory and disk•The access time for main memory is orders of magnitude less than for disk storage•Main memory is normally volatile, while disk storage is not•The layout of data on disk is much more critical than the layout of data in main memoryImpact of memory resident data•The differences in properties of main-memory and disk have important implications in:–Concurrency control–Commit processing–Access methods–Data representation–Query processing–Recovery–PerformanceConcurrency control•Access to main memory is much faster than disk access, so we can expect that transactions complete more quickly in a MM system•Lock contention may not be as important as it is when the data is disk residentCommit Processing•As protection against media failure, it is necessary to have a backup copy and to keep a log of transaction activity•The need for a stable log threatens to undermine the performance advantages that can be achieved with memory resident dataAccess Methods•The costs to be minimized by the access structures (indexes) are differentData representation•Main memory databases can take advantage of efficient pointer following for data representationA study of Index Structures for Main Memory Database Management SystemsTobin J. LehmanMichael J. CareyVLDB 1986Disk versus Main Memory•Primary goals for a disk-oriented index structure design:–Minimize the number of disk accesses–Minimize disk space •Primary goals of a main memory index design:–Reduce overall computation time–Using as little memory as possibleClassic index structures•Arrays: –A: use minimal space, providing that the size is known in advance–D: impractical for anything but a read-only environment •AVL Trees: –Balanced binary search tree–The tree is kept balanced by executing rotation operations when needed –A: fast search –D: poor storage utilizationClassic index structures (cont)•B trees:–Every node contains some ordered data items and pointers–Good storage utilization–Searching is reasonably fast–Updating is also fastHash-based indexing•Chained Bucket Hashing:–Static structure, used both in memory and disk–A: fast, if proper table size is known–D: poor behavior in a dynamic environment •Extendible Hashing:–Dynamic hash table that grows with data–A hash node contain several data items and splits in two when an overflow occurs–Directory grows in powers of two when a node overflows and has reached the max depth for a particularly directory sizeHash-based indexing (cont)•Linear Hashing:–Uses a dynamic hash table–Nodes are split in predefined linear order–Buckets can be ordered sequentially, allowing the bucket address to be calculated from a base address–The event that triggers a node split can be based on storage utilization•Modified Linear Hashing:–More oriented towards main memory–Uses a directory which grows linearly–Chained single items nodes–Splitting criteria is based on average length of the hash chainsThe T tree•A binary tree with many elements kept in order in a node (evolved from AVL tree and B tree)•Intrinsec binary search nature•Good update and storage characteristics•Every tree has associated a minimum and maximum count•Internal nodes (nodes with two children) keep their occupancy in the range given by min and max countThe T treeSearch algorithm for T tree•Similar to searching in a binary tree•Algorithm–Start at the root of the tree–If the search value is less than the minimum value of the node•Then search down the left subtree•If the search value is greater than the maximum value in the node–Then search the right subtree –Else search the current nodeThe search fails when a node is searched and the item is not found, or when a node that bounds the search value cannot be foundInsert algorithmInsert (x):•Search to locate the bounding node•If a bounding node is found:–Let a be this node–If value fits then insert it into a and STOP–Else •remove min element amin from node•Insert x•Go to the leaf containing greatest lower bound for a and insert amin into this leafInsert algorithm (cont)•If a bounding node is not found–Let a be the last node on the search path–If insert value fits then insert it into the node–Else create a new leaf with x in it•If a new leaf was added–For each node in the search path (from leaf to root)•If the two subtrees heights differ by more than one, then rotate

View Full Document


School:
Email:
New Password:
Confirm Password:

CORNELL CS 632 - Main Memory Database Systems

Sign up for free to view:

Please select your school