UNCP CSC 3800 - Distributed Databases - D1616319

Home> Schools> University of North Carolina at Pembroke> Computer Science (CSC) > CSC 3800> Distributed Databases

DOC PREVIEW

UNCP CSC 3800 - Distributed Databases

School name University of North Carolina at Pembroke

Course Csc 3800- Database Management Systems

Pages 21

This preview shows page 1-2-20-21 out of 21 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 21 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 21 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 21 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 21 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 21 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Slide 1Distributed Database SystemAdvantages of DistributionArchitecture Design ConsiderationsArchitecture AlternativesTypes of Distributed SystemsSoftware Components of DDBMSDDBMS FunctionsData Placement AlternativesFactors in Data Placement DecisionTypes of TransparencyTransaction Management for DDBMSLocking ProtocolsGlobal Deadlock DetectionTimestamping ProtocolsRecovery-FailuresHandling Node FailureCommit ProtocolsDistributed Query ProcessingSteps in Distributed Query-1Steps in Distributed Query-2CSC 3800 Database Management SystemsTime: 1:30 to 2:20 Meeting Days: MWF Location: Oxendine 1237BTextbook: Databases Illuminated, Author: Catherine M. Ricardo, 2004, Jones & Bartlett PublishersFall 2009Chapter 12Distributed DatabasesDr. Chuck LillieDistributed Database Distributed Database SystemSystemMultiple sites connected by a communications systemData at any site available to users at other sites Sites may be far apart; linked by telecommunications linesMay be close together; linked by a local area networkAdvantages of DistributionAdvantages of DistributionCompared to a single, centralized system that provides remote access, distributed system advantages are◦Local autonomy◦Improved reliability◦Better data availability◦Increased performance◦Reduced response time◦Lower communications costsArchitecture Design Architecture Design ConsiderationsConsiderationsFactors the designer considers in choosing an architecture for a distributed system ◦Type of communications system◦Data models supported◦Types of applications◦Data placement alternativesArchitecture AlternativesArchitecture AlternativesCentralized database with distributed processing Client-server systemParallel databases◦Shared memory◦Shared disk◦Shared nothing◦Cluster true distributed database-data and processing shared among autonomous sitesTypes of Distributed Types of Distributed SystemsSystemsHomogeneous◦All nodes use the same hardware and softwareHeterogeneous◦Nodes have different hardware or software◦Require translations of codes and word lengths due to hardware differences◦Translation of data models and data structures due to software differencesSoftware Components of Software Components of DDBMSDDBMSData communications component (DC)Local database management system (DBMS)Global data dictionary (GDD)Distributed database management system component (DDBMS)Not all sites have all these componentsDDBMS FunctionsDDBMS FunctionsProvides the user interface◦Needed for location transparencyLocates the data◦Directs queries to proper site(s)Processes queries◦Local, remote, compound (global)Provides network-wide concurrency control and recovery proceduresProvides translation in heterogeneous systemsData Placement Data Placement AlternativesAlternativesCentralized◦All data at one site onlyReplicated◦All data duplicated at all sitesPartitioned◦Data divided among sites◦Fragmentation scheme: horizontal, vertical, mixed fragments◦Each item appears only onceHybrid◦Combination of the othersFactors in Data Placement Factors in Data Placement DecisionDecisionLocality of referenceReliability of dataAvailability of dataStorage capacities and costsDistribution of processing loadCommunications costsTypes of TransparencyTypes of TransparencyData distribution transparency◦Fragmentation transparency◦Location transparency◦Replication transparencyDBMS heterogeneity transparencyTransaction transparency◦Concurrency transparency◦Recovery transparencyPerformance transparencyTransaction Management for Transaction Management for DDBMSDDBMSEach site that initiates transactions has a transaction coordinator to manage transactions that originate there◦For local or remote transactions, transaction manager at data site takes over◦For global transactions, originating site coordinatorStarts executionUses GDD to form sub-transactionsDirects sub-transactions to appropriate sitesReceives sub-transaction resultsControls transaction end-either commit or abort at all sitesAdditional concurrency control problem◦Multiple-copy inconsistency problemSolutions use locking and timestampingLocking ProtocolsLocking ProtocolsExtension of two-phase locking protocol◦Single-site lock managerMay use Read-One-Write-All replica handlingSite may become a bottleneck◦Distributed lock managerCan use Read-One-Write-All methodDeadlock difficult to determine◦Primary copyDominant node for each data item◦Majority lockingGlobal Deadlock DetectionGlobal Deadlock DetectionEach site has local wait-for graph-detects only local deadlockNeed global wait-for graph◦Single site can be global deadlock detection coordinator◦Constructs global graph and checks for cycles◦Responsibility could be shared among sitesTimestamping ProtocolsTimestamping ProtocolsOne site could issue all timestampsInstead, multiple sites could issue them◦Each timestamp has two parts-the time and the node identifier◦Guarantees uniqueness of timestamps◦Difficult to synchronize clocks-to control divergence, can advance clock reading if later timestamp received◦Can apply basic timestamping, Thomas’ Write Rule, multi-version timestamp protocols using unique timestampsRecovery-FailuresRecovery-FailuresMust guarantee atomicity and durability of transactionsFailures include usual types, plus loss of messages, site failure, link failureNetwork partitioning◦Failure where network splits into groups of nodes that are isolated from other groups, but can communicate with one anotherHandling Node FailureHandling Node FailureSystem flags node as failedSystem aborts and rolls back affected transactionsSystem checks periodically to see if node has recovered, or node self-reportsAfter restart, failed node does local recoveryFailed node catches up to current state of DB, using system log of changes made while it was unavailableCommit ProtocolsCommit ProtocolsTwo-phase commit protocl◦Phase 1-voting phaseCoordinator writes <begin commit T> to its log, writes log to disk, sends <prepare T> msg to all participants. Each site either does a <ready T> or <abort T> and sends its vote to coordinator◦Phase 2-resolution phaseCoordinator resolves fate of transactionIf any abort msg received, makes all sites abortFailure of any site to vote generates global

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-20-21 out of 21 pages.

UNCP CSC 3800 - Distributed Databases

Sign up for free to view:

Please select your school