Unformatted text preview:

Chapter 13 Distributed Databases Modern Database Management 6th Edition Jeffrey A Hoffer Mary B Prescott Fred R McFadden Prentice Hall 2002 1 Definitions Distributed Database A single logical database that is spread physically across computers in multiple locations that are connected by a data communications link Decentralized Database A collection of independent databases on non networked computers They are NOT the same thing Chapter 13 Prentice Hall 2002 2 Reasons for Distributed Database Business unit autonomy and distribution Data sharing Data communication costs Data communication reliability and costs Multiple application vendors Database recovery Transaction and analytic processing Chapter 13 Prentice Hall 2002 3 Figure 13 1 Distributed database environments adapted from Bell and Grimson 1992 Chapter 13 Prentice Hall 2002 4 Distribution on different sites Chapter 13 Prentice Hall 2002 5 Distributed Database Options Homogeneous Same DBMS at each node Autonomous Independent DBMSs Non autonomous Central coordinating DBMS Easy to manage difficult to enforce Heterogeneous Different DBMSs at different nodes Systems with full or partial DBMS functionality Gateways Simple paths are created to other databases without the benefits of one logical database Difficult to manage preferred by independent organizations Chapter 13 Prentice Hall 2002 6 Distributed Database Options Systems Supports some or all functionality of one logical database Full DBMS Functionality All dist DB functions Partial Multi database Some dist DB functions Federated Supports local databases for unique data requests Loose Integration Local dbs have their own schemas Tight Integration Local dbs use common schema Unfederated Requires all access to go through a central coordinating module Chapter 13 Prentice Hall 2002 7 Homogeneous NonAutonomous Database Data is distributed across all the nodes Same DBMS at each node All data is managed by the distributed DBMS no exclusively local data All access is through one global schema The global schema is the union of all the local schema Chapter 13 Prentice Hall 2002 8 Figure 13 2 Homogeneous Database Identical DBMSs Source adapted from Bell and Grimson 1992 Chapter 13 Prentice Hall 2002 9 Typical Heterogeneous Environment Data distributed across all the nodes Different DBMSs may be used at each node Local access is done using the local DBMS and schema Remote access is done using the global schema Chapter 13 Prentice Hall 2002 10 Figure 13 3 Typical Heterogeneous Environment Non identical DBMSs Source adapted from Bell and Grimson 1992 Chapter 13 Prentice Hall 2002 11 Major Objectives Location Transparency User does not have to know the location of the data Data requests automatically forwarded to appropriate sites Local Autonomy Local site can operate with its database when network connections fail Each site controls its own data security logging recovery Chapter 13 Prentice Hall 2002 12 Significant Trade Offs Synchronous Distributed Database All copies of the same data are always identical Data updates are immediately applied to all copies throughout network Good for data integrity High overhead slow response times Asynchronous Distributed Database Some data inconsistency is tolerated Data update propagation is delayed Lower data integrity Less overhead faster response time NOTE all this assumes replicated data to be discussed later Chapter 13 Prentice Hall 2002 13 Advantages of Distributed Database over Centralized Databases Increased reliability availability Local control over data Modular growth Lower communication costs Faster response for certain queries Chapter 13 Prentice Hall 2002 14 Disadvantages of Distributed Database compared to Centralized databases Software cost and complexity Processing overhead Data integrity exposure Slower response for certain queries Chapter 13 Prentice Hall 2002 15 Options for Distributing a Database Data replication Copies of data distributed to different sites Horizontal partitioning Different rows of a table distributed to different sites Vertical partitioning Different columns of a table distributed to different sites Combinations Chapter 13 of the above Prentice Hall 2002 16 Data Replication Advantages Reliability Fast response May avoid complicated distributed transaction integrity routines if replicated data is refreshed at scheduled intervals De couples nodes transactions proceed even if some nodes are down Reduced network traffic at prime time if updates can be delayed Chapter 13 Prentice Hall 2002 17 Data Replication Disadvantages Additional requirements for storage space Additional time for update operations Complexity and cost of updating Integrity exposure of getting incorrect data if replicated data is not updated simultaneously Therefore better when used for non volatile read only data Chapter 13 Prentice Hall 2002 18 Types of Data Replication Push Replication updating site sends changes to other sites Pull Replication receiving sites control when update messages will be processed Chapter 13 Prentice Hall 2002 19 Types of Push Replication Snapshot Replication Changes periodically sent to master site Master collects updates in log Full or differential incremental snapshots Dynamic vs shared update ownership Near Real Time Replication Broadcast update orders without requiring confirmation Done through use of triggers Update messages stored in message queue until processed by receiving site Chapter 13 Prentice Hall 2002 20 Issues for Data Replication Data timeliness high tolerance for out of date data may be required DBMS capabilities if DBMS cannot support multinode queries replication may be necessary Performance implications refreshing may cause performance problems for busy nodes Network heterogeneity complicates replication Network communication capabilities complete refreshes place heavy demand on telecommunications Chapter 13 Prentice Hall 2002 21 Horizontal Partitioning Different rows of a table at different sites Advantages Data stored close to where it is used efficiency Local access optimization better performance Only relevant data is available security Unions across partitions ease of query Disadvantages Accessing data across partitions inconsistent access speed No data replication backup vulnerability Chapter 13 Prentice Hall 2002 22 Vertical Partitioning Different columns of a table at different sites Advantages and disadvantages are the same as for horizontal partitioning except that combining data across


View Full Document

Cal State East Bay CS 6320 - Distributed Databases

Loading Unlocking...
Login

Join to view Distributed Databases and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Distributed Databases and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?