HARVARD CSCI 299r - Introduction to Distributed Computing with BOB

Unformatted text preview:

Chapter 1Introduction to Distributed Computing with BOBWhat is it?“A distributed system is several computers doing something together.”Interconnections exposed to unauthorized eavesdropping and message modificationChapter 1Introduction to Distributed Computing with BOBWritten by Michael D. SchroederPresented by John ChoFebruary 8, 1999What is it? “A distributed system is several computers doing something together.”- Multiple computers – need at least two computers.- Interconnections – without a network, it wouldn’t be very interesting.- Shared state – need to ensure correct and coordinated computingWhy do it?A better computing paradigm with the advantages of networked and centralized systems.Networked system advantages:- Sharing – Resources shared over a wide geographic and organizational spread.- Cost – Can use small, cost-effective computers.- Growth – Can scale in small increments over a large range of sizes.- Autonomy – Can use a variety of vendors, software, and management policies.- Reliability – Doesn’t necessarily all crash at once.Centralized system advantages:- Accessibility – All information and resources equally accessible.- Coherent – Functions work the same and objects named the same everywhere.- Manageability – Easier to manage.No clear advantage for either in security or availability. - Security – All information and resources in one place can be dangerous.The presence of a network introduces many security problems. - Availability – A central system going down prevents all users from getting work done.In a networked system, a user generally needs several computers to function at once and so the probability of failure increases.Additional advantages: A distributed system can increase computing power by reducing idle processor cycles in other computers, and can increase availability with transparent redundancy. Security is still an issue, although the compromise of a computer does not necessarily compromise the entire system.What are the issues? (See Waldo’s notes for extended info)1. Independent failure:Even if one computer fails, we often want the “system” to keep working.2. Unreliable communication:Connections not always available and messages can be lost of garbled.3. Insecure communication:Interconnections exposed to unauthorized eavesdropping and message modification4. Costly communication:Interconnections provide lower bandwidth, higher latency, and higher cost.How do we do it?The author introduces a best-of-both world (BOB) model of distributed computing with the aforementioned advantages. We can ignore his comments about “state-of-the-art” since it is time relative.Properties- Global names – The mapping from a string to an entity is the same everywhere.- Global access – If you can access something from somewhere in the system, then youcan get to that thing from anywhere in the system.- Global security – If you have a right to something in the system, you have it everywhere. The same needs to be true if you lack a right.- Global management – Resources can be managed from anywhere.- Global availability – The same services work everywhere.We will explore these five issues with models at a general level without the implementation details later in this paper.Services- Naming – Provides access to a replicated, distributed database of global names and associated values for machines, users, files, distribution lists, access control groups, and services.- Remote procedure call – Provides a standard was to define and securely invoke service interfaces locally or remotely.- User registration – Allows users to be registered and authenticated and issues certificates permitting access to system resources and information.- Time – Distributes consistent and accurate time globally.- Files – Provides access to a replicated, distributed, global file system.- Management – Provides access to management data and operations of each component.Additional services to intended applications- Records – Provides access to records to allow concurrent reading and writing with journaling to preserve integrity after a failure.- Printers – Allows printing throughout the network with job control and scheduling.- Execution – Allows programs to run on any machine and efficiently schedules on available machines.- Mailboxes – Provides a transport service for email.- Terminals – Provides access to a windowing graphics terminal from anywhere.- Accounting – Provides access to a system-wide collection of data on resource usage for billing and monitoring.InterfacesInterfaces are crucial for coherent communication between the services of a client and server. The ideas of object-oriented programming, such as objects and data encapsulation, are used to provide homogeneity across various models, versions, and vendors.Naming ModelA hierarchic name space used because of it scales well, allows autonomy in the selection of names, and malleable enough for a long lifetime. An example of this is the IDNS. A global name is interpreted by traversing the tree starting from a global root, which every node has a way to find. Along the way, a junction (just a node in the tree) between a name service and some other service (like a file service) may be encountered. At this point, the name service passes the junction object to the client. For example, for a client to access the file /com/dec/src/bin/ls, the global root is found, and the name service traversed the name tree is to /com/dec/src/bin. The node bin is a junction to a file service, so the content of this junction object, which is stored in the tree, is passed to the client. The client then knows the names of the appropriate servers, and the rules for choosing a server (i.e. the first one that responds); then the client can get the file from the chosen server. An illustration of such a tree is shown on page 10 as Figure 1.1.The details on how a client can find the global root, how the name tree is replicated and distributed across the computers, and how integrity is maintained is not mentioned.Access ModelA user can execute a program from any machine that is compatible and end up with the same result. Performance may vary, but the same files will be used with the use of globalnames. To accomplish global access, the program must run in the same computing environment remotely as if locally. The key is the global name service because any object can be accessed


View Full Document

HARVARD CSCI 299r - Introduction to Distributed Computing with BOB

Download Introduction to Distributed Computing with BOB
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Introduction to Distributed Computing with BOB and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Introduction to Distributed Computing with BOB 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?