DOC PREVIEW
UCSD CSE 120 - Distributed Systems

This preview shows page 1-2 out of 5 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1Lecture 13Distributed SystemsDecember 2, 2003Prof. Joe PasqualeDepartment of Computer Science and EngineeringUniversity of California, San Diego© 2003 by Joseph PasqualeCSE 120: Principles of Operating Systems2Before We Begin …Read Chapters 15, 17 (on Distributed Systems topics)3What is a Distributed System?Computers connected by a network that can cooperateDegree of integration may be loose• messages between inter-machine processes• email, ftpOr it may be medium (network operating system)• remotely execute cmds, cross mount file systemsOr it may be tight (distributed operating system)• process migration, distributed file system4AdvantagesSpeed: more resources, parallelism, less contentionReliability: resource redundancy, fault toleranceScalability: incremental growth, can start cheapEconomy of scale: amortizing cost of shared resourcesGeographic distribution: communication, reliability5DisadvantagesGlobal state uncertainty• no shared memory• no shared clock“Simultaneous” decisions may conflict• mutually conflicting decision problemDistributed algorithms are complex• all interactions via messages (no shared memory)• distributed debugging is difficult6Which is Better?Single fast server withsingle queueMultiple slower serverswith separate queuesMultiple slower serverswith single queueLittle’s Law: N = λ Wµλµ/2µ/2λ/2λ/2µ/2µ/2λ7Performance: Load BalancingLoad: average number of runnable processesKeep load balanced across all nodesWhen process created, consider where to run• node where it was submitted?• node that is least loaded?• random selection?If load becomes unbalanced, migrate processes• may be more trouble than worth8Reliability: File ReplicationMaintain multiple copies of files on separate nodesWhen file requested, obtain from node having copy• primary and backup• all nodes are equalImproves availability (and maybe even performance)Problem: updates• when a file is written, how are copies updated9The Client/Server ModelClient• Short-lived process that makes requests• “User-side” of applicationServer• Exports well-defined requests/response interface• Long-lived process that waits for requests• Upon receiving request, carries it out (may spawn)Client Server10Peer-to-PeerPeer-to-peer is typically dynamic client/server• a node may act as a client or server• other “peer-to-peer” relationships are possibleExample: peer-to-peer file sharing• Client A requests file from server B• Server B provides file to client A• Client C can request file from A or B; say A• A, formerly a client, now acts as server11Distributed File SystemsFile service: file system interface for remote clientsFile server: machine(s) that provides file service• storage devices connected to file serverIssues• Naming: location transparency and independence• Caching: client caches, server caches, consistency• State: stateful vs. stateless file service• Replication: number of replicas, updating12Event OrderingWhat is order of events given no shared clock/memoryHappened-before relation: ->• A, B events of same process and A before B: A -> B• A is a send event, B is a receive event: A -> B• If A -> B, and B -> C, then A -> CImplementation• Timestamp all events based on local clock• Upon receiving a message, advance local clock• Resolve ties by ordering machines13Mutual ExclusionCentralized approach• single process acts as coordinator server• request, reply (to allow entrance), releaseDistributed approach• process sends a request with TS to all processes• waits until it receives all replies (ok to enter)• enter critical section (may get requests, defers)• upon exiting, responds (to release) to all deferred• TS used to order “simultaneous” requests14Atomic TransactionsProgram unit that must be executed atomically• executed to completion, or not at allDistributed transactions• transactions broken into multiple sub-transactions• sub-transactions executed on different machinesTo make distributed transaction atomic• have all sub-transactions execute• run two-phase commit protocol15Two-Phase Commit ProtocolPhase 1• Coordinator logs <prepare T>, sends to all sites• At each site, upon receiving <prepare T>– either {log <no T>, reply} or {log <ready T>, reply}Phase 2• C waits to receive all responses (or times out)• If all responses are <ready T>, commit; else abort• Log decision, and send decision to all sitesLog to stable storage16Stable StorageUses multiple storage devs; independent failure modesEach (logical) block has 2 (or more) physical blocksTo write: write first block; if successful, write second• Write is successful only if all succeedRecovery• If no detectable errors and blocks same, all is ok• If detectable error in one block, copy good to bad• If no errors but blocks different, copy 2nd to 1st17Leader ElectionMany distributed algs rely on a leader (e.g., coordinator)Need to determine whether leader exists; if not, electBully algorithm (elects leader L)• Every process is numbered (priority): P1, P2, …• Pi sends request to L, no reply; tries to elect itself• Pi sends election msg to all Pj, j > i; wait for replies• No replies, Pi becomes L, sends msgs to Pj, j < i• If some Pj replies, then Pi waits for election msg• If no election msg, Pi restarts election algorithm18Byzantine Generals ProblemDivisions of Byzantine army surround enemy camp• generals must agree whether to attack at dawn• all must agree; if only some attack, defeatGenerals can only communicate via messengers• Messengers may get captured (unreliable comm)• Generals may be traitors (faulty processors)Can’t create reliable comm from unreliable partsCan overcome faulty procs if n ≥ 3m + 1 (m of n


View Full Document

UCSD CSE 120 - Distributed Systems

Documents in this Course
Threads

Threads

14 pages

Deadlocks

Deadlocks

19 pages

Processes

Processes

14 pages

Paging

Paging

13 pages

Processes

Processes

18 pages

Threads

Threads

29 pages

Security

Security

16 pages

Paging

Paging

13 pages

Processes

Processes

32 pages

Lecture 2

Lecture 2

13 pages

Paging

Paging

8 pages

Threads

Threads

14 pages

Paging

Paging

13 pages

Paging

Paging

26 pages

Paging

Paging

13 pages

Lecture

Lecture

13 pages

Processes

Processes

14 pages

Paging

Paging

13 pages

Security

Security

17 pages

Threads

Threads

15 pages

Processes

Processes

34 pages

Structure

Structure

10 pages

Lecture 3

Lecture 3

13 pages

Lecture 1

Lecture 1

28 pages

Threads

Threads

15 pages

Paging

Paging

30 pages

Load more
Download Distributed Systems
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Distributed Systems and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Distributed Systems 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?