DOC PREVIEW
UW-Madison CS 739 - Lecture Notes

This preview shows page 1-2-3-4 out of 11 pages.

Save
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

UNIVERSITY of WISCONSIN MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C Arpaci Dusseau Introduction to CS739 Distribution Systems What are distributed systems What are the benefits and challenges How will CS739 be structured Readings Writeups Presentations Projects Goals of Course Learn about challenges and existing techniques for building distributed systems and services Read and discuss influential papers from SOSP OSDI NSDI Gain some experience programming in distributed environment Warm up project Final project What is a Distributed System Leslie Lamport says You know you have one when the crash of a computer you never heard of stops you from doing any work More technical definition Collection of independent computers that appears to its users as a single coherent system How are parallel distributed networked systems different All contain nodes processing memory disk connected with network More unified parallel distributed Consider distributed services as well networked Less unified Benefits of Distributed Systems Great price performance Leverage commodity components nodes and networks Use many many of them Incremental scalability Can add x new nodes or disks or memory to improve performance x Improved availability Continue operating when some nodes stop working Improved reliability Deliver correct results when some nodes misbehave corrupt data Allow geographically distributed individuals to share data or cooperate Distributed System Challenges Lack of global state information Different nodes have different view of system What are the contents of file A How many jobs are running on node X Which nodes are currently part of the system See delays different ordering of messages lost messages network partitions Tension with goal of single coherent system Handling slow failed and misbehaving nodes How do you avoid slow nodes How do you get back data or work from failed node When nodes disagree how do you know who is wrong Tension with goal of available and reliable When is it okay to have some centralized components Simplifies state management but single point of failure and performance bottleneck Content of 739 Distributed system courses can be very different Theoretical distributed algorithms e g to allow nodes to come to consensus or agreement 4 lectures Practical distributed programming e g using RPC JAVA RMI CORBA DCOM MPI PVM Warm up project Research systems new ideas for making distributed systems better Focus of course Implemented systems with new conceptual ideas Recent papers in top systems conferences SOSP OSDI NSDI Learning by Reading Intense reading list assume sophisticated reader 736 Usually cover 1 fascinating paper per class No exams Three types of classes 1 Formal lecture Only for 4 theory topics 2 Discussions Most papers I ask questions expect everyone to enthusiastically participate fairly casual Task 1 Read paper 2 3 times before class Task 2 Email write up to me BEFORE class Task 3 Take turns being scribe about 2 times in semester Write up notes from discussion in latex Post to web page within 72 hours Learning by Reading cont Types of classes cont 3 Group led lectures 4 topics Small group gives overview of about 3 4 related papers Topics Advantages Distributed system analysis Process migration Programming environments Specialized distributed services Good practice for giving presentations Learn about topic in slightly more depth Tasks Group Finalize related papers 1 week before Present to me 2 days before Use slides 3 Everyone else Skim papers Handout State preferences by next week Course Topics Reading List Distributed Operating Systems Survey Amoeba vs Sprite Network File Systems NFS Coda LBFS Theory Time Ordering and Distributed Snapshots 2 Lamport papers Analysis of Distributed Systems 1 Group Presentation Programming Environments DSM MapReduce Group Process Migration 1 Group Specialized Distributed Services Porcupine Group SPRING BREAK Theory Consensus Byzantine failures and fail stop processors Cluster based File Systems Petal Frangipani and GoogleFS Communication Primitives RPC vs U Net P2P Systems Measurement CFS Amazon Pangaea LOCKSS Miscellaneous Trust Recovery Mistakes Speculation Sensor Networks Learning by Doing Warm up Project Goal Become familiar with existing distributed programming environments Examples Hadoop open source MapReduce MPI PVM Task 0 Get environment running Task 1 Implement simple application e g sorting Task 2 Report sufficient numbers to indicate did something Final Project Goal 1 Experience with research process in general Work on open ended project unknown result New idea where don t know if it will work Goal 2 Learn about specific topic in depth Topic from my list or your own choice work with project partner Deliverables 20 minute talk short research paper Agenda for Next Class See website www cs wisc edu cs739 1 Read Survey Distributed Operating Systems Andrew S Tanenbaum and Robbert Van Renesse ACM Computing Surveys Volume 17 Issue 4 December 1985 pp 419 470 Long paper Focus on Sections 1 and 2 Answer question What were the goals of distributed systems at this time Which design issue I e communication primitives naming and protection resource management fault tolerance services seems most challenging or interesting Why Email answer to me with Subject cs739 Survey Think about group presentation papers


View Full Document

UW-Madison CS 739 - Lecture Notes

Documents in this Course
Load more
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?