DOC PREVIEW
NYU CSCI-GA 3033 - Course Introduction & Lab Intro

This preview shows page 1-2-3-20-21-22-41-42-43 out of 43 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 43 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 43 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 43 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 43 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 43 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 43 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 43 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 43 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 43 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 43 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Distributed (storage) systems G22.3033-006Know your staffImportant addressesThis class will teach you …Who should take this class?Course readingsCourse structureHow are you evaluated?Questions?What are distributed systems?Why distributed systems? for ease-of-useWhy distributed systems? for availabilityWhy distributed systems? for scalable capacityChallengesChallenges (continued)A word of warningPerformance can be subtleStarbucks’ throughputReliability can be subtle tooTopics in this courseCase Study: Distributed file systemA simple distributed FS designTopic: System DesignTopic: ConsistencyTopic: Fault ToleranceTopic: SecurityTopic: ImplementationIntro to programming Lab: Yet Another File System (yfs)YFS is inspired by FrangipaniFrangipani DesignSlide 31Frangipani securityFrangipani server implements FS logicConcurrent accesses cause inconsistencySolution: use a lock service to synchronize accessPutting it togetherNFS (or AFS) architectureNFS messages for reading a fileWhy use file handles in NSF msg, not file names?Frangipani vs. NFSYFS: simplified FrangipaniLab seriesL1: lock serverDistributed (storage) systemsG22.3033-006Lec 1: Course Introduction & Lab IntroKnow your staff•Instructor: Prof. Jinyang Li (me)–[email protected]–Office Hour: Tue 5-6pm (715 Bway Rm 708)•TA: Yair Sovran–[email protected]–Office Hour: Tue 3-4pm (715 Bway Rm 705)Important addresses•Class webpage: http://www.news.cs.nyu.edu/~jinyang/fa08–Check for announcements, reading questions•Sign up for class mailing list [email protected]–We will email announcements using this list–You can also email the entire class for questions, share information, find project member.•Staff mailing list includes just me and Yair [email protected]–Email us your questions, suggestionsThis class will teach you …•Basic tools of distributed systems–Abstractions, algorithms, implementation techniques–System designs that worked•Build a real system!•Your (and my) goal: address new system challengesWho should take this class?•Pre-requisite:–Undergrad OS –Programming experience in C or C++•Satisfies M.S. requirement D–“large-scale programming project course”Course readings•No official textbook•Lectures are based on research papers–Check webpage for schedules•Useful reference books–Distributed Systems (Tanenbaum and Steen)–Advanced Programming in the UNIX environment (Stevens)–UNIX Network Programming (Stevens)Course structure•Lectures –Read assigned papers before class–Answer reading questions, hand-in answers in class–Participate in class discussion•Programming Labs –Build a networked file system with detailed guidance!•Project–Extend the lab file system in any way you like!How are you evaluated?•Class participation 10%•Labs 40%•Project 20%–In teams of 1-2 people•Quizzes 30%–mid-term and finalQuestions?•Please complete survey questionsWhat are distributed systems?•Examples?Multiple hostsA network cloudHosts cooperate to provide a unified serviceWhy distributed systems?for ease-of-use•Handle geographic separation•Provide users (or applications) with location transparency:–Web: access information with a few “clicks”–Network file system: access files on remote servers as if they are on a local disk, share files among multiple computersWhy distributed systems?for availability•Build a reliable system out of unreliable parts–Hardware can fail: power outage, disk failures, memory corruption, network switch failures…–Software can fail: bugs, mis-configuration, upgrade …–To achieve 0.999999 availability, replicate data/computation on many hosts with automatic failoverWhy distributed systems?for scalable capacity•Aggregate resources of many computers–CPU: Dryad, MapReduce, Grid computing–Bandwidth: Akamai CDN, BitTorrent–Disk: Frangipani, Google file systemChallenges•System design–What is the right interface or abstraction?–How to partition functions for scalability?•Consistency–How to share data consistently among multiple readers/writers?•Fault Tolerance–How to keep system available despite node or network failures?Challenges (continued)•Security–How to authenticate clients or servers?–How to defend against or audit misbehaving servers?•Implementation–How to maximize IO parallelism?–How to reduce load on the bottleneck resource?A word of warning•Easy to make distributed systems that are less reliable and w/ worse performance than centralized systems!Performance can be subtle•Goal: sustained performance under high load•Toy “distributed system”: –2 employees run Starbucks–Employee 1: take orders from customers, calls out to employee 2–Employee 2:•Write down orders (5 seconds per order)•Make drinks (10 seconds per order) •What is starbuck’s throughput under increasing load?Starbucks’ throughput•What is the ideal curve? What design achieves it?Orders per minute (offered load)48 12drinks per minute (tput)24Reliability can be subtle tooA distributed system is a system in which I can’t do my work because some computer that I’ve never even heard of has failed.”-- Leslie LamportTopics in this courseCase Study: Distributed file systemServer(s)Client 1Client 2 Client 3A distributed file system provides:• location transparent file accesses • sharing among multiple clients$ echo “test” > f2$ ls /dfsf1 f2$ ls /dfsf1 f2$ cat f2testA simple distributed FS design•A single server stores all data and handles clients’ FS requests.Client 1Client 2QuickTime™ and a decompressorare needed to see this picture.Client 3Topic: System Design•What is the right interface?–possible interfaces of a storage system •Disk•File system•Database•What if more clients than 1 server can handle?•How to store peta-bytes of data?–Idea: partition users’ home directories across serversTopic: Consistency•When C1 moves file f1 from /d1 to /d2, do other clients see intermediate results?•What if both C1 and C2 want to move f1 to different places?•To reduce network load, cache data at C1–If C1 updates f1 to f1’, how to ensure C2 reads f1’ instead of f1?Topic: Fault Tolerance•How to keep the system running when some file server is down?–Replicate data at multiple servers•How to update replicated data?•How to fail-over among replicas?•How to maintain consistency across reboots?Topic: Security•Adversary can manipulate messages–How


View Full Document

NYU CSCI-GA 3033 - Course Introduction & Lab Intro

Documents in this Course
Design

Design

2 pages

Real Time

Real Time

17 pages

Load more
Download Course Introduction & Lab Intro
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Course Introduction & Lab Intro and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Course Introduction & Lab Intro 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?