Unformatted text preview:

Goals for Today CS162 Operating Systems and Systems Programming Lecture 19 Data Durability Beginning of Distributed Systems Discussion Lisp ML map fold review MapReduce overview File Systems continued Distributed Systems April 9 2008 Prof Anthony D Joseph http inst eecs berkeley edu cs162 Note Some slides and or pictures in the following are adapted from slides 2005 Silberschatz Galvin and Gagne Gagne Many slides generated from my lecture notes by Kubiatowicz 4 9 08 Important ilities Availability the probability that the system can accept and process requests Joseph CS162 UCB Spring 2008 How to make file system durable Disk blocks contain Reed Solomon error correcting codes ECC to deal with small defects in disk drive Can allow recovery of data from small media defects Often measured in nines of probability So a 99 9 probability is considered 3 nines of availability Key idea here is independence of failures Make sure writes survive in short term Either abandon delayed writes or use special battery backed RAM called non volatile RAM or NVRAM for dirty blocks in buffer cache Durability the ability of a system to recover data despite faults This idea is fault tolerance applied to data Doesn t necessarily imply availability information on pyramids was very durable but could not be accessed until discovery of Rosetta Stone Make sure that data survives in long term Usually stronger than simply availability means that the system is not only up but also working correctly Includes availability security fault tolerance durability Must make sure data survives system crashes disk crashes other problems RAID Redundant Arrays of Inexpensive Disks Need to replicate More than one copy of data Important element independence of failure Could put copies on one disk but if disk head fails Could put copies on different disks but if server fails Could put copies on different servers but if building is struck by lightning Could put copies on servers in different continents Reliability the ability of a system or component to perform its required functions under stated conditions for a specified period of time IEEE definition 4 9 08 Joseph CS162 UCB Spring 2008 Lec 19 2 Data stored on multiple disks redundancy Either in software or hardware 4 9 08 Lec 19 3 Page 1 In hardware case done by disk controller file system may not even know that there is more than one disk in use Joseph CS162 UCB Spring 2008 Lec 19 4 Log Structured and Journaled File Systems Better reliability through use of log All changes are treated as transactions Functional Programming Review Functional operations do not modify data structures they always create new ones Original data still exists in unmodified form Data flows are implicit in program design Order of operations does not matter A transaction either happens completely or not at all A transaction is committed once it is written to the log Data forced to disk for reliability Process can be accelerated with NVRAM Although File system may not be updated immediately data preserved in the log fun foo L int list sum L mul L length L Difference between Log Structured and Journaled Log Structured Filesystem LFS data stays in log form Journaled Filesystem Log used for recovery Order of sum mul length does not matter since they do not modify L For Journaled system Log used to asynchronously update filesystem Log entries removed after used After crash Remaining transactions in the log performed Redo Examples of Journaled File Systems Ext3 Linux XFS Unix NTFS Windows 4 9 08 Joseph CS162 UCB Spring 2008 4 9 08 Lec 19 5 Functional Updates Do Not Modify Structures Joseph CS162 UCB Spring 2008 Lec 19 6 Functions Can Be Used As Arguments fun append x lst let lst reverse lst in reverse x lst fun DoDouble f x f f x It does not matter what f does to its argument DoDouble will do it twice The append function above reverses a list adds a new element to the front and returns all of that reversed which appends an item What is the type of this function But it never modifies lst 4 9 08 Joseph CS162 UCB Spring 2008 4 9 08 Lec 19 7 Page 2 Joseph CS162 UCB Spring 2008 Lec 19 8 Administrivia Map Midterm 2 is next Wednesday April 16th map f lst a b a list b list Creates a new list by applying f to each element of the input list returns output in order 6 7 30pm in 10 Evans Covers projects 1 3 lectures 9 2 25 to 19 4 9 OS History Services and Structure CPU Scheduling Kernel and Address Spaces Address Translation Caching and TLBs Demand Paging I O Systems Filesystems Disk Management Naming and Directories Distributed Systems TA Review session TBA Fold left moves left to right across the list Fold right moves from right to left f Standard ML Implementation fun foldl f a a foldl f a x xs foldl f f x a xs fun foldr f a a foldr f a x xs f x foldr f a xs f returned initial 4 9 08 f f Lec 19 10 Order of list elements can be significant f returns the next accumulator value which is combined with the next element of the list f Joseph CS162 UCB Spring 2008 fold left vs fold right fold f x0 lst a b b b a list b Moves across a list applying f to each element plus an accumulator f f Fold f 4 9 08 Lec 19 9 f Joseph CS162 UCB Spring 2008 f f 4 9 08 Joseph CS162 UCB Spring 2008 4 9 08 Lec 19 11 Page 3 Joseph CS162 UCB Spring 2008 Lec 19 12 Example More Complicated Problems fun foo l int list sum l mul l length l More complicated fold problem Given a list of numbers how can we generate a list of partial sums How can we implement this e g 1 4 8 3 7 9 0 1 5 13 16 23 32 fun sum lst foldl fn x a x a 0 lst fun mul lst foldl fn x a x a 1 lst fun length lst foldl fn x a 1 a 0 lst More complicated map problem Given a list of words can we reverse the letters in each word and reverse the whole list so it all comes out backwards e g my happy cat tac yppah ym 4 9 08 Joseph CS162 UCB Spring 2008 4 9 08 Lec 19 13 map Implementation Joseph CS162 UCB Spring 2008 Lec 19 14 Implicit Parallelism In map In a purely functional setting elements of a list being computed by map cannot see the effects of the computations on other elements If order of application of f to elements in list is commutative we can reorder or parallelize execution fun map f map f x xs f x map f xs This implementation moves left to right across the list mapping elements one at a time This is the secret that MapReduce exploits But does it need to 4 9 08 Joseph CS162 UCB Spring 2008 4 9 …


View Full Document

Berkeley COMPSCI 162 - Lecture 19 File Systems continued Distributed Systems

Documents in this Course
Lecture 1

Lecture 1

12 pages

Nachos

Nachos

41 pages

Security

Security

39 pages

Load more
Loading Unlocking...
Login

Join to view Lecture 19 File Systems continued Distributed Systems and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 19 File Systems continued Distributed Systems and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?