DOC PREVIEW
UH COSC 6360 - Deciding when to forget in the Elephant file system

This preview shows page 1-2-3-4-5 out of 16 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Deciding when to forget in the Elephant file systemKey IdeaINTRODUCTIONThe problemCurrent solutions (I)Current solutions (II)Basic issuesNot all files are created equalThe two objectivesFinding the landmark versionsUser interfaceRetention policies (I)Retention policies (II)Implementation (I)Implementation (II)PerformanceDeciding when to forgetin the Elephant file system Douglas S. Santry Michael J. FeeleyNorman C. Hutchinson Alistair C. VeitchRoss W. Carton Jacob OfirKey Idea•Elephant automatically retains all important versions of user files•Elephant uses file-grain user-specified retention policies to reclaim storage •Previous file versions are named by combining a traditional pathname with a time when the desired version of a file or directory existedINTRODUCTION•Modern file systems associate –Deletion of a file with the immediate release of storage –File writes with the irrevocable change of file contents•Users control what is on disk by explicitly creating, updating and deleting files•Best solution when disk space was at a premiumThe problem•Key problem with current approach is that user actions have immediate and irrevocable effect on disk storage–Users are not protected against their own mistakes•Goes against file system objective of protecting data against failure•We can do better todayCurrent solutions (I)•Cedar protected against accidental overwrites by saving the last few versions of file–Cedar files were immutable: each write created a new version of the file–Does nothing for deleted files•Windows and Mac OS allow users to undelete recently deleted files–Does nothing for files that were overwrittenCurrent solutions (II)•Many systems are regularly backed up–Can restore the state of any file at backup time•Many users maintain multiple versions of their critical dataBasic issues•Can maintain multiple versions of user filesbut not all versions of all files–Need a retention policy•Should we involve the user in the retention/reclamation decisions?Involving the user means–Less protection from user mistakes–A retention policy that might be better suited to the users’ needsNot all files are created equal•Read-only files (like application executables) have no version history•Derived files (like object files) can be easily reconstituted•Cached files require no version history•Temporary files might benefit from a short-term history but not from a long-term history•User-modified files would benefit most from a long-term and a short-term historyThe two objectives•Providing users with the ability of undoing recent changes–Keep the complete history of a file over a short period of time (one hour to one week)•Maintaining a long-term history of important versions of each file–Keep forever landmark versions of each fileFinding the landmark versions•Could rely on the user–User ability to recognize landmark versions of a file degrades with age of versions•Elephant detects landmark versions bylooking at time line of updates to the file–Can identify groups of updates separated by long periods of stability–Last versions of each group of updates are assumed to be landmark versionsUser interface•File versions are–Indexed by their creation time–Named by combining the file pathname with a date and time•Versioning is extended to directories–Allow for recovery of deletes•Previous versions of a file or a directory are read-onlyRetention policies (I)•Keep One: only keeps latest version of the file•Keep All: keeps all versions of the file•Keep Safe: keeps all versions of the file during a specific second-chance interval•Keep Landmarks : keeps all versions of the file during a specific second-chance interval and only landmark versions after thatRetention policies (II)•Keep-Landmarks policy also allows user to group files for consideration–Important for inter-dependent files as their consistency requires viewing all files as of the same point of time–Grouping policy is quite flexible: user can specify•Individual files•Entire directories of subtreesImplementation (I)•I-nodes of non-versioned files are stored in a special i-node file•I-nodes of versioned files are stored in an i-node log–Versions are stored as an ordered sequence of i-nodes–Changes are detected at the block level–Versions of the same file share identical blocksImplementation (II)•Elephant use a different mechanism for versioned directories–We did not discuss it in classPerformance•Somewhat slower than conventional file systems•Using HP-UX traces collected at HP Labs one can estimate that Keep-Landmarks files would account for 62.4 % of files but only 15.2% of the disk


View Full Document

UH COSC 6360 - Deciding when to forget in the Elephant file system

Documents in this Course
Load more
Download Deciding when to forget in the Elephant file system
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Deciding when to forget in the Elephant file system and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Deciding when to forget in the Elephant file system 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?