DOC PREVIEW
CORNELL CS 614 - Large Scale Sharing GFS and PAST

This preview shows page 1-2-16-17-18-34-35 out of 35 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Large Scale Sharing GFS and PASTDistributed File SystemsThe Google File SystemDesign Space CoordinatesGFS ArchitectureClient File RequestDesign Choices: MasterSlide 8Relaxed Consistency ModelAnatomy of a MutationConnection with Consistency ModelSpecial FunctionalityMaster InternalsDealing with FaultsMicro-benchmarksStorage Data for ‘real’ clustersPerformanceWorkload BreakdownGFS: ConclusionPASTPAST OperationsPastry10233102: Routing TablePAST operations/securityStorage ManagementSlide 26Slide 27Slide 28Effect of Storage ManagementEffect of tpriEffect of tdivFile and Replica DiversionsDistribution of Insertion FailuresCachingConclusionLarge Scale Sharing Large Scale Sharing GFS and PASTGFS and PASTMahesh BalakrishnanMahesh BalakrishnanDistributed File SystemsDistributed File SystemsTraditional Definition:Traditional Definition:Data and/or metadata stored at remote Data and/or metadata stored at remote locations, accessed by client over the locations, accessed by client over the network.network.Various degrees of centralization: from NFS to Various degrees of centralization: from NFS to xFS.xFS.GFS and PASTGFS and PASTUnconventional, specialized functionalityUnconventional, specialized functionalityLarge-scale in data and nodesLarge-scale in data and nodesThe Google File SystemThe Google File SystemSpecifically designed for Google’s Specifically designed for Google’s backend needsbackend needsWeb Spiders append to huge filesWeb Spiders append to huge filesApplication data patterns:Application data patterns:Multiple producer – multiple consumerMultiple producer – multiple consumerMany-way mergingMany-way mergingGFS GFS  Traditional File Systems Traditional File SystemsDesign Space CoordinatesDesign Space CoordinatesCommodity ComponentsCommodity ComponentsVery large files – Multi GBVery large files – Multi GBLarge sequential accessesLarge sequential accessesCo-design of Applications and File SystemCo-design of Applications and File SystemSupports small files, random access writes Supports small files, random access writes and reads, but not efficientlyand reads, but not efficientlyGFS ArchitectureGFS ArchitectureInterface: Interface: Usual: create, delete, open, close, etcUsual: create, delete, open, close, etcSpecial: snapshot, record appendSpecial: snapshot, record appendFiles divided into fixed size chunksFiles divided into fixed size chunksEach chunk replicated at chunkserversEach chunk replicated at chunkserversSingle master maintains metadataSingle master maintains metadataMaster, Chunkservers, Clients: Linux Master, Chunkservers, Clients: Linux workstations, user-level processworkstations, user-level processClient File RequestClient File RequestClient finds chunkid for offset within fileClient finds chunkid for offset within fileClient sends <filename, chunkid> to MasterClient sends <filename, chunkid> to MasterMaster returns chunk handle and chunkserver locationsMaster returns chunk handle and chunkserver locationsDesign Choices: MasterDesign Choices: MasterSingle master maintains all metadata …Single master maintains all metadata … Simple DesignSimple Design Global decision making for chunk replicationGlobal decision making for chunk replication and placementand placement Bottleneck?Bottleneck? Single Point of Failure?Single Point of Failure?Design Choices: MasterDesign Choices: MasterSingle master maintains all metadata … in Single master maintains all metadata … in memory!memory! Fast master operationsFast master operations Allows background scans of entire dataAllows background scans of entire data Memory Limit? Memory Limit?  Fault Tolerance?Fault Tolerance?Relaxed Consistency ModelRelaxed Consistency ModelFile Regions are -File Regions are -Consistent: All clients see the same thingConsistent: All clients see the same thingDefined: After mutation, all clients see exactly Defined: After mutation, all clients see exactly what the mutation wrotewhat the mutation wroteOrdering of Concurrent Mutations –Ordering of Concurrent Mutations –For each chunk’s replica set, Master gives For each chunk’s replica set, Master gives one replica primary leaseone replica primary leasePrimary replica decides ordering of mutations Primary replica decides ordering of mutations and sends to other replicasand sends to other replicasAnatomy of a MutationAnatomy of a Mutation1 2 Client gets chunkserver 1 2 Client gets chunkserver locations from masterlocations from master3 Client pushes data to 3 Client pushes data to replicas, in a chainreplicas, in a chain4 Client sends write request to 4 Client sends write request to primary; primary assigns primary; primary assigns sequence number to write sequence number to write and applies itand applies it5 6 Primary tells other replicas to 5 6 Primary tells other replicas to apply writeapply write7 Primary replies to client7 Primary replies to clientConnection Connection withwith Consistency Model Consistency ModelSecondary replica encounters error while applying write Secondary replica encounters error while applying write (step 5): region Inconsistent.(step 5): region Inconsistent.Client code breaks up single large write into multiple Client code breaks up single large write into multiple small writes: region Consistent, but Undefined.small writes: region Consistent, but Undefined.Special FunctionalitySpecial FunctionalityAtomic Record AppendAtomic Record AppendPrimary appends to itself, then tells other replicas to Primary appends to itself, then tells other replicas to write at that offsetwrite at that offsetIf secondary replica fails to write data (step 5), If secondary replica fails to write data (step 5), duplicates in successful replicas, padding in failed onesduplicates in successful replicas, padding in failed onesregion defined where append successful, inconsistent where region defined where append successful, inconsistent where failedfailedSnapshotSnapshotCopy-on-write: chunks copied lazily to same replicaCopy-on-write: chunks copied lazily to same replicaMaster InternalsMaster InternalsNamespace managementNamespace managementReplica Placement Replica Placement Chunk Creation, Re-replication, Chunk Creation, Re-replication, RebalancingRebalancingGarbage CollectionGarbage CollectionStale Replica DetectionStale Replica DetectionDealing with FaultsDealing with FaultsHigh availabilityHigh


View Full Document

CORNELL CS 614 - Large Scale Sharing GFS and PAST

Documents in this Course
Load more
Download Large Scale Sharing GFS and PAST
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Large Scale Sharing GFS and PAST and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Large Scale Sharing GFS and PAST 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?