Duke CPS 212 - Distributed Storage and Consistency

Unformatted text preview:

Distributed Storage and ConsistencyStorage moves into the netStorage as a serviceStorage AbstractionsNetwork Block Storage“NAS vs. SAN”NAS vs. SAN: Cutting through the BSStorage ArchitectureDuke Mass Storage TestbedProblemsDuke Storage Testbed, v2.0Testbed v2.0: pro and conSharing Network StorageFile/Block Cache ConsistencySoftware DSM 101Page Based DSM (Shared Virtual Memory)The Sequential Consistency Memory ModelInside Page-Based DSM (SVM)Write-Ownership ProtocolNetwork File System (NFS)NFS ProtocolFile HandlesConsistency for File SystemsNFS as a “Stateless” ServiceRecovery in Stateless NFSDrawbacks of a Stateless ServiceTimestamp Validation in NFS [1985]AFS [1985]Callback Invalidations in AFS-2Issues with AFS Callback InvalidationsNQ-NFS LeasesUsing NQ-NFS LeasesNQ-NFS Lease RecoveryNQ-NFS Leases and Cache ConsistencyThe Distributed Lock LabRemote Method Invocation (RMI)Background SlidesCluster File SystemsSharing and CoordinationA Typical Unix File TreeFilesystemsVFS: the Filesystem SwitchVnodesVnode Operations and AttributesV/Inode CachePathname TraversalProblem 1: Retransmissions and IdempotencySolutions to the Retransmission ProblemProblem 2: Synchronous WritesSpeeding Up Synchronous NFS WritesNFS V3 Asynchronous WritesNFS V3 CommitDistributed Storage and ConsistencyDistributed Storage and ConsistencyStorage moves into the netStorage moves into the netStorage capacity/volumeStorage capacity/volumeAdministrative costAdministrative costNetwork bandwidthNetwork bandwidthNetwork delaysNetwork delaysNetwork costNetwork costShared storage with scalable bandwidth and capacity.Shared storage with scalable bandwidth and capacity.Consolidate Consolidate — — multiplex multiplex —— decentralize decentralize —— replicate. replicate.Reconfigure to mix-and-match loads and resources.Reconfigure to mix-and-match loads and resources.Storage as a serviceStorage as a serviceSSP SSP ASPASPStorage Service ProviderStorage Service ProviderApplication Service ProviderApplication Service ProviderOutsourcing: storage and/or applications as a Outsourcing: storage and/or applications as a serviceservice..For ASPs (e.g., Web services), storage is just a component.For ASPs (e.g., Web services), storage is just a component.Storage AbstractionsStorage Abstractions•relational database (IBM and Oracle)tables, transactions, query language•file systemhierarchical name space of files with ACLsEach file is a linear space of fixed-size blocks.•block storageSAN, Petal, RAID-in-a-box (e.g., EMC)Each logical unit (LU) or volume is a linear space of fixed-size blocks.•object storageobject == file, with a flat name space: NASD, DDS, PorcupineVarying views of the object size: NASD/OSD/Slice objects may act as large-ish “buckets” that aggregate file system state.•persistent objectspointer structures, requires transactions: OODB, ObjectStoreNetwork Block StorageNetwork Block StorageOne approach to scalable storage is to attach raw block storage to a network.•Abstraction: OS addresses storage by <volume, sector>.iSCSI, Petal, FibreChannel: access through special device driver•Dedicated Storage Area Network or general-purpose network.FibreChannel (FC) vs. Ethernet•Volume-based administrative toolsbackup, volume replication, remote sharing•Called “raw” or “block”, “storage volumes” or just “SAN”.•Least common denominator for any file system or database.““NAS vs. SAN”NAS vs. SAN”In the commercial sector there is a raging debate today about “NAS vs. SAN”.•Network-Attached Storage has been the dominant approach to shared storage since NFS.NAS == NFS or CIFS: named files over Ethernet/Internet.E.g., Network Appliance “filers”•Proponents of FibreChannel SANs market them as a fundamentally faster way to access shared storage.no “indirection through a file server” (“SAD”)lower overhead on clientsnetwork is better/faster (if not cheaper) and dedicated/trustedBrocade, HP, Emulex are some big players.NAS vs. SAN: Cutting through the BSNAS vs. SAN: Cutting through the BS•FibreChannel a high-end technology incorporating NIC enhancements to reduce host overhead.......but bogged down in interoperability problems.•Ethernet is getting faster faster than FibreChannel.gigabit, 10-gigabit, + smarter NICs, + smarter/faster switches•Future battleground is Ethernet vs. Infiniband.•The choice of network is fundamentally orthogonal to storage service design.Well, almost: flow control, RDMA, user-level access (DAFS/VI)•The fundamental questions are really about abstractions.shared raw volume vs. shared file volume vs. private disksStorage ArchitectureStorage ArchitectureAny of these abstractions can be built using any, some, or all of the others.Use the “right” abstraction for your application.Basic operations: create/remove, open/close, read/write.The fundamental questions are:•What is the best way to build the abstraction you want?division of function between device, network, server, and client•What level of the system should implement the features and properties you want?IP LANIP LANMed CtrMed CtrDuke Mass Storage TestbedDuke Mass Storage TestbedCampus FC netCampus FC netIBM Shark/HSMIBM Shark/HSMIP LANIP LANBrain LabBrain LabGoalGoal: managed storage on : managed storage on demand for cross-demand for cross-disciplinary research.disciplinary research.Direct SAN access for Direct SAN access for “power clients” and NAS “power clients” and NAS PoPs; other clients access PoPs; other clients access through NAS.through NAS.ProblemsProblemspoor interoperability•Must have a common volume layout across heterogeneous SAN clients. poor sharing control•The granularity of access control is an entire volume.•SAN clients must be trusted.•SAN clients must coordinate their access.$$$Med CtrMed CtrDuke Storage Testbed, v2.0Duke Storage Testbed, v2.0Campus FC netCampus FC netIBM Shark/HSMIBM Shark/HSMCampus IP netCampus IP netBrain LabBrain LabEach SAN volume is Each SAN volume is managed by a single NAS managed by a single NAS PoP.PoP.All access to each volume is All access to each volume is mediated by its NAS PoP.mediated by its NAS PoP.Testbed v2.0: pro and conTestbed v2.0: pro and conSupports resource sharing and data sharing.Does not leverage Fibre Channel investment.Does not scale access to individual volumes.Prone to load imbalances.Data crosses campus IP network in the clear.Identities and authentication must be centrally


View Full Document

Duke CPS 212 - Distributed Storage and Consistency

Download Distributed Storage and Consistency
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Distributed Storage and Consistency and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Distributed Storage and Consistency 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?