DOC PREVIEW
FIU CIS 6612 - The Data Grid

This preview shows page 1-2-23-24 out of 24 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 24 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 24 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 24 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 24 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 24 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific DatasetsAgendaIntroductionSlide 4Data GridSlide 6Data Grid DesignLayered Architecture (from the paper)Core ServicesSlide 10Data Grid ServicesData Grid Services (from loci.cs.utk.edu/dsi/netstore99/docs/presentations/foster-d-slides.pdf )Slide 13Slide 14Slide 15Slide 16Slide 17Higher-Level Data Grid ComponentsSlide 19Slide 20Slide 21Slide 22ConclusionFurther Works01/15/19 1The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Datasets A.Chervenak, I.Foster, C.Kesselman, C.Salisbury, S.TueckePresented By: Kasturi ChatterjeeAgnostic: Selim Kalayci01/15/19 2AgendaIntroductionData Grid DesignData Grid ServicesHigher-Level Data Grid ComponentsConclusion01/15/19 3IntroductionGrid : Geographically distributed computing resources configured for coordinated useData Grid : Database Architecture for storage and handling huge amount of data supported by a Grid01/15/19 4IntroductionScientific disciplines are data intensive as well as computationally demandingTerabytes and petabytes of dataDiverse Domains and Geographic Distribution of Users and Resources01/15/19 5Data GridIntegrate heterogenous data archives into a distributed data management grid*Identify services for high performance, distributed, data intensive computing*APIs and Components required to implement it efficiently*from globus project slides available at loci.cs.utk.edu/dsi/netstore99/docs/presentations/foster-d-slides.pdf01/15/19 6AgendaIntroductionData Grid DesignData Grid ServicesHigher-Level Data Grid ComponentsConclusion01/15/19 7Data Grid DesignDesign Principles Mechanism Neutrality independent of low-level mechanisms Policy Neutrality design decisions are exposed to users Compatibility with Computational Grid integration of storage and computation Uniformity of Information Infrastructure uniform access to information about resource structure and state01/15/19 8Layered Architecture (from the paper)01/15/19 9Core Services Storage SystemsDPSS : Distributed Parallel Storage SystemHPSS : High Performance Storage System Metadata RepositoryLDAP : Lightweight Directory Access ProtocolMCAT : MetaData Catalogue01/15/19 10AgendaIntroductionData Grid DesignData Grid ServicesHigher-Level Data Grid ComponentsConclusion01/15/19 11Data Grid ServicesData Access Mechanisms for accessing, managing and initiating third-party transfers of dataMetadata Access Mechanisms for accessing and managing information about data01/15/19 12Data Grid Services (from loci.cs.utk.edu/dsi/netstore99/docs/presentations/foster-d-slides.pdf )01/15/19 13Data Grid ServicesStorage Systems and Data Access Storage Systems: provides functions for creating, destroying, writing and manipulating file instances associate a set of properties like name, size and access restrictions with each file instanceEg: A data grid implementation may use SRB to access data01/15/19 14Data Grid ServicesData Access APIs are defined which describes the possible operations on storage systems and file instances API provides standard interface to storage systems like create, delete, open, close, read, write and storage to storage transfer Self-Optimizing capability Uniform Access to heterogeneous Systems01/15/19 15Data Grid ServicesMetadata Service Application Metadata, Replica Metadata and System Configuration Metadata Single interface to access themPros: UniformityCons: Complex Implementation Structured as hierarchical and distributedPros: Scalable, no single failure point, local control01/15/19 16Data Grid ServicesApplication Metadata : metadata describing the information content represented by the file, circumstances under which data was obtained and information to applications to process itReplica Metadata : data used to manage replication of data objectsSystem Configuration Metadata : describes the system i.e. network connectivity, storage systems, usage policy etc.01/15/19 17AgendaIntroductionData Grid DesignData Grid ServicesHigher-Level Data Grid ComponentsConclusion01/15/19 18Higher-Level Data Grid ComponentsReplica Management from I. Foster SlidesCollections contain related filesLogical files describe replicated physical filesServices for managing replicated file instances Create / delete Schedule / manage data transfer Register in the replica catalog Metadata display01/15/19 19Higher-Level Data Grid ComponentsHow Does a Replica Manager Works ?  Maintains a repository/catalogue Entries correspond to logical files/file collections Associated with each logical file/collection are one/more physical instance of objects Catalogue contains mapping from logical file to physical instances01/15/19 20Higher-Level Data Grid ComponentsReplica Manager doesn’t do the following :  determine when or where replicas are created which replicas are to be used by an application keeps policy separate from replica manager design making it generic01/15/19 21Higher-Level Data Grid ComponentsReplica Selection Process of choosing replica that will optimize a desired performance criterionSelection process may initiate creation of a new replicaIntelligent scheduling to determine appropriate replica, site for (re)computation, etc.01/15/19 22AgendaIntroductionData Grid DesignData Grid ServicesHigher-Level Data Grid ComponentsConclusion01/15/19 23ConclusionImplementation experience led to the adoption of using collection of logical filesImplements computation and data intensive Grid architectureAPIs provide standard interface for various utilitiesReplica Management and Metadata services are provided using LDAP01/15/19 24Further WorksChervenak et al1.Secure, Efficient Data Transport and Replica Management for High-Performance Data-Intensive Computing :20012. High-Performance Remote Access to Climate Simulation Data: A Challenge Problem for Data Grid Technologies :20013. A Replica Location Grid Service Implementation : 20044. Applying Peer-to-Peer Techniques to Grid Replica Location Services :2006Leanne Guy et al Replica Management in Data Grids in 2002 : addressed Read/Write Replica


View Full Document

FIU CIS 6612 - The Data Grid

Download The Data Grid
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view The Data Grid and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view The Data Grid 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?