Information Management Sergey Koren CMSC818S Dr Alan Sussman Grid Information Services for Distributed Resource Sharing Karl Czajkowski Steven Fitzgerald Ian Foster Carl Kesselman Introduction Grid enables wide spread sharing Static and long lived relationships Highly dynamic relationships Either way users have little or no knowledge of resources contributed Example Applications Service Discovery Superscheduler Monitors an application Troubleshooting service Routes requests to best computer Replica selection service Application adaptation agent Description of services Looks for anomalous behavior Performance diagnosis To understand reason for anomalous behavior Architecture Example from above all different but have a similar structure It is feasible to treat these different cases within a single consistent framework Large collection of information providers Called GRIS Higher level services that collect manage and respond to information provided by information providers aggregate directory services Called GIIS Protocols Between higher level services and providers Soft state registration protocol Enquiry protocol Identifies entities participating in the service GRRP Retrieval of information about those entities GRIP Integrated with GSI to provide authentication and access control Benefits Separation between information retrieval and discovery and monitoring allows a wide variety of discovery and monitoring strategies Soft state registration makes the system fault tolerant Distribution of Providers Any information delivered is necessarily old Producers model confidence Timestamps or time to live metadata No consistent view of global state Focus on efficient delivery of state information from a single source Failure Entities and networks providing access fail Provide as much partial or inconsistent information as available As distributed and decentralized as possible Failure is the rule not an exception Need to be able to detect failures Failure Scenario Diversity in Components Define common discovery and enquiry mechanisms that any Grid entity supports Information producers restrict who can has access to pieces of information Aggregates can control membership defining a policy for which providers can join Architecture Overview Architecture Overview GRIP Supports search to do discovery Supports enquiry for direct lookup includes subscription Uses LDAP as underlying protocol GRRP Push information about existence send an invitation Includes time of validity Expire after some lack of updates LDAP Data Model Aggregate Directory Services Able to define any directory that is needed Use GRIP GRRP to create a hierarchical discovery service Directories use GRRP to register with other directories Can use alternatives to GRIP GRRP Hierarchical Discovery Extensibility GRIP Extension Additional information delivery beyond standard in GRIP Service Publication Use GRIP to permit discovery of other Grid resources and services Describe another protocol that can be used to communicate Security Aggregate directory must be consistent with underlying providers The provider trusts the directory and they have the same access policies The provider limits information available to an aggregate directory The provider only gives information on its existence The provider places no restrictions on the information provided Also authenticate GRRP messages signature or secure channel Naming Two methods 1 Naming service responsible for generating unique names in its scope Have to maintain an infrastructure Can be similar to DNS 2 Assign unique names at random GUIDs No structural information Can still organize hierarchically Configuration Manual Configure providers with addresses of directories or vice versa Automated discovery based on hierarchical discovery Automated discovery based on other information service A Directory Service for Configuring High Performance Distributed Computations Steven Fitzgerald et al Introduction Information rich approach to configuration Configuration decisions not difficult if the right information is available Requirements Performance Scalability and cost Uniformity Expressiveness Must be able to represent relevant structure for example network bandwidth Extensibility Multiple information sources Requirements cont Dynamic data Flexible access Read update data as well as search Security Deployability Must make dynamic data available in a timely manner Easily installed and maintained Decentralized Maintenance Representation and Data Access Based on the data representation of LDAP Data organized into entries each entry is an instance object class Each entry contains attributes Unique name for each entry called a distinguished name Entries organized hierarchically into a tree the branches specify the name like in DNS Object Classes MDS Data Data Model Focus on representation of computers and networks Globus Network Globus Network Interface Information about physical network Contained by hosts links to a Globus Network Image versions of above Support multiple logical views of a physical object Implementation Based on LDAP but standard LDAP has problems LDAP problems Single information provider Client server architecture Add ability to specify information providers on a perattribute basis LDAP requires at least one round trip network communication Add TTL to allow caching Scope of data LDAP assumes any data can be used anywhere Add update scope process computation global A Grid Monitoring Architecture B Tierney et All Introduction Performance information has a fixed often short lifetime of utility Updates are frequent Need rapid access but long term storage not necessarily needed Updated more often than read Performance information is stochastic Carry quality of information metrics Requirements Low latency High data rate Minimal measurement overhead Transmitted from where it is measured to where it is needed with low latency Limit impact and percentage of resource Secure Scalable Design Separate data discovery from data transfer Directory service used to match a data source with a data sink Majority of data travels between the producers of the data to the consumers of the data Entities Directory service Producer Makes performance data available source Consumer Supports information publication and discovery Receives performance data sink Time stamped events Typed collection of data with a structure specified by an event schema Entities Directory service interactions Producers and consumers
View Full Document
Unlocking...