FIU CIS 6612 - Presentation 3 Agnostic Questions - D2100089

Home> Schools> Florida International University> Computer Science And Information Systems (CIS) > CIS 6612> Presentation 3 Agnostic Questions

FIU CIS 6612 - Presentation 3 Agnostic Questions

School name Florida International University

Course Cis 6612- Adv Software Eng

Pages 5

Download Save

Unformatted text preview:

Presentation 3 Agnostic QuestionsPaper: Database Access and Integration Services on the GridPresenter: Ariel CaryAgnostic: Fernando TrigosoAs I was reading the paper I made notes for many “lower level” questions. However, after finishing reading it I realized that many prototypes still need to be implemented to provide answers to the questions I had. Thus I tried to keep the following questions at a higher level:Presentation 3 Agnostic Questions1. The need for Grid database services arises from the fact that existing information is spread out across many databases. To be able to use these databases in a distributed fashion the documents mentions the use of wrappers, distributed transaction managers and query processors. It seems to me that every application that intends to use distributeddatabases would need to provide its own transaction manager and query processors. The view of the data and its dependencies may change across different applications. Is this true?A1: The idea is to provide High Level Services (HLS) to the community of user through an interface. The interface, which could be a WSDL document, exposes what activities (operations) the Database Service is capable of; for example, a Database Service may provide support for the SQL’92 language and also implement distributed transaction operations. The particular data model (relational, XML repository, specialized storage), which can definitely change, is not relevant to the client.2. This paper proposes that DGS must be independent of any specific data model or database language. To achieve these traits, do we need a higher level query language ableto query different data models on the same transaction? For example, we may need to query a relational database and an XML repository.A2: The need of a higher level language is not necessarily true. In a Grid environment, the transaction manager will be in charged of coordinating the transaction by using the appropriate language that supports each Database Service, which will go to the underlying database, and additional operations provided by the Service. The supported language as well as the transaction operations is presented through a consistent interface (WSDL). In the paper, it’s suggested that a transaction manager may:1. Initiate a transaction by executing starTransaction(OUT txHandle, OUT fail) on each participating Database Service.2. Execute the specific commands on each Database Service: query, update, etc.3. Initiate the end of the transaction with prepareCommit(IN txHandle, OUT fail), by using for example a two-phase commit protocol, in which basicallythe transaction manager is the coordinator and the participating services are the cohorts that agree or not to commit a transaction.3. At the moment that we start using DGS we are not only querying data on different databases but also updating and creating data. This new data may require integrity constraints. How can we enforce these constraints across different databases?A3: In this context, during a distributed transaction, each change operation is confined to one database, and there is no integrity constraint checking among all the databases. However, you can execute several update activities on different databases as part of a single transaction, but each operation will be ruled by the specific database you’re executing the operation on. Constraints in general have limitations when talking about distributed environments, not to speak federated databases in which there is no tight relationship among the data sources. For example, in Oracle or DB2 database (on which Ihave experience) if you partition the database/table, meaning distributing physically data segments among nodes/disks, and want to define a (global) Primary Key (PK), you must include as part of the PK definition the partitioning key too, otherwise the DBMS will notbe able to guarantee uniqueness. So, I think it would be trying to implement something on the Grid that is not supported at a lower level.4. Let’s assume that the idea of wrappers is actually put in place for GDS. What process should we follow in adding wrappers to newly created databases that are added to the grid? Should we enforce the creation of wrappers with every database added to the grid? Or, should they be created on demand when a particular application needs it?A4: First off, in the paper, wrappers are mentioned as one possible way to make DBMS systems adhere to a common interface, and it seems reasonable, but it’s not a strong recommendation. The process for adding a new database basically will be to provide a uniform interface to the DBMS; for example, the OGSA-DAI system uses JDBC to access to the database. About the enforcement of this practice, it really depends on the requirements of the particular Database Service implementation.On the other hand, database services are independent of the demand of clients, but clearlythe development of wrappers or in general access interfaces will be dictated by the needs of the end users of the Grid Database Services.5. If there is a need to use data from different databases, then it is likely that this data is related in some way. Thus, there may be a lot of duplicated data amongst these databases. For example, if we are going to use two relational databases, one of them mayhave a table with a field named “Student_DOB”; while the other one may have a table with a field named “DOB_Of_Student”. These two fields are semantically the same. These are ideas from the Semantic Web project. Would there be any gain if a Grid database service knew the semantic equivalence or relation of the schemas of different databases?A5: Yes, that would be a great contribution in particular for queries that are not “accurately” specified. For example, a user may want to know if a certain product X is available and at which stores. Suppose the user gets connected to a Data Service that provides product availability information, and it has several databases underneath. So, here the semantic of each data element in the Data Services is important; we need to identify which columns represent the product ID we are looking for to execute the search.In fact, that information could be included as part of the Database Metadata the Database Services expose, and will be used by the query processor./********************************************************************/Fernando Trigoso wrote: I only have comments on the answer for

View Full Document


School:
Email:
New Password:
Confirm Password:

FIU CIS 6612 - Presentation 3 Agnostic Questions

Sign up for free to view:

Please select your school