Unformatted text preview:

A Token Based Access Control System for RDF Data in the Clouds Arindam Khaled Mohammad Farhan Husain Latifur Khan Kevin Hamlen Bhavani Thuraisingham Department of Computer Science University of Texas at Dallas Research Funded by AFOSR CloudCom 2010 1 Outline Motivation and Background Semantic Web Security Scalability Access control Proposed Architecture Results CloudCom 2010 2 Motivation Semantic web is gaining immense popularity Resource Description Framework RDF is one of the ways to represent data in Semantic web But most of the existing frameworks either lack scalability or don t incorporate security Our framework incorporates both of those CloudCom 2010 3 Semantic Web Originally proposed by Sir Tim Berners Lee who envisioned it as a machine understandable web Powerful since it allows relationships between web resources Semantic web and Ontologies are used to represent knowledge Resource Description Framework RDF is used for its expressive power semantic interoperability and reusability CloudCom 2010 4 Semantic Web Technologies Data in machine understandable format Infer new knowledge Standards Data representation RDF Subject Triples http test com s1 Predicate Object foaf name John Smith Example Ontology OWL DAML Query language SPARQL CloudCom 2010 5 Current Technologies Joseki 15 Kowari 17 3store 10 and Sesame 5 are few RDF stores Security is not addressed for these In Jena 14 20 efforts have been made to incorporate security But Jena lacks scalability often queries over large data become intractable 12 13 CloudCom 2010 6 Cloud Computing Frameworks Proprietary Amazon S3 Amazon EC2 Force com Open source tool Hadoop Apache s open source implementation of Google s proprietary GFS file system MapReduce functional programming paradigm using key value pairs CloudCom 2010 7 Cloud as RDF Stores Large RDF graphs can be efficiently stored and queried in the clouds 6 12 13 18 These stores lack access control We address this problem by generating tokens for specified access levels Agents are assigned these tokens based on their business requirements and restrictions CloudCom 2010 8 System Architecture LUBM Data Generator RDF XML 1 Query Preprocessor MapReduce Framework N Triples Converter Query Rewriter Query Plan Generator Object Type Based Splitter Plan Executor Preprocessed Data Access Control Predicate Based Splitter 3 Answer 2 Jobs Hadoop Distributed File System Hadoop Cluster 3 Answer CloudCom 2010 9 Storage Schema Data in N Triples Using namespaces Example http utdallas edu res1 utd resource1 Predicate based Splits PS Split data according to Predicates Predicate Object based Splits POS Split further according to rdf type of Objects CloudCom 2010 10 Example D0U0 GraduateStudent20 lehigh University0 D0U0 GraduateStudent20 rdf type lehigh GraduateStudent rdf type lehigh University lehigh memberOf lehigh University0 File rdf type D0U0 GraduateStudent20 lehigh University0 lehigh GraduateStudent lehigh University P File lehigh memberOf D0U0 GraduateStudent20 File rdf type GraduateStudent D0U0 GraduateStudent20 File rdf type University D0U0 University0 PS lehigh University0 File lehigh memberOf University D0U0 GraduateStudent20 lehigh University0 POS CloudCom 2010 11 Space Gain Example Steps Number of Files Size GB Space Gain N Triples 20020 24 Predicate Split PS 17 7 1 70 42 Predicate Object Split POS 41 6 6 72 5 Data size at various steps for LUBM1000 CloudCom 2010 12 SPARQL Query SPARQL SPARQL Protocol And RDF Query Language Example SELECT x y WHERE z foaf name x z foaf age y Query Data Result CloudCom 2010 13 SPAQL Query by MapReduce Example query SELECT p WHERE x rdf type lehigh Department p lehigh worksFor x x subOrganizationOf http University0 edu Rewritten query SELECT p WHERE p lehigh worksFor Department x x subOrganizationOf http University0 edu CloudCom 2010 14 Inside Hadoop MapReduce Job subOrganizationOf University Department1 http University0 edu Department2 http University1 edu M A P S H U F F L E S O R T O U T P U T worksFor Department Professor1 Professor2 Map Deaprtment1 Department2 Map Filtering Object http University0 edu Department1 SO http University0 edu R E D U C E Reduce D W epa F De Pr rtm W pa ofe en F r t ss t1 Pr me or of n 1 es t2 so r2 I N P U T Department1 SO http University0 edu WF Professor1 Department2 WF Professor2 Output WF Professor1 CloudCom 2010 15 Access Control in Our Architecture Access control module is linked to all the components of MapReduce Framework MapReduce Framework Query Rewriter Access Control Query Plan Generator Plan Executor CloudCom 2010 16 Motivation It s important to keep the data safe from unwanted access Encryption can be used but it has no or small semantic value By issuing and manipulating different levels of access control the agent could access the data intended for him or make infereneces CloudCom 2010 17 Access Control Terminology Access Tokens AT Denoted by integer numbers allow agents to access securityrelevant data Access Token Tuples ATT Have the form AccessToken Element ElementType ElementName where Element can be Subject Object or Predicate and ElementType can be described as URI DataType Literal Model Subject or BlankNode CloudCom 2010 18 Six Access Control Levels Predicate Data Access Defined for a particular predicate An agent can access the predicate file For example An agent possessing ATT 1 Predicate isPaid can access the entire predicate file isPaid Predicate and Subject Data Access More restrictive than the previous one Combining one of these Subject ATT s with a Predicate data access ATT having the same AT grants the agent access to a specific subject of a specific predicate For example having ATT s 1 Predicate isPaid and 1 Subject URI MichaelScott permits an agent with AT 1 to access a subject with URI MichaelScott of predicate isPaid CloudCom 2010 19 Access Control Levels Cont Predicate and Object This access level permits a principal to extract the names of subjects satisfying a particular predicate and object Subject Access One of the less restrictive access control levels The subject can ne a URI DataType or BlankNode Object Access The object can be a URI DataType Literal or BlankNode CloudCom 2010 20 Access Control Levels Cont Subject Model Level Access This permits an agent to read all necessary predicate files to obtain all objects of a given subject The ones which are URI objects obtained from the last step are treated as subjects to extract their respective predicates and objects This


View Full Document

UTD CS 7301 - A Token-Based Access Control System for RDF Data in the Clouds

Documents in this Course
Load more
Loading Unlocking...
Login

Join to view A Token-Based Access Control System for RDF Data in the Clouds and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view A Token-Based Access Control System for RDF Data in the Clouds and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?