DOC PREVIEW
UT Dallas CS 6350 - NOSQL-CAP

This preview shows page 1-2-22-23 out of 23 pages.

Save
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

NoSQL Databases Slides take from J Pokorn KSI MFF UK 1 Cloud computing cloud databases Cloud computing data intensive applications on hundreds of thousands of commodity servers and storage devices basic features elasticity fault tolerance automatic provisioning Cloud databases traditional scaling up adding new expensive big servers is not possible requires higher level of skills is not reliable in some cases Architectural principle scaling out or horizontal scaling based on data partitioning i e dividing the database across many inexpensive machines DATAKON 2011 J Pokorn 2 Cloud computing cloud databases Technique data sharding i e horizontal partitioning of data e g hash or range partitioning Consequences manage parallel access in the application scales well for both reads and writes not transparent application needs to be partition aware DATAKON 2011 J Pokorn 3 Relaxing ACID properties Cloud computing ACID is hard to achieve moreover it is not always required e g for blogs status updates product listings etc Availability Traditionally thought of as the server process available 99 999 of time For a large scale node system there is a high probability that a node is either down or that there is a network partitioning Partition tolerance ensures that write and read operations are redirected to available replicas when segments of the network become disconnected DATAKON 2011 J Pokorn 4 Eventual Consistency Eventual Consistency When no updates occur for a long period of time eventually all updates will propagate through the system and all the nodes will be consistent For a given accepted update and a given node eventually either the update reaches the node or the node is removed from service BASE Basically Available Soft state Eventual consistency properties as opposed to ACID Soft state copies of a data item may be inconsistent Eventually Consistent copies becomes consistent at some later time if there are no more updates to that data item Basically Available possibilities of faults but not a fault of the whole system DATAKON 2011 J Pokorn 5 CAP Theorem Suppose three properties of a system Consistency all copies have same value Availability system can run even if parts have failed Partitions network can break into two or more parts each with active systems that can not influence other parts Brewer s CAP Theorem for any system sharing data it is impossible to guarantee simultaneously all of these three properties Very large systems will partition at some point it is necessary to decide between C and A traditional DBMS prefer C over A and P most Web applications choose A except in specific applications such as order processing DATAKON 2011 J Pokorn 6 https foundationdb com whi te papers the cap theorem Brewer originally described this impossibility result as forcing a choice of two out of the three CAP properties leaving three viable design options CP AP and CA However further consideration shows that CA is not really a coherent option because a system that is not Partition tolerant will by definition be forced to give up Consistency or Availability during a partition A more modern interpretation of the theorem is during a network partition a distributed system must choose either Consistency or Availability DATAKON 2011 J Pokorn 7 CAP Theorem DATAKON 2011 J Pokorn 8 CAP Theorem Drop A or C of ACID relaxing C makes replication easy facilitates fault tolerance relaxing A reduces or eliminates need for distributed concurrency control DATAKON 2011 J Pokorn 9 NoSQL databases The name stands for Not Only SQL Common features horizontal scalable mostly open source More characteristics non relational usually do not require a fixed table schema relax one or more of the ACID properties see CAP theorem replication support easy API if SQL then only its very restricted variant Do not fully support relational features no join operations except within partitions no referential integrity constraints across partitions DATAKON 2011 J Pokorn 10 Categories of NoSQL databases key value stores column NoSQL databases document based XML databases myXMLDB Tamino Sedna graph database neo4j InfoGrid DATAKON 2011 J Pokorn 11 Categories of NoSQL databases key value stores column NoSQL databases document based XML databases myXMLDB Tamino Sedna graph database neo4j InfoGrid DATAKON 2011 J Pokorn 12 Key Value Data Stores Example SimpleDB Based on Amazon s Single Storage Service S3 items represent objects having one or more pairs name value where name denotes an attribute An attribute can have multiple values items are combined into domains DATAKON 2011 J Pokorn 13 Column oriented store data in column order allow key value pairs to be stored and retrieved on key in a massively parallel system data model families of attributes defined in a schema new attributes can be added storing principle big hashed distributed tables properties partitioning horizontally and or vertically high availability etc completely transparent to application Better extendible records DATAKON 2011 J Pokorn 14 Column oriented mff ksi www anchor my look ca anchor cnnsi com Contents html html html html column family t6 t5 t3 MFF t9 MFF cz t8 Example BigTable indexed by row key column key and timestamp i e row string column string time int64 String rows are ordered in lexicographic order by row key row range for a table is dynamically partitioned each row range is called a tablet columns syntax is family qualifier DATAKON 2011 J Pokorn 15 A table representation of a row in BigTable Row key Time stamp http ksi DATAKON 2011 Column name Column family Grandchildren t1 Jack Claire 7 t2 Jack Claire 7 Barbara 6 t3 Jack Claire 7 Barbara 6 J Pokorn Magda 3 16 Column oriented Example Cassandra keyspace Usually the name of the application e g Twitter Wordpress column family structure containing an unlimited number of rows column a tuple with name value and time stamp key name of record super column contains more columns DATAKON 2011 J Pokorn 17 Document based based on JSON format a data model which supports lists maps dates Boolean with nesting Really indexed semistructured documents Example Mongo Name Jaroslav Address Malostranske n m 25 118 00 Praha 1 Grandchildren Claire 7 Barbara 6 Magda 3 Kirsten 1 Otis 3 Richard 1 DATAKON 2011 J Pokorn 18 Typical NoSQL API Basic API access get key Extract the value given a key put key value Create or update the value given its key delete key Remove the key and its associated value execute key operation parameters


View Full Document

UT Dallas CS 6350 - NOSQL-CAP

Documents in this Course
HW3

HW3

5 pages

BigTable

BigTable

39 pages

HW3

HW3

5 pages

Load more
Download NOSQL-CAP
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view NOSQL-CAP and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view NOSQL-CAP and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?