DOC PREVIEW
UT Dallas CS 6350 - 07.NOSQL-CAP

This preview shows page 1-2-22-23 out of 23 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Slide 1Cloud computing, cloud databasesCloud computing, cloud databasesRelaxing ACID propertiesEventual ConsistencyCAP Theoremhttps://foundationdb.com/white-papers/the-cap-theoremCAP TheoremCAP TheoremNoSQL databasesCategories of NoSQL databasesCategories of NoSQL databasesKey-Value Data StoresColumn-oriented*Column-orientedA table representation of a row in BigTableColumn-orientedDocument-basedTypical NoSQL APIRepresentatives of NoSQL databases key-valuedRepresentatives of NoSQL databases column-orientedRepresentatives of NoSQL databases document-basedSlide 231NoSQL DatabasesSlides take fromJ. PokornýKSI MFF UKDATAKON 2011J. Pokorný2Cloud computing, cloud databasesCloud computingdata intensive applications on hundreds of thousands of commodity servers and storage devicesbasic features: elasticity, fault-toleranceautomatic provisioning Cloud databases: traditional scaling up (adding new expensive big servers) is not possiblerequires higher level of skills is not reliable in some cases Architectural principle: scaling out (or horizontal scaling) based on data partitioning, i.e. dividing the database across many (inexpensive) machinesDATAKON 2011J. Pokorný3Cloud computing, cloud databasesTechnique: data sharding, i.e. horizontal partitioning of data (e.g. hash or range partitioning)Consequences: manage parallel access in the applicationscales well for both reads and writesnot transparent, application needs to be partition-awareDATAKON 2011J. Pokorný4Relaxing ACID propertiesCloud computing: ACID is hard to achieve, moreover, it is not always required, e.g. for blogs, status updates, product listings, etc.AvailabilityTraditionally, thought of as the server/process available 99.999 % of timeFor a large-scale node system, there is a high probability that a node is either down or that there is a network partitioning Partition tolerance ensures that write and read operations are redirected to available replicas when segments of the network become disconnectedDATAKON 2011J. Pokorný5Eventual ConsistencyEventual ConsistencyWhen no updates occur for a long period of time, eventually all updates will propagate through the system and all the nodes will be consistentFor a given accepted update and a given node, eventually either the update reaches the node or the node is removed from serviceBASE (Basically Available, Soft state, Eventual consistency) properties, as opposed to ACIDSoft state: copies of a data item may be inconsistentEventually Consistent – copies becomes consistent at some later time if there are no more updates to that data itemBasically Available – possibilities of faults but not a fault of the whole systemDATAKON 2011J. Pokorný6CAP TheoremSuppose three properties of a systemConsistency (all copies have same value)Availability (system can run even if parts have failed)Partitions (network can break into two or more parts, each with active systems that can not influence other parts)Brewer’s CAP “Theorem”: for any system sharing data it is impossible to guarantee simultaneously all of these three propertiesVery large systems will partition at some pointit is necessary to decide between C and Atraditional DBMS prefer C over A and Pmost Web applications choose A (except in specific applications such as order processing)https://foundationdb.com/white-papers/the-cap-theoremBrewer originally described this impossibility result as forcing a choice of "two out of the three" CAP properties, leaving three viable design options: CP, AP, and CA. However, further consideration shows that CA is not really a coherent option because a system that is not Partition-tolerant will, by definition, be forced to give up Consistency or Availability during a partition. A more modern interpretation of the theorem is: during a network partition, a distributed system must choose either Consistency or Availability.DATAKON 2011J. Pokorný7CAP TheoremDATAKON 2011J. Pokorný8DATAKON 2011J. Pokorný9CAP TheoremDrop A or C of ACIDrelaxing C makes replication easy, facilitates fault tolerance,relaxing A reduces (or eliminates) need for distributed concurrency control.DATAKON 2011J. Pokorný10NoSQL databasesThe name stands for Not Only SQLCommon features:non-relational usually do not require a fixed table schemahorizontal scalable mostly open sourceMore characteristicsrelax one or more of the ACID properties (see CAP theorem)replication supporteasy API (if SQL, then only its very restricted variant) Do not fully support relational featuresno join operations (except within partitions),no referential integrity constraints across partitions.DATAKON 2011J. Pokorný11Categories of NoSQL databaseskey-value storescolumn NoSQL databases document-basedXML databases (myXMLDB, Tamino, Sedna) graph database (neo4j, InfoGrid)DATAKON 2011J. Pokorný12Categories of NoSQL databaseskey-value storescolumn NoSQL databases document-basedXML databases (myXMLDB, Tamino, Sedna) graph database (neo4j, InfoGrid)DATAKON 2011J. Pokorný13Key-Value Data StoresExample: SimpleDBBased on Amazon’s Single Storage Service (S3)items (represent objects) having one or more pairs (name, value), where name denotes an attribute.An attribute can have multiple values.items are combined into domains.DATAKON 2011J. Pokorný14Column-oriented*store data in column order allow key-value pairs to be stored (and retrieved on key) in a massively parallel systemdata model: families of attributes defined in a schema, new attributes can be addedstoring principle: big hashed distributed tablesproperties: partitioning (horizontally and/or vertically), high availability etc. completely transparent to application* Better: extendible recordsDATAKON 2011J. Pokorný15Column-orientedExample: BigTableindexed by row key, column key and timestamp. i.e. (row: string , column: string , time: int64 )  String.rows are ordered in lexicographic order by row key.row range for a table is dynamically partitioned, each row range is called a tablet.columns: syntax is family:qualifier“Contents:”“anchor:cnnsi.com”“anchor:my.look.ca”“mff.ksi.www”“MFF” “MFF.cz”t3t5t6t9 t8<html><html><html><html>column familyDATAKON 2011J. Pokorný16A table representation of a row in BigTableRow key Time stamp Column name


View Full Document

UT Dallas CS 6350 - 07.NOSQL-CAP

Documents in this Course
HW3

HW3

5 pages

NOSQL-CAP

NOSQL-CAP

23 pages

BigTable

BigTable

39 pages

HW3

HW3

5 pages

Load more
Download 07.NOSQL-CAP
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view 07.NOSQL-CAP and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view 07.NOSQL-CAP 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?