UT Dallas CS 6350 - 14.Cassandenra - D3104983

Home> Schools> University of Texas at Dallas> Computer Science (CS) > CS 6350> 14.Cassandenra

DOC PREVIEW

UT Dallas CS 6350 - 14.Cassandenra

School name University of Texas at Dallas

Course Cs 6350- Big Data Management and Analytics

Pages 56

This preview shows page 1-2-3-4-26-27-28-53-54-55-56 out of 56 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 56 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Prof Latifur Khan Department of Computer Science School of Engineering Computer Science UT Dallas http jonathanhui com how cassandra read persists data and maintain consistency A little of both Architecture Overview Membership Cluster membership in Cassandra is based on Scuttlebutt a very efficient anti entropy Gossip based mechanism 8 66 Membership Gossip protocol is used for cluster membership Super lightweight with mathematically provable properties State disseminated in O logN rounds where N is the number of nodes in the cluster Every T seconds each member increments its heartbeat counter and selects one other member to send its list to A member merges the list with its own list 9 66 10 66 11 66 12 66 13 66 Architecture Overview Data Model A table in Cassandra is a distributed multi dimensional map indexed by a key The value is an object which is highly structured Simple column families Cassandra exposes two kinds of columns families Super column families 16 66 Issues with today s workloads Data Large and unstructured Lots of random reads and writes Foreign keys rarely needed Need Incremental Scalability Speed No Single point of failure Low TCO and admin Scale out not up Schema free Sparse table Flexible column naming You define the sort order Not required to have a specific column just because another row does Data Model Keyspace ColumnFamily Row indexed Key Columns Name sorted Value Easier to show from the bottom up Data Model A single column Data Model A single row Data Model Source datastax Read and Write in Cassandra Read Client Query Result Cassandra Cluster Closest replica Result Replica A Digest Query Replica B Replica C 33 66 Writes Need to be lock free and fast no reads or disk seeks Client sends write to one front end node in Cassandra cluster Coordinator Which via Partitioning function sends it to all replica nodes responsible for key Always writable Hinted Handoff If any replica is down the coordinator writes to all other replicas and keeps the write until down replica comes back up When all replicas are down the Coordinator front end buffers writes for up to an hour Provides Atomicity for a given key i e within ColumnFamily One ring per datacenter Coordinator can also send write to one replica per remote datacenter Writes at a replica node On receiving a write 1 log it in disk commit log 2 Make changes to appropriate memtables In memory representation of multiple key value pairs Later when memtable is full or old flush to disk Data File An SSTable Sorted String Table list of key value pairs sorted by key Index file An SSTable key position in data sstable pairs And a Bloom filter Compaction Data udpates accumulate over time and sstables and logs need to be compacted Merge key updates etc Reads need to touch log and multiple SSTables May be slower than writes Deletes and Reads Delete don t delete item right away add a tombstone to the log Compaction will remove tombstone and delete item Read Similar to writes except Coordinator can contact closest replica e g in same rack Coordinator also fetches from multiple replicas check consistency in the background initiating a read repair if any two values are different Makes read slower than writes but still fast Read repair uses gossip remember this Bootstrapping When a node starts for the first time it chooses a random token for its position in the ring 43 66 Scaling The Cluster When a new node is added into the system it gets assigned a token such that it can alleviate a heavily loaded node 44 66 Local Persistence The Cassandra system relies on the local file system for data persistence 45 66 Partitioning One of the key design features for Cassandra is the ability to scale incrementally 46 66 Partitioning and Replication 1 0 h key1 E A N 3 C F h key2 B D 1 2 Partitioning 1 In cassandra the total data managed by the cluster is represented as a circular space or ring The ring is divided up into ranges equal to the number of nodes which each node being responsible for one or more ranges of the overall data Before a node can join the ring it must be assigned a token The token determines the node s position on the ring and the range of data it is responsible for 1 http www datastax com docs 0 8 cluster architecture 48 66 1 Partitioning Single Data Center Continue A cluster with 4 nodes the row keys managed by the cluster were numbers in the range of 0 to 100 Each node is assigned a token that represents a point in this range In this simple example the token values are 0 25 50 and 75 1 http www datastax com docs 0 8 cluster architecture 49 66 1 Partitioning Replica Placement Continue In multi data center deployments replica placement is calculated per data center Additional replicas in the same data center are placed by walking the ring clockwise until it reaches the first node in another rack 1 http www datastax com docs 0 8 cluster architecture 50 66 Replication Factor Replication In Cassandra Replication factor The total number of replicas across the cluster is often referred to as the replication factor A replication factor of 1 means that there is only one copy of each row and a replication factor of 2 means two copies of each row Replication Is the process of storing copies of data on multiple nodes 51 66 Replication Factor In Cassandra Replication is the process of storing copies of data on multiple nodes 52 66 About Client Request All nodes in cassandra are peers A client read write request can go to any node in the cluster When a client connect to a node and issues a read write request that node serves as a proxy the coordinator for that particular operation The job of the coordinator is to act between the client application and the nodes replicas that own the data being requested 53 66 About Client Request Continue The coordinator sends the write request to all replicas that own the row being written If all replica nodes are up and available They will get the write regardless of the consistency level specified by the client The write consistency level determines how many replica nodes must respond with a success acknowledgement in order for the write to be considered successful 54 66 An Example To Write For example in a single data center 10 node cluster with a replication factor of 3 an incoming write will go to all 3 nodes that own the requested row 55 66 1 R 1 12 2 3 11 Client R 2 4 10 5 9 8 6 7 R 3 56 66

View Full Document