DOC PREVIEW
UT Dallas CS 6350 - 15.CQLExample

This preview shows page 1-2-3 out of 9 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Slide 1Slide 2Slide 3Slide 4Slide 5Slide 6Slide 7Slide 8Slide 9We’ll create and use atestkeyspace: cqlsh> CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1} cqlsh> USE test;Go ahead and flip over to cassandra-cli and use the test keyspace there as well.[default] use test; In our first example, let’s look at the simple table composed of three fields using a simple primary key.cqlsh:test> CREATE TABLE example ( ... field1 int PRIMARY KEY, ... field2 int, ... field3 int); And we will toss in a few example rows:cqlsh:test> INSERT INTO example (field1, field2, field3) VALUES ( 1,2,3); cqlsh:test> INSERT INTO example (field1, field2, field3) VALUES ( 4,5,6); cqlsh:test> INSERT INTO example (field1, field2, field3) VALUES ( 7,8,9);And as we would expect, the resulting entry as viewed through CQL is:cqlsh:test> SELECT * FROM example; field1 | field2 | field3 --------+--------+-------- 1 | 2 | 3 4 | 5 | 6 7 | 8 | 9 However this looksverydifferent when viewed from cassandra-cli:[default@test] list example; ------------------- RowKey: 1 Þ(column=, value=, timestamp=1374546754299000) Þ(column=field2, value=00000002, timestamp=1374546754299000)Þ(column=field3, value=00000003, timestamp=1374546754299000) -------------------RowKey: 4 Þ(column=, value=, timestamp=1374546757815000) Þ(column=field2, value=00000005, timestamp=1374546757815000) Þ(column=field3, value=00000006, timestamp=1374546757815000) Þ------------------- RowKey: 7 Þ(column=, value=, timestamp=1374546761055000) Þ(column=field2, value=00000008, timestamp=1374546761055000) Þ(column=field3, value=00000009, timestamp=1374546761055000)The mapping from CQL statements to their internal representations:•The value of the CQL primary key is used internally as the row key (which in the new CQL paradigm is being called a “partition key”).•The names of the non-primary key CQL fields are used internally as columns names. The values of the non-primary key CQL fields are then internally stored as the corresponding column values.Consider the following table:CREATE TABLE tweets ( ... user text, ... time timestamp, ... tweet text, ... lat float, ... long float, ... PRIMARY KEY (user, time) ... ); Take special notice of how the primary key is defined. Again here we have a partition key,user. The partition key is always the first field in the primary key, and it can optionally be compound as in the previous example. We also have a clustering key, time.The clustering key or keys are the fields contained in the primary key asides from the partition key. It will become clear in a moment why the pieces of the primary key are labeled as “partition” or “clustering”.After entering in several items we select the contents of the table and here’s what we get:cqlsh:test> SELECT * FROM tweets; user | time | lat | long | tweet------------------+------------------------------------+---------+-----------+--------------------- softwaredoug | 2013-07-13 08:21:54-0400 | 38.162 | -78.549 | Having chest pain. softwaredoug | 2013-07-21 12:15:27-0400 | 38.093 | -78.573 | Speedo self shot. jnbrymn | 2013-06-29 20:53:15-0400 | 38.092 | -78.453 | I like programming. jnbrymn | 2013-07-14 22:55:45-0400 | 38.073 | -78.659 | Who likes cats? jnbrymn | 2013-07-24 06:23:54-0400 | 38.073 | -78.647 | My coffee is cold. What’s more, we can easily (and efficiently) select all tweets for a particular usercqlsh:test> SELECT * FROM tweets WHERE user='jnbrymn';And we can easily (and efficiently) select all tweets for a particular user within a particular time slice:cqlsh:test> SELECT * FROM tweets WHERE user='jnbrymn' AND time>='2013-07-01'; In all of the queries, it is also significant that we are efficiently retrieving all information associated with each tweet (tweet text, lat, long, …) without having to issue multiple queries.In order to understandwhythese queries are efficient, it’s informative to look again at the cassandra-cli listing of tweets:[default@test] list tweets; ------------------- RowKey: softwaredoug Þ(column=2013-07-13 08\:21\:54-0400:, value=, timestamp=1374673155373000) Þ(column=2013-07-13 08\:21\:54-0400:lat, value=4218a5e3, timestamp=1374673155373000) Þ(column=2013-07-13 08\:21\:54-0400:long, value=c29d1917, timestamp=1374673155373000) Þ(column=2013-07-13 08\:21\:54-0400:tweet, value=486176696e67206368657374207061696e2e, timestamp=1374673155373000) Þ(column=2013-07-21 12\:15\:27-0400:, value=, timestamp=1374673155407000) Þ(column=2013-07-21 12\:15\:27-0400:lat, value=42185f3b, timestamp=1374673155407000) Þ (column=2013-07-21 12\:15\:27-0400:long, value=c29d2560, timestamp=1374673155407000) Þ (column=2013-07-21 12\:15\:27-0400:tweet, value=53706565646f2073656c662073686f742e, timestamp=1374673155407000) Þ------------------- RowKey: jnbrymn Þ(column=2013-06-29 20\:53\:15-0400:, value=, timestamp=1374673155419000) Þ(column=2013-06-29 20\:53\:15-0400:lat, value=42185e35, timestamp=1374673155419000) Þ(column=2013-06-29 20\:53\:15-0400:long, value=c29ce7f0, timestamp=1374673155419000) Þ(column=2013-06-29 20\:53\:15-0400:tweet, value=49206c696b652070726f6772616d6d696e672e, timestamp=1374673155419000) Þ(column=2013-07-14 22\:55\:45-0400:, value=, timestamp=1374673155434000) Þ(column=2013-07-14 22\:55\:45-0400:lat, value=42184ac1, timestamp=1374673155434000) Þ (column=2013-07-14 22\:55\:45-0400:long, value=c29d5168, timestamp=1374673155434000) Þ(column=2013-07-14 22\:55\:45-0400:tweet, value=57686f206c696b657320636174733f, timestamp=1374673155434000) Þ (column=2013-07-24 06\:23\:54-0400:, value=, timestamp=1374673155485000) Þ(column=2013-07-24 06\:23\:54-0400:lat, value=42184ac1, timestamp=1374673155485000) Þ(column=2013-07-24 06\:23\:54-0400:long, value=c29d4b44, timestamp=1374673155485000) Þ(column=2013-07-24 06\:23\:54-0400:tweet, value=4d7920636f6666656520697320636f6c642e, timestamp=1374673155485000) We see here that internally in Cassandra, these tweets are stored in two different rows. The internal rows are keyed by the user names —the partition keys. This is why it is so efficient to retrieve all the tweets for a single user. The internal columns are named by the tweet times —the clustering keys. This is why it is so efficient to query for slices along the cluster keys. Also demonstrated here, the names of the non-private key fields are appended to the internal column names. This


View Full Document

UT Dallas CS 6350 - 15.CQLExample

Documents in this Course
HW3

HW3

5 pages

NOSQL-CAP

NOSQL-CAP

23 pages

BigTable

BigTable

39 pages

HW3

HW3

5 pages

Load more
Download 15.CQLExample
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view 15.CQLExample and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view 15.CQLExample 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?