UT Dallas CS 6350 - HBaseRevised - D3102927

Home> Schools> University of Texas at Dallas> Computer Science (CS) > CS 6350> HBaseRevised

DOC PREVIEW

UT Dallas CS 6350 - HBaseRevised

School name University of Texas at Dallas

Course Cs 6350- Big Data Management and Analytics

Pages 29

This preview shows page 1-2-3-27-28-29 out of 29 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 29 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

HBASE THE SCALABLE DATA STORE An Introduction to HBase XLDB Europe Workshop 2013 CERN Geneva James Kinley EMEA Solutions Architect Cloudera 1 Apache HBase is the Hadoop database a distributed scalable big data store The Apache Software Foundation 2 Why Hadoop and HBase Datasets are constantly growing and intake soars Traditional databases are expensive to scale and inherently difficult to distribute Commodity hardware is cheap and powerful Hadoop 3 CERN stores 100PB of physics data with 75PB being generated in past 3 years Is designed to store and process extremely large datasets in batch Is not intended for realtime querying Does not support random access History of Hadoop and HBase Google solved its scalability problems The Google File System published October 2003 MapReduce Simplified Data Processing on Large Clusters published December 2004 Hadoop MapReduce BigTable A Distributed Storage System for Structured Data published November 2006 4 Hadoop DFS HBase What is HBase Distributed Column Oriented Multi Dimensional Project Goals 5 High Availability CAP High Performance Storage System Billions of Rows Millions of Columns Thousands of Versions Petabytes of data stored across thousands of commodity servers HBase is not 6 A SQL Database No native query engine no SQL no types no joins Transactions and secondary indexes only as add ons but immature A drop in replacement for your RDBMS You must be ok with RDBMS anti schema Denormalized data Wide and sparsely populated tables Just say no to your DBA HBase tables 8 HBase tables 9 HBase tables 10 HBase tables 11 HBase tables 12 HBase tables 13 HBase tables 14 HBase tables 15 HBase tables 16 HBase tables 17 HBase tables 18 HBase tables 19 HBase tables Tables are sorted by Row Key in lexicographical order Table schema only defines its Column Families 20 Each family consists of any number of Columns Each column consists of any number of Versions Columns only exist when inserted no NULLs Columns within a family are sorted and stored together Everything except table name are byte Table Row Key Family Column Timestamp Value HBase Architecture 21 Table is made up of any number of regions Region is specified by its startKey and endKey Each region may live on different node and is made up of several HDFS files and blocks Two types of node Master and RegionServer Special tables ROOT and META store schema information and region locations Master server monitors RegionServers as well as region assignment and load balancing Uses ZooKeeper for distributed coordination HBase Architecture 22 Impala Open source general purpose SQL query engine Runs directly within Hadoop High performance 24 Reads widely used Hadoop file formats and HBase tables Talks to widely used Hadoop storage managers Runs on the same nodes that run Hadoop processes C instead of Java Runtime code generation LLVM A completely new execution engine that doesn t build on MapReduce Thank You James Kinley EMEA Solutions Architect Cloudera kinley cloudera com jrkinley 29

View Full Document

UT Dallas CS 6350 - HBaseRevised

Sign up for free to view:

This document and 3 million+ documents and flashcards
High quality study guides, lecture notes, practice exams
Course Packets handpicked by editors offering a comprehensive review of your courses
Better Grades Guaranteed


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-3-27-28-29 out of 29 pages.

UT Dallas CS 6350 - HBaseRevised

Sign up for free to view:

Please select your school