DOC PREVIEW
UT Dallas CS 6350 - BigdataIntro#0

This preview shows page 1-2-3-4-5-6 out of 18 pages.

Save
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Big Data Management and Analytics Introduction Spring 2015 Dr Latifur Khan 1 Introduction to Big Data What is Big Data What makes data Big Data 2 Big Data Definition No single standard definition Big Data is data whose scale diversity and complexity require new architecture techniques algorithms and analytics to manage it and extract value and hidden knowledge from it 3 Characteristics of Big Data 1 Scale Volume Data Volume 44x increase from 2009 2020 From 0 8 zettabytes to 35zb Data volume is increasing exponentially Exponential increase in collected generated data 4 Characteristics of Big Data 2 Complexity Variety Various formats types and structures Text numerical images audio video sequences time series social media data multi dim arrays etc Static data vs streaming data A single application can be generating collecting many types of data To extract knowledge all these types of data need to linked together 5 Characteristics of Big Data 3 Speed Velocity Data is begin generated fast and need to be processed fast Online Data Analytics Late decisions missing opportunities Examples E Promotions Based on your current location your purchase history what you like send promotions right now for store next to you Healthcare monitoring sensors monitoring your activities and body any abnormal measurements require immediate reaction 6 Big Data 3V s 7 Some Make it 4V s 8 Harnessing Big Data OLTP Online Transaction Processing DBMSs OLAP Online Analytical Processing Data Warehousing RTAP Real Time Analytics Processing Big Data Architecture technology 9 Who s Generating Big Data Mobile devices tracking all objects all the time Social media and networksScientific instruments all of us are generating data collecting all sorts of data Sensor technology and networks measuring all kinds of data The progress and innovation is no longer hindered by the ability to collect data But by the ability to manage analyze summarize visualize and discover knowledge from the collected data in a timely manner and in a scalable fashion 10 The Model Has Changed The Model of Generating Consuming Data has Changed d Model Few companies are generating data all others are consuming data New Model all of us are generating data and all of us are consuming data 11 What s driving Big Data Optimizations and predictive analytics Complex statistical analysis All types of data and many sources Very large datasets More of a real time 12 Ad hoc querying and reporting Data mining techniques Structured data typical sources Small to mid size datasets Value of Big Data Analytics Big data is more real time in nature than traditional DW applications Traditional DW architectures e g Exadata Teradata are not well suited for big data apps Shared nothing massively parallel processing scale out architectures are well suited for big data apps 13 Challenges in Handling Big Data The Bottleneck is in technology New architecture algorithms techniques are needed Also in technical skills Experts in using the new technology and dealing with big data 14 What Technology Do We Have For Big Data 15 16 Big Data Technology 17 What You Will Learn We focus on Hadoop MapReduce technology NoSQL keyValue BigTable Cassandra SPARK Stream Learn the platform how it is designed and works How big data are managed in a scalable efficient way Learn writing Hadoop jobs SPARK in different languages Programming Languages Java C Python Scala High Level Languages Apache Pig Hive CQL Learn advanced analytics tools on top of Hadoop Mahout Data mining and machine learning tools over big data Applications Recommendations Graph Processing Pregel Giraf 18


View Full Document

UT Dallas CS 6350 - BigdataIntro#0

Documents in this Course
HW3

HW3

5 pages

NOSQL-CAP

NOSQL-CAP

23 pages

BigTable

BigTable

39 pages

HW3

HW3

5 pages

Load more
Download BigdataIntro#0
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view BigdataIntro#0 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view BigdataIntro#0 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?