Unformatted text preview:

2 20 2017 Is Big Data getting too Big Dr Joe Hanson PhD o All use tech but do we understand how it works o Grain of rice 1 byte o Computers deal in a lang of 1s and 0s to store their info compute their instructions 8 of these 1 0 bits make a byte 1000 1 kb 1000 kb 1 mb Today we use 1 gigabyte terabyte 1 terabyte of secs 32k yrs Moving far beyond these scales to larger ones like petabytes 1 could cover Manhattan island All vids on youtube are about 500 petabytes Google store up to 10 exabytes of data Becoming a serious challenge to deal w so much storage o Today we double our store of info every 2 3 yrs 2007 all data we ever saved estimated 300 exabytes 2013 grown to 1200 eb s Total amt of data on earth quadrupled in 6 yrs Acceleration will cont to accelerate Radio telescopes that make up the Square Km Array will generate an Exabyte of astronomical data every 4 days o Future Might live in zettabytes Fill the whole pacific ocean o BIG data Much will be useless hard to organize Quantity will lead to changes in quality of how we live understand our world Ex cave painting o Creation was slow limited amt of info o Photograph faster to make contain much more detail o Capturing horses motion meaningful but more data As we incr our resolution dividing that experience into smaller amts of time detail extracting more info lots of data In our DNA Sequencing the first genome cost about 3 bil dollars in 2001 Today cost about 1k Cheaper to sequence a genomre than store data o Future of Storage Practical problem WHERA ARE WE GOING TO KEEP IT ALL 100 yrs fm now it s estimated we ll be storing 42 yottabytes of data every yr o Using tech that comps use today we d need enough data centers to cover the surface area of 12 Jupiters but DNA itself might hold the answer Harvard resources o Molecule has the potential to hold petabytes of data just a few grams of genetic material o Doesn t deal with how to store read or write it o Might give a whole new definition to SAVING THE WORLD Stay CURIOUS Kenneth Cukier Big Data is Better Data America s favorite pie Apple o Bc of data how we know Supermarket sales Started selling smaller cm pies apple fell to 4 th 5th place When buy a big pie whole family has to agree but when buy an indiv you can buy the one you want More data see something More data o Allows us to see new better different o Allows us to see that Am s fave pie is not apple Big Data o Extremely important tool by which society is going to advance o In the past looked small data and think about what it would mean to understand the world Now we have a lot more than we could ever have before Having a large body of data fundamentally do things that we could do before o Important new o Only way its going to deal w global challenges supply them w food energy elec etc bc the effective use of data o What s new about it Information In the past o 1908 island of Crete disc a clay disc dated it fm 2000 bc 4k yrs old Inscriptions but dk what it means How society stored transmitted info Heavy doesn t store a lot Info is unchangeable now we still store on disc o can store more than ever before o search copy share process all easier o can reuse info for many uses came fm a stock to a flow o stationary static to fluid dynamic o o o o liquidity to information o information Edward took fm Nat l security can fit on a thumbdrive shared at the speed of light were taking things that have always been informational but never rendered into data format putting into to data ex question of location Martin Luther 1500s if we wanted to know where he was just follow him all times Today carrier base that records info where you been all times o Like if cell has GPS Location has been datafied Ex posture The way you sit Fxn of your leg lenth back Researchers in Tokyo are using it as an potential antitheft device in cars Idea is that carjacker sits behind the wheel tries to leave but car recognizes the unapproved driver s in there then just shuts off or something If aggregated the data could identify telltale signs that best predict a car accident is going to take place in the next 5 secs Datafited driver fatigue Car services would be when it senses into that position set an internal alarm What is the VALUE of BIG DATA More information Can do things you couldn t do before Area of machine learning Branch of artificial intelligence which itself is a branch of comp science Throw data the prob and tell comp to figure out itself 1950s comp scientist IBM named Arthur Samuel Liked to play checkers Wrote a prog that he could play against the computer o Won bc comp only knew what a legal move was He knew strategy o Wrote a small sub prog that scored the prob that a given board config would lead to a winning one than a losing Then leaves comp to play itself o Collects more data incr the accuracy of its prediction Then he plays it and he loses created a machine that surpasses his ability in a task that he taught it Self driving cars Not better as a society Memory is not better Algorithms are not faster Processors are not better WE CHANGED THE NATURE OF THE PROB o Car figures it out o Machine learning is the basis of things we do online Search engines personalization algorithm comp translation voice recognition systs Biopsies Asked comp to identify by looking data survival rates to determine whether cells are actually cancerous or not Machine was able to identify the 12 tell tale signs the biopsies of cell are indeed cancerous PROBLEM medical literature only knew 9 of them 3 of them that ppl didn t need to look for machine spotted o DARK SIDES TO BIG DATA Idea that we may be punished for predictions Police may use it for their purposes like minority report Predictive policing or algorithmic criminology Take a lot of data where crimes have been we know where to set up patrols Prob going to go down to the level of the individual o Use data like HS transcript credit score etc o Fit bit show aggressive thoughts Privacy Safeguarding free will Moral choice human volition human agency Going to steal our jobs Going to challenge white collar professional knowledge work in the 21 st century Industrial revolution Need to be careful adjust big data for our human needs We have to be master of tech not its servant o Not good handling all data we can all collect Going …


View Full Document

TAMU SOCI 210 - Big Data

Loading Unlocking...
Login

Join to view Big Data and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Big Data and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?