BIG DATA MANAGEMENT AND ANALYTICS cs6350 SPARK TUTORIAL SPARK INSTALLATION It works for both Linux and Windows Operating systems Go to https spark apache org downloads html Chose a package type Prebuilt for Hadoop 2 4 or later Download the spark file Extract the file and change directory to the bin Run spark shell Simple Scala Spark val a Array 1 2 3 2 3 4 5 2 1 2 3 4 3 4 5 val r sc parallelize a val newr r map x x 1 newr reduceByKey collect output will be res6 Array Int Int Array 4 3 1 2 5 2 2 4 3 4 Word count program val in sc textFile beeline in flatMap line line split map word word 1 reduceByKey collectAsMap Filter commands filter by zipcode val lines sc textFile users dat val ln readLine API to take input from command line val linesZipcode lines filter line line contains ln map line line split map line line 0 collect Finding average Find top 10 average rated movies with descending order of rating val lines sc textFile ratings dat val sumratings lines map line line split map line line 1 line 2 toDouble reduceByKey val counts lines map line line split map line line 1 1 reduceByKey Defining Functions in scala def addInt a Int b Int Int var sum Int 0 sum a b return sum Applying functions val a Array 1 2 3 2 3 4 5 2 1 2 3 4 3 4 5 val r sc parallelize a r map x addInt x x collect Stand alone scala programs Create a folder structure as show below simple sbt src src main src main scala src main scala SimpleApp scala In SimpleApp scala Write your code package org apache spark examples streaming SimpleApp scala import org apache spark SparkContext import org apache spark SparkContext import org apache spark SparkConf import java util Properties object SimpleApp def main args Array String In Simple sbt Add the meta information like main class name Simple Project version 1 0 scalaVersion 2 10 4 libraryDependencies org apache spark spark core 1 3 0 mainClass in Compile run Some org apache spark examples streaming Si Run the sbt command to package and run your code sbt bin sbt run sbt bin sbt package sbt bin sbt clean
View Full Document