DOC PREVIEW
CMU CS 15319 - Instructions for using the Qloud

This preview shows page 1-2 out of 6 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Instructions for using the Qloud (Hadoop 0.20) Cloud Infrastructure at CMUQ (The Qloud) CMUQ has dedicated cloud infrastructure running on a 14-blade IBM Blade-center with a total of 112 physical cores and 7 TB storage. The cloud is managed using advanced management software (IBM Cloud 1.4) with Tivoli provisioning manager which allows for cloud resources to be provisioned via software. Users can request virtual machines (VMs) on the cloud with customized CPUs, RAM and disk space, as well as custom OS/software images. These machines run Red Hat Linux on top of the Xen virtualization platform and will be pre-configured with Java and the Hadoop SDKs. Since these resources are virtualized and provisioned through the management server, cloud resources can be accessed only through the Cloud Gateway (see Figure 1). From your computer, you will need to log on to hadoop.qatar.cmu.edu server which is where you will run Eclipse and communicate to the provisioned cloud. This machine does not run any of the Hadoop code, it just acts as a liaison with your provisioned cloud. Figure 1 - CMU Cloud Infrastructure Logical ViewQloud User Requirements 1. Login account for CMUQ network. Please contact someone if you need one. In this document, we will assume that this username will be login1. 2. Login for the Cloud management software (Provided to you along with this document). We will assume that this username will be login2. 3. Basic knowledge of Unix commands and Unix text editors. 4. X-Win32 for windows machines. (can be downloaded from http://www.qatar.cmu.edu/myandrew/ - login required). You may also need WinSCP, PuTTy and other software to interact with the cloud servers from Windows machines. Part I: Setting up your Cloud 1. Start X-Win32 and configure an SSH connection to hadoop.qatar.cmu.edu using login1 Start Firefox remotely on hadoop.qatar.cmu.edu by using the following command: firefox & 2. Choose Edit->Preferences. 3. Click on Advanced, and then Network, and then Settings. 4. Choose Manual proxy configuration, and under SOCKS Host enter "cloud-01-14.qatar.cmu.edu", and for the SOCKS port enter 3900. 5. Click OK, and then Close. 6. Goto the address http://10.160.0.100:9080/cloud/ 7. Login using the supplied username and password and request a Cloud. 8. Select the required dates (choose the Cloud for the semester duration). Include your Name in the Project name and Choose Project Type: “Hadoop customized for CMU” 9. Confirm your choice and your cloud will be available in about 24 hours. 10. Check back in 24 hours using this website from hadoop.qatar.cmu.edu. Verify that you have the number of requested VMs and that they are all active. You may expand the information on each VM (node) and check its IP address and its Admin password. The last listed node is your master node. 11. You may use this web interface at anytime to get current status of your provisioned cloud.Part II: Configuring Hadoop on your Cloud and Command Line execution of MapReduce programs. In this section, you will be accessing your Cloud remotely through the master node. This can only be done through the cloud gateway server: cloud-01-14.qatar.cmu.edu. 1. SSH to cloud-01-14.qatar.cmu.edu. using login2 2. SSH to your Master node by using the following command: ssh [email protected] (Use IP address and password of your master node that you got from the last step of the previous part.) 3. Execute the following commands to start the Hadoop daemons on your Cloud: su - hadoop hadoop namenode -format start-all.sh hadoop dfs -chmod 777 / If, for some reason in the future you need to re-start your Hadoop Cloud, you may run these commands but skip the namenode -format command. 4. Verify that Hadoop is working correctly on your PC by executing the PiEstimation example in the hadoop examples JAR file. cd /hadoop/hadoop-0.20.1/ hadoop jar hadoop-0.20.1-examples.jar pi 10 100 This will run the Pi estimation code using random sampling, the first argument (10) being the number of maps and the second argument (100) number of samples per map. Source code of this program is available at src/examples/org/apache/hadoop/examples/PiEstimator.java in the current folder. While your job is running, you may use the jobtracker web interface to see the progress the job. Open Firefox from hadoop.qatar.cmu.edu (as described in steps 1 and 2 of Part 1), and go to the web address http://10.160.4.5:50030 (Use IP address of your master node and port 50030). Browse through the job tracker interface for more information regarding the jobs on your Hadoop system.Part III: Using HDFS In this section, you will be accessing your Cloud through hadoop.qatar.cmu.edu. For this, we first need to copy the configuration files from your Cloud to that machine and configure Hadoop to access your Cloud through a SOCKS proxy (cloud-01-14.qatar.cmu.edu). 1. SSH to cloud-01-14.qatar.cmu.edu. using login2 2. SSH to your Masternode by using the following command: ssh [email protected] (Use IP address of your master node.) 3. Execute the following commands to copy the configuration files to hadoop.qatar.cmu.edu: “scp [email protected]:/hadoop/hadoop-0.20.1/conf/*-site.xml .” “scp *-site.xml hadoop.qatar.cmu.edu:hadoop-conf/” 4. SSH to hadoop.qatar.cmu.edu. using login1. Go to the directory hadoop-conf/. Edit core-site.xml, and add the following property before the last line (</configuration>): Use any text editor available (vim and nano). <property> <name>hadoop.socks.server</name> <value>cloud-01-14.qatar.cmu.edu:3900</value> <description> Address (host:port) of the SOCKS server to be used by the SocksSocketFactory. </description> </property> 5. Still editing core-site.xml, find the hadoop.rpc.socket.factory.class.ClientProtocol and hadoop.rpc.socket.factory.class.JobSubmissionProtocol properties. For both of them, change the value to org.apache.hadoop.net.SocksSocketFactory and remove the “<final>true</final>” attribute. 6. Edit the mapred-site.xml, and add the following XML properties to the file before <\configuration> <property> <name>mapred.reduce.tasks</name> <value>4</value> </property><property> <name>mapred.reduce.copy.backoff</name> <value>1</value> </property> 7. Run the following commands to setup your HDFS filesystem on this server: These commands will create the input directory for the wordcount application to be run in part V. hadoop dfs -mkdir /user hadoop dfs -mkdir /user/me


View Full Document

CMU CS 15319 - Instructions for using the Qloud

Download Instructions for using the Qloud
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Instructions for using the Qloud and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Instructions for using the Qloud 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?