Unformatted text preview:

Lab 3I. Merging data setsII. Updating data setsIII. “Table Look Up”IV. A Little SAS Secret90-776 Manipulation of Large Data SetsLab 3March 31, 1999Major Skills covered in today’s lab:- Keeping and dropping variables - Subsetting – keeping certain observations- Merging data setsToday’s hints: A) If you are tying to get ideas for your project, try browsing through some of the links on the data sources page on the class web page.B) Check out the LABEL= command on page 725 of that handout from Monday’s class. Just like you can label variables and variable values, you can also give a whole data set a label. You may want to consider labeling your temporary data sets in today’s labsince I ask you to create a number of similar data sets.C) When you merge data sets, don’t forget to first sort all of the data sets by the variable(s) you are merging on.D) In today’s lab, it is important to remember to use the DATA= option with your procedures because you will be dealing with lots of data sets.I. Merging data setsDo our friends from lab1 have a weight problem? Let’s find out by merging in a data set that contains information about their heights and weights. We will start with the little example SAS data set from the first lab: l:\academic\90776\data\lab1.sd2. Write the SAS code to do the following:1) Bring in the SAS lab1 data set into a new temporary data set called NAMES. Keep only the NAME and SEX variables. (In this case, it doesn’t really matter whether youuse the KEEP statement (KEEP name sex;), or KEEP= (DATA names (keep=name sex) or (SET libref.lab1 (keep=name sex)).2) Create a permanent data set called WEIGHTS that reads in the following data (using INPUT and DATALINES). WEIGHT is a variable that measures that person’s weightin pounds. HEIGHT is a variable that measures the person’s height in feet and inches.The numbers after the decimal point are the number of inches. For example, 5.5 is NOT five and one-half feet. Instead, 5.5 means 5 feet, 5 inches. Therefore, it is IMPORTANT THAT YOU READ IN THE HEIGHT VARIABLE AS A CHARACTER VARIABLE (include a “$” after the variable name in the input statement).name weight heightAlex 130 5.8Alicia 119 5.1Lester 220 5.3Amir 187 6.0Trixi 150 5.6Becky 155 5.8Check your work by including a PROC PRINT of the WEIGHTS data set.3) Create a temporary data set called NW1 that merges NAMES and WEIGHTS by the common variable, NAME. (Did you remember to: 1) first sort the two data sets (by what?), 2) use the proper merge syntax: MERGE NAMES libref.WEIGHTS; and BYNAME;)4) Use PROC PRINT to look at the merged data set, NW1. 5) Notice that in NW1, there are some missing values for a number of the observations. Let’s remerge the data into a new temporary data set called NW2. This time, keep only the observations that are in the NAMES data set. (To do this, you need to include a (IN=varname) statement after the NAMES data set. You also need to include an IF varname statement).6) Use PROC PRINT to look at the merged data set, NW2.7) Notice that in NW2, you still have some missing values since the WEIGHTS data set does not contain weights for everyone in the NAMES data set. Let’s once again remerge the data into a new temporary data set called NW3. This time, keep only the observations that are in both data sets. (To do this, you need to include a (IN=varname) statement after both data sets. You also need to include an IF varname1 and varname2 statement).8) Use PROC PRINT to look at the merged data set, NW3. How does it look?II. Updating data setsBad news – the scale that Alicia and Amir were using was broken. They found new scales and re-weighed themselves. We need to update their weights in our data set.1) Create a temporary SAS data set called FIX that reads in the following information using INPUT and DATALINES:name weightAlicia 131Alex .Amir 2022) Create a new temporary data set, NW4, which updates NW3 with the information from the FIX data set. (Instead of MERGE, you will use UPDATE).3) Use PROC PRINT to view NW4.III. “Table Look Up”Sometimes, we want to add some information to a data set based on more than variable.The SAS data set l:\academic\90776\data\ideal.sd2 contains a list of ideal weights from the Metropolitan Life Insurance Company tables (1983) (http://www.drcapella.com/html/chart.htm). IDEAL.SD2 contains 3 variables: HEIGHT, SEX (both defined as above), and IDEAL (the “ideal” weight based on a person’s height and sex as reported in the Metropolitan Life table).1) Create a temporary data set called IDEALS that reads in the reads in the IDEAL.SD2 data only for people 6 feet tall or shorter. (In your data step, you need to include thestatement: IF HEIGHT <= “6.0”; Believe it or not, SAS can evaluate this character variable as if it were a numeric variable).2) Use the print procedure to view your IDEALS data set.3) Create a new temporary SAS data set, NW5, which merges NW4 with IDEALS. Merge the data sets on the HEIGHT variable and only keep the observations that are in NW4.4) Use the print procedure to view the NW5 data set. What do you notice about the ideal weight for Alex and Becky? What do you notice about the observations for Alicia and Amir?5) Alex and Beck have the same value for the ideal weight variable because we only merged on the height variable, and they both are the same height. But, looking at the IDEALS data set, we know that males and females have different ideal weights for the same height. Also, notice that the height for Alicia and Amir both merged to two different weights since there were two entries for their heights, one for men and one for women. Therefore, to do a proper table lookup, we need to merge both on HEIGHT and SEX. Resort NW4 and IDEALS by HEIGHT and SEX (SORT HEIGHT SEX;), then create a new temporary data set, NW6, that merges NW4 and IDEALS by HEIGHT and SEX (BY HEIGHT SEX;) for observations that are in the NW4 data set.6) Now print out your NW6 data set.IV. A Little SAS Secret When you are done with the above steps, click (in SAS) on Globals, Accessories, Games and see what


View Full Document

CMU PPP 90776 - lab3

Download lab3
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view lab3 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view lab3 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?