Unformatted text preview:

Spring 2008Problem set # 1: Using STATAHanded out: Feb. 11, 2008Part I: Speed-dating data (one point each)Part II: Getting data into STATA (five points)Part III: Merging data (two points each)Part IV: Research design (five points each)17.871, Political Science LabSpring 2008Problem set # 1: Using STATAHanded out: Feb. 11, 2008Due: Feb. 20, 2003, at the beginning of class.For Parts I-III, turn in (1) a do file that contains the commands used to answer the problems (as well as your answers) and (2) a “log” file produced from running the do file.Part I: Speed-dating data (one point each)Speed-dating data from studies conducted in New York City by Ray Fisman and Sheena Iyengar, an economist and a psychologist at Columbia University. If you're interested, they summarize their findings in this paper. You'll need to look at the codebook. Here's the abstract:We study dating behavior using data from a Speed Dating experiment where we generate random matching of subjects and create random variation in the number of potential partners. Our design allows us to directly observe individual decisions rather than just final matches. Women put greater weight on the intelligence and the race of partner, while men respond more to physical attractiveness. Moreover, men do not value women's intelligence or ambition when it exceeds their own. Also, we find that women exhibit a preference for men who grew up in affluent neighborhoods. Finally, male selectivity is invariant to group size, while female selectivity is strongly increasing in group size.1. Open the speed_dating file from the course folder or off the class website.2. Recode the variable date so that the values roughly correspond with number of datesper year (e.g., once a week = 52) and call this variable dates. Do this once with generate and replace. Drop this first variable (drop dates). Do this again with recode.3. What's the modal category on the dates variable? (Hint: tabulate)4. How many dates does the average participant go on each year? (Hint: summarize)5. Who goes out on dates more often: men or women? (Hint: tabulate gender with sum(dates) as an option.)6. How many men and how many women participated in the experiments? 7. During speed dating, are men or women more selective? (Hint: tabulate dec gender with the column option)8. In waves 6-9, the experimenters used different scales for the preference questions. To simplify, drop waves six through nine. (Hint: drop if wave ==6)9. Do men and women report placing similar weights on traits in potential partners? What's the biggest difference? (Hint: by gender, sort: sum attr1_s)10. Which participant(s) (iid) sought the most matches? (Hint: first create a variable decisions that totals the number decisions to pursue a match by each participant (dec) with the egen command (egen decisions = total(dec), by(iid).)11. What was the maximum number of "matches" participants received across the speed dating rounds? (Hint: similar to 10.)12. What was the highest success rate observed among participants? (Hint: create a new variable match_rate with the generate that equals matches divided into decisions.)13. The speed-dating data contains a variable that codes the median SAT score for participants’ undergraduate institutions. The variable, however, is not coded in numeric form. What form is it in? Convert it to a numeric variable. (Use describe todetermine the variables’ format. Use destring to convert the variable. You will haveto use the ignore and replace options.)14. Create a variable that equals 1 for the bottom third of participants’ undergraduate institutions based on the median SAT variable, 2 for the middle third, and 3 for the top third. First do so with recode using the generate new variable option. Drop this variable. 15. Now that you've practiced recoding, show how you can save yourself considerable time in the future by creating this variable again using xtile. 16. Does SAT tercile influence match_rate? (Hint: use one of the commands above.)17. Create a new data set that contains the average ratings for each self-reported attribute (e.g., attr3_1) and the average ratings by partners for each participant (e.g., attr_o). (Hint: use the collapse command with the by option.)18. How many unique participants are there in this new data set? Do any variables have missing data?19. On what traits do participants’ self-reports tend to correspond with those of their partners? On what traits is there no correspondence? (Hint: corr.)20. Using this same data set, generate a scatter plot of participants’ own attractiveness ratings by partners’ ratings of the attractiveness of these participants. So that you can see each point, add some randomness to each point with the jitter option. Would you describe the relationship as strong, moderate, or weak? No need to print the scatter plot. (Hint: scatter attr3_1 attr_o, jitter(10).)Part II: Getting data into STATA (five points)Data comes in many forms. Here's one way to get data into Stata. Using a text editor (such as EMACS), type the text from Exhibit 1 in the document “How to Use the STATA infile and infix Commands” into Athena and save it in a file named scores.dat on your home directory. Write the code that will create a STATA data set from this raw data and save it as a file called “scores.dta”.Part III: Merging data (two points each)Find two tables that interest you in the Statistical Abstract of the United States that meet the following criteria: (1) they have between 25 and 52 observations and (2) they have the same units of analysis (e.g., states, years, nations), (3) the subject matter of the two tables are conceivably linked. You can find the Abstract here: http://www.census.gov/prod/www/statistical-abstract.html 1. Call these two tables Table A and Table B. Create a STATA data set that contains one variable, plus the identifying variable (like state or year), from Table A. Save it. Create a STATA data set that consists of one variable, plus the identifying variable (like state or year), from Table B. Save it. 2. Merge the two data sets. Save the merged dataset. 3. Test that you have successfully merged your data by tabulating _merge. 4. Are your merged variables actually related? Check by tabulating or correlating.5. Write a short description (2 or 3 sentences) of the tables you got your data from. Part IV: Research design (five points each)Comment on the research designs of the following two studies. Discuss whether


View Full Document

MIT 17 871 - Problem Set #1

Documents in this Course
Load more
Download Problem Set #1
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Problem Set #1 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Problem Set #1 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?