122S:166Computing in StatisticsIntroduction to RLecture 5September 6, 2006Kate Cowles374 SH, [email protected] R is• “an integrated suite of software facilities fordata manipulation, calculation, and graphicsdisplay” (An Introduction to R, Venables,Ripley, and the R Core team)– data handling and storage capabilities– operators for calculations on arrays andmatrices– data analysis tools– graphical capabilities– programming language– planned and coherent system3• an implementation of S language– S language was developed at AT&T-BellLabs∗ first version 1976– S-Plus is a commercial version of S (beginin 1987)∗ sold and supported by Insightful Corp.∗ GUI∗ many formats supported for graphics ex-port and data input/output∗ runs on Windows, UNIX, Linux (notMacintosh)4• advantages of S– extendible∗ users write new functions in S language— just as developers do∗ excellent documentation for adding func-tions to system∗ users can create their own data types∗ huge international community of usersconstantly contribute new capabilities∗ contrast with SAS· very hard to write new SAS proce-dures· users write in different language (SASmacro or IML) than developers– high-level language∗ only a few commands required to docomplex things5– language is connected to data while exe-cutingexample (from Statistical Computing andGraphics course notes by Frank Harrell)if(is.factor(x) | is.character(x) |(is.numeric(x) & length(unique(x)) < 20))table (x) else quantile(x)computes quantiles of x if x is numeric andhas at least 20 distinct values, requencytable otherwise– object-oriented∗ fewer commands to learn because thesame command can be applied to dif-ferent types of objects– Harrell: “best scientific graphics available”∗ Harrell: “SAS graphics are ugly, inflexi-ble, have poor defaults, difficult to pro-gram”6R• international team of statisticians started de-veloping R in early 1990’s– to provide open source alternative to S-Plus– to provide S implementation on Linux (notsupported by S-Plus then)• easy to download and install from web sites• excellent documentation• user-contributed libraries called packages ex-pand capabilities• runs on Windows, UNIX, Linux, Macintosh• no GUI on most platforms• fewer data import/export capabilities thanS-Plus– although add-on packages provide more– no export specifically to Powerpoint7Starting and running R interactivelyon Linux• recommendation: use a separate subdirec-tory for each major project you do with R• in a terminal window, get into the desiredsubdirectory and start R by enteringR• R commands may be issued interactively• to quitq()– follow prompts as to whether you want tosave workspace– if you don’t save it, any new objects (data,functions, results) created during the cur-rent R session will be lost8• Part B of this lecture is from Maindonald,J.H. “Using R for Data Analysis and Graph-ics: Introduction, Code and Commentary,”available as contributed documentation
View Full Document