1SAS ProceduresClass 4TITLE STATEMENTz enclose in quotesz prints at the top of each page in OUTPUT windowz limit of 10 (Title, Title2, ..Title10)z limit of 132 characters eachz in effect for session until new title statementTITLE STATEMENTSyntax:TITLEn‘text string’;Example :Title1 ‘Example of title’;Title2 ‘Second title’;SAS Formats It is sometimes useful to store data in one way and display it in another. For example, dates can be stored as integers but displayed in human readable format. A SAS format changes the way the data stored in a variable is displayed. There are two types of format:• Internal formats (SAS already knows about these)• User defined formats (you define these yourself).Internal SAS formats : Class4_1.SAS• A format statementtells SAS to use that format with one or more variablesPermanent formats : Class4_2.SAS• A format statement added to a datasteppermanently connects the format to a variable. The format information is stored in the dataset header.2User defined formats: Class4_3.SAS Define the format using proc format Tell SAS to use the format with a specific variable by using the format statementas before.Cumulative Cumulative male Frequency Percent Frequency Percent0 3097 56.31 3097 56.311 2403 43.69 5500 100.00proc freq data=mylib.nmes_tot;table male;User defined formats• proc format definesthe format genlb.• the format statement applies the format to the variable gender.proc format;value genlb 1='male'0='female';proc freqdata=mylib.nmes_tot;table male;format male genlb. ;Cumulative Cumulativemale Frequency Percent Frequency Percentƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒfemale 3097 56.31 3097 56.31male 2403 43.69 5500 100.00Syntax :proc format <options>;value FormatNamerange1 = ‘formatted value1’…..rangen = ‘formatted valuen’;run;Used to define a format.User defined formats User defined formats: Exampleproc format;value gen1 = ‘male’2 = ‘female’;value age10-29 = ‘10 - 29’30-39 = ‘30 - 39’40-49 = ‘40 - 49’50-75 = ‘50 - 75’;value $dpt‘A’ = ‘Dept A.’‘B’ = ‘Dept B.’;• Defines three formats, gen, age and dpt.• Format dpt is a character format suitable for character variables.format names: must be 8 or fewer characters long cannot end with a number character formats begin with a $ can not use a SAS internal format name refer to format in format statement by using the name followed by a periodformat ranges You can specify a range of values to be formatted in a given wayproc format;value age10-29 = ‘10 - 29’30-39 = ‘30 - 39’40-49 = ‘40 - 49’50-75 = ‘50 - 75’;run;• inclusive ranges• you can use formats as look-up tables to categorize a variable.3specifying format ranges low lowest value (excludes missing) high highest value other all other values not listed (including missing values) value1 - value2 means [value1,value2] value1 -< value2 means [value1,value2) value1 <- value2 means (value1,value2]Example: Class4_4.sasObs expend2 expend1 3341.89 >=18962 0.00 <1703 27.00 <1704 621.54 [170,629) 7 662.15 [629-1896)proc format ;value expflow-<170='<170'170-<629='[170,629)'629-<1896='[629-1896)'1896-high='>=1896';proc print data=a; *1st 10 records;var totalexp_copy totalexp;format totalexp expf.;run;format names: must be 8 or fewer characters long cannot end with a number character formats begin with a $ can not use a SAS internal format name refer to format in format statement by using the name followed by a periodDescriptive statistics exploratory data analysis is very important from many perspectives in SAS there are three procedures used routinelyDescriptive statistics for numeric datameansDescriptive statistics for numeric dataunivariatetables for categorical datafreqProcedureproc freqz produces frequency counts and cross-tabulation tablesz computes tests and measures of associationSyntax:proc freq <options>;tables requests / <options>;proc freqExample:proc freq data=mydata;tables gender race chd ;tables gender * chd / chisq relrisk;run;data=mydata is an optionchisq and relrisk are requests for statistics4Example data: NMES_TOTThe national medical expenditure survey (1987).Examine smoking and gender.Libname mylib ‘d:\temp\sasclass’;proc format;value smoke0 = ‘never’1 = ‘current’2 = ‘former’;value gen0 = ‘female’1 = ‘male’;run;make two formats smoke and gen for the smoking and gender variablesExample data: Class4_5.sasproc freq data=mylib.nmes_tot;tables male*smoke / chisq;format male gen. smoke smoke. ;run;mylib is a libname (folder), nmes_tot is the dataCheck Outputproc univariate produces simple descriptive statistics use PLOT options on PROC statement stem-and-leaf plot box plot normal probability plot (QQ plot) side by side box plots for by variable groupsSyntax:proc univariate <options>;var variables / <options>;proc univariate : Class4_6.sasExample:proc univariate data=mylib.nmes_tot plot;title “Univariate Output for Age”;var lastage;run;Check Output5proc means similar to univariate – no plots nicer output, particularly for more that one variableSyntax:proc means <options>;class varlist;var variables / <options>;by varlist;output out=outdata <options>;run;proc means options data=dataset statistic default is: n mean std min max Others are: nmiss range median clm noprint – suppress printing of outputstatementsclass statistics produced for each combination of class variableby statistics produced by each combination of by variablesoutput produce an output dataset which contains the statisticsproc means: Class4_7.sasExample:proc means data=mylib.nmes_tot noprintn mean std stderr range nmiss;class male;var lastage;output out=results n=nage mean=mage std=sage;format male gen.;run;proc print data=results;run;Check OutputExercise IModeling with SAS examine relationships between variables estimate parameters and their standard errors calculate predicted values evaluate the fit or lack of fit of a model test hypotheses design outcome6The linear modelExample:εββββ+++++=kkxxxy K22110),0(~2σεNεβββ+++= AgeHeightWeight210Note: outcome variable must be continuous and normal given
View Full Document