Nov 11 2009 Lab 7 Econ240A 1 L Phillips Exploratory Data Analysis and Failure Times Linear Probability Model I Failure Time Analysis Cumulative Hazard Rate Survivor Function The data is time until failure of the fan in hours for diesel generators The data was obtained from Wayne Nelson Applied Life Data Analysis 1982 McGraw Hill There are 70 observations Only twelve are actual times until failure The other 58 observations are running times fans that had not failed at the last time observed Thus for these 68 observations all we know about failure time is that it is longer than this last observed running time These observations are called right censored Open the Excel data file Fanfail The times in hours are in column B The running times i e censored observations are followed by a The failure times like the first observation 450 hours are exact The observations are ordered from smallest to largest The second column C shows the number of fans failing or ending at that time For example at 450 hours one fan fails The second observation 460 was a fan that was lost track of at 460 hours and so is a censored observation Observations 3 and 4 at 1150 hours are two fans that failed at that time so 2 is the number ending for the third observation The third column D number at risk is the number of fans that could fail For the first observation that number is 70 For the second observation the number is 68 since the first fan failed at 450 hours and we lost track of the second fan at 460 hours For the third observation the number at risk is 68 Then two fans fail at 1150 hours and another fan is lost track of at 1560 hours so by the fifth observation the number at risk is 65 The fourth column E is the number of hours that pass between failures or the interval The time until the first fan fails is 450 hours Then another fan fails at 1150 hours or an interval of 700 hours later Open the Excel data file Fanfail II The first four columns are filled in or completed The fifth column F is the interval hazard rate and is calculated for each of the fans failing by dividing the number failing in Column C by the number at risk in Column D The sixth column G is the cumulative hazard rate and is the sum of the interval hazard rates For example for the third observation the cumulative hazard Nov 11 2009 Lab 7 Econ240A 2 L Phillips Exploratory Data Analysis and Failure Times Linear Probability Model function at 1150 hours of 0 0437 is the interval hazard rate of 0 0294 for observation three plus the interval hazard rate of 0 0143 for observation one at 450 hours The data for failure time and the cumulative hazard rate for the 12 observations with exact failure times is listed in the Excel data file Fanfail III below the data for the seventy observations Since there were ties at 1150 hours and at 2070 hours there are 10 observed failure times To produce a plot of the cumulative hazard function select these ten observations for hours and control click cumulative hazard rate and insert an xy scatterplot The title is Cumulative Hazard Function for Diesel Generator Fans the x axis is Hours and the y axis is Cumulative Hazard Function Select the data points on the chart and insert a linear trendline and using options show the coefficient of determination and fitted parameters Note a linear fit to the data is pretty good The cumulative hazard function H t for the exponential distribution is H t t So the estimated slope is the parameter for the exponential Use Data Analysis Regression in the Tools menu The estimated intercept is not significantly different from zero as expected The estimated parameter for is 3 89 x 10 5 and is highly significant The reciprocal of the parameter is the mean of the exponential distribution so the estimated average time until failure is 25 707 hours Since the exponential has no memory the expected time until failure for a fan on the generator is 1 conditional on having run hours already The seventh column H is the hazard rate per hour multiplied by 100000 to keep the numbers from being so small and is calculated by dividing the interval hazard rate by the interval in hours Note that this hazard rate ranges from 1 to 7 7 with the exception of the fan that has an interval of only 10 hours Evidently dividing by this small denominator causes a blip Thus it would appear that the cumulative hazard rate provides a more reliable picture Nov 11 2009 Lab 7 Econ240A 3 L Phillips Exploratory Data Analysis and Failure Times Linear Probability Model The survivor function is calculated by determining the ratio Column I which is the number at risk minus the number ending divided by the number at risk The survivor function is calculated by multiplying this ratio by the previous survivor function value At zero hours the survivor function is 1 The ten values of the survivor function are also listed below the data Select hours and control click the survivor function values and insert an xy chart The title is Survivor Function for Diesel Generator Fan Failure the x axis is Hours and the y axis is Survivor Function Select the data points and insert an exponential trendline with coefficient of determination and estimated parameters The survivor function for the exponential is S t exp t The exponential fits the survivor function but so would a linear trend line The cumulative hazard function is probably more definitive but both the survivor function and the cumulative hazard function are consistent with the exponential distribution fitting the data The regressions for the ten points should be checked with Eviews which is left as an exercise for Lab Seven II Playing the Lottery Yes or No the Linear Probability Model Open the Eviews file lottery that we used in Lab Six The data for the dependent variable was the percentage of income spent on the lottery There were a number of zeros Thus there are two groups those who play the lottery and those who do not This data can be recoded to ones for the players and zeros for the rest Use the GENR statement to generate a variable called BERN for Bernoulli BERN 0 lottery 0 1 lottery 0 Select lottery and Bern and open this group using the View menu into the spreadsheet view and confirm that the data has been mapped correctly into zeros and ones Go to Procs in the menu and select sort series In the window type in Bern for the series to be sorted and sort in ascending order Select lottery and Bern and open this Nov 11 2009 Lab 7 Econ240A 4 L Phillips Exploratory Data Analysis
View Full Document
Unlocking...