1CS 350: Introduction toSoftware EngineeringSlide Set 4Estimating with Probe IIC. M. OverstreetOld Dominion UniversityFall 2005Fall 2005 CS 350/ODU 2Announcements Prog. 2 should be graded by this weekend Study Abroad Fair Tues. Oct 4, 10:30 – 1:30, North Mall of Webb Info on Scholarships Prog. 4 description availableLecture TopicsThe prediction intervalOrganizing proxy dataEstimating with limited dataEstimating accuracyEstimating considerations2The Prediction Interval The prediction interval provides a likely range around the estimate. A 70% prediction interval gives the range within which the actual size will likely fall 70% of the time. The prediction interval is not a forecast, only an expectation. It applies only if the estimate behaves like the historical data. It is calculated from the same data used to calculate the regression parameters.The Range Calculation The range defines the likely error around the projection within which the actual value is likely to fall. Widely scattered data will have a wider range than closely bunched data. The variables are n - number of data points - the standard deviation around the regression line t(p, df) – the t distribution value for probability p (70%) and df (n-2) degrees of freedom x – the data: k – the estimate, i – a data point, and avg –average of the data σRange = tp,n()σ1+1n+xk− xavg()2xi− xavg()2i=1n∑ The Standard Deviation Calculation The standard deviation measures the variability of the data around the regression line. Widely scattered data will have a higher standard deviation than closely bunched data.The standard deviation is the square root of the variance.σVariance = σ2= 1n − 2yi−β0−β1xi()2i=1n∑3Calculate the Prediction Interval Calculate the prediction range for size and time for the example in lecture 3 (slides 42 and 43). Calculate the upper (UPI) and lower (LPI) prediction intervals for size. UPI = P + Range = 538 + 235 = 773 LOC LPI = P - Range (or 0) = 538 - 235 = 303 LOC Calculate the UPI and LPI prediction intervals for time. UPI = Time + Range = 1186 +431 = 1617 min. LPI = Time - Range (or 0) = 1186 - 431 = 755 min.Organizing Proxy Data -1 To make an estimate break the planned product into parts relate these planned parts to parts that you have already built use the size of the previously-built parts to estimate the sizes of the new parts To do this, you need size ranges for the types of parts that you typically develop. For each product type, you also need size ranges to help you to judge the sizes of the new parts.Organizing Proxy Data -2 To determine the size ranges, start with the part data. Assume that you have the following data. class A, three items (or methods), 39 total LOC class B, five items, 127 total LOC class C, two items, 64 total LOC class D, three items, 28 total LOC class E, one item, 12 LOC class F, two items, 21 total LOCThe LOC per item is 13, 25.4, 32, 9.333, 12, 10.5. The objective is define size ranges that approximate our intuitive feel for size.4Organizing Proxy Data -3 To produce the size ranges, sort the data as follows. The sorted LOC per item data: 9.333, 10.5, 12, 13, 25.4, 32. Arrange these data as follows. Pick the smallest item as very small: VS = 9.333. Select the largest item as very large: VL = 32. Pick the middle item as medium: M = 12 or 13. For the large and small ranges, pick the midpoints between M and VS and M and VL: 10.9, and 22.25. While these may be useful ranges, they are probably not stable. That is, additional data points will likely result in substantial size-range adjustments.Intuitive Size Ranges -1 In judging size, our intuition is generally based on a normal distribution. That is, we think of something as of average size if most such items are about that same size. We consider something to be very large if it is larger than almost all items in its category. When items are distributed this way, it is called a normal distribution. With normally distributed data, the ranges should remain reasonably stable with the addition of new data points.Intuitive Size Ranges -2 A normal distribution5Intuitive Size Ranges -3 With a large volume of data, you could calculate the mean and standard deviation of that data. For the size ranges Medium would be the mean value. Large would be mean plus one standard deviation. Small would be mean minus one standard deviation. Very large would be mean plus two standard deviations. Very small would be mean minus two standard deviations. This method would provide suitably intuitive size ranges if the data were normally distributed.The Distribution of Size Data Program size data are not normally distributed. many small values a few large values no negative values With size data, the mean minus one or two standard deviations often gives negative size values. The common strategy for dealing with such distributions is to treat it as a log-normal distribution.A Log-Normal Distribution6The Log-Normal Distribution To normalize size data, do the following:1. Take the natural logarithm of the data.2. Determine the mean and standard deviation of the log data.3. Calculate the average, large, very large, small, and very small values for the log data.4. Take the inverse log of the ranges to obtain the range size values.5. This procedure will generally produce useful size ranges.Organizing Proxy Data -4 A mathematically precise way to determine the proxy size ranges is described in the text (pages 78-79). This simple way to determine these size ranges will work when you have lots of data. Otherwise, it can cause underestimates. Comparative estimating rangesVS S M L VLNormal -1.67 7.68 17.04 26.39 35.75Log-Normal 5.55 9.19 15.22 25.21 41.75Estimating with Limited Data -1 Even after using PSP for many projects, you will have to make estimates with limited data when you work in a new environment use new tools or languages change your process do unfamiliar tasks Since estimates made with data are more accurate than guesses, use data whenever you can. Use the data carefully since improper use can lead to serious errors.7Estimating with Limited Data -2 Depending on the quality of your data, select one of the four PROBE estimating methods. To use
View Full Document