CASE STUDY CYCLONES DEFINITION Cyclones are defined as an atmospheric system in which the barometric pressure diminishes progressively to a minimum at the centre and toward which the winds blow spirally inward from all sides resulting in a lifting of the air and eventually in clouds and precipitation Hurricanes are cyclones that originate in the tropics with windspeeds beyond 64 knots 74 mph 113 km h SOME BACKGROUND ON HURRICANES 1 Average insurance claims per year 1 billion 2 Extreme violent hurricanes can exceed 10 billion 3 Example hurricane Andrew in 1992 GENERAL GOAL OF THE STUDY 4 Understand properties of cyclones based on the recorded variables 5 Predict a Track of cyclones b Probability of landfall SOURCE OF THE DATA US National Hurricane Center Various web sites on tropical storms o US National Hurricane Center http www nhc noaa gov o Tropical Storm Page http www solar ifa hawaii edu Tropical tropical html o More data at http www bbsr edu rpi research demaria demaria4 html o DATA DESCRIPTION Number of variables 18 o Date 3 Year Month Day o Name Number 2 o Location X longitude Y lattitude 2 o Categorical 2 Stormtype Landfall o Continuous 9 Speeds Angles Distance Number of Cases 1819 But only 334 different storms Number of observations per storm min 1 42 12 5 median 5 max 24 OVERVIEW OF THE MEASURES Trackspeed Parallelspeed Inwardspeed Windspeed Land Sea Coast Trackangle angle Parallel to the Equator SUGGESTED APPROACHES Approach Reason Type of Question addressed Calculate summaries of Extract scale location and What is the average all variables range information windspeed of a hurricane in this dataset Draw distributions of Understand asymmetry Which variables are useful variables and outlier of the variables for a statistical model Plot interactions of Understand interaction Which variables contribute variables structure of the data information to a model Draw maps with Understand geographical Where do hurricanes occur hurricane locations distribution Draw tracks of Look for similar track What does a typical track of a hurricanes types hurricane look like Are there different types of tracks shapes Plot geographical What is the interaction of distribution of variables location with all the or variables Do speeds and angles of measurements follow a geographical pattern link information of other variables into the scatterplot Check accuracy of the data Set up statistical model Try to predict a landfall What is the probability that for landfall from single measurements this hurricane will hit land ACTUAL APPROACHES 1 Summaries a Year summary Year Min 1945 1st Qu Median 1954 1964 barplot table Year Mean 1963 3rd Qu Max 1971 1979 b Month summary Month Min 1 00 1st Qu Median 8 00 9 00 barplot table Month Mean 8 769 3rd Qu Max 9 00 12 00 c Name summary Name NOT 289 CAROL 28 CHARLIE 36 GLADYS 28 SUBTROP 35 GRETA 28 EDITH 33 INGA 28 BETSY 32 BECKY 27 ANNA 30 ELLA 27 ABLE 29 FLORA 27 DOG 29 FRANCES 27 d Location plot X Y col 3 Landfall Some example paths for Donna and Anna plot X Y col 3 Landfall lines X Name DONNA Y Name DONNA col 2 lwd 3 lines X Stormnumber 275 Y Stormnumber 275 col 3 lwd 3 e Stormtype table Stormtype Stormtype 1 3 1674 76 5 69 f Landfall table Landfall Landfall 0 1 1431 388 table Landfall LandfallN 0 1 234 100 388 1431 0 2711391 234 100 0 4273504 barplot table Landfall barplot table LandfallN g Speeds hist Windspeed col 3 hist Windspeed Landfall T add T col 2 h Angles i Distance 2 RELATIONS BETWEEN VARIABLES a Checked all variables against Landfall but highlighted subgroup is hard to compare with total in a histogram 1 alternative Boxplots 2 alternative Spinograms Example Windspeed Histogram Spinogram Trackangle Histogram Spinogram b Dependencies of the derived variables 1 Inward speed and parallel speed are derived from track speed Trackspeed Inwarspeed vs Parallelspeed 2 Coast angle and distance angle are derived from track angle Trackangle Distanceangle vs Costangle 3 Distance is related to longitude and latitude Distance Latitude vs Longitude c Looking at 2 way interactions between continuous variables d Interactions in more than 2 dimensions 2 d Tour e A SIMPLE MODEL FOR LANDFALL m1 lm Landfall 1 summary m1 Call lm formula Landfall 1 Coefficients Estimate Std Error t value Pr t Intercept 0 213304 0 009607 22 2 2e 16 add1 m1 cyclones c 1 3 9 16 Single term additions Model Landfall 1 Df Sum of Sq RSS AIC none 305 2 3244 8 X 1 19 0 286 2 3359 7 Y 1 17 6 287 6 3351 0 Windspeed 1 1 2 304 1 3249 8 Trackspeed 1 6 0 299 3 3278 8 Trackangle 1 2 4 302 9 3256 9 Distance 1 3 2 302 0 3262 1 Coastangle 1 19 9 285 3 3365 6 Distanceangle 1 13 8 291 4 3327 1 Inwardspeed 1 16 7 288 5 3345 3 Parallelspeed 1 16 8 288 5 3345 6 This stepwise regression yields the model Landfall Coastangle Inwardspeed X Windspeed Y with a R2 of 13 5 Prediction hist predict m1 cyclones Confusion matrix table round 0 164 predict m1 cyclones 0 1 1430 389 table round 0 164 predict m1 cyclones Landfall Landfall 0 1 0 1215 215 1 216 173 mosaicplot table round 0 164 predict m1 cyclones Landfall INVESTIGATION OF THE RESIDUALS Geographic mapping of false positives false negatives Approach Partition the data into Gulf region Atlantic region North east Atlantic region removing outlier Ginger R Code group rep 1 length X group Y 0 5825 X 22 5641 X 80 2 group Y 0 4806 X 7 8837 Distanceangle 90 group 2 Landfall 0 3 group Stormnumber 310 4 barplot table group col 2 5 plot X Y col group 1 f REESTIMATING TWO SEPARATE LOGISITC MODELS Logistic Regression ordinary linear model Y aX b e Problem with Y Y is dichotomous but the linear fit will give estimates on to Solution Introduce sigmoid link function to map data from to 0 1 ln p 1 p aX b e logit link Remarks Other continuous link function from to 0 1 are used as well Parameter estimate no longer works with simple solution of linear equation but needs iterative optimization methods to find a solution R2 can not be extracted as in the Ordinary Least Square case i Estimate for the Gulf region g1 glm Landfall 1 subset group 1 family binomial add1 g1 cyclones c 1 3 9 16 Landfall Trackangle Y Inwardspeed Coastangle table round 0 07 predict g1 cyclones type response group 1 0 1 253 179 table round 0 07 predict g1 cyclones type response group 1 Landfall group 1 0 1 0 191 62 1 62 117 R2 20 8 ii Estimate for the Atlantic region a1 glm Landfall 1 subset group 2 family binomial add1 a1 cyclones c 1 3 9 16 Landfall Coastangle Parallelspeed Windspeed Y X table round 0 1809
View Full Document