Detecting Space-Time Clusters: Prior Work and New Directions

Home> Academic Documents> Detecting Space-Time Clusters: Prior Work and New Directions

DOC PREVIEW

This preview shows page 1-2-3-4-5-6 out of 17 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 17 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 17 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 17 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 17 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 17 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 17 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 17 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Detecting Space-Time Clusters: Prior Work andNew DirectionsDaniel B. Neill Andrew W. MooreNovember 2004CMU-CS-05-115School of Computer ScienceCarnegie Mellon UniversityPittsburgh, PA 15213AbstractThe problem of space-time cluster detection arises in a variety of applications, including diseasesurveillance and brain imaging. In this work, we briefly review the state of the art in space-timecluster detection, focusing on space-time scan statistics, and we derive a number of new statistics.First, we distinguish between tests for clusters with higher disease rates inside the cluster thanoutside (as in the traditional spatial scan statistics framework) and tests for clusters with highercounts than expected (as is appropriate when inferring the expected counts in a region from thetime series of past counts). Second, we distinguish between tests for “persistent” clusters (wherethe disease rate remains constant throughout the duration of a cluster) and tests for “emerging”clusters (where the disease rate increases monotonically through the duration of a cluster). Thesenew statistics for spatio-temporal cluster detection will serve as the basis for our future work indetection of emerging space-time clusters.Keywords: algorithms, biosurveillance, cluster detection, space-time scan statistics1 IntroductionThe problem of detecting space-time clusters arises in a variety of applications, including diseasesurveillance and brain imaging. In general, spatio-temporal methods can be divided into threeclasses: spatial modeling techniques such as “disease mapping,” where observed values are spa-tially smoothed to infer the distribution of values in space-time (e.g. Clayton and Kaldor, 1987;Besag et al., 1991); tests for a general tendency of the data to cluster (e.g. Knox, 1964; Mantel,1967); and tests which attempt to infer the location of clusters (e.g. Kulldorff et al., 1998; Kull-dorff, 2001; Kulldorff et al., 2004). We focus on the latter class of methods, since these are the onlymethods which allow us to both answer whether any significant clusters exist, and if so, identifythese clusters.Let us assume that we have a set of data collected at a set of discrete time steps k = 1...kbase,and at a set of discrete spatial locations si. For each siat each time step k, we are given a countckiand (optionally) a baseline bki. For example, in epidemiology, the counts may be the numberof disease cases in a given spatial region over a given time interval, or some related observablequantity such as the number of Emergency Department visits or OTC drug sales. The baselinesmay be given (based on results from a control group, or an at-risk population derived from censusdata), or may be inferred based on the time series of counts. In all cases, we assume that countsckiare generated by some distribution with mean proportional to bkiqki, where qkiis the rate (orexpected ratio of count to baseline). Our goal, then, is to find whether there is any region (set oflocations si) and time interval (k = kmin...kmax) for which the rates are significantly higher thanexpected; in epidemiology, this may correspond to a disease outbreak. Within this very generalframework, there are a number of questions we can ask:1. Which spatial regions to search over? We typically search over the set of all regions of somegiven shape and variable size. For simplicity, we assume here that the spatial locations siare aggregated to a d-dimensional grid, and search over the set of all d-dimensional hyper-rectangular regions on the grid.2. Which temporal intervals to search over? For prospective analysis, we search only over in-tervals ending at the present time, while for retrospective analysis we search over all intervalsincluding those ending before the present time.3. What distributions are assumed? For simplicity, we assume that ckiare generated indepen-dently from Poisson distributions with mean qkibki. We could also take other factors suchas extra-Poisson variation (overdispersion) and spatial correlation into account; we do thissomewhat in the CATS and RATS methods discussed below, since these perform aggregationof counts at the level of grid cells and regions respectively. In the BATS method discussedbelow, which considers a separate time series for each building, we do not account for corre-lation. We can also use Normal distributions instead of Poisson to model distributions withdispersion different from the mean and spatially varying.4. Do we want to infer baselines from the time series of previous counts, or are the baselinesgiven? For the time being, we assume that baselines are given; we discuss methods ofinferring baselines from previous counts in Section 3.1In any case, the value of the space-time statistic Dmaxis taken to be the maximum over allspatial regions S ⊆ G of D(S), where D(S) is the maximum Dkmaxkmin(S) for all temporal intervalsk = kmin...kmax. For retrospective analysis, we have 1 ≤ kmin≤ kmax≤ kbase; for prospectiveanalysis, we have 1 ≤ kmin≤ kmax= kbase.Now, in order to decide which statistic Dkmaxkmin(S) to use, we must first decide what sort of regionswe are looking for. In particular:1. Do we want to detect regions such that the rates cki/bkiare significantly higher than someprior expectation q0, or such that they are significantly higher inside the region than outside?We call the former “globally sensitive” tests, since they are sensitive to global increases inrate. For the latter, we must decide whether to adjust for the overall global rate (“globallyadaptive” tests) or to adjust separately for each day’s rate (“daily adaptive” tests).2. Do we expect the rate to be constant over the time duration of a cluster, or do we expectthe rate to be increasing over the time duration of a cluster? In the first case, we have atest for persistent clusters, while in the second case, we have a test for emerging clusters.We can also make several other assumptions, such as a rate increasing according to someparametrized distribution (ex. linear increase, exponential increase).Based on our answers to these two questions, we may define a number of different statistics, asdefined in Section 2.2 Space-time statisticsWe first consider the case of a simple prospective space-time scan statistic, where we want todetect only if there are any space-time clusters on the present day k = kbase. In this case, we havethe same statistics whether we assume that the cluster is persistent, emerging,


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-3-4-5-6 out of 17 pages.

Please select your school