DOC PREVIEW
Multivariate Gaussian Simulation Outside Arbitrary Ellipsoids

This preview shows page 1-2-20-21 out of 21 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Multivariate Gaussian Simulation Outside ArbitraryEllipsoidsNick Ellis and Ranjan Maitra∗AbstractMethods for simulation from multivariate Gaussian distributions restricted to befrom outside an arbitrary ellipsoidal region are often needed in applications. A standardrejection algorithm that draws a sample from a multivariate Gaussian distribution andaccepts it if it is outside the ellipsoid is often employed: however, this is computationallyinefficient if the probability of that ellipsoid under the multivariate normal distributionis substantial. We provide a two-stage rejection sampling scheme for drawing samplesfrom such a truncated distribution. Experiments show that the added complexityof the two- stage approach results in the standard algorithm being more efficient forsmall ellipsoids (i.e. with small rejection probability). Howeve r, as the size of theellipsoid increases, the efficiency of the two-stage approach relative to the standardalgorithm increases indefinitely. The relative efficiency also increases as the numberof dimensions increases, as the centers of the ellipsoid and the multivariate Gaussiandistribution come closer, and as the shape of the ellipsoid becomes more spherical. Weprovide results of simulation experiments conducted to quantify the relative effic iencyover a range of parameter settings.1 IntroductionThe need to simulate from the extreme regions of a multivariate Gaussian distribution arisesin a variety of applications. An example is in the context of environmental risk assessmentwhere one may need to simulate from an extreme event to make inference on certain param-eters (Hefferman and Tawn, 2004). Other application areas include environmental impactassessment (de Haan and de Ronde, 1998), contaminant modelling (Lockwood and Schervish,2005) and strategies for financial management (Poon et al, 2004). However, the primary mo-tivation for our interest in this problem, comes from the context of outlier detec tion within∗Nick Ellis is Natural Resource Modeler at CSIRO Marine and Atmospheric Research, 233 Middle St,Cleveland, QLD 4163, Australia. Ranjan Maitra is Associate Professor in the Department of Statistics andStatistical Laboratory, Iowa State University, Ames, IA 50011-1210, USA.1Restricted Multi-Gaussian Simulation 2a proposed refinement of the multi-stage clustering procedure for massive datasets definedin Maitra (2001). The methodology adopted there is first to cluster a random sample ofthe dataset using some clustering technique, and then to identify representativeness of theclusters in the rest of the dataset using a likelihood-ratio test under the assumption that theclusters are from homogeneous multivariate Gaussian distributions. The rejected observa-tions are then resampled, clustered, and the groups tested again for representativeness. Theprocess is repeated until no further sampling is possible.One refinement to the above methodology stems from the fact that, at each stage, the samplecontains observations from groups that are too scarcely represented to be recognized from thesample as a separate cluster. These observations should be identified as outliers to the clusterthey get assigned to, and should be removed from these groups before inference is performedon those groups. One way to test for the presence of outliers in Gaussian populations is tocompute the multivariate sample kurtosis measure T of Mardia (1970, 1974, 1975) and usethat to detect outliers (Schwager and Margolin, 1982), with the rejection region decided bysimulation.In the first stage of the clustering algorithm, this approach is straightforward, since therejection region can be estimated using the quantiles of T , obtained from samples drawnfrom a standard multivariate Gaussian distribution. However, in subsequent stages, theobservations in identified groups are no longer multivariate Gaussian. Rather, they areobservations from a multivariate normal distribution restricted to be outside a union ofellipsoids, being the rejection regions of the previous stages. To detect outliers, one coulduse the same test statistic T , but now the sampling distribution of T must be estimated usingsimulations sampled outside these ellipsoids. While most of these ellipsoids will perhaps befar away from the support of the Gaussian distribution restricted to the cluster, there willpossibly be a few ellipsoids with substantial overlap. Such a scenario presents a need formultivariate Gaussian simulation from outside arbitrary ellipsoids.There has been a lot of interest in this and related problems. Many authors in particularhave provided algorithms to compute the probability of a multivariate normally distributedrandom vector over different kinds of regions. For instance, B ohrer and Schervish (1981),building upon the work of Milton (1972), provided algorithms for calculating multivariatenormal probabilities over rectangular regions. Schervish (1984) provided a faster algorithmfor calculating such probabilities along with their error bounds. More recently, Lohr (1993)presented an algorithm to compute the multivariate normal probabilities inside general-shaped regions, of which the ellipsoid is a special case. On the other hand, interest inmulti-Gaussian simulation from extreme regions is much more recent. Hefferman and Tawn(2004) developed the theory for simulation from a general class of multivariate distributionsrestricted to component-wise extreme regions, and Lockwood and Schervish (2005) discussedstrategies for MCMC sampling on component-wise censored data.In this paper, we specialize to the case of sampling f rom a multivariate Gaussian distributionfrom extreme regions defined to be in the form of the complement to an arbitrary ellipsoid.Restricted Multi-Gaussian Simulation 3Without loss of generality, we can assume a standard multivariate Gaussian distribution,since, under transformation to standard coordinates, the ellipsoid is mapp ed to anotherellipsoid. Therefore we aim to simulate from the following p-variate density, given byf(x) ∝ exp {−x0x2}1[(x − µ)0Γ(x − µ) > a].Simulation from the above distribution can be done, using crude rejection sampling. How-ever, when the ellipsoid has probability close to one under the standard multivariate normaldistribution, this approach can be extremely inefficient with most realizations being dis-carded rather than being accepted. The worst-case scenario is when Γ ≈ I, µ ≈ 0 and ais large. When a = χ2p;0.99, for instance,


Multivariate Gaussian Simulation Outside Arbitrary Ellipsoids

Download Multivariate Gaussian Simulation Outside Arbitrary Ellipsoids
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Multivariate Gaussian Simulation Outside Arbitrary Ellipsoids and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Multivariate Gaussian Simulation Outside Arbitrary Ellipsoids 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?