DOC PREVIEW
Berkeley A,RESEC 210 - BLE, BLUE and BLMSE

This preview shows page 1-2-3 out of 8 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Best Linear Estimators ARE 210 Page 1 BLE, BLUE and BLMSE 1. How do we estimate the unknown parameters of a probability distribution? 2. What kind of inferences can we make based on those parameter estimates? 3. Under what conditions is our rule for estimating these unknown parameters optimal in some reasonable sense? 4. When can we do better, and when not? In these notes, I develop three kinds of estimators that are linear in the observations (data) that can all be thought of in terms of optimization theory. The BLE (the Best Linear Estimator) chooses a linear combination of the observations from a random sample to minimize the variance without constraint. The result is weight zero on each and every data point. While this estimator does in fact attain the global un-restricted minimum of the variance for an estimator (its variance is always zero!), it may be biased. Indeed, it will be biased with probability one for all sample sizes and probabil-ity distributions. The BLUE (Best Linear Unbiased Estimator) principle minimizes the variance of the chosen linear combination of the data subject to the constraint that the estimator must be unbiased. The BLMSE (Best Linear Mean Squared Error Estimator) principle weights the square of the bias equally with the variance in the objective function and minimizes the unre-stricted global minimum of the sum (bias2 + variance).Best Linear Estimators ARE 210 Page 2 We begin by supposing that we have a random sample of i.i.d. random variables, 12,,,nyy y" . Let the population mean for the underlying probability distribution for the y’s be µ and let the population variance be σ2. Both of these are unknown. For now, we will not make any further assumptions about the distribution. We do not as-sume that we know the functional form of the pdf (such as normal). We will, however, restrict our attention to linear combinations of the data, say 1ˆniiiwy=µ=∑, where the “weights” wi are choice variables, to make calculating expectations simpler and to pose the estimation problem better. Writing the mean of ˆµ as ()()11 1ˆ() ( )nn nii i i iii iEE wy wEy w== =µ= = = µ∑∑ ∑, (1) the variance of ˆµ is equal to []{}()222ˆ11ˆˆ()nnii iiiEE E wy wµ==σ= µ− µ = − µ∑∑ {}21()niiiEwy==−µ∑. (2) We seek to choose the weights wi, for i = 1,…,n, to minimize this function. Using the composite function theorem, the necessary first-order conditions are 2ˆ12( ) ( ) 0 1, ,nijjjiEy wy i nwµ=∂σ=−µ −µ=∀=∂∑" . (3) Re-arranging terms and using the fact that independence implies zero covariance,Best Linear Estimators ARE 210 Page 3 2ˆ212[()()]201,,nji j ijiwE y y w i nwµ=∂σ=−µ−µ=σ=∀=∂∑" . (4) if and only if wi = 0 ∀ i =1,…, n. Thus, the choice ˆ0µ= achieves the global unrestricted minimum variance of zero. But although this is a very precise estimator, it is likely to be inaccurate. It is prudent, therefore, to take the bias of an estimator into account. We will now develop and discuss the statistical properties of two estimators that do – the BLUE and the BLMSE. To obtain the BLUE, note that for ˆµto be unbiased, it must satisfy ˆ()E µ=µ. Applying this condition to (1), we have ()1ˆ()niiEw=µ= µ=µ∑ if and only if 11niiw==∑. (5) Thus, we now seek to find appropriate weights wi to minimize the variance (2) subject to the adding up condition in (5) implied by unbiasedness. To accomplish this, we form the Lagrangean function, {}()211() 1nnii iiiEwy w===−µ+λ−∑∑L , (6) and find a saddle point of L (a relative maximum with respect to the w’s and a relative minimum with respect to λ). Since we do not have any inequality or sign restrictions on the choice variables, and since the Lagrangean is convex in w (this is easy to prove, and you should do it as an exercise), the first-order necessary and sufficient conditions are: {}12( ) ( ) 0 1,...,njiiijEy wy j nw=∂=−µ −µ−λ=∀=∂∑L, (7)Best Linear Estimators ARE 210 Page 4 110niiw=∂=−=∂λ∑L. (8) Rearranging the terms inside the {⋅} and then passing the expectation operator through by the distributive law, we can rewrite (7) as 12 [( )( )] 1,...,nii jijwE y y j nw=∂=−µ−µ=λ∀=∂∑L. (9) Now, again using the fact that independence implies zero covariance, we obtain 22 1,...,jjwjnw∂=σ=λ∀=∂L. (10) Solving for the wj terms, we have 22jwj=λσ∀. Then substituting this into (8) and solving for λ gives 221122njjwn n===λσ⇒λ=σ∑. Thus, we obtain the optimal weights for the BLUE for µ as 1, 1, ,jwnj n== " . Finally, this implies that the best linear unbiased estimator for µ, regardless of the underlying distribution for 12,,,nyy y" , is the sample mean, 1ˆniiyn=µ=∑. Before proceeding to the BLMSEE, we will briefly to develop the statistical properties of the sample mean. First, by construction, ˆyµ= is unbiased. We can easily verify this by using the linearity of the expectation operator, to show that ()11 1() ( )nn niiii iEy E y n Ey n n n n== ====µ=µ=µ∑∑ ∑. (11) Second, also by construction, ˆyµ= has smallest variance among all possible unbiased estimators for µ that are formed as linear combinations of the yi. We can easily calculateBest Linear Estimators ARE 210 Page 5 its variance by using the fact that the yi are statistically independent, and therefore uncor-related, so that ()()22211[( ) ] ( )nniiiiEy E yn E y n== −µ = −µ = −µ  ∑∑ () ()12221211()21 ()()nniiijiijEn y n y y=====−µ+ −µ−µ∑∑∑ () ()12221211 [()]21 [()( )]nniiijiijnEy n Eyy=====−µ+ −µ−µ∑∑∑ () ()1222121121 0nniiijnn=====σ+∑∑∑ 2n=σ . (12) Finally, the sample size adjusted and mean deviated random variable, ()ny−µ , has mean 0 zero and variance σ2 for all values of n ≥ 2. As long as this variance is finite, it can be shown that as n → ∞, this random variable converges to one which has a normal distribution (also with zero mean and variance σ2). This is one of the main justifications for using the normal distribution theory and standard normal probability tables to con-struct confidence intervals and perform hypothesis tests for all sorts of probability distri-butions, as long as the sample size is reasonably large. The BLUE applies a lexicographic preference ordering


View Full Document

Berkeley A,RESEC 210 - BLE, BLUE and BLMSE

Download BLE, BLUE and BLMSE
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view BLE, BLUE and BLMSE and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view BLE, BLUE and BLMSE 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?