Unformatted text preview:

RECURSIVE ALGORITHM FOR SUMSOF SUBSETS• Computational complexity of:– |D|-conditional logistic likelihood– Finite population baseline odds incre-ment estimate.• Solution: Recursive algorithm.1COMPUTATIONALLY CHALLENGINGSUMS• |D|-Conditional logistic likelihood:l(β; D|D|) =ϕD(β)Ps⊂R:|s|=|D|ϕs(β)• Finite population estimator of the base-line odds:∆bΛ =n − (|D| − 1)n − |D|Ps⊂R:|s|=|D|−1ϕsPs⊂R:|s|=|D|ϕs.• Other sums in the variance expressions...2THE COMPUTATIONAL PROBLEM• The expressions involvePd⊂R:|d|=m.• If the number of cases in the risk set is“largish,” then these sums can have toomany terms for direct summing.• If there are 15 cases in a risk set of size100, the number of terms is10015≈ 2.53×1017!3SUMS THAT NEED TO BECOMPUTEDLet xibe a vector of values associated withunit i andxd=Xi∈dxi.B0(m, r) =Xd⊂r:|d|=mϕdB1(m, r, x) =Xd⊂r:|d|=mxdϕdB2(m, r, x) =Xd⊂r:|d|=mx⊗2dϕd.4HISTORY• Cox (1972) JRSS paper describing Coxregression.• Discussion includes Breslow estimator ofbaseline hazard.• Howard has discussion point on computa-tion with ties and describes the recursivealgorithms (but seems to have been “for-gotten.”)• Gail, Lubin, Rubenstein (1981) “rediscover”and implement.Cox DR. JRSS B 34:187-220, 1972.Gail et al. Biometrika68:703-707, 19815SPECIAL CASE BINOMIALCOEFFICIENTWhen the ϕi≡ 1 (β = 0),B0(m, r) =|r|m6THE RECURSIVE ALGORITHM: ANEXAMPLE• Consider the contribution of a particular element5 to the sum B0(3, {1, 2, 3, 4} ∪ {5}).B0(3, {1, 2, 3, 4} ∪ {5})= 123 + 124 + 134 + 234 + 125 + 135145 + 235 + 245 + 345= 123 + 124 + 134 + 234 + 5(12 + 13 + 14 + 23 + 24 + 34)= B0(3, {1, 2, 3, 4}) + ϕ5B0(2, {1, 2, 3, 4}).• Suggests, in general, thatB0(m, r ∪ {j}) = B0(m, r) + ϕjB0(m − 1, r).7THE RECURSIVE ALGORITHM FORSUMS OF PRODUCTS OF SETS OFSIZE 4 FROM 8B0(1, r1)↓&B0(1, r2) B0(2, r2)↓& ↓&B0(1, r3) B0(2, r3) B0(3, r3)↓& ↓& ↓&B0(1, r4) B0(2, r4) B0(3, r4) B0(4, r4)↓& ↓& ↓& ↓B0(1, r5) B0(2, r5) B0(3, r5) B0(4, r5)↓& ↓& ↓& ↓B0(1, r6) B0(2, r6) B0(3, r6) B0(4, r6)↓& ↓& ↓& ↓B0(1, r7) B0(2, r7) B0(3, r7) B0(4, r7)↓& ↓& ↓& ↓B0(1, r8) B0(2, r8) B0(3, r8) B0(4, r8)B0(m, r ∪ {j}) = B0(m, r) + ϕjB0(m − 1, r).8THE RECURSIONS FOR EACH TYPEOF SUMTheorem: LetB0(m, r) =Xd⊂r:|d|=mϕdB1(m, r) =Xd⊂r:|d|=m Xi∈dxi!ϕdB2(m, r) =Xd⊂r:|d|=m Xi∈dxi!⊗2ϕd.Then the following recursion formulas hold:B0(m, r ∪ {j}) = B0(m, r) + ϕjB0(m − 1, r)B1(m, r ∪ {j}) = B1(m, r) + ϕjB1(m − 1, r) + xjϕjB0(m − 1, r)B2(m, r ∪ {j}) = B2(m, r) + ϕjB2(m − 1, r)+ ϕj[xjB1(m − 1, r)T+ B1(m − 1, r)xTj] + ϕjx⊗2jB0(m − 1, r)9PROOF OF B1RECURSIONFORMULAB1(m, r ∪ {j}) = B1(m, r) + ϕjB1(m − 1, r) + xjϕjB0(m − 1, r).Proof: Splitting the outer sum into sets that do not contain jand those that do contain j,B1(m, r ∪ {j}) =Xd⊂r∪{j}:|d|=m Xi∈dxi!ϕd= terms without j + terms with j=Xd⊂r:|d|=m Xi∈dxi!ϕd+Xd⊂r:|d|=m−1 Xi∈dxi+ xj!ϕdϕj= B1(m, r) + ϕjXd⊂r:|d|=m−1 Xi∈dxi!ϕd+ ϕjxjXd⊂r:|d|=m−1ϕd= B1(m, r) + ϕjB1(m − 1, r) + ϕjxjB0(m − 1, r).Provides recursive method.10COMPUTATIONAL EFFICIENCY OFTHE RECURSIVE ALGORITHM• The algorithm requires the order of |R| ×|D| operations compared to the|R||D|op-erations required for direct summing.• So for a case-control set with 15 casesand 15 controls, the recursive algorithmrequires on the order of 30 × 15 = 450operations to compute the 150 millionterms in the sum.• SAS LOGIST (STRATA), and PHREG(TIES=DISCRETE) use the recursive al-gorithm to fit conditional logistic likeli-hood.11EXAMPLE: COMPUTE BASELINEODDS FROM RISK SET 6libname rs ’\book\miners\coh_rs’;* read in data and risk set and covariate information* into a single record;data riskset6 (keep=_setno _ncases _ntot z1-z1626);set rs.gr_miners_rs_5yr;by _setno;length _ncases _ntot 5;array z{1626} z1-z1626;retain i 0 z1-z1626 _ntot 1626 _ncases 60;if _setno ne 6 then delete;* put _setno 6 subject data into the array;i = i + 1;z[i] = cr500;if last._setno then output;run;Output data set has one record with cr500 for all 1626 members.12COMPUTE BASELINE ODDS:PRELIMINARIESdata baselineodds (keep=_setno _ncases _ntot bhat b0_bfd b0_bfd_1 baselineodds);set riskset6;retain bhat 1.45;******************************************* input arrays:z - covariate values (will expand to multiple covariates later);array zt[1,1626] z1-z1626;******************************************* temporary arrays:* phi - odds ratio (calculated from beta and z;array phi[1626] _temporary_;* recursion arrays, set size to at least the max number of cases;array b0a[1626] _temporary_;array b0b[-1:1626] _temporary_;* compute phi vector;do i = 1 to _ntot; phi[i] = exp(zt[1,i]*bhat); end;13DO THE RECURSION* initialize values;b0b[-1]=0; b0b[0]=1;* initialize m-1 level;b0a[1] = phi[1];do i = 2 to _ncases;b0a[i] = 0;end;* do the recursion;do k = 2 to _ntot;b0b[1] = b0a[1] + phi[k];do l = 2 to min(k,_ncases);b0b[l] = b0a[l] + b0a[l-1]*phi[k];end;* re-initialize the m-1 step array;do l = 1 to min(k,_ncases);b0a[l] = b0b[l];end;end; /* recursion section */14*********************************************************** Compute baseline log odds**********************************************************;b0_bfd = b0b[_ncases];b0_bfd_1 = b0b[_ncases-1];baselineodds = (b0b[_ncases-1]/b0b[_ncases])*(_ntot - (_ncases-1))/(_ntot-_ncases);run;15Baseline odds estimate from CPUM 5year 55-60 risk set_ncases _ntot bhat b0_bfd b0_bfd_1 baselineodds60 1626 1.45 1.9532E134 2.9979E132


View Full Document

USC PM 518a - 2011-06-03 recursive algorithm

Download 2011-06-03 recursive algorithm
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view 2011-06-03 recursive algorithm and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view 2011-06-03 recursive algorithm 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?