DOC PREVIEW
Berkeley STATISTICS 246 - Lecture Notes

This preview shows page 1-2-3-4-24-25-26-50-51-52-53 out of 53 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 53 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1Summarizing many probe intensitiesStatistics 246 Spring 2006Week 9 Lecture 22Summarizing11-20probeintensitypairstogiveameasureofexpressionforaprobesetTherearemanylow-levelsummariesinuse.Hereareafew.• MAS5.0(theAffymetrixmethodwhichreplacedAvDiffdescribedpreviously);• dChip,describedpreviously;• RMA=RobustMulti-chipAnalysistobedescribednow;• GCRMA(thesuccessortoRMA);and• PLIER(thesuccessortoMAS5.0).3TheMAS5.0expressionsummary Log{Signal Intensity} = TukeyBiweight{log(PMj - MMj*)}whereMMj*aversionofMMjthatisneverbiggerthanPMj(thedetailsarenotimportant),andTukeyBiweight is arobust/resistant average of the 11-20 quantities.Exercise. Tukey’s biweight location estimator is an M-estimator. What is its form?There are at least 2 problems with this summary. First, it is stillsingle chip, so cannot benefit from information across chips.And second, the robustness is not necessarily in the correctplace. A probe may function perfectly well, but have muchhigher intensities than the other probes. This procedurewould downweight, perhaps ignore it, for no good reason.4Some influence functions(courtesy of Philip Stark)5Some weight functions(courtesy of Philip Stark)6What RMA does: four steps It uses only the PMs and ignore the MMs. Also, it• Adjusts for background: call this -*BG• Carries out quantile normalization of PM-*BG withchips in suitable sets: call the result n(PM-*BG)• Takes log2 of normalized background adjusted PM;• Carries out a Robust Multi-chip linear model Analysis(RMA) of the quantities log2n(PM-*BG).7Why ignore the MM values? The reason was that it was hard to do better usingthem. They definitely have information, - about bothsignal and noise - but using them without addingmore noise (see below) seemed to be a challenge. GCRMA and other improved bg correction methodsmake use of MMs, though this hardly justifieshaving one MM for every PM. A pool of controlprobes for use in estimating non-specifichybridization would do just as well, and this is nowbeing used on some Affymetrix chips.α,µσ8Why take log2 ?LookatSDsfromreplicatechips9Why Multi-chip Analysis? To put each chip’s values in the context of a set ofsimilar values. We see that there are substantialprobe effects in the responses (next slides), andthat these are repeatable. Such an obvious feature - the parallelism of proberesponse across chips in these figures - should beexploited in any analysis. This leads to an additive model on the log scale (ora multiplicative model on the intensity scale, as indChip).10Probe Intensity vs conc ex 1A glance at some raw data:20 probe spike-in set across 14 arraysPM intensityConcentration in pM = array11Probe Intensity vs conc ex 216 probe spike-in set: 6 non-responding12Whywritelog2n(PM-*BG)≈chipeffect+probeeffect?BecauseprobeeffectsareadditiveonthelogscaleThespikeindatasetintheprevioustwoslidesshowedit.Indeedeverysetofexperimentsshouldexhibitthisparallelbehaviouracrossprobes.Canyouseewhy?13WhyaRobustratherthanLeastSquaresfit?In these large bodies of data we see many (perhaps up to 10%)outliers: image artifacts, bad probes, bad chips….Robust summaries really can improve over the standard ones,by down-weighting outliers and leaving their effects visible inresiduals. Not only are the estimates of quantities that matter better, wecan use aspects of the robust analysis to carry out qualityassessment.14How Robust Multi-chip Analysis works The analysis involves robustly fitting the followinglinear model for one probe set: log2 n(PMij -*BG) = ai + bj + εij where i labels chips, j labels probes, and Σbj = 0 isimposed for identifiability. Here εij are the errors,assumed iid with mean zero, constant variance σ2. Initially we used median polish, but the currentimplementation uses Huber’s ψ. We’ll describe both.15RMA in summary• BackgroundcorrectPM• Carryoutquantilenormalization• Takelog2Undertheadditivemodel log2 n(PMij -*BG) = ai + bj + εij• Estimatechipeffectsai and probe effects bj(with Σbj = 0) using a robust/resistant method.16M-estimatorsOne can estimate the parameters of the model as solutions to€ minai,bjρ(Yij− ai− bjˆ σ )2=i, j∑minai,bjρ(uij)2i, j∑where ρ is a symmetric, positive-definite function increasingless rapidly than x2. Solutions to this minimization problemcan be obtained by an Iteratively Reweighted Least Squares(IRWLS) procedure with weights:€ wij=′ ρ (uij) uij=ψ(uij) /uij17Robust fit by IRWLSAt each iterationrij = Yij - current est(ai) - current est(bj),S = k.MAD(rij) a robust estimate of the scale parameter σuij = rij/S standardized residualswij = ψ(uij)/uij weights to reduce the effect of discrepantpoints on the next fitNext step estimates are:nextest(ai) = weighted row i meannextest(bj) = weighted col j mean – overall weighted mean18The way robustness works here We look at the parameter estimates from a robust two-way analysis using median polish and the usual leastsquares on data from 37 control probe sets across 6chips. Overall we have 20x37 probe effects, 6x37chip effects, and 20x6x38 residuals. This was repeated for 37 randomly chosen probesets, to check that the control probe sets were notatypical. They weren’t. After observing certain patterns, an explanation isoffered of the way robustness works.19The two methodsEach fits the same additive model to probe level data.Median polish (mp) fits iteratively, successively removing rowand column medians, and accumulating the terms, until theprocess stabilizes. The residuals are what is left at the end.Least squares (lm) uses the familiar closed-form estimates ofthe parameters a and b (what are they?), and again theresiduals are what is left after subtracting them from theobservations.20Note the slightly different shapes of their distributionsTwo sets of residuals (control probes)21Similarly with random probe sets22 Normal qq plots of mp and lm residualsWhich has (slightly) fatter tails?23mp residuals have fatter tails than lm, but


View Full Document

Berkeley STATISTICS 246 - Lecture Notes

Documents in this Course
Meiosis

Meiosis

46 pages

Meiosis

Meiosis

47 pages

Load more
Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?