Logout succeed
Logout succeed. See you again!

ANOVA and the Bootstrap - Computational Diagnostics Group PDF
Preview ANOVA and the Bootstrap - Computational Diagnostics Group
Practical microarray analysis – resampling and the bootstrap ANOVA and the Bootstrap Ulrich Mansmann [email protected] Practical microarray analysis March 2003 Heidelberg Heidelberg, March 2003 1 Practical microarray analysis – resampling and the bootstrap Probe conservation and gene expression Question of interest: Does time between probe harvesting and probe hybridisation have a relevant influence on gene expression? Are there degradation effects or time pattern? Data: Probes of three patients were hybridised at day 0, 1, and 2 Nine Affimetrix – chips with 12625 genes each Methodological approach: Quantify time effects: general or gene-specific Is there evidence for time effects? How to quantify variability between patients? How to quantify variability over time (within patients)? How to operationalise the idea of relevant influence on gene expression) Heidelberg, March 2003 2 Practical microarray analysis – resampling and the bootstrap The one gene scenario - Table time T T T 1 2 3 probe/patient P X X X 1 11 12 13 P X X X 2 21 22 23 P X X X 3 31 32 33 Heidelberg, March 2003 3 Practical microarray analysis – resampling and the bootstrap The one gene scenario - Graph Affi-ID 36518_at 0 5. 1 8 4. 1 6 al 4. n 1 g si sf. n 4 N tra 14. S V 2 4. 1 0 4. 1 8 3. 1 1.0 1.5 2.0 2.5 3.0 Time Heidelberg, March 2003 4 Practical microarray analysis – resampling and the bootstrap First ideas on variability 3 3 3 3 = (cid:229) (cid:229) - 2 1 (cid:229) (cid:229) Quantifying variability: SS (x x) with x = x , the global mean ij ij 9 = = = = i 1j 1 i 1j 1 Separate variability between patients from the variability of individual time courses (within patients) 3 3 = (cid:229) (cid:229) - 2 SS (x x) ij = = i 1j 1 3 3 3 (cid:229) (cid:229) - + - 2 1 (cid:229) = (x x x x) with x = x mean patient value ij i i i ij 3 = = = i 1j 1 j 1 3 3 3 (cid:229) (cid:229) - 2 (cid:229) - 2 = (x x ) + 3 (x x) ij i i = = = i 1j 1 i 1 within patient variability between patient variability Heidelberg, March 2003 5 Practical microarray analysis – resampling and the bootstrap ANOVA – Analysis of variance (1) Analysis of variance(ANOVA) studies differences between then mean values of normal distributed data from groups. We are interested in an analysis of y = x - x , time course measurements ij ij i adjusted for individual probe levels The biological variability between individual probes is not of interest for the next step and is therefore separated. We only look at individually adjusted time courses. 3 (cid:229) Important boundary condition: y = 0 for all i (i = 1, 2, 3) ij = j 1 Heidelberg, March 2003 6 Practical microarray analysis – resampling and the bootstrap Affi-ID 36518_at 8 0. 6 0. 4 d) 0. e st u dj a 2 al ( 0. n g si ansf. 0.0 N tr S V 2 0. - 4 0. - 1.0 1.5 2.0 2.5 3.0 Time Is there evidence for a systematic pattern? What is a systematic pattern? Heidelberg, March 2003 7 Practical microarray analysis – resampling and the bootstrap ANOVA – Analysis of variance (2) A formal view on the data of patient i (probe i) (cid:230) 1 (cid:246) (cid:230) 1 (cid:246) (cid:231) (cid:247) (cid:230) y (cid:246) (cid:231) (cid:247) (cid:231) 6 (cid:247) (cid:230) e (cid:246) (cid:231) i1 (cid:247) (cid:231) 2 (cid:247) (cid:231) i1 (cid:247) (cid:231) 2 (cid:247) = (cid:215) + (cid:215) - + e (cid:231) y (cid:247) a (cid:231) 0 (cid:247) b (cid:231) (cid:247) i2 (cid:231) (cid:247) i2 (cid:231) (cid:247) (cid:231) 1 (cid:247) 6 (cid:231) (cid:247) Ł y ł (- 1)(cid:215) (cid:231) (cid:247) Ł e ł (cid:231) (cid:247) 1 i3 i3 Ł 2 ł (cid:231) (cid:247) Ł 6 ł A sequence of three measurements with mean 0 can be decomposed into a linear term, a quadratic term, and residuals. How to calculate the coefficients a and b? Heidelberg, March 2003 8 Practical microarray analysis – resampling and the bootstrap ANOVA – Analysis of variance (3) y = a 2-1/2 + b 6-1/2 + e 11 11 y = b (-2)(cid:215) 6-1/2 + e 12 12 y = a (-2-1/2) + b 6-1/2 + e 13 13 y = a 2-1/2 + b 6-1/2 + e 21 21 y = b (-2)(cid:215) 6-1/2 + e 22 22 y = a (-2-1/2) + b 6-1/2 + e 23 23 y = a 2-1/2 + b 6-1/2 + e 31 31 y = b (-2)(cid:215) 6-1/2 + e 32 32 y = a (-2-1/2) + b 6-1/2 + e 33 33 Least square estimates for coefficients: 3 3 2 6 (cid:229) - (cid:229) + - a = (y y ) and b = (y y y ) (cid:215) i1 i3 (cid:215) (cid:215) i1 i3 i2 3 2 3 4 = = i 1 i 1 e Residuals are calculated by taking the difference between observed and model based values. Heidelberg, March 2003 9 Practical microarray analysis – resampling and the bootstrap ANOVA – Analysis of variance (4) The final decomposition of the variance 3 3 3 3 3 = (cid:229) (cid:229) - 2 (cid:229) (cid:229) - 2 (cid:229) - 2 SS (x x) = (x x ) + 3 (x x) = ij ij i i = = = = = i 1j 1 i 1j 1 i 1 3 3 3 (cid:229) (cid:229) e 2 + 2 2 (cid:229) - 2 ( ) 3a + 3b + 3 (x x) ij i = = = i 1j 1 i 1 unexplained model based patient level variability variability variability Heidelberg, March 2003 10