Скачать презентацию Normalization for c DNA Microarray Data Yee Hwa Скачать презентацию Normalization for c DNA Microarray Data Yee Hwa

f8393114b03e55bb97dae6ef283acc58.ppt

  • Количество слайдов: 24

Normalization for c. DNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Normalization for c. DNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001

Normalization issues Within-slide – What genes to use – Location – Scale Paired-slides (dye Normalization issues Within-slide – What genes to use – Location – Scale Paired-slides (dye swap) – Self-normalization Between slides

Within-Slide Normalization —Normalization balances red and green intensities. —Imbalances can be caused by – Within-Slide Normalization —Normalization balances red and green intensities. —Imbalances can be caused by – Different incorporation of dyes – Different amounts of m. RNA – Different scanning parameters —In practice, we usually need to increase the red intensity a bit to balance the green

Methods? log 2 R/G -> log 2 R/G - c = log 2 R/ Methods? log 2 R/G -> log 2 R/G - c = log 2 R/ (k. G) Standard Practice (in most software) c is a constant such that normalized log-ratios have zero mean or median. Our Preference: c is a function of overall spot intensity and print-tipgroup. What genes to use? — All genes on the array — Constantly expressed genes (house keeping) — Controls – Spiked controls (e. g. plant genes) – Genomic DNA titration series — Other set of genes

Experiment KO #8 m. RNA samples R = Apo A 1 KO mouse liver Experiment KO #8 m. RNA samples R = Apo A 1 KO mouse liver G = Control mouse liver (All C 57 Bl/6) Probes: ~6, 000 c. DNAs, including 200 related to lipid metabolism.

M = log 2(R / G) A = log 2(R*G) / 2 M vs. M = log 2(R / G) A = log 2(R*G) / 2 M vs. A

Normalization - Median — Assumption: Changes roughly symmetric — First panel: smooth density of Normalization - Median — Assumption: Changes roughly symmetric — First panel: smooth density of log 2 G and log 2 R. — Second panel: M vs. A plot with median set to zero

Normalization - lowess — Global lowess — Assumption: changes roughly symmetric at all intensities. Normalization - lowess — Global lowess — Assumption: changes roughly symmetric at all intensities.

Normalisation - print-tip-group Assumption: For every print group, changes roughly symmetric at all intensities. Normalisation - print-tip-group Assumption: For every print group, changes roughly symmetric at all intensities.

M vs. A - after print-tip-group normalization M vs. A - after print-tip-group normalization

Effects of Location Normalisation Before normalisation After print-tip-group normalisation Effects of Location Normalisation Before normalisation After print-tip-group normalisation

Within print-tip-group box plots for print-tip-group normalized M Within print-tip-group box plots for print-tip-group normalized M

Taking scale into account Assumptions: – All print-tip-groups have the same spread. True ratio Taking scale into account Assumptions: – All print-tip-groups have the same spread. True ratio is mij where i represents different print-tipgroups, j represents different spots. Observed is Mij, where Mij = ai mij Robust estimate of ai is MADi = medianj { |yij - median(yij) | }

Effect of location + scale normalization Effect of location + scale normalization

Effect of location + scale normalization Effect of location + scale normalization

Comparing different normalisation methods Comparing different normalisation methods

Follow-up Experiment — 50 distinct clones with largest absolute t-statistics from the first experiment. Follow-up Experiment — 50 distinct clones with largest absolute t-statistics from the first experiment. — 72 other clones. — Spot each clone 8 times. — Two hybridizations: Slide 1, ttt -> red ctl-> green. Slide 2, ttt -> green ctl->red.

Follow-up Experiment Follow-up Experiment

Paired-slides: dye swap — Slide 1, M = log 2 (R/G) - c — Paired-slides: dye swap — Slide 1, M = log 2 (R/G) - c — Slide 2, M’ = log 2 (R’/G’) - c’ Combine by subtract the normalized log-ratios: [ (log 2 (R/G) - c) - (log 2 (R’/G’) - c’) ] / 2 [ log 2 (R/G) + (log 2 (G’/R’) ] / 2 [ log 2 (RG’/GR’) ] / 2 provided c = c’ Assumption: the separate normalizations are the same.

Verify Assumption Verify Assumption

Result of Self-Normalization Plot of (M - M’)/2 vs. (A + A’)/2 Result of Self-Normalization Plot of (M - M’)/2 vs. (A + A’)/2

Summary Case 1: A few genes that are likely to change Within-slide: – Location: Summary Case 1: A few genes that are likely to change Within-slide: – Location: print-tip-group lowess normalization. – Scale: for all print-tip-groups, adjust MAD to equal the geometric mean for MAD for all print-tip-groups. Between slides (experiments) : – An extension of within-slide scale normalization (future work). Case 2: Many genes changing (paired-slides) – Self-normalization: taking the difference of the two log-ratios. – Check using controls or known information.

http: //www. stat. berkeley. edu/users/terry/zarray/Html/ Technical Reports from Terry’s group: http: //www. stat. Berkeley. http: //www. stat. berkeley. edu/users/terry/zarray/Html/ Technical Reports from Terry’s group: http: //www. stat. Berkeley. EDU/users/terry/zarray/Html /papersindex. html — Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data — Statistical methods for identifying differentially expressed genes in replicated c. DNA microarray experiments. — Comparison of methods for image analysis on c. DNA microarray data. — Normalization for c. DNA Microarray Data Statistical software R http: //lib. stat. cmu. edu/R/CRAN/

Acknowledgments Terry Speed Sandrine Dudoit Natalie Roberts Ben Bolstad Matt Callow (LBL) John Ngai’s Acknowledgments Terry Speed Sandrine Dudoit Natalie Roberts Ben Bolstad Matt Callow (LBL) John Ngai’s Lab (UCB) Percy Luu Dave Lin Vivian Pang Elva Diaz