Vol. 38, No. 3 (2005) examined graphically using the scatter study is to identify the outlying slides Therefore, the main focus of this study. inference on the whole microarray quent analysis, resulting in unreliable slides can frequently enter the subse- pattern recognition, and outlying nation is based on subjective human variability. However, such exami- to examine the overall patterns and plot between the two intensity channels analysis, each microarray slide is often patterns and/or larger variability than than other slides. At an earlier stage of patterns or show larger variability slides that have unusual expression statistical criterion to detect outlying slide, they do not provide a rigorous adjusting the bias of each individual lizing transformations are useful for variance-stabi- and approaches normalization these that have unusual nonlinear expression other slides in a microarray data set. transformation (9–14). Taesung Park 1 , Sung-Gon Yi 1 , SeungYeoun Lee 2 , and Jae K. Lee 3 and a 2076-gene microarray experiment for anticancer compound time-course expression of the NCI-60 cancer cell lines. examined to illustrate the proposed approach. A 3840-gene microarray experiment for neuronal differentiation of cortical stem cells sets. This diagnostic measure is also informative to compare variability among slides. Two cDNA microarray data sets are carefully proposed graphical method is easy to implement and shown to be quite effective in detecting outlying slides in real microarray data recognition on their scatter plots. A graphical method and a rigorous diagnostic measure are proposed to detect outlying slides. The clustering analysis. However, it is difficult to select outlying slides rigorously and consistently based on subjective human pattern outlying slides tend to have large impacts on the subsequent analyses, such as identification of differentially expressed genes and plot is commonly used to examine outlying slides that have unusual expression patterns or larger variability than other slides. These Different sources of systematic and random error variations are often observed in cDNA microarray experiments. A simple scatter 1 Seoul National University, 2 Sejong University, Seoul, Korea, and 3 University of Virginia, Charlottesville, VA, USA BioTechniques 38:463-471 (March 2005) in a cDNA microarray experiment We propose the diagnostic plot (DP) Diagnostic plots for detecting outlying slides for the slides from the rat neuronal plots based on Lowess normalization them. Figure 1 shows the log-scatter different degrees of correlation between intensity channels often results in of nonlinear trends between the two by the observation that adjustment The proposed DP is motivated cDNA microarray study. and detects outlying slides from a approach that succinctly summarizes While suggested, including generalized log BioTechniques 463 on a high-density glass slide. Each represents the relative abundance of the fluorescence intensities at each spot RNA samples. The ratio of these two ( G ) and red ( R ) channels for the two fluorescence intensities in the green can then be measured by reading two clone’s (or gene’s) expression levels (Cy3) fluorescent dyes. Each cDNA labeled with red (Cy™5) and green two independent mRNA samples, slide is competitively hybridized with thousands of cDNA clones spotted However, in cDNA microarray A cDNA microarray slide consists of ously in biology and medicine (1–4). regulations and interactions simultane- to understand such complex gene has been recognized as a breakthrough wide microarray profiling technology genes and gene products. Genome- complex interactions between many undergo processes Biological INTRODUCTION corresponding cDNA probe (5). experiments, different sources of transformation approaches have been Larger variability is often observed at genes, variance-stabilizing and other across different intensity regions and To obtain homogeneous variability dominated by the expression intensity. intensity, this background noise is while at high levels of expression (i.e., lower signal-to-noise ratio), of the observed expression intensity background noise is a larger proportion because at low intensity levels, the low log-transformed intensity regions, parameters such as means or medians. systematic and random error can focus on adjusting for the location Those normalization methods mainly nonparametric statistical models (6–8). been proposed using parametric and Several normalization methods have artifacts due to such error variation. employed to remove (or minimize) the lizing transformation are commonly procedure and a variance-stabi- expression patterns. A normalization the inference on the measured gene arise. These may significantly affect R ESEARCH R EPORT
Recommend
More recommend