Methodological study of affine transformations of gene expression data with proposed normalization method

Henrik Bengtsson and Ola Hössjer

Centre for Mathematical Sciences
Mathematical Statistics
Lund Institute of Technology,
Lund University,

ISSN 1403-9338
A detailed methodological study of affine models for gene expression data is carried out. We focus on two-channel comparative studies although the findings are not restricted to such. We find that the affine model is capable of explaining many of the commonly observed discrepancies and systematic effects in gene expression levels and the observed log-ratios. Focus is also on data obtained by the two-color spotted microarray technology, but most of the discussion applies equally well to other gene expression techniques such as single-channel hybridization methods and quantitative real-time PCR. The affine model can also explain non-linear systematic effects commonly observed when log-ratios obtained by different gene expression technologies are compared. A high-quality cDNA microarray data set is used to demonstrate the power of the affine model. In the light of the affine model, the strengths and the weaknesses of the most commonly used normalization methods are discussed. Based on the affine model we propose a novel method to normalize one or multiple arrays simultaneously where each array has been hybridized with one, two or more samples. A package with all necessary method to read and normalize the data have been written in the R language and is made available for free.
Key words:
microarrays; affine transformation; bias; logarithm; non-linear systematic effects; robust normalization; background correction