Weighted Analysis of Microarray Experiments (WAME)

WAME was gradually generalised to the current state through three papers. For the main results, please view the abstracts of the papers below.
The WAME procedure is available in a R-package.

The R package

Note that this is an alpha release. Try help(package="WAME") for documented functions.

WAME 0.0.5 (2006-03-17)
WAME 0.0.7 (2006-10-24)
WAME 0.0.8 (2006-12-05) Primary changes: Included help. Try help(package="WAME") for documented functions.
WAME 0.0.9 (2006-12-12) Primary changes: Estimate of Sigma now includes error code from nlm as attribute.
WAME 0.0.10 (2006-12-14) Primary changes: Minor changes in cross.plot.
WAME 0.0.11 (2007-06-12) Minor changes.

Type the following line in R to install the WAME package:

install.packages("WAME", contriburl="http://wame.math.chalmers.se")

update.packages("WAME", contriburl="http://wame.math.chalmers.se")

to update the package.

Send bug reports and suggestions to Anders Sjögren or Erik Kristiansson.

Weighted Analysis of General Microarray Experiments (2007)

Anders Sjögren^*, Erik Kristiansson, Mats Rudemo, and Olle Nerman,
Submitted to BMC Bioinformatics and available as preprint 2007:29, Mathematical Sciences, Chalmers University of Technology, ISSN 1652-9715.
^* Corresponding author: anders.sjogren@math.chalmers.se.

Abstract

Background : In DNA microarray experiments, measurements from different biological samples are often assumed to be independent and to have identical variance. For many datasets these assumptions have been shown to be invalid and typically lead to too optimistic p-values. A method called WAME has been proposed where a variance is estimated for each sample and a covariance is estimated for each pair of samples. The current version of WAME is, however, limited to experiments with paired design, e.g. two-channel microarrays.

Results : The WAME procedure is extended to general microarray experiments, making it capable of handling both one- and two-channel datasets. Two public one-channel datasets are analysed and WAME detects both unequal variances and correlations. WAME is compared to other common methods: fold-change ranking, ordinary linear model with t-tests, LIMMA and weighted LIMMA. The p-value distributions are shown to differ greatly between the examined methods. In a resampling-based simulation study, the p-values generated by WAME are found to be substantially more correct than the alternatives when a relatively small proportion of the genes is regulated. WAME is also shown to have higher power than the other methods. WAME is available as an R-package.

Conclusions : The WAME procedure is generalized and the limitation to paired-design microarray datasets is removed. The examined other methods produce invalid p-values in many cases, while WAME is shown to produce essentially valid p-values when a relatively small proportion of genes is regulated. WAME is also shown to have higher power than the examined alternative methods.

Quality Optimised Analysis of General Paired Microarray Experiments (2006)

Erik Kristiansson^*, Anders Sjögren, Mats Rudemo, and Olle Nerman,
Statistical Applications in Genetics and Molecular Biology: 5(1), Article 10.
^* Corresponding author: erikkr@math.chalmers.se.

Abstract

In microarray experiments, several steps may cause sub-optimal quality and the need for quality control is strong. Often the experiments are complex, with several conditions studied simultaneously. A generalised linear model for paired microarray experimemnts is proposed as a generalisation of the paired two-sample method by Kristiansson et al. 2005. Quality variation is modelled by different variance scales for different (pairs of) arrays, and shared sources of variation are modelled by covariances between arrays. The gene-wise variance estimates are moderated in an empirical Bayes approach. Due to correlations all data is typically used in the inference of any linear combination of parameters. Both real and simulated data are analysed. Unequal variances and strong correlations are found in real data, leading to further examination of the fit of the model and of the nature of the datasets in general. The empirical distributions of the test-statistics are found to have an considerably improved match to the null distribution compared to previous methods, which implies more correct p-values provided that most genes are non-differentially expressed. In fact, assuming independent observations with identical variances typically leads to optimistic p-values. The method is shown to perform better than the alternatives in the simulation study.

Supplementary figures

ApoAI cross-plot
Cardiac cross-plot

Weighted Analysis of Paired Microarray Experiments (2005)

Erik Kristiansson^*, Anders Sjögren^*⁺, Mats Rudemo, and Olle Nerman,
Statistical Applications in Genetics and Molecular Biology: 4(1), Article 30.
^* Both authors contributed equally, order was randomised.
⁺ Corresponding author: anders.sjogren@math.chalmers.se.

Abstract

In microarray experiments quality often varies, for example between samples and between arrays. The need for quality con trol is therefore strong. A statistical model and a corresponding analysis method is suggested for experiments with pair ing, including designs with individuals observed before and after treatment and many experiments with two-colour spotted arrays. The model is of mixed type with some parameters estimated by an empirical Bayes method. Differences in quality are modelled by individual variances and correlations between repetitions. The method is applied to three real and sever al simulated datasets. Two of the real datasets are of Affymetrix type with patients profiled before and after treatment , and the third dataset is of two-colour spotted cDNA type. In all cases, the patients or arrays had different estimated variances, leading to distinctly unequal weights in the analysis. We suggest also plots which illustrate the variances and correlations that affect the weights computed by our analysis method. For simulated data the improvement relative to previously published methods without weighting is shown to be substantial.

The supplementary source code is now deprecated and the functionality is available through the R package above.