Return to the LLMPP Homepage
Interactively explore the figures
Search the LLMPP Microarray dataset
View figures from the paper
Data analysis methods and links
View additional tables & figures
Get help with interpretation of the data
Download the primary data
List of authors
Fluorescent images of hybridized microarrays were obtained using the
GenePix 4000 microarray scanner.
Images were analyzed with ScanAlyze.
Single spots or areas of the array with obvious blemishes were flagged and excluded
from subsequent analyses. Fluorescence ratios (along with numerous quality control parameters;
see ScanAlyze manual)
were stored in a custom database. Raw data files for each array containing all measured
values and manual flags are available here.
A set of clones that consistently behaved poorly across arrays was identified and
excluded from all analyses; a list of excluded elements is available
Fluorescence ratios were calibrated independently for each array by applying a
single scaling factor to all fluorescent ratios from each array; this scaling
factor was computed so that the median fluorescence ratio of well-measured spots on each array was 1.0.
All non-flagged array elements for which the fluorescent intensity in each channel was
greater than 1.4 times the local background were considered well-measured. The
ratio values were log-transformed (base 2) and stored in a table (rows=individual
cDNA clones, columns=single mRNA samples). Where samples had been analyzed on multiple arrays,
multiple observations for an array element for a single sample were averaged. Array elements
that were not well-measured on at least 80% of the 96 mRNA samples were excluded. Data for the
remaining genes were centered by subtracting (in log space) the median observed value, to
remove any effect of the amount of RNA in the reference pool. This dataset contains 4,026 array
elements and is available here.
Hierarchical clustering was applied to both axes using the weighted pair-group method with
centroid average (WPGMC) (Sneath and Sokal 1973) as implemented in the program
Cluster. The distance matrixes used were
Pearson correlation for clustering the arrays and the inner product of vectors
normalized to magnitude 1 for the genes (this is a slight variant of Pearson correlation;
see Cluster manual
computational details). The results were analyzed with TreeView.
The datasets used for Figs. 3
and Figs. 4 are also available.