J Appl Physiol 99: 397-413, 2005;
doi:10.1152/japplphysiol.00050.2005
8750-7587/05 $8.00
INVITED REVIEW
Transcriptional profiling of tissue plasticity: role of shifts in gene expression and technical limitations
Martin Flück,1
Christoph Däpp,1
Silvia Schmutz,1
Ernst Wit,2 and
Hans Hoppeler1
1Department of Anatomy, Bern, Switzerland; and 2Department of Statistics, University of Glasgow, Glasgow, United Kingdom
 |
ABSTRACT
|
|---|
Reprogramming of gene expression has been recognized as a main instructive modality for the adjustments of tissues to various kinds of stress. The recent application of gene expression profiling has provided a powerful tool to elucidate the molecular pathways underlying such tissue remodeling. However, the biological interpretations of expression profiling results critically depend on normalization of transcript signals to mRNA standards before statistical evaluation. A hypothesis is proposed whereby the "fluctuating nature" of gene expression represents an inherent limitation of the test system used to quantify RNA levels. Misinterpretation of gene expression data occurs when RNA quantities are normalized to a subset of mRNAs that are subject to strong regulation. The contention of contradictory biological outcomes using different RNA-normalization schemes is demonstrated in two models of skeletal muscle plasticity with data from custom-designed microarrays and biochemical and ultrastructural evidence for correspondingly altered RNA content and nucleolar activity. The prevalence of these biological constraints is underlined by a literature survey in different models of tissue plasticity with emphasis on the unique malleability of skeletal muscle. Finally, recommendations on the optimal experimental layout are given to control biological and technical variability in microarray and RT-PCR studies. It is proposed to approach normalization of transcript signals by measuring total RNA and DNA content per sample weight and by correcting for concurrently estimated endogenous standards such as major ribosomal RNAs and spiked RNA and DNA species. This allows for later conversion to diverse tissue-relevant references and should improve the physiological interpretations of phenotypic plasticity.
phenotype; standard; reference base; muscle; sampling
A GROWING BODY OF EVIDENCE is accumulating that implies gene expressional alterations to be involved to a significant extent in the unique response of cells and tissues to external stressors (18, 21, 41, 45, 54, 55, 86). Transcriptional profiling evolves as a powerful tool to explore the molecular mechanisms underlying such adaptation. However, heavy expressional adaptations which may impede the RNA-based normalization of transcript signals appear often not respected in the biological interpretation of expression profiling experiments. This has been noted in recent review articles (21, 115) and is expressed in the current debate on standards of gene expression data (9, 13, 22, 36, 6264, 91, 101, 115, 118, 122). To date, specific recommendations to correct biological bias were not formulated.
The main goal of this article is to lay out the molecular-biological and technical concepts that underlie the investigation of gene expressional alterations and biological processes of tissue adaptations. We will discuss the pertinent cell-biological factors that influence the outcome of transcriptional profiling and the ensuing biological adjustments. A particular focus will be on the pulsatile nature of nuclear gene expression, its implication for tissue sampling, as well as the concomitant bias for the study of skeletal muscle plasticity after the single or repeated impact of exercise stress. This focus on this later phenomenon is motivated by the unique opportunity to investigate the specific molecular mechanisms underlying phenotypic plasticity in an adult fully differentiated tissue (45). The investigation of skeletal muscle plasticity is of major clinical relevance as it holds the key for the understanding of systemic metabolic disease(s) and muscle disorders (28). Consequently, the possibility for repeated biopsy sampling may also provide important clues for the unraveling of the cellular strategies underlying adaptive processes in other human tissues.
Our own experimental data and a literature survey will be provided to underline the effect of an overshoot in total transcript number for the normalization and interpretation of expression data during tissue adaptation. Biological and technical limitations and issues related to interexperimental reproducibility are discussed, and an "optimal" layout of expression profiling experiments is proposed. It is concluded that, in tissue plasticity experiments, systematic tissue sampling has to precede RNA level determinations and that the estimated transcript levels have to be related to absolute reference parameters such as muscle weight, total RNA content, and nuclear number.
 |
CONCEPT OF TRANSCRIPTIONAL REPROGRAMMING
|
|---|
Gene expression is the process by which genetic information is transcribed from deoxyribonucleic acid (DNA) into diffusible messenger ribonucleic acids (mRNA) (111). Nuclei and mitochondria are the cellular entities involved in the storage and retrieval of genetic information. Normally nuclear-encoded mRNAs are exported into the cytoplasm where they are translated into peptide chains by the ribosomal machinery. Ribosomes themselves consist of the majority of four ribosomal RNAs (i.e., to 6065% of 18S, 28S, 5.8S, and 5S rRNA). These later transcripts complex with multiple proteins to form a large and a small subunit (75, 82, 84). For mitochondrially encoded genes, translation takes place in the mitochondrial matrix by a mitochondrial ribosome pool that resembles the prokaryotic ribosomes (57, 84). mRNAs therefore provide the building plan for the biosynthesis of proteic cellular structures and enzymes.
The synthetic activity of nuclei is subject to multiple levels of control and appears to be pulsatile in nature (16, 30). This noise is gene specific (117) and leads for distinct genes to large variations of expression levels that may vary by orders of magnitude (96). A recent twin study suggests that factors involved in signaling and inflammation show the most substantial variations in gene transcript levels (112, 126). With respect to syncytical skeletal muscle fibers, these stochastic fluctuations in gene expression produce regional effects along the same and between different muscle fiber. For instance, Newlands et al. (103) found that both endogenous muscle-specific and housekeeping genes and transgenes are not expressed equivalently during muscle development and regeneration. Similarly, a high variability (i.e., patchy pattern) in the muscular gene expressional response of immediate early genes c-jun and c-fos was observed after eccentric exercise (29, 114). This indicates that the transcriptional activity of the 1,00010,000 myonuclei contained in an individual myofiber (and associated cells) is regulated independently and henceforth produces regional effects. Conversely, the even distribution of the highly abundant myosin mRNA along the myofiber (119) suggests that the consequences of stochastic gene expression is dependent on the abundance of a gene transcript. In this regard, it should be noted that the repeated impact of external stimuli, such as regular endurance exercise, improves the correlation of expression profiles in a steady state, probably because of enhanced uniformity of slow muscle fiber types (154).
 |
EXPRESSION STUDIES IN TISSUE PLASTICITY
|
|---|
Since the first application of gene analytic tools in the late 1980s, a role of transcriptional reprogramming has been invoked in various phenotypical adjustments of mammalian tissues (18, 26, 40, 45, 78, 116, 125). Reprogramming of gene transcription therefore appears as the main instructive modality for adjustments of form and function of diverse tissues to external stimuli (18, 21, 41, 45, 54, 55, 86).
Gene expression profiling is hence considered a key technology for understanding the biology of tissue plasticity as well as pathological disorders (6, 21, 54, 55, 86, 156). The power of this approach has led to an accelerated progress in the understanding on the interrelatedness of the complex biological processes in adapting tissue (29, 34, 108, 131, 154, 155). Particularly, there has been a spurt in the investigations on the molecular strategies underlying the unique malleability of skeletal muscle tissue to specifically adjust its structural-functional makeup in response to a variety of physiological stimuli.
 |
BASIC TECHNIQUES FOR AND SHORTCOMINGS OF TRANSCRIPTIONAL PROFILING
|
|---|
Over the past few years, there have been massive improvements in the methodology to carry out gene expression analyses. Several high-throughput approaches are currently in routine use for gene expression profiling. The basic principle of all these techniques relies on the uniform chemical stability of deoxy- and ribonucleic acids and the specific ability of nuclei acids to form complementary double strands by hybridization. A short overview on the principal characteristics of most popular gene expression profiling techniques as well as their advantages and disadvantages is presented in Table 1. For a more comprehensive description of these technologies, the reader is referred to the references cited in this illustration.
 |
BIAS OF NORMALIZATION TO STANDARDS
|
|---|
Expression profiling techniques involve multiple steps, several of which are critical (summarized in Fig. 1). High-throughput expression technologies such as microarrays produce huge amounts of related data from a single experiment, posing major challenges for mining and statistical analysis of expression data as well as the biological interpretation (32, 85, 86).

View larger version (24K):
[in this window]
[in a new window]
|
Fig. 1. Common steps in transcriptional profiling. The sketch points out the dynamic relationship between RNA content, reaction efficiencies, data normalization, and statistical outcome. The steps where improvements are recommended are indicated in italics.
|
|
Major efforts have been devoted in the past years to develop statistical tools to adequately analyze microarray data (Table 2; Refs. 3, 56, 85, 115). It is therefore assumed that the scientific audience is sensitive to the "mathematical" issues. By contrast, biological considerations for data collection and treatment remain largely undefined or ignored (85, 104, 115, 128). A salient topic in this context is the normalization of gene expression data. Normalization of expression signals before statistical testing is generally recommended to reduce systematic experimental variation to maximize the consistency between measurements (21, 115, 128). Biological guidelines for the normalization procedure have not been formulated up so far (115, 128).
Today, different standards are used for PCR and microarray experiments. Common standards for normalization of PCR data include
-actin,
2-microglobulin, cyclophilin A, GAPDH, hypoxanthine guanine phosphoribosyltransferase mRNA, and 18S and 28S rRNA (14, 62, 91, 94, 101, 108, 114). Microarray data are frequently normalized to the (global) sum of hundreds to thousands of expression signals (21). Both normalization procedures fail with massive alterations in the content of total or messenger RNA in relation to tissue structural alterations and may lead to misinterpretations. This problem does not manifest itself as long as RNA molecules are analyzed. The parallel estimation of structural adjustments, however, indicates important shifts in cell number and structural (volumetric) adaptations in response to various physiological stressors (see below). The observations implicate that purely RNA-based normalization procedures may mask the identification of biological processes that underlie phenotypic responses of cells and tissues.
 |
CASE FOR IMPORTANT SHIFTS IN THE REFERENCE BASE BY TRANSCRIPTOME AND STRUCTURE ANALYSIS
|
|---|
The inappropriateness of RNA-based normalization of expression signals is supported by our laboratory's recent data from a low-density ATLAS cDNA microarray filter approach (34). Nylon filters were custom designed to quantify the expression of muscle-relevant transcripts and various commonly used standards. The RT step was modified to permit the "copying" of primer-specified mRNA sequences and of the heavily abundant ribosomal 18S and 28S RNA species in two parallel RT reactions. The mRNA- and rRNA-derived labeled templates were mixed and hybridized to a nylon support carrying corresponding cDNA probes (Fig. 2). This approach offered a high specificity and allowed normalization of RNA levels in relation to commonly used mRNA standards (GAPDH,
-actin), ribosomal RNA species (18S, 28S), or total mRNA (i.e., total cDNA as spotted on the custom filter).

View larger version (60K):
[in this window]
[in a new window]
|
Fig. 2. Custom-design for transcriptome analysis of skeletal muscle. A: experimental setup for the synthesis of cDNA templates from human mRNAs and 18S/28S ribosomal RNAs for the microarray analysis with low-density cDNA filters (BDClontech Atlas cDNA microarrays). B: custom design of the human Atlas cDNA nylon array. The cDNA double spots for common standards (18S, 28S, GAPDH, -actin) are denoted in gray, and some interesting sectors are indicated.
|
|
The use of this custom-designed microarray provided evidence for fine alterations of commonly used PCR mRNA standards relative to ribosomal 28S RNA in biopsies of human m. vastus lateralis during recovery from endurance exercise (Fig. 3, A and B, Fig. 4A). For example, the GAPDH mRNA level that has been proposed to serve as a stable endogenous RNA standard after endurance exercise (91) undergoes a 50% increase within 8 h of recovery from endurance training (Fig. 3A). The enhanced abundance of
-actin mRNA did not reach the level of statistical significance. Normalization to GAPDH lead to a trend toward a reduction (
40%, P
0.09) of relative myosin heavy chain IIX mRNA levels 1 and 24 h after recovery from exercise (Fig. 3B). Conversely, normalization to ribosomal 28S rRNA or
-actin produced no significant trend. This argues that some member of the set of "accepted" transcriptional standards in skeletal muscle tissue (i.e., GAPDH) are altered as acute consequence of muscle's adaptation to endurance exercise.

View larger version (59K):
[in this window]
[in a new window]
|
Fig. 3. Inconsistency between different normalization schemes. Influence of different normalization schemes in a human model (AB) for endurance exercise and the mouse hindlimb suspension-reloading model (CD). Data from custom-designed low-density cDNA microarrays were background corrected and normalized to single housekeepers (GAPDH, -actin) or ribosomal 28S rRNA or the total cDNA signal. Finally, normalized values were related to the level before the intervention (set to 1). A and B: time course of transcriptional response in human m. vastus lateralis to 30 min of ergometer exercise at the anaerobic threshold. *, P < 0.05 and 0.10, respectively, vs. before the exercise (n = 6, one-way ANOVA, Fisher's post hoc test). C and D: effect of 7 days of unloading (by hindlimb suspension, white arrow) and subsequent reloading (black arrow) for 1 and 7 days on transcript levels in mouse m. soleus (n = 5, *, P < 0.05 vs. control or suspension, respectively, 1-way ANOVA, Fisher's post hoc test). E and F: effect of un- and reloading on RNA concentration and content (E) and nuclear activity (F). F: representative examples of myonuclei in control, unloaded, and reloaded mouse muscle at same magnification. Microarray data sets have been deposited as entries GSE1293
[NCBI GEO]
and GSE2479
[NCBI GEO]
at Gene Expression Omnibus (GEO, http://www.ncbi.nlm.nih.gov/geo). Arrows and arrowheads indicate myonuclei and nucleoli, respectively. Bar, 5 µm. Note the reduction of nucleolar size in unloaded muscle and the nucleolar swelling with reloading.
|
|

View larger version (25K):
[in this window]
[in a new window]
|
Fig. 4. Strategy for the longitudinal study of expressional responses to exercise training in humans. A: outline of a study on the acute and steady-state gene response to exercise in humans. Muscle biopsies are harvested during the time course of recovery from a single bout of exercise ( , E) in the untrained state (U) or in a training steady state (T) after the repetitive impact of exercise. B: summary of the approach used for transcriptional profiling of skeletal muscle plasticity. The 3 steps considered as the main bias are numbered in boxes. These relate to the nonrepresentative sampling or bias in RNA normalization due to acute alterations in mRNA content or longer lasting shifts in total RNA levels.
|
|
In a similar experiment, a modified custom-designed cDNA microarray for mouse was developed to put the transcript level alterations with muscle atrophy and recovery in relation to adjustments of total RNA per muscle weight as well as ultrastructural modifications (34). This complementary setup demonstrated that many mRNAs undergo transcriptional induction within the first 24 h of reloading of atrophied mouse m. soleus (34). Importantly, also the total RNA content (and concentration) per muscle was significantly increased after 7 days of recovery from muscular atrophy (Fig. 3E; Ref. 34). Consequently, different conclusions about the mechanodependent gene expression in mouse soleus muscle can be drawn with different schemes of mRNA normalization. Transcript level alterations with muscle reloading were more pronounced when related to 28S rRNA than when normalized to the total mRNA signal for the probed transcripts. In the case of the highly abundant myosin heavy chain I transcript, different alterations with reloading of atrophied muscle were noted when normalized to 28S rRNA, total cDNA, or GAPDH values (Fig. 3D). This discrepancy can be traced to a general increase of the lower abundant mRNAs and GAPDH that were probed by the filter (see Fig. 3C, Ref. 34). Therefore, "global" normalization of high abundant mRNAs to the signals of multiple rapidly adapting transcripts minimizes the increase in highly abundant transcripts.
The notion of important induction of total RNA content (i.e., total RNA per muscle) in the hindlimb suspension-reloading mouse model is supported by morphometric estimates. These indicate an important increase in interstitial cells with days of reloading (34). The ultrastructural analysis further demonstrates the appearance of prominent, swollen nucleoli at this time (Fig. 3F). The latter event is a hallmark of an increased nucleolar activity and thus indicates increased production and maturation of ribosomal RNA (65). It thus becomes evident that different biological conclusions are drawn between two muscle states when relating changes in mRNA levels to standards that are subjected themselves to change, or when different subsets of mRNAs are used for comparison.
 |
IMPLICATIONS FOR EXPRESSIONAL INVESTIGATIONS ON TISSUE PLASTICITY
|
|---|
The contention of an influence of biological processes on normalization procedures is corroborated by the observation that significant total RNA level changes are a common adaptive response of tissues to physiological challenges or pathological processes (142). For instance, increased workload, dietary interventions, injury, and neoplastic transformation (cancer), respectively, are known to cause heavy alterations in RNA and protein content in various tissues (Table 3). Consequently, established RNA standards show considerable regulation during these biological processes (118, 122).
This contention is particularly indicated by the analysis of the phenomenon of skeletal muscle plasticity. Various reports document important shifts in total RNA and the mRNA pool during the acute phase (i.e., the first days) of muscular adaptations (Table 4) as well as emphasizing differences in total RNA in a steady state between different muscle phenotypes (Table 5). Particularly, an overshoot of mRNA content is observed during the first few days after muscle loading due to strong induction of nuclear gene transcription (Table 4). Likewise, longer lasting pronounced alterations in the content of total RNA and mRNA may pose significant difficulties for proper normalization of gene expression data (see Table 5; Ref. 34). Similarly, significant alterations of housekeepers such as cyclophilin A and
-actin after a single bout of endurance ergometer exercise (62, 91, 101) as well as GAPDH,
2-microglobulin, and hypoxanthine guanine phosphoribosyltransferase were indicated with age and atrophy in rat soleus muscle (108). Hence, when transcriptional alterations in an acutely perturbed or altered steady state are analyzed (see Fig. 4), a universal RNA standard may simply not exist (reviewed in Ref. 64). In these cell-biological situations, the normalization of RNA signals to "fluctuating" or permanently shifted sets of RNA standards represents an important bias. Consequently, the statistically identified transcriptional alterations in distinct tissue models will be fraught with misinterpretations. The definition of appropriate controls (reference system) for data normalization is thus a key element to draw biologically relevant conclusions in expressional reprogramming experiments.
View this table:
[in this window]
[in a new window]
|
Table 4. Altered nuclear transcriptional activity and cell composition in the acute phase of muscular adaptations
|
|
At the heart of the matter, we find that RNA signals cannot easily be related to first, i.e., (volu)metric, principles and that the total RNA content (or absolute number of mRNA molecules) per tissue is rarely estimated in microarray experiments to circumvent the "reference trap" (19, 146). Rather, gene expression levels are often normalized against biological wisdom to mRNAs signals that may fluctuate with respect to main tissue parameters in RT-PCR experiments (122, 133) or are available on the microarray platform (4, 37, 41). In this manner, the problem posed by the dynamic relationship between changes in the pool of assessed RNAs, the modification of tissue makeup, and data normalization (see also Fig. 1) is not essentially taken care of. Even the most recent microarray approaches that hold the "full genome on a chip" (i.e., all sets of transcripts in a cell) may not solve this problem: This is because the measured microarray signals do not represent the original proportion of transcripts because of differences in RT efficiency for each transcript and different binding affinity of each labeled cDNA target to the cDNA-oligonucleotide probe. Neither do they take into account the eventual shifts in total RNA. Hence RNA signals that lack a meaningful unit are related to each other rather than to an absolute standard such as a structural entity. A complete understanding of the cellular basis of novel features of tissue adaptation when strong RNA regulation is apparent therefore depends on the complementation of gene expressional results with a careful quantitative structural analysis with morphometric means (147).
 |
PITFALLS OF EXPRESSION PROFILING TOOLS RELATING TO NORMALIZATION
|
|---|
There has been a recent interest in verifying the performance of different microarray platforms (10, 21, 90, 92, 135) and the applicability of ribosomal 18S or 28S RNA species as internal standards of RT-PCR experiments in human exercise studies (62, 91, 101). These reports point to a considerable mismatch between different microarray platforms and considerable variability of single PCR standards. Because of the prevalence of these issues in gene expression profiling, these matters are briefly discussed as they relate to the issue of RNA normalization.
Incompatible microarray results.
The comparison of mRNA levels from one experimental condition showed that considerable divergences exist across different cDNA and oligonucleotide microarray platforms (90, 135). This implies that the comparison of expression data for the same tissue state through different platforms may not be feasible.
This shortcoming is explained by a number of technical pitfalls of this technology that relate to the efficiency of reverse transcription and hybridization, as well as the difficulty to normalize to a biologically valid reference (76, 104). The discrepancies for the comparison of Tan et al. (135) reflect the expected difference for the relative contribution of expression signals for a set of transcripts between oligonucleotide and cDNA microarrays. This bias differently influences the "pixel sum of the normalization factors" between the compared chip families. The discrepancies therefore relate to the reference trap (19) because only a small subset of measured RNAs serving for the normalization was common to all platforms, i.e., 2,000 transcripts corresponding to 510% of totally measured transcripts. Misbalances in the pixel sum of common transcript vs. total sum of transcripts used for normalization between the different microarray platforms provoke shifts in the representation of each transcript signal relative to the true RNA amount. These shifts are the likely causes for the incongruence of expression levels between the oligonucleotide- and cDNA-spotted platforms (10). This example illustrates that normalized transcript signals that result from the relation to a set of mRNA signals, rather than to a muscle-relevant reference base, are not a characteristic of a tissue state. This is because these mRNA-normalized RNA signals largely depend on the normalization procedures and/or normalization factors offered by the technique.
The phenomenon does, however, come much less into account when related microarray platforms, i.e., those spotted with long oligonucleotide probes (
70-mer) or multiple short (25-mer) oligonucleotides for each gene (10), are compared or when the significantly altered transcripts of one treatment are compared between microarray platforms (10, 135).
RT-PCR and limitations of reverse transcription.
Recent characterizations imply that 18S and 28S rRNA levels are rather unstable per total RNA in human m. vastus lateralis and that the 28S RNA/total RNA ratio would show a largely variable increase immediately after 90 min of ergometer exercise at moderate intensity (62, 91, 101). These conclusions contradict kinetic considerations and biological observations on the accumulation rate of highly abundant transcripts. Drastic shifts in the major RNA pool of ribosomal 28S RNA species (i.e., 35%) would have to occur within 11/2 h (82, 84). Previous observations on acute muscular adaptations to heavy stimuli demonstrate, however, that such a significant increase in total RNA is expected after 1 day at the earliest (Table 4). Therefore, the cautious interpretation of the paradoxical results of these former RT-PCR studies (62, 91, 101) must incorporate possible technical limitations of the RT-PCR technology.
To resolve this issue, we have carried out similar real-time RT-PCR experiments. These indicate that the amount of the 28S rRNA template used in typical preparations, i.e., 0.252 µg of total RNA, does not convert proportionally into a corresponding cDNA amount during the reverse transcription reaction using 20-pmol hexamer primers (data not shown). It appears that under these conditions there is simply not a 1:1 relation for the number of 28S rRNA molecules put in the RT and the relative number of synthesized 28S cDNA molecules during this step. Rather, the calculation shows that with each doubling of input RNA only
10% more cDNA is produced by reverse transcription. This indicates possible kinetic limitations of the reverse transcription reaction conditions of this highly abundant RNA. This conclusion is supported by theoretical considerations that take into account the binding specificity of the 4,096 possible hexaoligonucleotides for the 28S rRNA. The calculations indicate a low primer-to-template ratio of 3 to 25 under the given PCR constraints. Additional considerations on the equilibrium balance (137), the binding of multiple primers per same template, the template switching of the reverse transcriptase (39), and the moderate 6- to 48-fold stoichiometric excess of deoxynucleotides argue that kinetic constraints additionally impede the efficiency of the "conversion" of the highly abundant rRNA into cDNA. Thus kinetic considerations predict for the specific conditions (101) a nonrepresentative reverse transcription of this highly abundant 28S rRNA species into cDNA.
The contention of a kinetic limitation is supported by the observation in similar RT reactions in which an increase in the amount of hexamers by two magnitudes improved the combined RT and PCR efficiency to near proportionality of input RNA and detected cDNA (91). Kinetic constraints must therefore be considered when heavily abundant transcripts are reverse transcribed to reduce variability of amplified cDNA such as observed for 28S rRNA (91, 101).
Likewise, technical inaccuracies could explain the variability of ribosomal 18S and 28S rRNA level determinations as described by Jemiolo and Trappe (62). Their protocol mentions that rRNA and mRNA species were reverse transcribed from poly-A selected samples with oligo(dT) primers. However, the majority of ribosomal RNAs is primarily not polyadenylated (reviewed in Ref. 95). In fact, rRNAs are expected to be depleted by the poly-A selection, but at a variable yield (142). The observed inconsistent level of amplified 18S rRNA species from poly-A-enriched RNA (62) therefore matches expectations.
 |
RECOMMENDATIONS FOR OPTIMAL EXPERIMENTAL LAYOUT OF GENE EXPRESSION STUDIES
|
|---|
Following, experimental guidelines are given that are critical to reduce bias related to biological variability and that permit normalization of gene transcript levels to tissue-relevant references. These concern the systematic tissue sampling and the collection of tissue-relevant references such as weight, volume, nuclear number, and RNA content. These proposals include as well the use of appropriate endogenous controls and the spiking of exogenous RNAs to correct differences in processing efficiencies to allow backcalculation of RNA values to tissue references. These latter considerations have to be incorporated into the sampling as well as processing of tissues and RNA molecules.
Subsequently, critical factors for the controlling of technical variance along with pertinent aspects for the statistical testing of expression profiling studies are briefly summarized. For complementary information, see Table 6.
Tissue sampling.
A strategy should be adopted that permits a reliable representation of the biological processes. The variability arising from differences in nuclear transcriptional activity due to regional variability in cellular composition and imposed stress (5, 22, 42, 114) and stochastic fluctuations in transcript expression (103) has to be taken into account. If the total tissue cannot be isolated, as for example in human experimentation, samples should be collected from equivalent, i.e., anatomically defined, portions of the tissue.
Pooling has been introduced recently as a novel approach to minimize the noise and costs of microarray analysis (54, 152) and to compare two skeletal muscle steady states (24). Because pooling also reduces the sample size (54), there is an important loss of information on the biological variability of the gene response. Care must therefore been taken when a heterogeneous transcriptional response is to be expected. Noncontinuity is existent in dystrophic muscle, badly matched patient populations, and eccentric exercise (5, 44, 54, 114). When a pooling strategy is planned, systematic sampling and controlled mixing of samples have to be employed (see Ref. 152). For instance, equal amounts of tissue of RNA per biological replicate, for a sufficient number of pools per treatment, is critical to reveal a "stable" representation of the stochastic, adaptive process being studied.
Tissue-relevant references.
For the purpose of normalization to a biologically relevant standard, it is strongly advised to estimate the weight of the sample that is subjected to isolation of RNA and/or to analyze cellular changes by histology (145). For human experimentation, the tissue volume may be determined instead by physical measures such as computed tomography, magnetic resonance imaging, or dual-energy X-ray absorptiometry to scale the biopsy to total muscle volume. This consideration on the collection of tissue-relevant parameters equally applies to RNA isolation (see Table 6). The recommended improvements involve the sampling of an aliquot of the initial tissue extract for later estimation of DNA content and relation of RNA to DNA content via sensitive ribo- or picogreen fluorophores or optical density ratios (OD260/280). Surprisingly, such tissue-internal structural values are rarely reported along with gene expression data (M. Fluck, personal observation). Consequently, the observed changes in "bare" mRNA values could reflect possible shifts in the proportion of RNA to tissue volume and need to be substantiated via validation using alternative techniques.
Endogenous control transcripts.
When shifts in the ratio of total or messenger mRNA pool vs. the tissue volume are evident (108, 142, 148), microarray expression signals should be related to tissue weight or nuclear number rather than being simply normalized to RNA pools. To achieve such normalization, we propose to relate RNA estimates to ribosomal 28S or 18S RNA. These polycistronic mRNA species constitute the great majority, i.e.,
70%, of the total RNA pool (82, 84), and therefore best track the total RNA content. Conversely, mRNA only provides for
2% of the total RNA pool (84). Normalization to the value of a single ribosomal RNA is expected to be robust because this measure arises from the estimation of the largest RNA population. Our characterization of custom-designed low-density cDNA microarrays supports that these major RNA species are well suited for the later conversion to a muscle-relevant reference base in muscle plasticity models where heavy RNA alterations are apparent (Figs. 2 and 3, Tables 3 and 4). For example, the multiplication of 28S-related signals with the simultaneously determined total amount of RNA per muscle weight (volume) converts the expression signal in kilogram or metric units (34). This reference system permits, under the assumptions of a constant proportion of ribosomal per total RNA in muscle tissue (79), the conversion of transcript levels between different RNA and tissue references. The methodology potentially also allows for drawing interexperimental comparisons. This approach would be of major relevance for many comparative questions such as, for example, the relationship between gene expression in slow- vs. fast-type muscle fibers because these demonstrate different total RNA and nuclear content (52, 121).
Some degree of caution must be exercised because ribosomal RNAs cannot be easily used as a true internal standard. Because of kinetic constraints of the RT step, the cDNA synthesis must be run in two differently diluted samples in parallel (34; Fig. 2A), which increases the technical variability. Additionally, the measure of ribosomal RNA species may require a custom layout of the microarray platform (see Fig. 2B). Because of these practical constraints, estimated RNA values may for the purpose of relating to a tissue parameter also initially be related to another appropriate endogenous mRNA.
Given that detrimental shifts in RNA to tissue volume (or nuclear number) can be ruled out, i.e., in the first hours after exercise (see Tables 3 and 4; Fig. 3), transcript signals from microarray studies may be normalized to a selected pool of measured mRNAs (64, 115, 158). In PCR experiments, transcripts signals may be related to single or multiple stably expressed transcripts that were identified upon routine testing (14, 108, 136, 143) or statistical modeling (4, 122, 133). Alternatively, the estimation of absolute transcript numbers may be another route for RNA normalization. PCR-based methods applying quantified homologous cDNA references have been described as useful (80, 114, 158). For the estimation of absolute transcript numbers in microarray experiments, a methodology incorporating statistical modeling may be suitable (46). Still, the latter approach would demand the relation to a concomitantly measured muscle structural feature.
In any case, the careful choice of the primers for reverse transcription is of prime importance to control the bias of possible normalization procedures. When shifts in mRNA content (per tissue) are expected, random primers rather than oligo(dT) primers ought to be used for initiation of cDNA synthesis (see Table 6). This permits the later normalization to both mRNA and rRNA species, therefore allowing to better control for the shift in mRNA-to-total RNA or mRNA-to-tissue proportion.
Spiking of exogenous RNA.
To overcome the limitations in data normalization, it is proposed to follow up the content of spiked exogenous RNA and cDNA standards during RNA processing. This novel approach is useful to control for the loss of sample during enrichment of mRNA and to correct shifts in the proportion of messenger to total RNA during sample processing (72, 142). Under the considerations given above, this referencing should also improve the relation of RNA signals to an objective parameter in SI base units such as mass or volume or nuclear number. Currently, such a reference system is not routinely available and demands the synthesis of suitable RNA template and the setup of a custom PCR or microarray detection system (34, 158).
Technical variability.
Special emphasis should be put on the reduction of day-to-day variability. RNA samples that will be directly compared ought to be subjected to the same cycles of physical and chemical treatments. RT reactions should be assembled in parallel from the same master mix. This situation is given for the time course of the gene response in an individual (see Fig. 4). It is also advisable to perform test-retest experiments, to validate the reproducibility of the different processing steps. Exogenously added (i.e., spiked) RNA species may be used in conjunction with an optimized experimental design to allow a distinction between day-to-day technical variability in the different RNA processing steps and the biological variability of transcript levels in the samples (72, 142). Finally, kinetic limitations for highly abundant transcripts should be avoided by eventual dilution of RNA sample and adjustments of primer and NTP concentration. Eventual nonproportionality between input RNA and synthesized 28S cDNA may be improved by subjecting equal amounts of quantified total RNA samples to the RT reaction. For further background on the optimization of assays, the reader is referred to a more systematic article (8).
Statistical testing.
Several steps of the statistical analysis are common to most expression data analysis (Fig. 1) and be carried out dependent on the available statistical advice (3, 152). Concerning the analysis of microarray data, an initial consideration is to assess the data in a descriptive way to reveal an apprehension on the overall relationship of data sets, which is important for data mining (see Tables 2 and 6). A common aim is to identify differentially expressed transcripts and to filter interindividual association of gene transcripts to synexpression groups and/or phenotypical features (21, 85, 152). The identification of differentially expressed genes should be based on a test that best describes the distribution of the data and that develops sufficient statistical power (Table 2, Ref. 102). In this regard, variance stabilization by simple log-transformation, or more complex transformations (58), is a generally accepted procedure to make the signal values more comparable in terms of variance before normalization and statistical analysis are carried out (85, 115). This avoids bias in the assignment of P values for a microarray and reduces the number of biological replicas to reach statistical significance.
Multiplicity error correction.
When multiple transcripts are measured in the absence of a specific hypothesis, it is important to perform a correction for the number of tests performed (33). This reduces the risk of assigning a false positive transcript alteration (i.e., type I error). Several multiplicity error corrections have been proposed (152). To balance the concomitant risk of falsely excluding real positives, i.e., type II errors, the less stringent false discovery rate adjustment rather than the more stringent familywise error rate correction is now widely applied (12). Finally, the choice of the selected significance level essentially may also depend on whether one wants to use microarrays for precise estimates of identified genes or as an exploratory tool in which biological replica number is kept low and only the heaviest alterations are of interest (for further details, see Refs. 152, 159).
These multiplicity error corrections do not normally take into account whether an a priori hypothesis exists for a possible change in expression levels. This issue is critical as it bears the risk of excluding many real changes. For instance, each transcript that is not specifically altered by the manipulation under study does increase the number of statistical tests. Consequently, severe multiplicity error corrections bear the risk of excluding many real changes. When a hypothesis can be formulated for certain functional categories on acceptable a priori grounds, it is most effective to tailor the multiplicity error correction individually for each set of hypothesized transcripts.
Lastly, we would like to comment on the intriguing fact that the issue of multiplicity error correction has so far been neglected in RT-PCR studies. This relates to the argumentation that these studies are hypothesis driven. This is a sweeping argument, and it is the opinion of the authors that multiplicity error corrections also have to be considered in these kinds of studies when alterations of a multitude of genes are analyzed without a priori hypothesis.
Verification.
Because of possible different relative alterations of the mRNA and total RNA pool or cross-hybridization phenomenon, some microarray analysis should be verified by complementary techniques like PCR or in situ hybridization (see Table 6; Ref. 31). Occasionally it may be necessary to verify microarray data by Western blot analysis. However, the protein does not necessarily follow an mRNA change because protein turnover may be additionally subjected to regulation or relate to the bias of selective extraction of individual molecules in proteomic approaches.
 |
CONCLUSIONS
|
|---|
The incompatibility of results in regard to muscle plastic adaptations relates to a large extent to biological fluctuations and permanent shifts of transcriptional activity of nuclei and cell composition. These impede additionally to technical variability on the RNA-based normalization approaches and prohibit cross-referencing of biological interpretations. Consequently, it is proposed to relate expression transcript levels to relevant reference bases such as muscle weight, volume, or nuclear content via correction to endogenous and spiked exogenous RNA standards. The outlined considerations and proposed guidelines basically apply to transcription profiling in different models of tissue plasticity.
 |
GRANTS
|
|---|
The work was supported by Swiss National Science Foundation Grant 3100-065276.
 |
ACKNOWLEDGMENTS
|
|---|
The assistance of Eduard Babychuck in the translation of Russian scientific literature is gratefully recognized. We thank Thomas Aigner, Oliver Baum, and Dominique Desplanches for helpful suggestions. A special thanks goes to Dr. Rudolf Billeter for defining the technical baseline for the gene expression profiling studies of skeletal muscle in our laboratory.
 |
FOOTNOTES
|
|---|
Address for reprint requests and other correspondence: M. Flück, Dept. of Anatomy, Baltzerstrasse 2, 3000 Bern 9, Switzerland (E-mail: martin.flueck{at}ana.unibe.ch)
 |
REFERENCES
|
|---|
- Allen DL, Linderman JK, Roy RR, Bigbee AJ, Grindeland RE, Mukku V, and Edgerton VR. Apoptosis: a mechanism contributing to remodeling of skeletal muscle in response to hindlimb unweighting. Am J Physiol Cell Physiol 273: C579C587, 1997.[Abstract/Free Full Text]
- Alway SE, Gonyea WJ, and Davis ME. Muscle fiber formation and fiber hypertrophy during the onset of stretch-overload. Am J Physiol Cell Physiol 259: C92C102, 1990.[Abstract/Free Full Text]
- Anderle P, Duval M, Draghici S, Kuklin A, Littlejohn TG, Medrano JF, Vilanova D, and Roberts MA. Gene expression databases and data mining. Biotechniques Suppl: 3644, 2003.
- Andersen CL, Jensen JL, and Orntoft TF. Normalization of real-time quantitative reverse transcription-PCR data: a model-based variance estimation approach to identify genes suited for normalization, applied to bladder and colon cancer data sets. Cancer Res 64: 52455250, 2004.[Abstract/Free Full Text]
- Bakay M, Chen YW, Borup R, Zhao P, Nagaraju K, and Hoffman EP. Sources of variability and effect of experimental approach on expression profiling data interpretation. BMC Bioinformatics 3: 4, 2002.[CrossRef][Medline]
- Baldwin KM. Research in the exercise sciences: where do we go from here? J Appl Physiol 88: 332336, 2000.[Abstract/Free Full Text]
- Bandopadhyay M and Ganguly AK. Putrescine, DNA, RNA and protein contents in human uterine, breast and rectal cancer. J Postgrad Med 46: 172175, 2000.[Medline]
- Bank HL and Schmehl MK. Parameters for evaluation of viability assays: accuracy, precision, specificity, sensitivity, and standardization. Cryobiology 26: 203211, 1989.[CrossRef][Medline]
- Barber RD, Harmer DW, Coleman RA, and Clark BJ. GAPDH as a housekeeping gene: analysis of GAPDH mRNA expression in a panel of 72 human tissues. Physiol Genomics 21: 389395, 2005.
- Barczak A, Rodriguez MW, Hanspers K, Koth LL, Tai YC, Bolstad BM, Speed TP, and Erle DJ. Spotted long oligonucleotide arrays for human gene expression analysis. Genome Res 13: 17751785, 2003.[Abstract/Free Full Text]
- Barnett JG, Holly RG, and Ashmore CR. Stretch-induced growth in chicken wing muscles: biochemical and morphological characterization. Am J Physiol Cell Physiol 239: C39C46, 1980.[Abstract/Free Full Text]
- Benjamini Y, Drai D, Elmer G, Kafkafi N, and Golani I. Controlling the false discovery rate in behavior genetics research. Behav Brain Res 125: 279284, 2001.[CrossRef][ISI][Medline]
- Bilban M, Buehler LK, Head S, Desoye G, and Quaranta V. Normalizing DNA microarray data. Curr Issues Mol Biol 4: 5764, 2002.[Medline]
- Birot OJ, Koulmann N, Peinnequin A, and Bigard XA. Exercise-induced expression of vascular endothelial growth factor mRNA in rat skeletal muscle is dependent on fibre type. J Physiol 552: 213221, 2003.[Abstract/Free Full Text]
- Blacksell SD, Khounsy S, and Westbury HA. The effect of sample degradation and RNA stabilization on classical swine fever virus RT-PCR and ELISA methods. J Virol Methods 118: 3337, 2004.[CrossRef][ISI][Medline]
- Blake WJ, Kaern M, Cantor CR, and Collins JJ. Noise in eukaryotic gene expression. Nature 422: 633637, 2003.[CrossRef][Medline]
- Bland JM and Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet 1: 307310, 1986.[CrossRef][ISI][Medline]
- Booth FW and Thomason DB. Molecular and cellular adaptation of muscle in response to exercise: perspectives of various models. Physiol Rev 71: 541585, 1991.[Free Full Text]
- Braendgaard H and Gundersen HJ. The impact of recent stereological advances on quantitative studies of the nervous system. J Neurosci Methods 18: 3978, 1986.[CrossRef][ISI][Medline]
- Brazma A, Hingamp P, Quackenbush J, Sherlock G, Spellman P, Stoeckert C, Aach J, Ansorge W, Ball CA, Causton HC, Gaasterland T, Glenisson P, Holstege FC, Kim IF, Markowitz V, Matese JC, Parkinson H, Robinson A, Sarkans U, Schulze-Kremer S, Stewart J, Taylor R, Vilo J, and Vingron M. Minimum information about a microarray experiment (MIAME)-toward standards for microarray data. Nat Genet 29: 365371, 2001.[CrossRef][ISI][Medline]
- Butte A. The use and analysis of microarray data. Nat Rev Drug Discov 1: 951960, 2002.[CrossRef][ISI][Medline]
- Butte AJ, Dzau VJ, and Glueck SB. Further defining housekeeping, or "maintenance," genes. Focus on "A compendium of gene expression in normal human tissues." Physiol Genomics 7: 9596, 2001.[Free Full Text]
- Cabric M and James NT. Morphometric analyses on the muscles of exercise trained and untrained dogs. Am J Anat 166: 359368, 1983.[CrossRef][ISI][Medline]
- Campbell WG, Gordon SE, Carlson CJ, Pattison JS, Hamilton MT, and Booth FW. Differential global gene expression in red and white skeletal muscle. Am J Physiol Cell Physiol 280: C763C768, 2001.[Abstract/Free Full Text]
- Cantin M, Solymoss B, Benchimol S, Desormeaux Y, Langlais J, and Ballak M. Metaplastic and mitotic activity of the ischemic (endocrine) kidney in experimental renal hypertension. Am J Pathol 96: 545565, 1979.[Abstract]
- Carter EA, Hatz RA, Yarmush ML, and Tompkins RG. Injury-induced inhibition of small intestinal protein and nucleic acid synthesis. Gastroenterology 98: 14451451, 1990.[ISI][Medline]
- Casella G and Berger RL. Statistical Inference. Belmont, CA: Duxbury, 1990.
- Chakravarthy MV and Booth FW. Eating, exercise, and "thrifty" genotypes: connecting the dots toward an evolutiona