MSTN genotype (g.66493737C/T) association with speed indices in Thoroughbred racehorses

Emmeline W. Hill, Rita G. Fonseca, Beatrice A. McGivney, Jingjing Gu, David E. MacHugh, Lisa M. Katz


Sequence variation at the equine myostatin gene (MSTN) locus has previously been shown to have a singular genomic influence on optimum race distance in Thoroughbred racehorses. Myostatin, encoded by the MSTN gene, is a member of the TGF-β superfamily that regulates skeletal muscle development in a range of mammalian species including the horse. In the Thoroughbred, the C-allele at the g.66493737C/T SNP has been found at significantly higher frequency in subgroups of the population that are suited to fast, short distance, sprint races and also influences body composition phenotypes. We investigated the influence of the g.66493737C/T SNP on speed indexes measured in a cohort of n = 85 Thoroughbred horses-in-training. We found significant associations between genotypes at the g.66493737C/T SNP and all measured speed variables: Dist6 [distance travelled during 6 s before and after maximal velocity (Vmax); P = 0.0040], Vmaxt (duration at Vmax; P = 0.0249), Vmax (P = 0.0265), Dist6b (distance travelled during 6 s before Vmax; P = 0.0032), and Dist6a(distance travelled during 6 s after Vmax; P = 0.0317). For each measure, horses with the C/C and C/T genotypes outperformed T/T horses, indicating the requirement for at least one C-allele to improve speed. For the most significantly associated variables (Dist6 and Dist6b) the C/C cohort performed better than the T/T cohort with the heterozygotes intermediate, indicating a dose-dependent manifestation. These findings clearly indicate that variation at the MSTN gene influences speed in Thoroughbred horses.

  • horse
  • myostatin
  • genetic association
  • performance

thoroughbred horse racing is highly competitive and commands large financial investments. Currently, the evaluation of prospective racing success relies primarily on pedigree evaluation and observation of the physical characteristics of the horse. A range of scientific approaches have been used in an attempt to associate physiological characteristics with racing performance for the prediction of racing aptitude and include upper airway function (15), lower limb radiographic appearance (36), heart size (43, 44), skeletal muscle fiber type (3, 33, 34, 44), musculoskeletal conformation (13), postexercise lactate concentration (9), speed at maximal heart rate (16), hematological measurements (32), and other physiological variables (18). However, there is no clear consensus among researchers regarding correlations with any of these parameters and racing potential.

The availability of the equine genome sequence (41) and the parallel development of molecular genomics platforms for the horse have rapidly enabled the identification of genomic sequence variants associated with athletic performance phenotypes in Thoroughbred horses (4, 17, 2022, 38). Consensus is building that a sequence polymorphism in the first intron of the equine myostatin gene (g.66493737C/T) is a powerful predictor of suitability for racing at various distance ranges (4, 20, 22, 37, 38). Homozygote individuals for the C-allele (i.e., C/C) have been shown to compete preferentially in shorter distance races (1,000–1,600 m), whereas C/T horses are best suited to middle-distance races (1,400–2,400 m) and T/T horses have greater stamina and tend to excel in longer distance races (>2,000 m). A significant association has also been found between MSTN genotype and body composition measurements in two independent investigations, which have reported C/C horses having a significantly greater height-to-body mass ratio than C/T or T/T horses (20; Tozaki T, Sato F, Hill EW, Miyake T, Endo Y, Kakoi H, Gawahara H, Hirota K, Nakano Y, Nanbo Y, Kurosawa M, unpublished data). These findings are consistent with observations that sprint-oriented Thoroughbreds are generally more muscular and compact than individuals that are better suited to racing over longer distances. Furthermore, in a genome-wide association study using an SNP platform that assays 54,602 pan-genomic SNP markers, the g.66493737C/T SNP was confirmed as the most powerful predictor of optimum race distance in Thoroughbred horses, suggesting that it may have a functional role (22).

Desired performance characteristics for a Flat racehorse include acceleration rate, speed, and the ability to maintain maximum speed over a certain distance. While we previously reported no association between various speed parameters measured during training and racing performance in a group of Flat Thoroughbred racehorses, an association between acceleration rate and sprinting ability was observed (14). Considering the contribution of skeletal muscle to force generation and speed, for the present study we hypothesized that speed parameters, measured using field technologies (GPS) in a cohort of horses-in-training, may be influenced by g.66493737C/T genotypes at the myostatin locus.



This work was approved by the University College Dublin, Ireland, Animal Research Ethics Committee.

Study animals and training protocol.

A set of horses (n = 85) selected from a group of Thoroughbred racehorses (n =102) evaluated from a single training yard for physiological performance parameters during training (March-November) in 2007 and 2008 (14) were included in the current study. All horses were trained for Flat racing. The horses included in the study population were chosen based on their training stage and fitness to form a homogeneous study group. The study cohort comprised of n = 55 2 yr olds (n = 18 males and n = 37 females) and n = 30 3 yr olds (n = 11 males and n = 19 females). The criteria for inclusion in the study cohort were that each horse must have completed at least 2 work days [(WD), fast workout simulating a race] prior to the GPS recording, resulting in each horse having undergone ≥3 accumulated WD (accWD).The GPS data associated with the greatest number of accWD for each horse was used.

The training protocol for the horses was described previously (14). Briefly, horses were trained 6 days/wk on an outdoor all-weather gallop 1,500 m in length with a 2.7% incline for the final 800 m. The training program consisted of progressive stages gradually introducing “fast” workouts (WD) as training progressed. WD generally consisted of gallop distances 800–1,000 m. Training was modified and adapted to each individual animal based on soundness, fitness, and aptitude. Following the onset of WD, horses were entered into competitive races dependant on their perceived fitness and performance. All decisions on the training and racing schedule were made by a single trainer. At the time, the trainer did not have access to genotype information for the horses.

Experimental protocol and data collection.

Measured data were recorded for horses undergoing a WD as previously described (14). Each jockey carried a hand-sized GPS unit (GPSports Systems SPI10). After data collection the GPS data were downloaded to an equine-specific software program (Race Watch Software, GPSports Systems SPI10). The GPS unit recorded variables including speed, time, and distance as well as the exact map of each horse's exercise. Prior to the onset of the study, the entire gallop had been prerecorded using one of the GPS units (14).

The total number of WD and races for each horse were recorded during the year. For each WD, the sex, age, total distance exercised, jockey, and gallop condition were recorded (14). Horses were ridden by one of six jockeys selected by the trainer. Depending on the jockey's weight, a lead bag was added to the saddle to equalize the weight carried between horses.


All speed measurements were recorded as previously described (14). Briefly, measurements were recorded from a distance of 800 m from the finish line as the total distance exercised on a WD differed slightly for each horse. Speed indexes evaluated included maximal velocity (Vmax), duration at Vmax (Vmaxt), distance (m) travelled during 6 s before Vmax (Dist6b), distance (m) travelled during 6 s after Vmax (Dist6a) and distance (m) travelled during 6 s before and after Vmax (Dist6). The Vmax zone is the range from which Vmaxt was obtained. Vmax zones were created to represent ranges of speeds since the Vmax tended to waver 1–2 m/s for each horse and allowed a clearer idea of the time spent at Vmax. The zones were zone 1 (14–14.4 m/s), zone 2 (14.5–15 m/s), zone 3 (15.1–15.6 m/s), zone 4 (15.7–16.1 m/s), zone 5 (16.2–16.6 m/s), and zone 6 (16.7–17.4 m/s).

DNA extraction and MSTN polymorphism genotyping.

Genomic DNA was extracted from fresh whole blood using the Maxwell 16 automated DNA purification system (Promega, WI). Genotyping for the MSTN gene SNP g.66493737C/T was carried out using Taqman chemistry on the StepOnePlus Real-Time PCR System (Applied Biosystems). The assay consisted of forward primer 5′-CCAGGACTATTTGATAGCAGAGTCA, reverse primer 3′-GACACAACAGTTTCAAAATATTGTTCTCCTT, and two allele-specific fluorescent dye labeled probes (VIC-AATGCACCAAGTAATTT; 6-FAM-ATGCACCAAATAATTT).

Statistical analyses.

Tests of association were performed using the PLINK Version 1.05 software package (30, 31). The linear regression model was used to evaluate quantitative trait association at the g.66493737C/T SNP with the phenotypes: Vmax, Vmaxt, Dist6b, Dist6a, and Dist6. The following were included as covariates in the analyses as they had all been found to contribute to variation in speed indexes (14): sex, age, total distance exercised, accWD, jockey, and gallop condition. In the results, BETA is the regression coefficient; statistic refers to the t-statistic, which is computed by dividing the estimated value of the β-coefficient by its standard error; the TEST column is by default ADD meaning the additive effects of allele dosage; ADD refers to a standard additive test of association; GENO_2DF refers to a 2 df joint test of both additive and dominance. Tests of association were also performed using the ANOVA test statistic.


Training and exercise parameters.

The average number of accWD prior to the GPS recording was 14.1 ± 7 days. Thirty-seven (67.3%) 2 yr olds and 23 (76.7%) 3 yr olds had ≥10 accWD at the time of the GPS recording. Data from GPS recordings for WD exercise distances of 800 m (9% 2 yr olds, 3.3% 3 yr olds), 900 m (91% 2 yr olds, 43.3% 3 yr olds), 1,000 m (46.7% 3 yr olds), and 1,200 m (6.7% 3 yr olds) were used. Three of the six jockeys rode 76.5% of the exercise tests.

MSTN genotypes.

Genotypes at the MSTN gene locus g.66493737C/T were determined for all individuals in the study. There were 21 (24.7%) C/C, 44 (51.7%) C/T, and 20 (23.5%) T/T individuals, representing a normal distribution of the genotypes previously observed among a large cohort of Flat racehorses (20).

MSTN genotype association with speed indexes.

A strong correlation was observed among the phenotypic variables except Vmaxt (Table 1) and genotypes at the g.66493737C/T locus were significantly associated with all the measured variables: Dist6(P = 0.0040), Dist6b (P = 0.0032), Vmaxt (P = 0.0249), Vmax (P = 0.0265), and Dist6a (P = 0.0317) (Table 2). For each speed index, the C/C and C/T cohorts out-performed the T/T cohort (Table 3) and for the most strongly associated measurements (Dist6 and Dist6b) the C/T measurements were intermediate to the homozygotes. The mean distance (m) travelled was 3.8 m and 2.2 m greater in the C/C (195.7 m; 97.6 m) than the T/T (191.9 m; 95.35 m) cohort during the 6 s before and after Vmax (Dist6) and during the 6 s before Vmax (Dist6b). Vmax was 0.31 m/s greater for the C/C (16.6 m/s) cohort than the T/T (16.29 m/s) cohort, and Vmax was maintained (Vmaxt) for 2.05 s longer in the C/C (7.3 s) than the T/T (5.25 s) cohort. ANOVA test results are given in Table 4.

View this table:
Table 1.

Correlation coefficients among measured speed variables

View this table:
Table 2.

Genetic association test results for five speed variables and the MSTN g.66493737C/T SNP

View this table:
Table 3.

Association test means for speed indexes significantly influenced by genotypes at the MSTN g.66493737C/T SNP

View this table:
Table 4.

ANOVA results for speed variable association with MSTN genotype


In humans, variants in more than 200 genes have been reported to be associated with exercise traits (6) and have been described principally as segregating among elite athletes and sedentary members of the population (11, 12, 29). In particular, polymorphisms in the ACE (7, 8) and ACTN3 (10, 35, 42) genes have been most extensively described. Few reports describe genomic variation associated with a measured physiological index of prospective ability measured during running, cycling, or any other simulation of exercise (1). Recently, a genome-wide association study has reported significant associations with training responses to maximal oxygen consumption in a sedentary cohort of humans (5).

In this study, we investigated the effect of MSTN g. 66493737C/T genotype on speed indexes in a cohort of Thoroughbred horses trained by the same trainer and maintained in the same environment. While the sample size is small compared with equivalent studies in human populations, the relatively low genomic and phenotypic variation observed in Thoroughbreds as a result of recent shared ancestry and intense selection for athletic traits means that smaller numbers of samples can provide useful data. Additionally, the homogeneity of the population is augmented by maintenance in the same environment, which minimizes nongenetic influences on performance variation, a situation that is not possible for human studies. The statistical analyses in this study were further empowered by including as covariates parameters that had previously been shown to influence speed variables in the same cohort of horses-in-training (14).

This study has demonstrated that genotypes at the MSTN g.66493737C/T locus have a significant influence in the determination of individual differences in speed. For all speed parameters that were significantly different among genotype cohorts, horses with the C/C genotype outperformed T/T horses, with measurements for C/T individuals in most cases intermediate to the homozygotes. As expected the measured speed variables were related and, therefore, given the strong correlation among phenotypes, the most significant observation (Dist6) describes the overall differences in speed among genotypes. For instance, C/C horses travelled 3.8 m farther than T/T horses during the 6 s before and after the achievement of Vmax (Fig. 1). In racing parlance, a “length” is the unit of distance that describes the distance between the winner of a race and horses subsequently placed. One length is equivalent to ∼2.4 m; therefore C/C horses on average had a 1.5 length advantage over T/T horses.

Fig. 1.

Relationship between the distance travelled during the 6 s before and after Vmax (Dist6) and genotypes at the MSTN locus. On average, the C/C horses travelled 3.8 m farther than T/T horses during the 6 s before and after the achievement of Vmax and C/T horses were intermediate to the homozygotes.

Characteristics desired in a Flat racehorse include not only the ability to obtain top speeds, but also the capability to quickly accelerate to that top speed and maintain it over a certain distance. In the current study, the C/C horses obtained significantly higher speeds than the T/T horses, supporting previous findings that horses that compete optimally over shorter distances (≤ 1400 m races) that require exceptional speed are more likely to have the C-allele compared with horses that perform optimally over longer distances (> 1600 m races) (20). Acceleration rates were quantified in the current study by evaluating the distance travelled 6 s before reaching maximal velocity (i.e., the distance covered during a set amount of time), whereas the ability to maintain speed was quantified by measuring the time spent at maximal velocity as well as the distance covered 6 s after reaching maximal velocity. Compared with the T/T cohort, the C/C horses travelled a significantly greater distance during the 6 s before and after reaching maximal velocity, which supports the enhanced acceleration capabilities of this group. Furthermore, the C/C cohort was shown to have a longer duration at Vmax (Vmaxt) than the T/T group, exhibiting the capabilities to maintain top speed for a longer period of time.

The mechanism by which the MSTN gene variant influences suitability for short or long distance racing is still unclear. However, gene expression studies have determined that MSTN mRNA transcripts are significantly altered in Thoroughbred skeletal muscle following training (26) and that g.66493737C/T genotype variation is significantly associated with MSTN mRNA content and influences the transcript response to training (25). The myostatin protein is a secreted growth factor expressed in skeletal muscle and adipose tissue that negatively regulates skeletal muscle mass (27). The protein binds to receptor complexes initiating a signaling cascade that results in the negative regulation of muscle development and growth (19, 23, 24, 40). It is likely that the reduction in MSTN gene expression impacts directly on muscle growth and development and it is possible that this may be regulated directly by the g.66493737C/TSNP that disrupts a putative transcription factor binding site (22). However, it is notable that myostatin has been reported to have a number of other nonhypertrophic functions, such as the regulation of muscle fiber type development (19) and oxidative enzyme activity resulting from increased mitochondrial content (2).

Interestingly, muscle hypertrophy, as a result of a dysfunctional myostatin protein, manifests in a dose-dependent fashion in other species; for example, mice heterozygous for the null mutation have intermediate muscle mass compared with homozygotes (27). Further evidence for a dose-dependent effect of myostatin comes from studies of the whippet dog breed where animals homozygous for a 2-bp nonsense deletion mutation in MSTN show extreme hypermuscularity and are not raced, while dogs heterozygous for this mutation have superior racing ability (28). In Thoroughbreds, it is noteworthy that C/T horses have phenotypes similar to C/C horses or are intermediate to the homozygotes, indicating that a single C-allele may be sufficient to improve speed variables.

Five studies have reported variation at the MSTN gene locus segregating among subgroups of the Thoroughbred population (4, 20, 22, 37, 38). While it is clear that this locus has a singular genomic influence on optimum race distance (4, 22, 38), it is likely that various combinations of favorable variants at many other gene loci (17, 21) segregate in a manner that contribute to the variation in elite performance ability within the distance subgroups. Notwithstanding this, the relationship between g.66493737C/T genotypes and optimum race distance has led to referring to MSTN as “the speed gene” in Thoroughbreds. The data from the current study strongly support the inference that variation at this locus contributes to speed variables in a racehorse training environment.


R. G. Fonseca was supported by the Research Development Fund, University College Dublin. E. W. Hill was supported by a Science Foundation Ireland President of Ireland Young Researcher Award (04/YI1/B539).


Equinome Ltd. has been granted a license for commercial use of the data that is contained within patent applications: United States Provisional Serial Number 61/136553; Irish Patent Application Number 2008/0735 and 2010/0151; Patent Cooperation Treaty number PCT/IE2009/000062. The following authors are named on the applications: E. W. Hill, J. Gu, B. A. McGivney, L. M. Katz, and D. E. MacHugh, E. W. Hill, and D. E. MacHugh are shareholders in Equinome Ltd.


Author contributions: E.W.H. and L.M.K. conception and design of research; E.W.H., R.G.F., B.A.M., J.G., and L.M.K. performed experiments; E.W.H., R.G.F., B.A.M., J.G., and L.M.K. analyzed data; E.W.H., B.A.M., and L.M.K. interpreted results of experiments; E.W.H., R.G.F., B.A.M., and L.M.K. prepared figures; E.W.H. and L.M.K. drafted manuscript; E.W.H., B.A.M., D.E.M., and L.M.K. edited and revised manuscript; E.W.H., R.G.F., B.A.M., J.G., D.E.M., and L.M.K. approved final version of manuscript.


We thank the trainer J. S. Bolger for access to horses and samples. We thank B. O'Connor, P. O'Donovan, and the staff at Glebe House Stables for assistance.


  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
  21. 21.
  22. 22.
  23. 23.
  24. 24.
  25. 25.
  26. 26.
  27. 27.
  28. 28.
  29. 29.
  30. 30.
  31. 31.
  32. 32.
  33. 33.
  34. 34.
  35. 35.
  36. 36.
  37. 37.
  38. 38.
  39. 40.
  40. 41.
  41. 42.
  42. 43.
  43. 44.
View Abstract