## Abstract

Accurate prediction of the metabolic energy that walking requires can inform numerous health, bodily status, and fitness outcomes. We adopted a two-step approach to identifying a concise, generalized equation for predicting level human walking metabolism. Using literature-aggregated values we compared *1*) the predictive accuracy of three literature equations: American College of Sports Medicine (ACSM), Pandolf et al., and Height-Weight-Speed (HWS); and *2*) the goodness-of-fit possible from one- vs. two-component descriptions of walking metabolism. Literature metabolic rate values (*n* = 127; speed range = 0.4 to 1.9 m/s) were aggregated from 25 subject populations (*n* = 5-42) whose means spanned a 1.8-fold range of heights and a 4.2-fold range of weights. Population-specific resting metabolic rates (V̇o_{2}_{rest}) were determined using standardized equations. Our first finding was that the ACSM and Pandolf et al. equations underpredicted nearly all 127 literature-aggregated values. Consequently, their standard errors of estimate (SEE) were nearly four times greater than those of the HWS equation (4.51 and 4.39 vs. 1.13 ml O_{2}·kg^{−1}·min^{−1}, respectively). For our second comparison, empirical best-fit relationships for walking metabolism were derived from the data set in one- and two-component forms for three V̇o_{2}-speed model types: linear (∝V^{1.0}), exponential (∝V^{2.0}), and exponential/height (∝V^{2.0}/Ht). We found that the proportion of variance (*R*^{2}) accounted for, when averaged across the three model types, was substantially lower for one- vs. two-component versions (0.63 ± 0.1 vs. 0.90 ± 0.03) and the predictive errors were nearly twice as great (SEE = 2.22 vs. 1.21 ml O_{2}·kg^{−1}·min^{−1}). Our final analysis identified the following concise, generalized equation for predicting level human walking metabolism: V̇o_{2}_{total} = V̇o_{2}_{rest} + 3.85 + 5.97·V^{2}/Ht (where V is measured in m/s, Ht in meters, and V̇o_{2} in ml O_{2}·kg^{−1}·min^{−1}).

- walking economy
- generalized equation
- algorithm
- exercise metabolism
- wearable sensors

the metabolic energy that walking requires can be accurately measured, but it is difficult to predict in the absence of direct measurement. Two factors have prompted extensive efforts to develop predictive equations: the fundamental importance of walking metabolism to the body's health, fitness, and physiological status, and the impracticality of direct measurement under most circumstances. The large majority of the many predictive equations that currently exist were developed on small, homogeneous populations using best-fit approaches. This includes the two leading standardized equations (3, 36) that have been heavily used since their original formulations decades ago. Accordingly, the large majority of existing equations have not been validated beyond the test populations on whom they were developed, and many have not been validated at all. With the advent of affordable, wearable sensors capable of incorporating predictive algorithms, the importance and potential application of these algorithms have arguably never been greater. Nonetheless, a comprehensive assessment of the relative accuracy, or lack thereof, of the algorithms that currently exist is largely unavailable.

The two most established and commonly used predictive equations, the American College of Sports Medicine (ACSM) (3) and Pandolf et al. (36) equations, respectively, were developed using very small, homogeneous populations of adult men. Each partitions the body's total or gross, mass-specific metabolic rate during walking into resting and nonresting components, quantifying the latter as a single, speed-dependent component. However, they differ formulaically in how they do so: the ACSM equation quantifies the relationship between walking speed and metabolic rate as a linear function, whereas the Pandolf et al. equation uses an exponential description. The former equation is heavily used throughout health, fitness, and clinical communities; the latter is heavily used for military purposes. To date, independent assessments of the accuracy of these standard equations have been surprisingly limited given their broad acceptance and widespread use. Consequently, their general accuracy, the predictive consequences of their different mathematical forms, and their ability to generalize to populations other than adult men of average stature are largely unknown.

We recently formulated a model with potentially broader applicability to human walking metabolism. Our approach deviated from the long-standing practice of testing populations that are homogeneous with respect to both age and body size (3, 9, 16, 36, 48). We did so because our earlier work indicated that age, long considered to be a factor explaining the elevated metabolic requirements of children, has no quantifiable effect when body size and related gait mechanics are taken into account (49). Hence we used body size stratification as an experimental tool for model development to maximize the generalizability of our model. Formulaically, the Height-Weight-Speed (HWS) model that resulted, like the Pandolf et al. model, describes the metabolic rate vs. speed relationship as exponential, but it does so with two features that differ from literature norms (Fig. 1). The first is that walking metabolism is partitioned into two components: one that is primarily postural and constant across speed, and a second that is speed-dependent. The second differentiating feature is that the term describing speed-dependent increases in walking metabolic rates includes an inverse relationship to stature (V^{exp}/Ht). This second feature quantifies the economizing influence of stature on walking metabolism (15, 18, 31, 33, 49, 51).

In our original study, these features apparently allowed our HWS model to achieve greater overall accuracy than either the ACSM or Pandolf et al. models for the level condition investigated. For our sample of 78 subjects who varied substantially in body size (derivation group, *n* = 39; validation group, *n* = 39), the HWS model was able to predict measured metabolic rates to within an average of 8.1 ± 6.7%. Additionally, we found the predictive error of the HWS model to be one-third that of the two older standardized equations on our validation-group subjects, and less than half that when only larger or adult subjects were assessed. These results raised several questions regarding the basis of the predictive accuracy originally observed for our HWS model. First, from a technical standpoint, the relative accuracy of the HWS model was likely overrepresented because we evaluated our own model, but not the other two, with data acquired under identical conditions with the same instrumentation. Second, from a scientific standpoint, our original study did not reveal the extent to which the largely distinct features of the HWS model were responsible for the greater predictive accuracy achieved. Specifically, we did not know the predictive importance of *1*) the addition of a second metabolic component for walking metabolism, and *2*) the incorporation of height into predictions of speed-dependent increases in walking metabolic rates.

Our overall goal was to identify a concise, broadly accurate equation for predicting metabolic rates during level human walking. For this purpose, we used the existing literature to compile a data set that was well-stratified with respect to both the walking speeds and body sizes of the subject populations included. We used this data set to pursue our overall objective in two analytical steps. First, we assessed how accurately the aforementioned three equations were able to predict the fully independent metabolic rate values in the literature data set. Second, we used the different mathematical forms of these three equations to identify the elements that are essential for broad, accurate prediction. This included specifically evaluating whether the total or gross walking metabolic rates in the data set would be more accurately described when the walking, or nonresting portion, of the body's total metabolic rate consisted of two components rather than one. Accordingly, we formulated both one- and two-component versions of three V̇o_{2}-speed model types: linear (∝V^{1.0}), exponential (∝V^{2.0}), and exponential with an inverse relationship to height (∝V^{2.0}/Ht).

Our first hypothesis was that the error of the HWS model equation in predicting the literature data set values would be less than half that of both the ACSM and Pandolf et al. equations. Our second hypothesis was that accounting for 90% of the total data set variability would be possible when walking metabolism was quantified with two components, but not possible when quantified with only one.

## METHODS

### Experimental Design

We adopted a literature compilation approach to evaluating the relative accuracy with which formulaically different models predict human walking metabolism for several reasons. First, the existing literature is now sufficiently expansive to comprehensively incorporate the influences of height, weight, and speed on level walking metabolic rates. Second, the aggregation of means from many studies should mitigate measurement or condition-specific error from individual studies. Finally, contemporary digitizing techniques allow data published in graphic form to be extracted with a high degree of accuracy (21, 43). Collectively, these factors should allow for the aggregation of a robust, powerful data set for investigating our two hypotheses regarding energy cost of level human walking.

### Hypothesis Tests One and Two

Based largely on the prior results reported on 78 individuals who spanned a broad range of body sizes (51), we expected two hypothesis test outcomes. First, we expected that the error with which the aggregated literature means would be predicted by the ACSM and Pandolf et al. equations would be at least two times larger than the corresponding error of prediction of the HWS model equation using the standard error of estimate (SEE) as our evaluative standard. Second, we expected that the SEE would, on average, be at least twice as large when the walking portion of the body's total metabolic rate was modeled with one component rather than two. Furthermore, as general evaluative standards for whether each model was able to fit to the aggregated data set well, we set a priori thresholds of ≥90% of the total data set variance explained and ≤10% error in the accuracy of the estimates of individual data set values. These thresholds were set in accordance with our expectation that an accurate model should *1*) account for 90% of the total variability in the data set and *2*) have a predictive error of less than 10% of the grand mean of the 127 values in the data set. The first threshold was quantified using the *R*^{2} statistic, the second using the coefficient of variation, here calculated using the SEE divided by the grand mean of the values (*n* = 127) in the data set (i.e., SEE/grand mean × 100).

### Data Set Criteria

Our literature data set was strategically aggregated to broadly encompass the influences of height, weight, and speed on human walking metabolism. However, we did not seek to acquire all the valid data available from the literature. We avoided this because doing so would have skewed the data set toward the adult populations that are overrepresented in the literature. Accordingly, nearly all of the suitable literature data from subadult populations was included, whereas much of the data available from adult populations was not. The criteria for determining whether the literature values available qualified for inclusion were as follows. First, the mean height and weight of the group had to be reported in the original study. Second, metabolic means from a sufficient number of speeds to provide a minimum value for the energy expended per unit distance, or metabolic cost of transport, needed to be available. Third, to avoid speeds in the walk-run transition range that were too fast to be true walking speeds, we implemented a standardized maximum-speed cut-off using an analog of the Froude number (51) adapted from Alexander (2): (1) where walking speed is in units of m/s, height is in meters, and g is the gravitational constant in m/s^{2}. The Froude number is widely used to quantify speeds that are equivalent for walkers and runners who differ in body size. The standard Froude index does so using leg length. Here, as previously, we used a Froude number analog that substitutes height for leg length because studies on walking metabolism generally report the height means of the groups tested, but often do not report leg length. To avoid including values that were at or above the walk-run transition we removed data points with a Froude analog value of ≥0.65. We limited our analysis to subjects of normal weight, excluding groups of people who were classified as obese in their respective original publications. We excluded data from subject groups ≥65 years of age because the metabolic cost of walking is generally elevated in elderly subjects (35) for reasons that have not yet been fully identified. We included populations of a minimum age of 3–5 years because they exhibit adult-like patterns for gait and metabolism when leg length is taken into account (18, 49).

### Digitizing Process

Values for group means were acquired from the tables or figures in prior publications. Those data points acquired from figures were digitized in accordance with the highly accurate techniques now available (21, 43). Original illustrations were enlarged and oriented on a grid to allow precise vertical and horizontal line fits to the data point of interest. Line fits were extended to the *x*- and *y*-axes to determine the *x* and *y* values for each data point. Data point values were also determined using an automated digitizer available online [WebPlotDigitizer (40)].

### Data Set Characteristics

Using the inclusion criteria specified, our literature search from the early 1900s to the present yielded 25 subject groups from 10 publications (Table 1) spanning a 50-year period from 1960 to 2010. The number of subjects per population group ranged from 5 to 42. Mean age ranged from 5.2 to 40.7 years, mean height ranged from 1.03 to 1.82 m, and mean body mass ranged from 18.9 to 78.0 kg. Body mass index ranged from 15.5 to 25.4 kg/m^{2}. A minimum of four and a maximum of six metabolic rate values from different walking speeds were acquired from the different population groups (mean = 5.1 ± 0.7 values per population group), resulting in a total of 127 values in the final data set. Of the 127 group means included, 95 were acquired from subjects walking on treadmills, whereas 32 were acquired from overground walking at constant speeds. The grand mean for the rate of O_{2} uptake from the 127 values aggregated was 14.0 ml O_{2}·kg^{−1}·min^{−1}.

### Predictive Accuracy—Original HWS Model vs. ACSM and Pandolf et al.

According to the forms of the three respective literature equations provided in Table 2, literature values were predicted using the ACSM and Pandolf et al. equations on the basis of walking speed. For the original HWS model, literature values were predicted using walking speed, estimated V̇o_{2}_{rest}, and the mean height of each population group. The agreement between actual and predicted values across the three equations was evaluated using both the *R*^{2} statistic and SEE.

### Walking Metabolism Models

The specific forms of the one-and two-metabolic components used to model the walking, or nonresting portion, of gross walking metabolism were guided by both the primary literature traditions and our recent modeling efforts. Our recently introduced HWS model of walking metabolism appears schematically in Fig. 1 (51). Partitioning gross or total metabolic rates into a baseline component that corresponds to resting metabolic rate and an exercise component is a common practice (31, 41, 49–51). However, the HWS model is atypical in dividing the walking component of the body's total metabolic rate into two components: a constant, predominantly postural component, termed the minimum walking metabolic rate; and a second speed-dependent component. The novel component of the HWS model, the minimum-walk component, is attributed to the support and postural costs of the walking movement and is independent of walking speed (51). The speed-dependent component quantifies the simultaneous influences of walking speed, height, and gait mechanics as previously described (51). The HWS model incorporates body mass into the denominator of each metabolic component and takes the following form: (2)

where V̇o_{2}_{total} is the body's total rate of oxygen uptake; V̇o_{2}_{rest} is the body's supine resting rate of oxygen uptake; C_{1} is a coefficient that describes the minimum walking rate of oxygen uptake as a multiple of the resting rate; and C_{2} is a coefficient describing the speed-dependent increases in the rate of oxygen uptake as a function of walking velocity, V, raised to the exponent, exp, divided by the height (Ht) of the individual. Hence the sum of the model's second and third components represents the metabolic rate attributable to walking (V̇o_{2}_{walk}). All the terms in Eq. 2 are expressed in mass-specific units of oxygen uptake of ml O_{2}·kg^{−1}·min^{−1} in accordance with literature convention and for consistency with the original publication for the HWS model. The theoretical basis for the model, including its mass-specific form, has been previously provided (51). Per our scientific objectives, Fig. 1, Eq. 2, and our previous work, the term “metabolic rate” is used to refer to mass-specific rates of oxygen uptake throughout the manuscript.

### Resting Metabolic Rates

The resting portion of the gross or total walking metabolic rates in our literature data set was determined on the basis of height, weight, sex, and age for each of the 25 population means using the prediction equations of Schofield et al. (42). These equations have been extensively validated and are known to predict resting metabolic rates with a high degree of accuracy, typically in the range of 0.5 ml O_{2}·kg^{−1}·min^{−1} (19, 26, 38, 39, 47, 50). Because all of the predictive models tested incorporated the same Schofield-derived resting metabolic rate quantity, this portion of the total or gross metabolic rate attributed to V̇o_{2}_{rest} did not differ across all the model types tested. We did not use measured V̇o_{2}_{rest} data because these values were not reported in most of our literature sources. We converted the units of kJ/day from the Schofield equation to oxygen units of ml O_{2}·kg^{−1}·min^{−1} using the conversion factor of 20.1 J per milliliter of oxygen. Schofield equations modified to oxygen units appear in Table 4 of the original HWS manuscript (51).

### Modeling Iterations, Analyses, and Equations

Models of three basic forms for describing the metabolic rate vs. walking speed (V) relationship were evaluated: linear (∝V^{1.0}), exponential (∝V^{2.0}), and exponential with an inverse relationship to height (∝V^{2.0}/Ht). For each of the three model types, both one- and two-component versions were derived. The equations corresponding to these six different model derivations are provided in Table 3. The procedures used to determine the best fits of these model forms to the literature data set are described below.

### Model Best-Fit Procedures

For each of the three basic model forms, separate model versions were derived, a first that treated net walking metabolism as a single entity; and a second that partitioned walking metabolism into two components: a constant, largely postural component, and a separate speed-dependent component in accordance with the schematic in Fig. 1. For consistency and ease of interpretation, the postural component of walking metabolism was modeled the same way across the three model types, specifically as a multiple of V̇o_{2}_{rest}, therefore equal to the quantity C_{1}·V̇o_{2}_{rest} according to Eq. 2.

To maximize the fits provided by each equation, the coefficients derived were those that provided the best fit to the data points acquired from the literature sources (i.e., highest *R*^{2} value) across the range of height, weight, and walking speeds present. The coefficient describing the minimum walking metabolic rate (C_{1}) in the two-component models, and the coefficient describing the speed-dependent walking metabolic rate (C_{2}) in all models were specifically optimized to minimize the sum-squared error of prediction. The optimizer function in Microsoft Excel was used because of its ability to optimize a coefficient while holding other values such as estimated resting metabolic rate, walking velocity, and height fixed at their known values [Microsoft Excel Solver, Excel 2010 version (24)]. Once best-fit equations were derived they were used to estimate walking V̇o_{2} values for all 127 literature data points and subsequently plotted against walking speed.

We also tested a seventh model, the modified HWS model, which has only a two-component form and differed from the first six in that the minimum-walk component was modeled as a constant absolute value (in ml O_{2}·kg^{−1}·min^{−1}) across all group means rather than as a multiple of the group-specific V̇o_{2}_{rest} values. We did so because our prior results (51) raised the possibility of limited predictive bias being introduced by quantifying V̇o_{2}_{Walk Minimum} as a multiple of V̇o_{2}_{rest}. In this case, the equation consisted of V̇o_{2}_{rest}, a constant in ml O_{2}·kg^{−1}·min^{−1} that replaced the C_{1}·V̇o_{2}_{rest} term, and a coefficient (C_{2}) times walking velocity squared divided by height. Additionally, to further investigate the importance of height as a predictor we also analyzed the modified-HWS model without height in the equation.

### Validation of Modified-HWS Model on Data from Individual Subjects

Upon completing our model evaluation we tested how well the derived equation predicted values previously acquired from individual subjects. We did so using previously published level walking metabolic data collected from 57 individuals (30 men, 27 women) whose heights ranged from 1.07 to 1.89 m, and weights ranged from 15.9 to 88.95 kg (51).

### Derivation of a Final Generalized Equation

Once the essential elements for broad accurate prediction were formulaically established using the full data set, our final analytic step was to identify the best-fit coefficients for the equation form identified. We did so using only those values in the data set acquired using the gold standard technique for measurements of exercise metabolism, the Douglas bag method (13). For this purpose, we used all the Douglas bag values in the data set excepting those from Maffeis et al. (29), whose walking metabolic rate (V̇o_{2}_{walk}) values for subjects of 1.37 m in height were, for unknown reasons, substantially higher than those from other sources in the data set for subjects of similar height. The final equation was derived on 42 group mean values. For these 42 group means, subject height ranged from 1.19 to 1.73 m, weight ranged from 22.5 to 71.6 kg, and walking speed ranged from 0.44 to 1.80 m/s.

### Data Set Categorization by Stature

The 127 values for population group-mean metabolic rates in our aggregated data set appear in Fig. 2*A* as a function of walking speed. The influence of height on gross walking metabolic rates led us to classify these values by stature using a three-category scheme of short, intermediate, and tall. These stature classifications were not necessary for, and indeed were not part of, our formal hypothesis tests. Rather, we implemented these classifications to allow for visual evaluation of whether the different model versions evaluated fit the walking metabolic rate values equivalently across the different stature means present in the data set, or were biased toward shorter or taller individuals. The stature, weight, and age means of the populations in the short, intermediate, and tall groups appear in Table 4.

Also, for graphical purposes, within each height classification group, we determined representative metabolic rate vs. speed relationships as follows. For each of the three groups, we averaged the literature metabolic rate data points acquired to determine values at or near six speeds: 0.5, 0.8, 1.0, 1.3, 1.6, and 1.8 m/s. The precise speeds for the respective height groups varied slightly in accordance with the different protocol speeds administered in the different literature sources. This process allowed us to formulate trend lines for the metabolic rate vs. speed relationships that corresponded to the literature values for each of the three respective height classification groups (Fig. 2*B*). These trend lines, which appear in grayscale, were formulated to provide a visual reference for evaluating each model's ability to fit both the stature- and speed-variability present in the data set.

## RESULTS

### Digitizing Accuracy

The average absolute percent difference between the 20 original numeric values [Fig. 1*A* in (49)] and those acquired via digitization was found to be <1.00% in 17 of the 20 cases when values were obtained by the grid technique. Across these 20 data points, the error ranged from 0.03 to 2.43% with an overall mean of 0.65%. Using three different published graphs and original data sets [Fig. 1*A* in (49), Fig. 2A in (51), and Fig. 4*A* in (51)] with a combined 47 data points, the absolute percent difference between the measured data and the derived data was <0.60%. When using the automated digitizer, the original values and digitized values across the 47 data points agreed to within an average of 0.51% [WebPlotDigitizer (40)].

#### Data presentation.

The actual vs. predicted values for the equations evaluated for both hypotheses are presented using the same graphical format. First, the values predicted for each of the 127 original literature means by the respective equations are plotted as a function of walking speed (Figs. 3–7, *left*). Second, the same respective, predicted values are subsequently plotted vs. their actual values (Figs. 3–7, *right*). The height classification trend lines from Fig. 2*B* appear in grayscale in those figures in which equation-predicted values are plotted in relation to speed (Fig. 3, *A*, *C*, and *E*; and Figs. 4–7, *A* and *C*).

The best-fit coefficients derived for the one- and two-component models of each of the three basic model types (linear, exponential, and exponential with height) appear in Table 5. The goodness-of-fit of each of the six equations derived is provided graphically in two formats that follow the original graphic presentations of the literature values appearing in Fig. 2, *A* and *B*. For each of the six respective best-fit equations derived, one- and two-component version are vertically juxtaposed in the upper and lower portions of Figs. 4, 5, and 6. The one-component model forms and corresponding predictions appear at the *top* of each figure (*A* and *B*), whereas the two-component model forms appear in the *bottom* of each figure (*C* and *D*).

Each illustration of the goodness-of-fit between actual and predicted or estimated values includes an *R*^{2} value for the fit provided and the corresponding SEE. The grand mean for all of the values in the literature data set was 14.0 ml O_{2}·kg^{−1}·min^{−1}. Accordingly, those fits with SEE values below 1.40 ml O_{2}·kg^{−1}·min^{−1} met our criteria of a coefficient of variation of <10%.

### Hypothesis Test One: Predictive Accuracy of HWS vs. Standard Equations

The metabolic rates predicted for each of the 127 literature means using the ACSM (3), Pandolf et al. (36), and original HWS equations (51) appear as a function of walking speed in Fig. 3, *C* and *E*, and in relation to the actual values in Fig. 3, *B*, *D*, and *F*. The ACSM and Pandolf et al. equations were largely unable to predict the 127 values in the literature data set. In both cases, the proportion of variance accounted was less than zero, indicating that the error between predicted and actual values was greater than the total variability present in the data set. The HWS equation was considerably more accurate, accounting for just over 90% of the total variability present in these values.

Both the ACSM and Pandolf et al. equations result in significant underprediction of walking metabolic rate for all height groups; however, for the tall group, Pandolf et al. accurately predicts values at the intermediate and faster speeds, whereas the ACSM equation does not (Fig. 3, *A* and *C*). When plotting predicted vs. measured metabolic rates, the tendency toward underprediction by both equations is obvious. Almost all the data points fall on or below the line of identity for the ACSM equation (Fig. 3*B*), and all but a limited number of data points predicted by Pandolf et al. also fall below the line of identity (Fig. 3*D*). In contrast, the values predicted by the original HWS model fall relatively close to their respective height group trend lines (Fig. 3*E*) and to the line of identity (Fig. 3*F*). The predictive error (SEE) of the original HWS model equation was roughly one-fourth that of the ACSM and Pandolf et al. equations. Both ACSM and Pandolf et al. equations had greater predictive error than the benchmark SEE value (1.40 ml O_{2}·kg^{−1}·min^{−1}), whereas the HWS model equation had appreciably less.

We also tested the predictive accuracy of the ACSM, Pandolf et al., and HWS equations on the subset of literature values from adults only because both ACSM and Pandolf et al. were designed to serve adult-only populations. For adult groups, the *R*^{2} value for measured vs. predicted data points was 0.50 and the SEE was 2.69 ml O_{2}·kg^{−1}·min^{−1} using the ACSM equation. For the Pandolf et al. equation, the *R*^{2} value was 0.71 and SEE was 2.06 ml O_{2}·kg^{−1}·min^{−1} for the adult values. For both ACSM and Pandolf et al. equations almost all the adult values in the data set were underpredicted. In contrast, the HWS equation resulted in an *R*^{2} of 0.90 and SEE of 1.21 ml O_{2}·kg^{−1}·min^{−1} in predicting the same adult values.

### Hypothesis Test Two: One- vs. Two-Component Models

#### Linear model results.

The best-fits resulting from the linear forms of one- and two-component metabolic rate vs. speed models appear in Fig. 4. The best-fit from the one-component model slightly underpredicted the values of the shorter groups of subjects and overpredicted the values of taller ones (Fig. 4*A*). The stature-biased predictions were largely absent in the best-fit predicted values from the two-component form of the linear model (Fig. 4*C*). Both the one- and two-component model forms exhibited speed-dependent bias (Fig. 4, *B* and *D*). In both cases, the metabolic rate means at relatively slow and fast walking speeds tended to be underpredicted, whereas those at intermediate walking speeds were generally overpredicted. The goodness-of-fit and SEE of the two- vs. one-component model demonstrated only marginally better agreement (Δ*R*^{2} = 0.03 and ΔSEE = 0.14 ml O_{2}·kg^{−1}·min^{−1}), primarily because the addition of the second component allowed the stature-related stratification of the literature means to be fit somewhat more closely.

#### Exponential model results.

The best-fit equation predictions from the one- and two-component exponential models appear in Fig. 5. The one-component exponential model fit the literature values relatively poorly and was the only one of the six best-fit equations that did not account for at least half of the total variance present in the literature data set (*R*^{2} < 0.50; Fig. 5, *A* and *B*). The literature means at the slowest walking speeds were predicted least accurately and were consistently lower than the actual values. The predictions at faster speeds were also in error, being generally higher than the actual values (Fig. 5*B*). In contrast, the best-fit relationship from the two-component exponential model accounted for >90% of the total variance present in the literature values (Fig. 5*D*) because the addition of the minimum walking component substantially improved the agreement with the actual values at all speeds, particularly the slower ones (Fig. 5*C*). The agreement between the two-component, best-fit estimated and actual values indicated slight speed-dependent bias with most of the values at the slower and faster speeds falling just below the line of identity indicating slight underprediction (Fig. 5*D*).

#### Exponential model with height results.

The best-fit equation estimations derived for the two-component exponential model with height included appear in Fig. 6. The addition of height to the exponential model improved the one-component fit to the literature values slightly but did not improve the two-component fit at all. The one-component exponential model with height substantially underestimated the literature values at the slower walking speeds across all three height classification groups (Fig. 6, *A* and *B*). Best-fit estimations at the faster speeds were more accurate but tended toward overestimation, particularly for the values at the fastest walking speeds. The two-component exponential model with height provided a substantially better fit than the one-component model, primarily by improving the accuracy of the estimations for the values at the slowest walking speeds (Fig. 6, *C* and *D*), although a slight tendency toward underestimation among the tallest groups remained. The estimations of the intermediate and faster speed values were generally accurate and without obvious speed-dependent bias or trends. The two-component model with height was the second of the six best-fit equations derived that was able to capture greater than 90% of the total variance present among the literature data set means.

#### Modified HWS model.

The best-fit equation resulting from modeling the minimum-walk component as a constant absolute metabolic rate value was: (3)

By comparison, the two-component model with minimum-walk-component treated as a constant absolute value resulted in an SEE of 1.00 ml O_{2}·kg^{−1}·min^{−1} (Fig. 7*D*), indicating that this refinement of our HWS model improved the goodness-of-fit to our literature data set. Modifying the first of the two walking, or nonresting, components of the HWS model removed the slight bias toward underprediction at the slowest walking speeds that were present in both of the exponential two-component models. For each of the three height-classification groups, the literature means at the slow, moderate, and faster walking speeds all conformed closely to the corresponding height classification trend lines (Fig. 7*C*). The error present in the values predicted from the modified HWS model vs. the actual literature values was small and equally distributed above and below the line of identity across the full range of walking speeds and metabolic rate values.

The modified-HWS model equation was able to fit the literature values more closely when speed-dependent increases in walking metabolic rates were described as an exponential function with an inverse relationship to height vs. without. When height was not included as a predictor, the SEE was 1.5 times greater and the proportion of variance accounted for was nearly 10% lower (Fig. 7, *B* and D). When height was absent from this form of the model, values were overestimated for the tallest groups and underestimated for the shortest groups (Fig. 7*A*), with the greatest disagreement occurring at the fastest speeds for the shortest subject populations.

#### Final generalized equation for level walking metabolism.

The best-fit equation derived in the form identified as having the least predictive bias, and on the subset of literature values acquired from Douglas bags, and therefore considered to be most valid, was (4)

## DISCUSSION

Our two-step strategy for identifying the elements that are essential for accurate generalized prediction of level human walking metabolism was indeed effective. Our first test revealed that the two leading standardized equations that predict walking metabolism are inadequate for humans of different body sizes walking across a broad range of speeds on level surfaces. Our second test identified the quantitative elements that are required for accurate generalized predictions, but lacking in the leading standardized equations. In both cases, the primary conclusion supported is that accurate generalized predictions are possible when the body's walking, or nonresting, metabolism is quantified with two components, but not possible when quantified with only one.

Quantitatively, our first hypothesis test indicated that the group-mean metabolic rate values in our data set were predicted four times more accurately (SEE) by the HWS model equation, which includes two components for walking metabolism, vs. the leading literature equations (3, 36) that include only one (Fig. 3). The relative differences in predictive error of the leading standard equations vs. that of the HWS equation identified here actually exceed those we previously reported (51) on individual data. This quantification of larger differences in the independent data set compiled here indicates that our original study did not in fact overrepresent across-equation differences in predictive accuracy as it might have. Our second hypothesis test outcome indicated that in each of the three model forms, the best fits possible to the literature data were unable to account for either the speed- or stature-related variance present when they included only one component to describe the walking, or the nonresting, portion of the body's total metabolism (Figs. 4–6, *A* and *B*). In two out of three model types evaluated, the SEE was more than twice as large when walking metabolism was modeled with one component vs. two. Collectively, across the three model types on average, the one- vs. two-metabolic component versions provided poorer overall fits (Δ*R*^{2} = −0.27) with substantially larger predictive errors (ΔSEE = +1.21 ml O_{2}·kg^{−1}·min^{−1}).

### Hypothesis Test One: Predictive Accuracy of Existing One- vs. Two-Component Equations

Our first hypothesis test revealed that predictions of level human walking metabolism were roughly four times more accurate when based on our two-component HWS model equation vs. the leading one-component equations from the literature (Fig. 3). Certainly, some of the difference in predictive accuracy observed across these equations would be expected given *1*) the presence of an additional metabolic component in the HWS model, *2*) the formulation and validation of the HWS model on a data set similarly heterogeneous to that aggregated here, and *3*) the broader conditional objectives the ACSM and Pandolf et al. equations were meant to serve for adult-only populations (3, 36). The first two factors we have addressed throughout the manuscript, the last we were able to partially address here. For the 66 adult-population values in our literature data set, the respective predictive errors of the ACSM and Pandolf et al. equations were 2.2 and 1.7 times greater than that of the HWS model equation. Thus much of the difference in predictive error we identified for the entire data set was also present even when comparisons were limited to values from the adult-only populations that the ACSM and Pandolf et al. equations were meant to serve.

A primary contributor to the relative predictive errors we report is the substantial skew with which the leading literature equations predict the existing literature data. Of the 254 total values predicted by the ACSM and Pandolf et al. equations combined only seven were overpredicted, whereas the remaining 247 values were underpredicted by these two equations (Fig. 3, *B* and *C*). Although the tendency toward underprediction has been noted previously for the ACSM equation (1, 8, 10, 25, 51), the substantial intrinsic bias of both equations has not been previously documented. Almost certainly, a portion of the bias identified is attributable to the narrow original derivations of the respective equations. The Pandolf et al. equation was derived from data from six male soldiers of similar body size (36). The level portion of the ACSM equation was derived from only three adult men (22). The skew and systematic error now evident for both of these widely used equations under level conditions highlights a significant weakness in this heavily researched area. Leading generalized equations were derived from populations that were too small and homogeneous to provide broadly accurate predictions.

### Hypothesis Test Two: One vs. Two Metabolic Components for Walking Metabolism?

The distribution of values in our literature data set (Fig. 2, *A* and *B*) requires accurate formulaic descriptions to account for two visually evident features: *1*) the near-baseline differences in total metabolic rates at slow walking speeds that are related to stature and mass, and *2*) the curvilinear increases in walking metabolic rates across speed that have modest slope differences across stature groups (Fig. 2, *A* and *B*). The first feature is unsurprising given that the greater mass-specific rates of resting metabolism in shorter, less-massive individuals are well established (42). The constant V̇o_{2}_{rest} values that we used across all our model iterations (Table 1, Fig. 2*B*, *horizontal lines*) addressed this reasonably because this single factor accounted for an appreciable portion of the across-group differences in total metabolic rates at slower walking speeds. The second feature posed the perhaps greater quantitative challenge of simultaneously describing a metabolic rate vs. speed relationship that is curvilinear for all of the population groups in the data set, but with greater stratification than can be accounted for by differences in resting metabolism alone. One consequence of these distribution features was that best-fit differences between one- and two-component model versions differed substantially by model type.

The form of our linear model resulted in almost no difference in the relative goodness of the fits provided between the one- and two-component versions. In both cases, group-specific differences in V̇o_{2}_{rest} values allowed group differences at slower speeds to be reasonably approximated. Metabolic rate increases across speed were described by slope values that differed little between one- and two-component best-fit versions (Table 5). Consequently, both the goodness-of-fit increase (Δ*R*^{2} = +0.03) and predictive error decrease (ΔSEE = −0.14 ml O_{2}·kg^{−1}·min^{−1}) that resulted from the addition of a second component were negligible. The inability of a linear model to describe the curvilinear metabolic rate vs. speed relationship resulted in substantial speed-dependent error in the best-fits achieved by both versions.

In contrast to the similar best-fits observed across one- and two-component versions of the linear model type, respective best-fit differences across both exponential model types were large. For both the exponential models with and without height included, describing the entire walking, or nonresting, portion of the total metabolic rate as single component substantially overpredicted the slope of the metabolic rate vs. speed relationship. The exaggerated slopes forced by single-component exponential fits resulted in large underestimations of metabolic rate values at slower speeds and overestimations at faster ones (Figs. 5*A* and 6*A*, respectively). The addition of the second component to both model types reduced the slopes (Table 5) to align more closely with those actually present (Figs. 5*C* and 6*C*). Improved slope alignment also allowed model best-fit estimations at slower walking speeds to align more closely with the actual values of the different height groups. Consequently, both two-component exponential model versions were able to account for >90% of the total data set variance with predictive errors ≤10% of the grand mean.

Although the best-fits provided by our two-component exponential models exceeded our general criteria for good overall fits, both exhibited modest speed-dependent estimation bias vs. the line of identity. Slower-speed values were underestimated by both versions, whereas faster speed values were underestimated by the exponential model version that did not include height (Figs. 5*D* and 6*D*). The nature of the speed-dependent estimation bias present was consistent with our postulation that modeling the minimum-walk component as a multiple of group-specific V̇o_{2}_{rest} values might overestimate across-group differences at slower walking speeds. Accordingly, we modified our two-component exponential model with height (Eq. 3) to describe the minimum-walk component using the same constant (ml O_{2}·kg^{−1}·min^{−1}) across all 25 population groups. The resulting best-fit exhibited little to no discernible speed-dependent bias vs. the line of identity, accounted for a slightly greater proportion of the total variance, and had the lowest predictive error of all seven of our best-fit modeling iterations (Fig. 7*D*).

### Body-Size Dependency of Walking Economy: Is Height Needed for Predictive Accuracy?

On the asis of our previous findings, we anticipated that the model forms with the speed-dependent component expressed inversely to height would better describe walking metabolic rates than models that did not include this feature. However, across the two-component exponential models, this was not the case. The best-fit provided by the two-component model that did incorporate height to describe speed-dependent increases in walking metabolic rates was no better than the best-fit provided by the two-component exponential model that did not (Fig. 5*D* vs. Fig. 6*D*). This result may at first seem inconsistent with two previous findings: *1*) of a near-inverse relationship between stature and the energy walkers expend per unit distance (49); and *2*) the predictive accuracy previously achieved when incorporating height in the same formulaic manner (51). However, the cursory interpretation that height is unimportant is incorrect.

The consistency with which gross, mass-specific walking metabolic rates have been reported to be inversely related to height has been essentially absolute (15, 18, 29, 31, 33, 49, 51). Indeed, this consistency is responsible for the distribution of walking metabolic rate values by stature classification groups evident in our literature data set (Fig. 2). Although a portion of the stature-related differences in the gross metabolic rates consistently observed results from the greater resting metabolic rates of smaller, less massive individuals, the majority of the difference is attributable to the walking or nonresting portion [Fig. 2 here, Fig. 1 in (49)]. Indeed, when we analyzed the current walking metabolic rate data (i.e., V̇o_{2}_{total} − V̇o_{2}_{rest} per Eq. 1) to quantify the height-transport cost relationship (COT, O_{2}/kg·m) at the mechanically equivalent walking speeds that different-sized individuals typically self-select per our earlier analysis (49), we again found a large, negative, and nearly inverse relationship (COT ∝Ht^{−0.77}). Thus on a strictly biological level, our results here and those from previous literature are consistent in indicating that height is a fundamental determinant of level human walking economy.

The relative unimportance of height as a predictor across our exponential models resulted from the majority of the height-related variance being accounted for in the first of our two-model components of walking metabolism. Because this first component, the minimum-walk component, was modeled as a multiple of resting metabolic rates that vary by stature and mass (Table 3), the minimum-walk component values across different stature groups were greater for shorter populations and smaller for large ones, thereby accounting for much of the stature-related variance in the walking metabolic rates. Thus the incorporation of height into the second and speed-dependent metabolic component (as V^{2}/Ht) did not improve the best-fit provided because the stature-related variance had already been largely accounted for. This aforementioned effect becomes visually apparent in the modified HWS iterations that treated the minimum-walk component as a constant and therefore did not indirectly incorporate the influence of height into this portion of walking metabolism (Fig. 7). The version of this iteration which did not include a heightvariety of technologies
-related treatment in either walking component, was unable to account for stature-related variation in metabolic rates across all speeds, particularly faster ones (Fig. 7, *A* and *B*). However, when height was added to the speed-dependent component of this model version, the best-fits for populations of different statures across all speeds were the closest of all of our model iterations (Fig. 7, *C* and *D*).

### Why Are Two Nonresting Components Needed to Describe Human Walking Metabolism?

The most immediate scientific question raised by our conclusion that two metabolic components are required to adequately describe human walking metabolism is: To what extent does the two-component conclusion from our whole-body modeling approach correspond to internal physiological reality? Clearly, the ability to selectively activate tissues and tissue compartments within the muscular system across speed makes our simplified two-component description theoretically possible. The first component purportedly represents a constant minimum walking metabolic “baseline” set by the volume of muscle recruited largely to satisfy postural requirements. The second component purportedly represents speed-dependent increases in metabolic rates resulting from disproportionate increases in muscular activation across faster walking speeds. However, direct evaluation of these ideas is limited by the inability to measure metabolic rates within the body on a tissue-by-tissue basis.

Given this basic measurement limitation, investigators have used a number of indirect approaches to infer differential tissue contributions to the body's total walking metabolic rates. Of these, two partially overlapping approaches have been the most informative. The more direct of the two has been the use of surface electromyography (EMG) to measure the electrical activity of individual muscles (11, 12, 23, 34). The EMG studies available provide reasonable support for disproportionate increases in neuromuscular activity at faster walking speeds. Several of these indicate that hip flexor and extensor muscles that are largely inactive at slower walking speeds become activated at faster ones. These studies also demonstrate disproportionately large increases in knee extensor muscle activity across faster speeds that coincide with the greater knee extensor moments also observed (6, 23, 34). A less direct but more comprehensive approach is the detailed musculoskeletal modeling (20, 34, 37) that has advanced rapidly in the last decade. This approach uses forward dynamic simulations based on extensive kinematic, anatomical, neural, and physiological inputs. When applied across walking speeds, these detailed models also identify de novo recruitment of hip flexor and extensor muscles, and disproportionate increases in knee extensor muscle activation across the faster walking speeds.

Finally, we note that earlier investigators who adopted a combined theoretical-empirical approach to modeling whole-body walking metabolic rates reached a two-component conclusion similar to ours. Workman and Armstrong (53) used an energy-per-step framework in conjunction with direct measurements to develop their original equation for predicting level walking metabolic rates. Their lengthy equation, which also incorporates height, has been shown to be relatively accurate across a range of walking speeds and for different individuals (51). A retrospective scientific consideration of their work by the original authors led them to conclusions that have been largely overlooked, but are highly relevant here. Both the conceptual model [Fig. 5 in (52)] and quantitative conclusions they ultimately reached agree closely with ours even though the data and theoretical framework used were fully independent.

### Summary and Future Recommendations:

On a practical level, empirical evidence now exists to conclude that the HWS model offers a more accurate alternative to the ACSM or Pandolf et al. equations for nonobese children and adults (<65 years) walking on firm, level surfaces. In addition to the accuracy already noted, our modified HWS equation offers several other noteworthy features. First, the HWS equation (Eq. 3) appears to be similarly accurate for both treadmill and overground conditions because we found virtually no difference in the goodness-of-fit (Eq. 3 and Fig. 7*C*) between the values from these conditions (Δ*R*^{2} = 0.03 and ΔSEE = 0.15 ml O_{2}·kg^{−1}·min^{−1}). Second, two of the three metabolic terms in the HWS equations presented can be combined by adding a minimum-walk component constant to a population-specific V̇o_{2}_{rest} value to provide a concise two-term equation. Finally, for adult-only populations, the need for population-specific V̇o_{2}_{rest} values can be eliminated without any appreciable loss of accuracy. Simply using a global V̇o_{2}_{rest} constant of 3.3 ml O_{2}·kg^{−1}·min^{−1}, added to the 3.85 ml O_{2}·kg^{−1}·min^{−1} constant for the minimum-walk component from Eq. 4 yields the following equation: (5) which fits our adult literature data set values with an *R*^{2} >0.93 and an SEE of <1.0 ml O_{2}·kg^{−1}·min^{−1}.

Our expectation is that our final modified-HWS equations (Eqs. 4 and 5) will prove to be accurate in subsequent validations, but a degree of caution is warranted. We opted to derive the final form using only the Douglas bag values in our literature data set because measurements from both laboratory and portable metabolic systems are typically slightly but systematically higher than Douglas bag values. Accordingly, Eq. 4 represents our best present generalized equation to describe what the actual O_{2} uptake values during level walking will be. To assess this expectation on values acquired from individuals per the expected use of the equation, we evaluated Eq. 4 using data previously acquired from individual subjects (51). The resulting predictions conformed to the level of accuracy expected, with a resultant *R*^{2} of 0.88 and SEE of 1.35 ml O_{2}·kg^{−1}·min^{−1} for a height, weight, and age-stratified group of 57 subjects. However, we note that the laboratory metabolic system we use provides excellent agreement with Douglas bag values, whereas many others do not (4, 17). Accordingly, users of those metabolic systems that tend to provide slightly but systematically higher values may experience better agreement with the best-fit equation derived on the full literature data set (Eq. 3), which includes values acquired from these systems. However, given the general acceptance of the Douglas bag method as the most valid technique, we believe Eq. 4 represents the concise, broadly accurate generalized equation we sought to identify.

### Concluding Remarks

The emergence of an accurate, robust generalized relationship to predict the energy expended during level human walking is arguably scientifically overdue in addition to being practically opportune. From a basic standpoint, human walking has been studied far more extensively than any other animal gait. Nonetheless, a concise, relationship with predictive capabilities that could generalize across walking speeds and regardless of body size had not emerged (51) due to a tradition of focusing heavily on across-population differences rather than identifying the truly generalized relationships. This record contrasts with the robust, generalized relationships that exist for mammalian metabolism at rest (28) and during locomotion (44–46). In each of these cases, generalized relationships formulated on the basis of large data sets were established decades ago. Indeed, our experimental strategy here drew directly upon the comparative tradition of maximizing both speed and body-size related influences on locomotor metabolism.

From a practical standpoint, the relationship we report could be used to predict walking energy expenditure either with or without contemporary technology because the inputs required are minimal. Although present use is limited to level surfaces, under such conditions only velocity data are required if height and body weight are known. Thus low-tech field uses can be implemented with only time and distance inputs to compute an average velocity. In fitness settings, the equations we present here (Eqs. 3 and 4) could be used during level treadmill walking at known speeds. Across settings, technology-enabled implementations abound because many wearable sensors now provide velocity data (14). These include global positioning systems, geo-locating smart phones, and precision pedometers that determine speed from a variety of technologies and which are available from numerous manufacturers.

As health, medicine, military, and personal monitoring merge with mobile technologies, the accuracy of the generalized relationships available will be a primary determinant of the validity of the data streams that will inevitably become both widely available and heavily used.

## GRANTS

This work was made possible in part by U.S. Department of Defense Medical and Materiel Command Grant DAMD17-03-2-005 and Award W81XWH-12-2-0013, and by internal funds from Southern Methodist University to P.G. Weyand.

## DISCLOSURES

No conflicts of interest, financial or otherwise, are declared by the authors.

## AUTHOR CONTRIBUTIONS

L.W.L. and P.G.W. conception and design of research; L.W.L. and P.G.W. performed experiments; L.W.L. and P.G.W. analyzed data; L.W.L. and P.G.W. interpreted results of experiments; L.W.L. and P.G.W. prepared figures; L.W.L. and P.G.W. drafted manuscript; L.W.L. and P.G.W. edited and revised manuscript; L.W.L. and P.G.W. approved final version of manuscript.

## ACKNOWLEDGMENTS

We thank Dr. Kyle Roberts for statistical input and consultation, and Drs. Laurence Ryan and Kenneth Clark for scientific input and suggestions.

- Copyright © 2016 the American Physiological Society