|
|
||||||||
INNOVATIVE METHODOLOGY
1Department of Biomedical Engineering, Vanderbilt University 2Department of Medicine, Division of Gastroenterology, Vanderbilt University Medical Center, Nashville, Tennessee; and 3National Institute of Diabetes and Digestive and Kidney Diseases/Clinical Endocrinology Branch, National Institutes of Health, Bethesda, Maryland
Submitted 19 April 2007 ; accepted in final form 16 July 2007
| ABSTRACT |
|---|
|
|
|---|
physical activity; actigraph; IDEEA monitor; accelerometer; indirect calorimeter
Early modeling approaches relating activity counts and EE typically assumed a linear relationship between the activity count values and EE measured using indirect calorimeters (12, 15, 18, 24). Linear regression fits were used because of their computational simplicity and ability to well characterize the energy costs of moderate intensity, ambulatory activities (walking and jogging). Although models based on this strategy provided an excellent first approximation of the relationship between acceleration signals and EE, they have suffered in their generalization to different PA types and subject populations (27). This is because the models were predominantly developed using short protocols containing set paces of dynamic PA and were developed on homogeneous subject populations. Estimation accuracy of generalized linear models also varies greatly between subjects with different personal characteristics (for example, age, height, body mass) because identical accelerations may not result in the same metabolic costs for these individuals, although the activity count values may be the same.
Several investigators have sought to improve model accuracy by increasing the amount of information gathered during each measurement epoch. This effort has included adding additional acceleration dimensions at the hip (8, 20), adding sensors to the limbs (wrist and ankle) for more complete movement detection (8, 15), and coupling physical and physiological information, such as heart rate, near body temperature, and skin impedance (6, 16, 22). With the use of the additional data collected by these devices, more mathematically sophisticated and in some cases more accurate models relating acceleration and energy expenditure have been developed, such as multiple linear regressions (14) and generalized nonlinear models (8, 21). Recently, a new model for EE estimation was developed that called for recording data in finer time intervals (1 s) using a uniaxial accelerometer rather than collecting more channels or types of sensor data (10). In this model, the minute-by-minute coefficient of variability (CV) was computed with 10-s data segments. This CV was used as an initial discrimination tool to determine which of two nonlinear models should be applied to the minute of data. This modeling approach was made possible in part because of improvements in the data storage capacity and battery life of modern accelerometers. Increases in the amount of data acquired from each minute of PA open the field to new analytic solution techniques that rely on multiple measurements acquired from each minute of measured activity.
By increasing the number of acceleration samples per minute, more analytically sophisticated approaches, relying on automated pattern recognition and machine learning, have been applied to several aspects of PA monitoring. The majority of this work has focused on identifying postures (7), locations within a finite space (17), or PA types (3, 19). High probabilities of correct identifications have been shown for several PA types and activity contexts. It has also been shown that the speed and incline of self-paced free-living walking can be estimated with the use of data acquired from treadmill walking using similar analytic frameworks (2). To our knowledge, however, no group has used accelerometer data, coupled with machine-learning algorithms, to predict minute-by-minute EE.
The purpose of this study was to expand on existing EE modeling techniques by capturing raw (32 Hz) acceleration signals from a biaxial accelerometer worn at the hip. We propose a feature extraction scheme where the dense acceleration signals are reduced to a small number of simple to compute statistical parameters (features) that are well correlated with the minute-by-minute EE measured by a whole room indirect calorimeter. The reduced signal information and subject demographics (sex, age, height, weight, body-mass index, and racial/ethnic background) were used to develop an artificial neural network (ANN) model to estimate minute-by-minute EE. Results of the ANN model were compared with both a traditional accelerometer regression equation and the proprietary output of a commercially available accelerometer array.
| METHODS |
|---|
|
|
|---|
One hundred and two healthy adults (46 men, 55 women) between the ages of 18 and 70 years completed this study. Subjects were free of both diseases and medications known to alter metabolic rate and major orthopedic limitations and were nonsmokers. The characteristics of these subjects are shown in Table 1.
|
Volunteers were recruited from the middle Tennessee area using flyers, e-mail distribution lists, and personal contact. Before participation, all subjects signed an informed consent document approved by the Vanderbilt University Committee for the Protection of Human Subjects. Each subject was asked to stay in the room calorimeter for
24 h while minute-by-minute activity data were acquired with multiple commercially available accelerometery-based PA monitors. Each subject was asked to engage in two structured activity intervals. The morning activity period included self-paced walking and jogging (both in the room and on the treadmill), and the afternoon activity period contained sedentary activities, such as deskwork, along with stationary biking (Fig. 1). Each prescribed activity was performed for 10 min followed by a 10-min rest period to allow the metabolic rate to return to baseline between intervals and to allow post hoc discrimination between activity types. During times when no activity was prescribed, subjects were encouraged to engage in their normal daily PA routine as much as possible. Subject's height and body mass were measured on the morning of the study visit.
|
Activity-energy measurement system. EE was computed on a minute-by-minute basis by the Vanderbilt University room calorimeter, which is located within the Vanderbilt General Clinical Research Center. This system measures oxygen consumption and carbon dioxide production with high accuracy (system error of <1%). The room calorimeter is an air-tight environmental room measuring 2.5 x 3.4 x 2.4 m. The calorimeter is equipped with a toilet and sink, desk, chair, telephone, television, DVD player, stereo system, bed, treadmill, and exercise bike. Although the calorimeter floor contains a force plate and the room has several event markers, information from these systems was not utilized for these experiments. Technical details of the calorimeter have been previously reported (23).
Accelerometers. Subjects were outfitted with both the ActiGraph (Fort Walton Beach, FL) uniaxial accelerometer and a custom-designed activity monitor, which is a derivative of the commercially available IDEEA (MiniSun, Fresno CA) monitor. The commercial IDEEA monitor consists of an array of five accelerometers (20 x 15 x 4 mm, 2 g) attached to the skin via hypoallergenic tape at the sternum, midthigh, and bottom of each foot. Each sensor is wired to a hip pack that serves to synchronize the signals from each channel and store the data. Although high accuracy for the IDEEA PA type identification routine has been published (29), the study designed to validate the EE estimation routine contained only walking, jogging, and lying down (28), and the EE estimation approach has therefore not been subjected to a rigorous validation in estimation of EE associated with other PA types.
The custom IDEEA monitor used in these experiments includes all of the sensors from the original configuration but adds recording capability at the hip pack (biaxial, anterior/posterior, and medial/lateral), on each upper arm (uniaxial), and on the top of each hand (biaxial). Raw data (32 Hz) are collected at each of the custom sites with data reported separately for each axis of each sensor, and integrated signals are recorded by the original IDEEA sensors (Fig. 2). In this configuration, data can be acquired continuously throughout our study visits (
21 h). To our knowledge, none of the commonly used commercially available PA monitors can record raw data for this length of time. Both the ActiGraph and the IDEEA hip pack were worn on a snug elastic band with both monitors located at the right hip. For this study, we used only the raw data from the hip sensors (biaxial) for analyses. Because most investigators only collect data at the hip and it would be ideal to collect field data from only one site to minimize the inconvenience to the subject, we felt it was important to explore model developments that could be applied to traditional hip-mounted accelerometers before expanding our study goals to include multisite analysis. Stationary biking was removed from the analysis for all monitors.
|
Modeling Approach
ANN modeling was selected to relate the features of the raw acceleration signal to measured EE on a minute-by-minute basis. ANN modeling is an information-processing paradigm inspired by the way the densely interconnected, parallel structure of the mammalian brain processes information (13). Models are developed using a learning process in which a series of connection weights, analogous to synapses, are tied to a series of processing elements, analogous to neurons. Because the ANN is presented with input-output pairs, the weight values are adjusted until an optimal solution, in our case prediction of minute-by-minute EE, is achieved. ANN is a good candidate model when there are a large number of inputs for a small number of outputs or when the ideal functional form of the solution is not known (4).
To implement ANN, we begin by specifying the number of inputs (acceleration or subject characteristic terms), the number of weight values (interactions between the terms), the architecture of the model, and the number and type of output parameters (EE, a single continuous variable) (Fig. 3). A single neuron receives multiple inputs, which represent characteristics of the acceleration signal or the subjects themselves. The relative importance of each input is specified by a weight value. A single neuron is not capable of solving difficult problems because it may not allow for all required nonlinearities or interaction terms, so multiple neurons are arranged into computational layers linked by transfer functions.
|
|
Validation was performed by leave-one-subject-out cross validation. In this approach, the total data were divided into a training set (n = 101) and a testing set (n = 1). The training set data were used to optimize the model to the estimation of minute-by-minute EE, whereas the test set data, which were not used to derive the model, were used to assess the performance of the model on new data (25). This process was repeated 102 times so that the model performance on each subject's data could be assessed. For each validation step, training ended when the error on the validation set failed to decrease by more than 1e-6 per iteration (after an initial drop), the error gradient fell below 1e-6, or 5,000 iterations were reached.
Feature extraction. Feature extraction is the key step in preparing raw data for ANN modeling. The purpose of this step is data reduction. In this study, 1,920 (32 samples/s x 60 s/min) data points are collected by each IDEEA sensor channel for each minute of study data collected. These values all correspond to a single measurement made by the indirect calorimeter. This amount of information quickly becomes cumbersome to analyze; however, more importantly, redundant information is likely contained in the acceleration signals. It is therefore vital that the raw data are reduced into a small number of parameters that carry the most relevant information. We chose to reduce the data into a series of parameters that we felt were both statistically relevant and physically meaningful. Eleven parameters were extracted from each channel of raw data [median, integral, peak intensity, interquartile interval, skew, kurtosis, peak CV over any 10 s of data, lowest 10 s CV, mean absolute error (MAE), and the summation of signal power above 0.7 Hz, and sum of signal power below 0.7 Hz]. The signal power cutoff of 0.7 Hz was determined based on optimizing the division in the power spectral density between walking and sedentary tasks in a sample of 10 subjects not used for model development. The 11 computed acceleration parameters were then analyzed based on their correlations with one another to eliminate redundant information. This step reduced the inputs to five for each hip sensor channel. These consisted of the peak value, the interquartile interval, the lowest coefficient of variability when each minute of data was analyzed in 10-s increments, the sum of the signal power below 0.7 Hz, and the sum of the signal power above 0.7 Hz. These data were joined in the input set by the subjects sex, age, height, body mass, and ethnic background because these features have been shown to impact resting metabolic rate and can be easily measured or self-reported (11).
Feature extraction was designed such that the final inputs to the model are quantities researchers are generally familiar with (at least conceptually). Additionally, by using a small number of easily computed data features, the storage requirements for any future activity monitors would be minimized because raw data would not need to be stored, only the relevant computed parameters. This effectively minimizes the amount of internal storage capacity required of the accelerometer while maintaining the quality of information derived from the raw signal. Model development and feature extraction were performed with Matlab 7.01 (Mathworks, Natick, MA).
Statistical Analysis
Data are presented as means, standard deviation, and total range. Models (AGFW, proprietary IDEEA, ANN) were compared on a per subject basis according to the MAE (Eq. 1), the mean squared error (MSE) (Eq. 2), the absolute percent difference between each model and the measured total energy expenditure (TEE), and the squared Pearson's correlation coefficient (r2) for each subject over the entire study duration, using ANOVA with post hoc Tukey tests. Bland-Altman plots (5) were used to examine trends in total EE estimation relative to the calorimeter.
![]() | (1) |
![]() | (2) |
| RESULTS |
|---|
|
|
|---|
50 iterations (Fig. 5, inset). The number of iterations to solution convergence can be altered through the model learning rate, which was set at 0.01 for this model. A linear regression was performed on the MSE vs. number of iterations to convergence. The slope of this regression was not significantly different from zero (P = 0.9214), suggesting that the number of iterations to model convergence did not significantly impact the final error.
|
|
|
|
|
| DISCUSSION |
|---|
|
|
|---|
Although the ActiGraph, proprietary IDEEA model, and the ANN model all had high correlation with the TEE on a minute-by-minute basis, this is more reflective of the capability of the accelerometer to determine whether any motion is present, rather than the model accurately reflecting minute-by-minute EE intensities. When measures that reflect the magnitude of the EE differences observed on a minute-by-minute basis (MAE or MSE) are considered, significant reductions in error were observed in the IDEEA model relative to AGFW and the ANN relative to both of the other models. The cumulative effect of these error reductions can be observed in the absolute percent difference between measured and estimated TEE. There is reduction of nearly 13% between AGFW and the ANN and 5% between the IDEEA monitor and the ANN. The mean of the difference in TEE was also greatly reduced by the ANN relative to the IDEEA and AGFW with the mean difference in the ANN model being only 21 kcal, suggesting that the model has corrected for some of the baseline offset (resting EE) problems that have been previously observed using the IDEEA model. In the future, the ANN approach should be compared with other nonlinear EE estimation approaches to test the capabilities of the ANN relative to more sophisticated modeling approaches using the ActiGraph.
One of the biggest challenges involved in generalized modeling with accelerometers has been the large standard deviation of estimations between subjects. In this study, we found high individual estimation errors by AG and IDEEA compared with the measured EE [95% CI (–835 to 125) and (–187 to
647) kcal/day, respectively]. These values are beyond the treatment effect that we typically target in sustainable weight loss interventions, which is 100–250 kcal/day. Thus reducing this measurement error has crucial clinical implications. Large variability in performance across subjects may be related to the fact that standard regression approaches do not have sufficient flexibility to alter estimations when the same acceleration count value is achieved by a subject whose personal characteristics are different from those used for model development. The standard deviation in the TEE observed with the ANN, which allows interactions between characteristics and acceleration terms, showed a reduction of
50% relative to AGFW and nearly 45% relative to the proprietary IDEEA model. The IDEEA uses similar characteristics to those used in the ANN and still exhibited a higher variability.
ANN has a number of attractive features for the energy expenditure estimation problem such as the flexibility of estimations across subjects, allowing interactions between all input terms, and its ability to map multiple inputs (acceleration terms) to a single output (EE) without prespecifying a functional form (linear, logistic, and so forth). The major disadvantages of ANN approaches are the computational complexity of the models that may require long training times and a relatively large number of free parameters. Additionally, model training requires a large number of labeled examples, acceleration data from a diverse sample of PA types, for which the EE is known, which requires a long data collection period before models can be developed. These models are also more difficult to disseminate to potential users, which would require developers to create macros or other simple-to-use tools for distribution. This is in contrast to generalized linear and nonlinear models, which can easily be presented in a manuscript and implemented by most researchers.
Although a classical ANN may seem like a black box solution technique, we attempted to minimize this appearance by carefully selecting model inputs that make sense in the context of the EE estimation problem. We chose terms that represent the magnitude and frequency of movements and the variability in motion patterns, which may be characteristic of certain PA types. This feature extraction process does require the model developer to make decisions about what data features may be of interest. Alternately, feature extraction can be performed by a standard data reduction technique such as principal component analysis, which achieves data reduction by combining parameters that are linearly related. This process maximizes the amount of the information from the data while eliminating repetitious measurements. The advantage of this technique is its capability to succinctly and consistently reduce data to the desired proportion of the total data variance. The disadvantage is that the reduced data are not composed of characteristics that would be familiar to researchers; rather, the data contain features representing agglomerations of measurements.
Perhaps the most challenging aspects of model development are collecting appropriate model training data and validation. Because there are literally hundreds of modes of PA that individuals may engage in and at least that many profiles for subjects metabolic response to exercise, models will tend to generalize best to data sets composed of activities similar to those that were used for the original model development. We have attempted to mitigate this factor by 1) asking subjects to self-pace activities and 2) capturing spontaneous bouts of PA. These two steps allow for the collected data to be both diverse in intensity composition and representative of the activity patterns our subjects would normally engage in. To attempt to minimize the potential error increases associated with applying our model to new subjects (generalization errors), a leave one subject out cross-validation procedure was selected. This technique allows the bulk of the collected data to be used in model development relative to a split sample validation where a much larger percentage of the total data is withheld. The data from the validation sample may include unique features that would have affected the model development had they been available, so it is desirable to use as much data as possible for model training. The model presented here, however, is meant only to prove that, in principle, a high dimensional modeling approach such as ANN coupled with feature extraction from raw (32 Hz) acceleration signals can be used for EE estimation on a minute-by-minute basis, and the specific weighting coefficients should not be viewed as final. A split sample validation should also be implemented once a larger data sample has been collected to more independently characterize the model performance.
In conclusion, accelerometers have long been considered a promising tool for estimating EE due to their relatively low price, ease of use, and ability to record for many days at a time. This potential has not been fully met to date because of limitations in our ability to relate the output variables from the monitors to EE. Collecting raw acceleration data has the capability of improving the precision of EE estimation by allowing researchers the flexibility to identify relevant parameters during the feature extraction phase as well as opening the field to high dimensional modeling techniques such as ANN, which have the capability of generating more flexible estimations than more traditional modeling techniques. This study has shown a proof of concept that, by applying feature extraction and ANN models to biaxial acceleration data acquired at the hip, minute-by-minute and total EE estimations can be improved. Additional subjects and modes of PA should be acquired to both validate the current model and for use in developing a more robust algorithm.
| GRANTS |
|---|
|
|
|---|
| ACKNOWLEDGMENTS |
|---|
|
|
|---|
| FOOTNOTES |
|---|
The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
J. Staudenmayer, D. Pober, S. Crouter, D. Bassett, and P. Freedson An artificial neural network to estimate physical activity energy expenditure and identify physical activity type from an accelerometer J Appl Physiol, October 1, 2009; 107(4): 1300 - 1307. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. G. Bonomi, G. Plasqui, A. H. C. Goris, and K. R. Westerterp Improving assessment of daily energy expenditure by identifying types of physical activity with a single accelerometer J Appl Physiol, September 1, 2009; 107(3): 655 - 661. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Corder, U. Ekelund, R. M. Steele, N. J. Wareham, and S. Brage Assessment of physical activity in youth J Appl Physiol, September 1, 2008; 105(3): 977 - 987. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Zakeri, A. L. Adolph, M. R. Puyau, F. A. Vohra, and N. F. Butte Application of cross-sectional time series modeling for the prediction of energy expenditure from heart rate and accelerometry J Appl Physiol, June 1, 2008; 104(6): 1665 - 1673. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| Visit Other APS Journals Online |