## Abstract

Asthma and COPD are chronic respiratory diseases that fluctuate widely with regard to clinical symptoms and airway obstruction, complicating treatment and prediction of exacerbations. Time series of respiratory impedance obtained by the forced oscillation technique are a convenient tool to study the respiratory system with high temporal resolution. In previous studies it was suggested that power-law-like fluctuations exist also in the healthy lung and that respiratory system impedance variability differs in asthma. In this study we elucidate such differences in a population of well-characterized subjects with asthma (*n* = 13, GINA 1+2), COPD (*n* = 12, GOLD I+II), and controls (*n* = 10) from time series at single frequency (12 min, *f* = 8 Hz). Maximum likelihood estimation did not rule out power-law behavior, accepting the null hypothesis in 17/35 cases (*P* > 0.05) and with significant differences in exponents for COPD (*P* < 0.03). Detrended fluctuation analysis exhibited scaling exponents close to 0.5, indicating few correlations, with no differences between groups (*P* > 0.14). In a second approach, we considered asthma and COPD as dynamic diseases, corresponding to changes of unknown parameters in a deterministic system. The similarity in shape between the combined probability distributions of normalized resistance and reactance was quantified by Wasserstein distances and reliably distinguished the two diseases (cross-validated predictive accuracy 0.80; sensitivity 0.83, specificity 0.77 for COPD). Wasserstein distances between 3+3 dimensional phase space reconstructions resulted in marginally better classification (accuracy 0.84, sensitivity 0.83, specificity 0.85). These latter findings suggest that the dynamics of respiratory impedance contain valuable information for the diagnosis and monitoring of patients with asthma and COPD, whereas the value of the stochastic approach is not clear presently.

- detrended fluctuation analysis
- forced oscillation technique
- monitoring airway obstruction
- nonlinear dynamics
- Wasserstein distances

asthma and chronic obstructive pulmonary disease (COPD) are the two most common chronic respiratory diseases, affecting millions of people worldwide (19, 20). A characteristic of these diseases is their variability in clinical symptoms and in the degree of airway obstruction. Airway caliber is fluctuating due to a complex interplay of environmental and endogenous stimuli and the dynamics of airway smooth muscle (ASM). In addition, these dynamics are affected by inflammatory processes that are difficult to assess noninvasively (40). Recent models also indicate that bronchoconstriction might lead to self-organized cascades of ventilatory breakdown (57), and the sporadic nature of this phenomenon might explain the difficulties in predicting life-threatening exacerbations in both diseases.

Daily measurements of peak expiratory flow (PEF), on the other hand, exhibit long-range correlations over the course of months, indicating the existence of memory in the respiratory system (14, 15). Interestingly, the variability in these PEF time series seems to correlate with predictability of airway obstruction, where increased airway instability occurs when correlations become weaker. Although the physiological origin of these long-range baseline correlations is mostly unknown, these findings highlight the importance of fluctuations in the respiratory system and suggest the existence of similar phenomena also on shorter time scales. Indeed, a well known example is provided by interbreath intervals that exhibit characteristic scaling exponents (13, 42, 53).

Here we focus on signals obtained by the forced oscillation technique (FOT). Due to its relative ease, noninvasiveness, and good tolerance, FOT has become a valuable tool to investigate airway properties (38). A small pressure perturbation is superimposed on the inflowing air and the integrated response of the airways recorded. Assuming a linear response allows to estimate (input) respiratory system impedance (*Z*rs_{in}), a complex quantity whose real and imaginary parts are respiratory resistance (*R*rs) and reactance (*X*rs), such that *Z*rs_{in} = *R*rs + j *X*rs (32). Commonly, only the magnitude (*R*rs^{2} +*X*rs^{2})^{1/2} is referred to as impedance *Z*rs, complemented by the phase angle arctan(*R*rs/*X*rs), and we also adopt this convention in the rest of the paper. Clinically, these quantities have proven useful to assess and study lung diseases (21), and this has been investigated in a plethora of studies. However, apart from a few exceptions, only the mean values of *Z*rs, averaged over a few breathing cycles, are used (28). It is the aim of the present study to investigate whether more involved analysis techniques allow to distinguish between asthma, COPD, or controls with increased accuracy, compared to just using mean *Z*rs.

For completeness, we mention that factors influencing *Z*rs include exercise, posture, and sympathetic tone (4). Within-breath measurements of *Z*rs show a marked biphasic pattern that is the result of volume and flow dependence (9, 55) with slightly distinct behavior for inspiratory and expiratory phases (39). This modulation is partially attributed for by interference with the larynx and glottis (51), but also hints at hysteresis, i.e., dependence on the volume history, in the respiratory system (59). On top of these, *Z*rs is influenced by ventilatory inhomogeneities (17), airway caliber (43), interactions between airways (61), and various artifacts, the most problematic being the upper airways shunt (5, 38).

Separation of *Z*rs into contributions from various airways and tissue components is possible by the use of mathematical compartment models (12). However, this necessitates the use of complex excitation signals and estimation procedures and imposes limits on the temporal resolution, which is on the order of the inverse of the excitation frequency. Therefore, single frequency excitation signals are preferred to track *Z*rs in real time, and the role of deep inspirations and methacholine challenge has been studied thereby (29, 49, 55).

From the above we postulate that the temporal course of *Z*rs should contain valuable information that is differentially affected by respiratory diseases. In contrast to most studies that consider significance probabilities of differences on the group level, we estimate predictive accuracy when classifying individual subjects with regard to group membership. Employing cross-validated linear discriminant analysis (LDA) allows to quantify the amount of functionally differentiated information contained in the *Z*rs signals.

We present results obtained by two distinct approaches. In the first, *Z*rs time series are considered to arise from a stochastic process and we analyze their “noise” component (9) by distributional analysis of fluctuations, similar to the analysis of Que et al. (45). On the basis of recent recommendations, we employ state-of-the-art maximum likelihood estimation (8, 60). Furthermore, we employ detrended fluctuation analysis (DFA) (41) to quantify the extent to which *Z*rs signals are correlated in time and how self-similar these fluctuations appear; to our knowledge this method has not been applied to impedance measurements before.

In a second approach, the *Z*rs signal is considered to arise from a dynamic system. Thereby we will assume the respiratory system to contain a deterministic component. Moreover, we will make the assumption that the dynamic evolution of this deterministic component changes in distinct ways in airways influenced by respiratory diseases. This idea of a “dynamic disease” postulates that diseases correspond to changes in control parameters that subsequently modify the dynamic behavior of the system (18). Using the methods of nonlinear time series analysis (25, 52), we quantify differences between phase space reconstructions of normalized impedance time series.

## METHODS

### Subjects

We analyzed data collected during the baseline phase of a previous study (50). The population consisted of 13 patients with intermittent and mild persistent asthma, characterized by GINA guidelines as step 1 and 2 (19) asthma; 12 patients with mild to moderate COPD, characterized as GOLD type I and II (20); and 12 healthy control subjects.

The patients with asthma were all nonsmokers or ex-smokers with less than five pack years exposure and had a history of episodic wheezing or chest tightness. Baseline forced expiratory volume in 1 s (FEV_{1}) was >70% of predicted, and the provocative concentration of methacholine for a 20% fall in FEV_{1} (PC_{20}) was <8 mg/ml. All asthma patients were atopic, which was determined by one or more positive skin prick tests against 10 common aeroallergens.

The COPD patients were all smokers or ex-smokers with more than ten pack years exposure and had a history of chronic cough or dyspnea. Their FEV_{1}/FVC ratio was <70% predicted postbronchodilator, and the reversibility of FEV_{1} by salbutamol was <12% of predicted.

All patients were clinically stable, used β_{2}-agonists on demand only, and had no history of respiratory tract infection or other relevant diseases up to 2 wk prior to the study. None of the asthma or COPD patients had used inhaled or oral corticosteroids up to 3 mo prior to the study.

The healthy controls had no history of respiratory symptoms and were nonsmokers or ex-smokers with less than five pack years exposure. Baseline FEV_{1} was >80% of predicted and PC_{20} methacholine was >16 mg/ml. They showed no positive reaction to the skin prick test.

The baseline characteristics of the three groups have been listed previously (50). The Institutional Review Board for Human Studies approved the protocol, and the subjects gave their written informed consent before entering the study.

### Forced Oscillation Method

A forced oscillation device (Woolcock Institute) with a fixed oscillation frequency of 8 Hz and an amplitude of ±1 cmH_{2}O was used after calibration with tubes of known resistance. Subjects breathed through an antibacterial filter with a resistance of 0.2 cmH_{2}O·l^{−1}·s and respiratory flow was measured by a Fleisch pneumotachograph (diameter 50 mm, Vitalograph, Maids Moreton, UK). Differential pressure was measured by a ±2.5 cmH_{2}O solid-state transducer (Sursense DCAL4; Honeywell Sensing and Control). A similar transducer with a higher range (±12.5 cmH_{2}O) was used to measure mouth pressure (50). The pressure generator was connected by a 50-cm-long inertive tube.

The pressure and flow signals were digitized at 400 Hz, and the resulting time series were transformed to the time-frequency domain by a maximal overlap discrete Fourier transform that acts as a band-pass filter for the frequency 8 Hz (filter width 100 samples, i.e., a time window of 0.25 s). Time- and frequency-dependent complex respiratory impedance *Z*rs_{in} was then estimated, based on the hypothesis that random errors occur in both pressure and flow signals, by a total least squares (TLS) fit, which is equivalent to maximum likelihood estimation. Further details can be found in the online supplement of Slats et al. (50).

### Time Series and Artifact Removal

Respiratory impedance was measured three times during 60 s of tidal breathing on four distinct days during the course of a few weeks, yielding 12 time series per subject. Before further analysis the signals were downsampled to 16 Hz, i.e., the Nyquist frequency for the applied pressure oscillation, to decrease the computational effort (this is admissible here where we were mainly interested in properties for larger time scales). Artifacts were removed automatically by a custom-written algorithm. Each respiratory cycle was considered individually and rejected if *1*) negative resistance values occurred or the TLS estimation did not converge (indicative of artifacts) or *2*) flow limitation occurred. The latter was detected if the difference in mean *X*rs for inspiratory and expiratory phases was >3 cmH_{2}O·l^{−1}·s or if the difference in peak *X*rs for inspiratory and expiratory phases exceeded 8 cmH_{2}O·l^{−1}·s (10; with slightly adjusted threshold values). These events occurred infrequently and only a few percent of breathing cycles were thereby rejected.

### Analysis

#### Statistical properties.

In addition to mean and standard deviation (SD) over time, the time series were characterized by two functions of higher moments. Excess kurtosis, defined as the fourth central moment divided by the square of the variance minus 3, was used as a measure of peakedness of the distributions, and skewness, defined as the third central moment divided by the third power of SD, was used as a measure of asymmetry of the distributions. Distributions of parameters are shown in box and whisker plots for each group separately (with labels A: asthma, C: COPD, N: controls), that also show significance probabilities of differences in mean (unpaired two-sample nonparametric Wilcoxon tests) for all pairwise contrasts and additionally between controls and a “diseased” *group D* that comprises both asthmatics and COPD patients. Statistical significance was tested at the *P* = 0.05 level.

#### Power-law analysis.

Fluctuations of *Z*rs were defined by
*f*(x) of *Z*var follows a power-law if it is proportional to x^{α}, with an exponent α ≤ −1. Thereby, it is assumed that power-law behavior is only present for values x ≥ x_{min} for a finite threshold x_{min} > 0. The probability density of the two-parameter power-law distribution is then
_{min} is estimated by optimizing the goodness-of-fit of the estimated power-law with the empirical distribution of the data, quantified by the Kolmogorov-Smirnov statistic D (8). To avoid spurious minima for very short tails, only values of x_{min} were considered such that at least 200 values of *Z*var remained in the tail. Figure 1 shows an example of the estimation procedure. To compare with results obtained in Que et al. (45), we also extrapolate the probability distribution for unit fluctuations, i.e., consider log_{10} *f*(1) = log_{10} (−α−1)+(α+1) log_{10}(x_{min}).

To test whether the power-law hypothesis is a viable description of the data, we employed the permutation test of Clauset et al. (8). A number of synthetic dataset were generated with similar distributional properties as the original data. The power-law estimation was repeated independently for each of these and the fraction of values of the Kolmogorov-Smirnov statistic D that were larger than its value for the observed data forms the significance probability (*P* value) of this test. For a conservative test, if *P* < 0.05 the null hypothesis of power-law behavior was rejected, otherwise power-law behavior was accepted to be a valid model of the experimental data. We employed this test with 100 Monte Carlo datasets for each *Z*var time series, but did not correct for multiple comparisons (*n* = 35). Note that accepting the null hypothesis does not prove the existence of power-law behavior, but the compatibility of this model, i.e., that there is no significant evidence against the power-law hypothesis.

As an alternative to the power-law behavior we fitted two- and three-parameter log-normal distributions (47) to both the tail and the complete set of *Z*var values and considered the three-parameter truncated power-law distribution (1) that introduces an additional upper bound x_{max} and allows for general exponents α < 0. The significance of an improvement in fitting the data was assessed by the asymptotic likelihood ratio test.

#### Detrended fluctuation analysis.

Individual time series of *Z*rs were submitted to DFA (41) to assess self-similar behavior. Thereby, the deviations of the X(*i*)=*Z*var(*i*) time series from the mean were first integrated,
*Y*(*i*) was then divided into *N _{s}*=int(

*N*/

*s*) nonoverlapping segments of length

*s*. Since the length

*N*of the time series is usually not a multiple of the scale

*s*, a short part at the end of the profile may remain. In order not to disregard this part of the series, the procedure was repeated starting from the opposite end, leading to a total of 2

*N*segments (24). For each such segment either a linear (DFA1) or a quadratic (DFA2) trend was estimated by least-squares regression and subtracted from the data. The squares of the residuals were integrated and divided by the length to yield the mean-square error

_{s}*F*

^{2}(

*j*,

*s*) of the

*j*th segment at scale

*s*. The second-order fluctuation function is given by the total root-mean square (RMS) error,

*F*(

*s*) was assessed in a double logarithmic plot for a variety of scales. The smallest scale considered was

*s*= 8 samples. The scale was successively increased by a factor of √2 until

*s*was at most half of the length of the time series.

Scaling behavior results in a line in the double logarithmic plot of *F*(*s*) against *s*, which was estimated by weighted linear regression. Weights proportional to the inverse of scale were used to account for the fact that the larger scales are estimated from less segments, with increased variance. The occurrence of two separate scaling regimes (crossover phenomenon) was tested by assuming each scale individually as a change point and separately fitting lines to *F*(*s*) for both scales smaller and for scales larger. If the introduction of a change point led to an decrease in total RMS error of more than a factor 5, the existence of a crossover phenomenon was assumed. Figure 5*A* shows an example.

#### Wasserstein distances.

The time series were alternatively assumed to be projections of a dynamic system, considering *R*rs and *X*rs as two dynamic variables. In the simplest case the two-dimensional scatterplot of *R*rs vs. *X*rs captures the dynamic range of the impedance over the course of time. To allow for comparisons of distinct systems and to put the two parameters on equal standing, we normalized their marginal distributions independently to zero mean and unit variance and considered the resulting two-dimensional joint probability distributions (Fig. 2).

To quantify differences between two such empirical probability distributions we employed quadratic Wasserstein distances (35, 36). Assume X(*i*) and Y(*i*) {*i*=1, …, *N*} to be two vector-valued time series (here with normalized *R*rs and *X*rs as its two components) of the same length *N*. In this case the quadratic Wasserstein distance between X and Y reduces to the optimal transportation problem
*N*}. The value W_{2}(X,Y) can be interpreted as the average work (in terms of distance) needed to transform one distribution into the other after optimally matching sample points. This convex optimization problem was solved numerically by a network simplex algorithm (30).

Since the computation of W_{2}(X,Y) is time consuming, and the two series X and Y do not have the same length in general, we bootstrapped all distances by randomly sampling 512 points from both probability distributions a total of K=25 times each, using the mean of the K distance values so obtained as an unbiased estimator of W_{2}(X,Y).

#### Nonlinear analysis.

Generalizing the approach of the previous section, we recovered the state space of the impedance dynamics by delay vector embedding, as common in nonlinear time series analysis (25, 52). Let X(*i*) be a (normalized) time series, then
_{m}(*i*) contain essentially new information on the state of the system, and a convenient choice for the optimal lag τ is the decorrelation time, i.e., the lag at which the autocorrelation function (ACF) of X falls to 1/e.

The embedding dimension m should be large enough to unfold the dynamics properly and was here estimated by the method of false nearest neighbors (FNN) (26). If the distance between a point X_{m}(*i*) and its nearest neighbor X_{m}[k(*i*)] increases by a factor of more than 10 when increasing the embedding dimension m, then X_{m}[k(*i*)] is classified as a (relative) false neighbor. The optimal embedding dimension was assessed as the value of m where the fraction of FNN dropped below the 1 percent threshold.

Combining two m-dimensional delay vector series for *R*rs and *X*rs into a 2m dimensional vector-valued time series *X*_{2m}, we again employed the Wasserstein distances (see above) in 2m dimensions to robustly quantify differences in the shape of their impedance dynamics for each pair of subjects. This approach was pioneered by Moeckel and Murray (34).

#### Multidimensional scaling.

Measuring distances between phase space distributions for each pair of subjects (*n* = 35) leads to an n-by-n matrix of distances D* _{ij}* = W

_{2}(X

*,X*

_{i}*). To analyze these statistically it is advantageous to represent these distances by points in an Euclidean space R*

_{j}^{k}, which is achieved by metric multidimensional scaling (MDS) (3). The coordinates of points in the space R

^{k}are ordered according to their contribution to the variance of the distances and can often be assigned functional meaning or lead to the formulation of hypotheses about it (35). This dimension reduction method introduces misrepresentation errors, which were quantified by stress-per-point; for the

*i*th point with k coordinates x

*the stress-per-point is the average of (|x*

_{i}*−x*

_{i}*| −D*

_{j}*)*

_{ij}^{2}over all points

*j*≠

*i*.

#### Discriminant analysis.

To assess the *predictive* value of observed differences between groups of subjects, we employed linear discriminant analysis (LDA) as implemented in the statistical software R (56). A linear discriminant function is obtained that projects each of the k data items (from c distinct groups) onto r numerical scores, where r = min(c−1,k), corresponding to the optimal linear separation between the classes. The latter is given by a linear combination that maximizes a generalized signal-to-noise ratio, leading to r canonical variates that identify a vector subspace containing the variability across the c classes. From the r scores data items are classified according to the nearest group centroid, measured in terms of Mahalanobis distance. These centroids are estimated from the sample covariance matrix, and LDA makes the assumption that there are no differences between the group covariances. Here we will use LDA on the k MDS coordinates (derived from the Wasserstein distances), an idea that goes back to Anderson and Robinson (2).

Results are given in terms of total accuracy, i.e., the fraction of correctly identified group memberships, and for binary classifications additionally in terms of sensitivity and specificity relative to the diagnosis of a positive condition (mostly COPD in the following) against a negative (mostly asthma). Let TP denote the number of true positives, FP the number of false positives, and TN, FN analogously for the negatives. Then

#### Cross-validation and worst-case classifier.

All classification methods suffer from the fact that resubstitution accuracy, i.e., predictive accuracy of the data used to derive a classifier, invariably improves as the prediction model becomes more complex. Eventually the prediction model can distinguish data items by irrelevant chance differences, i.e., by using the noise in measurements to classify. To control for such overfitting we employed leave-one-out cross validation. When using LDA on coordinates derived by MDS of Wasserstein distances, it is then necessary to estimate the MDS coordinates of a time series in the MDS space defined by the remaining *n*-1 time series. This is achieved by estimating the fallible scalar products by a nonlinear optimization procedure (54) and is the reason why we consider metric MDS here instead of nonmetric generalizations (3).

All classification accuracies should be judged in the light of *1*) the null accuracy, achieved by randomly “guessing” class membership, of 1/3=0.33 and *2*) the worst-case classification accuracies, classifying all subjects as belonging to the largest group, which were 13/35=0.37 (all 3 groups) and 13/25=0.52 (asthma vs. COPD).

#### Multiple response permutation procedures.

Apart from being used for classification, the Wasserstein distances were assessed in terms of how significant was the separation between classes. This was achieved by a multiple response permutation procedure (MRPP) (33). The test statistic δ is the overall weighted mean of within-group means of all pairwise distances. It is first calculated under the known true group assignment and then by randomly permuting the group labels. The number of values of δ obtained under the permutation distribution that are larger than the original δ defines the significance probability (*P* value) of this test. All MRPP tests were run with 10^{5} randomly generated permutations.

Additionally, the chance-corrected within-group agreement

#### Software used.

All data analysis was performed in the statistical computing environment R (46) using the *MASS* package (56), the *fractal* package (W. Constantine and D. Percival, unpublished), the *vegan* package (J Oksanen, R Kindt, P Legendre, B O'Hara and MHH Steven, unpublished), and the *ROCR* package (48). Note that the implementation of DFA in *fractal* is faulty and was replaced by a custom-written algorithm that can be obtained from the authors on request. The two matrices of Wasserstein distances (one shown in Fig. 6*A*) are available as data supplements in the online journal.

## RESULTS

### Statistical Results and Predictive Accuracies

The mean values of *Z*rs, *R*rs, and *X*rs are shown in Fig. 3. There were significant differences between COPD patients and asthmatics (*P* = 0.035) or healthy controls (*P* = 0.014) in mean *Z*rs, but not between asthmatics and controls (*P* = 0.61). This increase in mean *Z*rs was reflected in significant decreases in *X*rs for the COPD group (*P* = 0.0055/*P* = 0.0016) and to a lesser degree in increases in *R*rs (*P* = 0.053/0.014). Although some of these differences were highly significant (e.g., mean *X*rs for COPD vs. controls), classification of COPD vs. controls by LDA of mean *X*rs only achieved an accuracy of 0.77 and accuracies of 0.73 for both mean *R*rs and *Z*rs. Discrimination between asthma and COPD was possible with accuracies of 0.72 (mean *X*rs) and 0.64 (mean *R*rs or *Z*rs).

Mean ln*Z*rs and ln*Z*rsSD were considered for completeness [to compare with the results of (11, 44)] and differed significantly between asthma and COPD (*P* = 0.046/*P* = 0.030) and led to similar classification (ln*Z*rs) or slightly worse (ln*Z*rsSD) classification results. No significant differences between asthmatics and controls were detected (*P* = 0.49/*P* = 0.70), consistent with the findings of (11), but COPD showed marginal increases in ln*Z*rs (*P* = 0.017), but not in ln*Z*rs variability (*P* = 0.081). Regarding higher moments, there were significant differences in kurtosis (peakedness) and skewness (asymmetry) of *Z*rs between COPD and the other groups (A: *P* = 0.0037/*P* = 0.040, N: *P* = 0.043/*P* = 0.025). These allowed for an accuracy of 0.68 (kurtosis) and 0.64 (skewness) when distinguishing asthma from COPD.

### Power-Law Analysis

Estimated power-law exponents and thresholds are shown in Fig. 4 for the *Z*var fluctuations. There were significant differences in exponents between COPD and asthma or controls (*P* = 0.018/*P* = 0.023), with slightly smaller exponents α in COPD, indicating that relatively larger fluctuations in COPD occur less often than in asthmatics or healthy controls. The threshold x_{min} was significantly higher in COPD than for the other groups (*P* < 0.008). The latter could be explained by a slightly larger variability in COPD (see Fig. 3*F*) and resulted in classification accuracies of 0.72 (asthma/COPD) and 0.73 (COPD/controls). The logarithm of the extrapolated probability density at *Z*var=1.0 (cmH_{2}O·l^{−1}·s)^{2} showed a significant increase for COPD with respect to the other groups (*P* < 0.002; Fig. 4*D*), and this indeed resulted in classification accuracies of 0.80 (against asthma) and 0.82 (against controls), whereas classification of asthma vs. controls was more or less random with an accuracy of 0.57, in contrast to the earlier findings of Que et al. (45).

The null hypothesis of power-law behavior was accepted for 17/35 subjects, distributed almost evenly among the three groups (Fig. 4*C*). Fitting a three- or two-parameter log-normal distribution to the same data in the tail of the *Z*var distribution resulted in a comparable fit, with an insignificant likelihood ratio in all cases. The truncated power-law distribution generally resulted in the best fit of the tail data, but also with an insignificant increase in likelihood. It can be concluded that both log-normal and power-law distributions are plausible models for the tails (compare Fig. 1*D*).

Regarding the complete set of *Z*var values, the log-normal distribution achieved the largest likelihood. However, this is still not a significant improvement over the power-law distribution, which seems to model the more extreme values better (see Fig. 1*B*).

### Detrended Fluctuation Analysis

Although DFA is to a certain extent robust against the removal of segments (7, 31), it is advisable to analyze only contiguous data. In contrast to the previous section we therefore employed DFA on individual 1-min measurements only, with ∼800–900 data points each. The quasi-periodic nature of the breathing cycle introduces spurious residuals due to the detrending for scales that are smaller than the average breathing period (e.g., 37). In particular, the scaling exponents obtained from DFA1 and DFA2, respectively, differ largely. Above this period, the scaling behavior changes (Fig. 5*A*) and results for DFA1 and DFA2 mostly agree. We chose the values of the more robust DFA1 for the larger scales and averaged this for all 12 measurements of each subject.

There were no significant differences in scaling exponents between groups (*P* > 0.14), compare Fig. 5*B*, although it seems that scaling behavior might be closer to the value 0.5 of Brownian motion for COPD than for asthma and controls.

### Distance-Based Analysis

The Wasserstein distances for the 1+1 dimensional joint probability distributions of *R*rs and *X*rs (both normalized independently) allowed to distinguish asthma and COPD with above chance accuracies. The eigenvalue distribution indicated that the distances can be represented reasonably well in k=2 dimensions (explaining a fraction 0.65 of their variance) with an intermediate misrepresentation error (a fraction of 0.10 of the average distance) more or less uniformly distributed among the points. The group structure in this functional space was significantly clustered (*P* = 0.002), but a within-group agreement A=0.06 suggests that only ∼6% of the variance among distances is explained by group structure. Including more reconstruction dimensions, the cross-validated classification accuracies decreased. LDA in two MDS dimensions classified with accuracy 0.51 in the full contrast and with accuracy 0.76 between asthma/COPD. The best asthma/COPD classification was achieved in just one dimension, leading to an accuracy of 0.80, sensitivity 0.83, and specificity 0.77.

### Nonlinear Analysis

Assuming the *R*rs and *X*rs time series to result from an underlying dynamic system, the proper time lag for delay vector reconstruction was assessed by the decorrelation time of the autocorrelation functions, with mean values of 14 ± 13 SD and 12 ± 9 SD, respectively. Due to the high variability, and since stochastic contributions to the signal might bias these estimates to larger values, the median values of 10 (for *R*rs and *X*rs alike) were chosen, corresponding to 0.625 s as characteristic time scale of the impedance dynamics, i.e., about one-fourth of a breathing cycle. Assessment of FNN suggested an embedding dimension of three to four (FNN *R*rs: relative 3.8 ± 0.6 SD, *X*rs: relative 3.9 ± 0.7 SD) and m=3 was chosen, as balancing the influence of noise seemed more important than improved resolution of the dynamics.

As in the 1+1 dimensional case, we quantified differences between the 3+3 dimensional delay vector distributions of *R*rs (3 delay coordinates) and *X*rs (the other 3 coordinates), normalizing the two to zero mean and unit variance independently. Results are shown in Fig. 6. The eigenvector distribution (Fig. 6*C*) suggests that although two dimensions captured most of the variance of the distances (a fraction of 0.48), quite a few more are needed to represent the distances faithfully. Indeed, for a two-dimensional MDS reconstruction the misrepresentation error was relatively large (Fig. 6*B*, about a fraction of 0.16 of the average distances). The group structure was still significant (*P* = 0.006; Fig. 6*D*), even under a lower within-group agreement A=0.023. The classification accuracies for the full contrast attained their maximum of 0.54 for two dimensions and for the asthma/COPD contrast in five reconstruction dimensions (Fig. 6*E*), which resulted in an accuracy of 0.84, sensitivity 0.83, and specificity 0.85 in five reconstruction dimensions (Fig. 6, *G–H*).

## DISCUSSION

We have attempted to distinguish between asthma, COPD, and healthy controls either by assessing fluctuations and scaling behavior or by robustly comparing probability distributions of the dynamic behavior of *R*rs and *X*rs, implicitly assuming an underlying dynamic system.

### Main Findings

Evidence for the controversial power-law hypothesis (45) was found. That is, the power-law null hypothesis could not be rejected for 17/35 subjects at the 5 percent significance level, and their *Z*var fluctuations were consistent with power-law behavior when this was fitted to the tail of the distributions. However, although there was no evidence against the power-law distribution, the two- or three-parameter log-normal distribution described the tail almost equally well. Without larger time series it is difficult to conclude this issue.

Consistent with earlier findings we did not detect significant changes between power-law exponents for asthmatics vs. controls (*P* > 0.99), but COPD showed significantly different exponents. In contrast to Que et al. (45), we did not detect significant differences in power-law intercepts between asthmatics and controls, although the extrapolated intercepts were significantly larger for COPD. The earlier analysis was done with methods that are now known to be potentially unreliable (60), and these earlier findings should therefore carefully be reconsidered. The final analysis of this data in Que et al. (44) in terms of log-normal distributions is consistent with our results.

Detrended fluctuation analysis did not obtain any significant differences in scaling exponents. Moreover, the scaling exponents were close to 0.5, the value obtained for Brownian motion, although it seems that exponents in COPD might be somewhat closer to 0.5 than for asthma and controls, indicating increased randomness. Due to the quasi-periodic nature of breathing, DFA exhibited two different scaling regimes. For scales lower than the average breathing period the DFA exponent depends strongly on the method of detrending and has to be considered unreliable. For scales above the breathing period, we have considered the exponents to be reliable. At even larger time scales it seems likely that respiratory impedance exhibits yet another crossover into scaling exponents significantly larger than 0.5, since similar phenomena were found in time series of tidal breathing parameters (6). However, to conclude this issue necessitates much longer recordings than presently available.

The distance-based analysis between probability distributions further evidenced that there exist subtle differences in respiratory properties. Since the *R*rs and *X*rs time series were normalized for this analysis, only differences in the shape of the dynamic behavior were thereby quantified. Interestingly, these were sufficiently large to allow robust (cross-validated) classification of 80 percent of subjects in the asthma/COPD contrast, which was better than classification based on mean *Z*rs, ln*Z*rsSD, skewness and kurtosis of *Z*rs, etc., individually. Only the estimated intercept of the power-law behavior of the tail resulted in similar classification between asthma and COPD. This finding confirms our hypothesis that the two diseases differentially affect the within-breath dynamics of respiratory impedance.

Regarding the 3+3 dimensional delay embedding and its Wasserstein distances, these did only improve classification marginally (to 84 percent in the asthma/COPD contrast) with respect to the 1+1 dimensional distributions. In the light of the largely increased noise level (due to the sparseness of delay vectors) this indicates that such delay reconstruction might possibly incorporate additional information that is not present when only using the 1+1 dimensional distributions. However, it seems necessary to reduce the influence of noise considerably before this could be convincingly demonstrated and is left to future studies.

In contrast to the largely successful classification of asthma vs. COPD, predictive classification of asthmatics vs. healthy controls was problematic due to large overlap between those two groups, in both the statistical properties of *Z*rs as well as their dynamic behavior.

### Clinical Implications

The distance-based time series analysis of respiratory impedance led to a correct distinction between patients with asthma and COPD in at least 80 percent of cases, i.e., the forced oscillation technique can capture discriminative aspects of airway disease from recordings during simple, tidal breathing. The differential diagnosis of asthma and COPD can be a challenge in clinical practice (23) as it appears that both diseases can exhibit overlapping pathological and physiological features (16). Part of this overlap may be due to real coexistence of both diseases in some patients, whereas in others the current diagnostic techniques apparently fail to pick up the difference (16). Alternatively, it cannot be excluded that the classical diagnoses of asthma and COPD are not capturing existing and clinically relevant phenotypic differences among patients with obstructive airway diseases. Our data suggest that characterization of patients based on evidence from objective measurements, such as respiratory impedance time series, may become more informative in clinical assessment than traditional diagnostic labels.

Our patients were used as a so-called “training set” (27), thereby being representative of gold-standard patients of either disease. The presently observed discriminative capacity of the dynamic time series analysis is, therefore, promising with regard to differential diagnosis and monitoring of asthma and COPD. The fully noninvasive nature of the measurements, without the requirement of artificial breathing maneuvers, offers great prospect for clinical application in chronic, often elderly patients. However, this still requires validation experiments, in independently recruited patients with an intention-to-diagnose (27), to establish the diagnostic accuracy of the dynamic time series analysis of respiratory impedance in clinical practice.

### Conclusion

Instead of evaluating *Z*rs signals with respect to the mechanical properties of airways, we have attempted a stochastic and nonlinear analysis. The distance analysis showed that there exist subtle differences in these signals that can only be partially attributed to statistical properties, such that the nature of the differential behavior of respiratory impedance is mostly unclear. Self-similar fluctuations were not detected in the signals at this time scale, and the evidence for power-law behavior was not conclusive. The distance-based analysis, however, has proved useful and detected clustering in functional space, indicating functional changes in respiratory impedance that are characteristic with respect to disease. Reverse engineering of these patterns is a possibility, since the interpolation properties of Wasserstein distances (58, ch. 5.1), in combination with nonlinear modeling techniques (22), principally allow to compute characteristic dynamic models for each group of subjects. This would potentially lead to further insights into how the respiratory system is affected in disease and possibly also allow to assess and track changes in airway caliber over the course of time.

## DISCLOSURES

No conflicts of interest, financial or otherwise, are declared by the authors.

## ACKNOWLEDGMENTS

M. Muskulus acknowledges support from the Netherlands Foundation for Scientific Research under grant no. 635.100.006. A. M. Slats has been supported by a fellowship from the Netherlands Asthma Foundation (3.2.02.34).

Present address of M. Muskulus: Department of Civil and Transport Engineering IVT-BAT, Norwegian University of Science and Technology NTNU, Lerkendalsbygget, 7491 Trondheim, Norway (e-mail: michael.muskulus{at}ntnu.no).

- Copyright © 2010 the American Physiological Society