| Sign In to gain access to subscriptions and/or personal tools. |
Efficiency Issues Among Statistical Methods for Demonstrating Efficacy of Caries Prevention
1 Department of Dental Public Health Sciences and Correspondence: * corresponding author, llman{at}u.washington.edu
ABSTRACT Although repeated tooth-surface-specific information is commonly collected during a longitudinal caries clinical trial, traditional methods often make limited use of the repeated measures. Newer methods of analysis, such as methods based on time-to-event and methods for longitudinal or clustered data, have the potential to increase the efficiency of the statistical analysis. We compare a range of analytical methods from the traditional analysis based only on the number of caries onsets to newer methods that incorporate time at risk and surface-specific information, such as Poisson regression methods for clustered data, with respect to the efficiency of treatment comparisons. Under most circumstances, the greatest gain in efficiency associated with time-to-event methods will be due to the ability of subjects to contribute caries onsets to the analysis until they are lost from the study. Incorporating the number of surfaces at risk, the surface time at risk, and surface-specific characteristics will typically produce only a modest gain in efficiency.
Key Words: Poisson regression survival analysis correlated data INTRODUCTION Traditional methods of analysis of caries clinical trials are based on the number of decayed, filled, and missing teeth or surfaces (DMF), and comparisons between treatments for caries prevention typically are based on the change in DMF between a baseline visit and a final follow-up visit. Typically, subjects who are lost to follow-up before the final follow-up visit are excluded from the analysis, because the DMF change score cannot be computed (Hannigan et al., 2001). The drawbacks to using the DMF change score to estimate caries incidence have been enumerated in the literature, as have potential modifications to the DMF score to improve the estimation of caries occurrence (Hujoel et al., 1994; Beck et al., 1995, 1997; Lawrence et al., 1996; Burt, 1997; Kingman and Selwitz, 1997; Spencer, 1997; Slade and Caplan, 1999). Ideally, the calculation of caries occurrence in a clinical trial should take into account the number of caries onsets, the number of surfaces at risk, and the time that the surface was at risk. This information is not always present in the DMF change scores (Hujoel et al., 1994). Although repeated surface-specific information is often collected during a longitudinal clinical trial, traditional methods make limited use of these data. Newer methods of analysis, such as methods based on time-to-event and methods for longitudinal or clustered data, have the potential to increase the efficiency and sensitivity of the statistical analysis for detection of treatment effects as compared with traditional methods (DeRouen et al., 1995; Beck et al., 1997; Spencer, 1997; Hannigan et al., 2001). The focus of this paper is on efficiency issues related to the use of (1) the number of caries onsets, (2) the number of surfaces at risk, and (3) the surface time at risk in the analysis of caries clinical trials. For this discussion, the traditional analysis will be considered to be based on the change in the DMF score between a baseline visit and a final follow-up visit, and the change score will measure the number of new caries (i.e., new decayed surfaces or new filled or missing surfaces due to caries) for subjects who are present at both the baseline and final follow-up visit (i.e., caries onsets from subjects who are lost to follow-up before the final follow-up visit are excluded from the analysis). We review the conditions under which methods that incorporate interim caries responses, surface-specific information, or surface time at risk, such as Poisson regression and methods for survival data, will be more efficient, as compared with the traditional analysis, for demonstrating a treatment effect for caries prevention, and discuss whether these conditions are likely to be met in the typical caries clinical trial. METHODS The typical clinical caries trial involves randomization of subjects to two or more treatment groups with annual repeated measurements on caries status taken at multiple teeth or surfaces for each subject. A commonly used epidemiological method to compare counts or incidence rates is Poisson regression. Since caries data typically exhibit more variation than expected from the Poisson distribution, Poisson regression methods that allow for extra-variation or overdispersion are recommended to compare caries incidence rates. The outcome measure in a Poisson regression analysis is the number of incident or new caries (i.e., new decayed surfaces or new filled or missing surfaces due to caries) and the surface time at risk. Given that measurements are usually recorded on an annual basis, the actual time of new caries or censoring is not known, but is usually estimated as the midpoint between two follow-up times (Hujoel et al., 1994; Beck et al., 1997; Slade and Caplan, 1999). With the Poisson regression methods, the surface or tooth can be used as the unit of analysis, with the use of generalized estimating equations (Liang and Zeger, 1986) or generalized linear mixed models (Breslow and Clayton, 1993; Littell et al., 1996), or the subject can be treated as the unit of analysis with generalized estimating equations (Hujoel et al., 1994) or log-linear regression (Breslow, 1984). An advantage of the generalized estimating equations implementation of the Poisson regression method is that, for valid inference, only the regression model describing how the caries onset rate depends on the covariates included in the model needs to be correctly specified (typically only an indicator for the treatment group is the model), whereas the caries onsets do not have to exhibit Poisson variance or have a Poisson distribution. Another class of methods that take into account the surface time at risk are the various methods for survival data that have been developed for non-clustered interval-censored or discrete failure-time data (e.g., the discrete analogue of the Cox proportional hazards model, accelerated failure-time model for interval censoring). The survival time methods use the surface or tooth as the unit of analysis, and treatment differences are estimated by standard survival methods based on an assumption that all observations are independent. Although teeth or surfaces within the same subject usually are correlated, the estimates of the treatment differences are still consistent as long as the model for the failure time or hazard is correctly specified (Wei et al., 1989). However, an empirically based covariance estimator or a covariance estimator based on re-sampling (e.g., jackknife or bootstrap estimator) is used to estimate the standard errors of the estimated treatment differences, and hence, to perform valid hypothesis-testing (Wei et al., 1989; Lipsitz and Parzen, 1996; Hannigan et al., 2001). Both approaches give valid standard error estimates regardless of the true correlation between surfaces or teeth within a subject. Hannigan et al.(2001) demonstrate the use of an accelerated failure-time model for interval-censored time-to-event data based on a log-logistic distribution using a jackknife method to estimate the standard errors for the analysis of caries data. Another method is a discrete time version of the familiar Cox proportional hazards model which can be estimated, based on generalized estimating equations, to fit a linear model to the complementary log-log transformation of the probability of new caries occurrence (Abbott, 1985). A special consideration for time-to-event analysis of caries data is that, although the actual time to caries onset and censoring is continuous, the observed time to caries onset is discrete or grouped due to annual follow-up, which makes fitting time-to-event data more complicated. If survival methods do prove to offer a more efficient means for the analysis of caries clinical trials, an area requiring further study is the validity of the assumptions about the censoring distribution of the caries onsets and other assumptions required by the methods for interval-censored failure-time data. RESULTS
A potential gain in efficiency of a time-to-event analysis, such as the Poisson regression and survival data methods described above, over a traditional analysis is that caries onsets from subjects or surfaces will contribute to the time-to-event analysis until the time they are lost from the study. The amount of efficiency gained by the time-to-event analysis will depend on the rate and timing of attrition of subjects from the study. Subjects who are lost to follow-up before the first follow-up visit are typically excluded from the time-to-event analysis (Hujoel et al., 1994; Hannigan et al., 2001). Hence, if all attrition occurs before the first follow-up visit, the time-to-event analysis would likely not be more efficient than a traditional analysis. However, if the attrition were relatively constant over the course of a clinical trial, one would expect to gain efficiency by using the partial data from subjects lost from the study. As an illustration of the potential gains in efficiency, Table 1
Another potential gain in efficiency of the time-to-event methods is the ability to use the surface or tooth as the unit of analysis, which allows surface-specific information to be taken into account in the analysis (e.g., number of surfaces at risk per subject, correlation between surfaces, surface-level characteristics). The comparison between the efficiency of a subject-unit analysis and surface-unit analysis for the Poisson regression methods with generalized estimating equations is analogous to the comparison of the efficiency of the generalized estimating equations for different specifications of the working correlation or weighting matrix. Specifically, the subject-unit analysis corresponds to the use of an Independence working correlation specification, and the surface-unit analysis corresponds to the use of a non-Independence working correlation specification (e.g., an exchangeable working correlation would weight the observations within a subject, assuming a common correlation between surfaces). In general, the efficiency will depend on the number of surfaces per subject, the correlation between surfaces within a subject, what surface and subject characteristics are included in the regression model, and the size of the treatment differences (Mancl and Leroux, 1996). To demonstrate the potential gain in efficiency of a surface-unit analysis that takes into account the number of surfaces at risk per subject and the correlation between surfaces, Table 2
A noteworthy advantage of a surface-unit analysis over a subject-unit is the ability to adjust for surface-level (and subject-level) characteristics, if there is imbalance between treatment groups, and for secondary analyses investigating the caries susceptibilities of different tooth surfaces. Also, there is the potential to increase the efficiency of the treatment comparisons by incorporating surface-level characteristics that are predictive of new caries (Grainger et al., 1984; Kingman, 1984). However, it is unlikely that these potential gains would be realized in the planning stages of a caries clinical trial, since they are difficult to quantify, and, in the absence of an imbalance in surface-level characteristics between treatment groups, it is also unlikely that the primary analysis of treatment efficacy would take advantage of the surface-level data. Another potential gain in efficiency of the time-to-event methods could be due to use of the surface time at risk. Dean and Balshaw (1997) have investigated the efficiency lost by analyzing only the counts (e.g., caries onsets) rather than also incorporating the time at risk (e.g., time at risk for caries onset) into Poisson and overdispersed Poisson regression models for a subject-unit analysis. In the case of overdispersion, only minimal loss of efficiency for treatment differences is shown if the follow-up times are balanced between the treatment groups (worst asymptotic relative efficiency of an extreme case was > 95%). As long as the follow-up times are not extremely imbalanced over the treatment groups (i.e., no one treatment group contains only the smallest or largest follow-up times), the estimates based only on the counts retain very high efficiency. Follow-up times of a randomized caries clinical trial would be expected to be fairly similar between treatment groups, given the relatively short follow-up and small-to-modest treatment effects. A large treatment effect could cause an imbalance in follow-up times, but this would not necessarily imply a loss of efficiency with the use of only the counts, since the efficiency of the count-only-based analysis increases as the treatment difference increases (Dean and Balshaw, 1997). Hence, the gain in efficiency by taking into account the surface time at risk for caries onset will most likely be modest for the typical caries clinical trial. A simplified explanation for these findings, for time-to-event methods that assume a constant treatment effect over time, is that it is the number of events that determine the efficiency of the analysis rather than the time at risk. Given the limited potential for increasing the efficiency of the statistical analysis by incorporating the time-at-risk for caries onset, newer methods for repeated events modeling that do not use the time-at-risk (e.g., generalized estimating equations and generalized linear mixed-effect models) could possibly be used for the analysis of caries clinical trials (e.g., analysis based on the rate of change in a DMF score). These methods would have advantages over the traditional methods (with respect to efficiency gains) similar to those of the time-to-events methods. For example, the generalized estimating equations and generalized linear mixed-effect methods do not require subjects to have complete follow-up data, and hence, subjects could contribute partial follow-up data to the analysis. DISCUSSION Traditional analysis of caries clinical trials based on a change in the DMF score between a baseline visit and a final follow-up visit can suffer from a lack of interpretability for caries occurrence (Hujoel et al., 1994; Slade, 1999). Also, subjects who are lost from the study before the final follow-up visit are typically excluded from the traditional analysis, because the DMF change score cannot be computed. Hence, the traditional analysis does not make efficient use of all caries onsets.
Newer methods of analysis, such as methods based on time-to-event and methods for clustered data, can be used to estimate caries occurrence and demonstrate treatment effects for caries prevention based on standard epidemiological methods to estimate incidence or hazard rates. In contrast to a traditional analysis that excludes data from subjects lost from the study by the final follow-up visit, time-to-event methods allow caries onsets to contribute to the analysis until the subject is lost from the study. Hence, by using all the caries onsets available, the time-to-event analysis can result in notable reductions in the sample size required to demonstrate a treatment effect, and it is straightforward for calculating the expected reduction (e.g., Table 1 In addition, time-to-event methods in which the surface is the unit of analysis can be used to adjust for surface-level characteristics, if there is imbalance between treatment groups, and for secondary analyses investigating the caries susceptibilities of different tooth surfaces. However, the gain in efficiency due to the use of surface-specific information (e.g., number of surfaces at risk per subject, surface-level characteristics) will most likely be small under most circumstances. Further potential drawbacks include more intense monitoring and data collection as well as distributional assumptions. Given that study attrition appears to have the greatest impact on efficiency, a topic for further research is whether recent methods for handling missing data in longitudinal studies, such as multiple imputation and inverse probability of censoring weighted estimators, can be used to further increase the efficiency of the time-to-event analysis (Robins et al., 1995; Schafer, 1997). As to the availability of software for fitting a time-to-event analysis, the methods using generalized estimating equations and standard survival data methods are available in major statistical software packages, such as SAS (SAS Institute Inc., Cary, NC, USA) and Stata (Stata Statistical Software, College Station, TX, USA).
FOOTNOTES Presented at the International Consensus Workshop on Caries Clinical Trials, Glasgow, Scotland, January 7–10, 2002 REFERENCES
Journal of Dental Research, Vol. 83, No. suppl 1,
C95-C98 (2004) This article has been cited by other articles:
|
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

