Advanced Search

Journal Navigation

Journal Home

Subscriptions

Archive

Contact Us

Table of Contents

CiteULike is a free service for managing and discovering scholarly references - click here to get started.

Sign In to gain access to subscriptions and/or personal tools.
Journal of Dental Research
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to Saved Citations
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Request Reprints
Right arrow Add to My Marked Citations
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Right arrow Citing Articles via Scopus
Google Scholar
Right arrow Articles by Mancl, L.A.
Right arrow Articles by DeRouen, T.A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Mancl, L.A.
Right arrow Articles by DeRouen, T.A.
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati   Add to Twitter  
What's this?

ARTICLES

Efficiency Issues Among Statistical Methods for Demonstrating Efficacy of Caries Prevention

L.A. Mancl1,*, P.P. Hujoel1 and T.A. DeRouen1,2

1 Department of Dental Public Health Sciences and
2 Department of Biostatistics, University of Washington, Box 357475, Seattle, WA 98195-7475;

Correspondence: * corresponding author, llman{at}u.washington.edu

ABSTRACT

Although repeated tooth-surface-specific information is commonly collected during a longitudinal caries clinical trial, traditional methods often make limited use of the repeated measures. Newer methods of analysis, such as methods based on time-to-event and methods for longitudinal or clustered data, have the potential to increase the efficiency of the statistical analysis. We compare a range of analytical methods from the traditional analysis based only on the number of caries onsets to newer methods that incorporate time at risk and surface-specific information, such as Poisson regression methods for clustered data, with respect to the efficiency of treatment comparisons. Under most circumstances, the greatest gain in efficiency associated with time-to-event methods will be due to the ability of subjects to contribute caries onsets to the analysis until they are lost from the study. Incorporating the number of surfaces at risk, the surface time at risk, and surface-specific characteristics will typically produce only a modest gain in efficiency.

Key Words: Poisson regression • survival analysis • correlated data

INTRODUCTION

Traditional methods of analysis of caries clinical trials are based on the number of decayed, filled, and missing teeth or surfaces (DMF), and comparisons between treatments for caries prevention typically are based on the change in DMF between a baseline visit and a final follow-up visit. Typically, subjects who are lost to follow-up before the final follow-up visit are excluded from the analysis, because the DMF change score cannot be computed (Hannigan et al., 2001). The drawbacks to using the DMF change score to estimate caries incidence have been enumerated in the literature, as have potential modifications to the DMF score to improve the estimation of caries occurrence (Hujoel et al., 1994; Beck et al., 1995, 1997; Lawrence et al., 1996; Burt, 1997; Kingman and Selwitz, 1997; Spencer, 1997; Slade and Caplan, 1999). Ideally, the calculation of caries occurrence in a clinical trial should take into account the number of caries onsets, the number of surfaces at risk, and the time that the surface was at risk. This information is not always present in the DMF change scores (Hujoel et al., 1994).

Although repeated surface-specific information is often collected during a longitudinal clinical trial, traditional methods make limited use of these data. Newer methods of analysis, such as methods based on time-to-event and methods for longitudinal or clustered data, have the potential to increase the efficiency and sensitivity of the statistical analysis for detection of treatment effects as compared with traditional methods (DeRouen et al., 1995; Beck et al., 1997; Spencer, 1997; Hannigan et al., 2001). The focus of this paper is on efficiency issues related to the use of (1) the number of caries onsets, (2) the number of surfaces at risk, and (3) the surface time at risk in the analysis of caries clinical trials. For this discussion, the traditional analysis will be considered to be based on the change in the DMF score between a baseline visit and a final follow-up visit, and the change score will measure the number of new caries (i.e., new decayed surfaces or new filled or missing surfaces due to caries) for subjects who are present at both the baseline and final follow-up visit (i.e., caries onsets from subjects who are lost to follow-up before the final follow-up visit are excluded from the analysis). We review the conditions under which methods that incorporate interim caries responses, surface-specific information, or surface time at risk, such as Poisson regression and methods for survival data, will be more efficient, as compared with the traditional analysis, for demonstrating a treatment effect for caries prevention, and discuss whether these conditions are likely to be met in the typical caries clinical trial.

METHODS

The typical clinical caries trial involves randomization of subjects to two or more treatment groups with annual repeated measurements on caries status taken at multiple teeth or surfaces for each subject. A commonly used epidemiological method to compare counts or incidence rates is Poisson regression. Since caries data typically exhibit more variation than expected from the Poisson distribution, Poisson regression methods that allow for extra-variation or overdispersion are recommended to compare caries incidence rates. The outcome measure in a Poisson regression analysis is the number of incident or new caries (i.e., new decayed surfaces or new filled or missing surfaces due to caries) and the surface time at risk. Given that measurements are usually recorded on an annual basis, the actual time of new caries or censoring is not known, but is usually estimated as the midpoint between two follow-up times (Hujoel et al., 1994; Beck et al., 1997; Slade and Caplan, 1999). With the Poisson regression methods, the surface or tooth can be used as the unit of analysis, with the use of generalized estimating equations (Liang and Zeger, 1986) or generalized linear mixed models (Breslow and Clayton, 1993; Littell et al., 1996), or the subject can be treated as the unit of analysis with generalized estimating equations (Hujoel et al., 1994) or log-linear regression (Breslow, 1984). An advantage of the generalized estimating equations implementation of the Poisson regression method is that, for valid inference, only the regression model describing how the caries onset rate depends on the covariates included in the model needs to be correctly specified (typically only an indicator for the treatment group is the model), whereas the caries onsets do not have to exhibit Poisson variance or have a Poisson distribution.

Another class of methods that take into account the surface time at risk are the various methods for survival data that have been developed for non-clustered interval-censored or discrete failure-time data (e.g., the discrete analogue of the Cox proportional hazards model, accelerated failure-time model for interval censoring). The survival time methods use the surface or tooth as the unit of analysis, and treatment differences are estimated by standard survival methods based on an assumption that all observations are independent. Although teeth or surfaces within the same subject usually are correlated, the estimates of the treatment differences are still consistent as long as the model for the failure time or hazard is correctly specified (Wei et al., 1989). However, an empirically based covariance estimator or a covariance estimator based on re-sampling (e.g., jackknife or bootstrap estimator) is used to estimate the standard errors of the estimated treatment differences, and hence, to perform valid hypothesis-testing (Wei et al., 1989; Lipsitz and Parzen, 1996; Hannigan et al., 2001). Both approaches give valid standard error estimates regardless of the true correlation between surfaces or teeth within a subject. Hannigan et al.(2001) demonstrate the use of an accelerated failure-time model for interval-censored time-to-event data based on a log-logistic distribution using a jackknife method to estimate the standard errors for the analysis of caries data. Another method is a discrete time version of the familiar Cox proportional hazards model which can be estimated, based on generalized estimating equations, to fit a linear model to the complementary log-log transformation of the probability of new caries occurrence (Abbott, 1985). A special consideration for time-to-event analysis of caries data is that, although the actual time to caries onset and censoring is continuous, the observed time to caries onset is discrete or grouped due to annual follow-up, which makes fitting time-to-event data more complicated. If survival methods do prove to offer a more efficient means for the analysis of caries clinical trials, an area requiring further study is the validity of the assumptions about the censoring distribution of the caries onsets and other assumptions required by the methods for interval-censored failure-time data.

RESULTS

A potential gain in efficiency of a time-to-event analysis, such as the Poisson regression and survival data methods described above, over a traditional analysis is that caries onsets from subjects or surfaces will contribute to the time-to-event analysis until the time they are lost from the study. The amount of efficiency gained by the time-to-event analysis will depend on the rate and timing of attrition of subjects from the study. Subjects who are lost to follow-up before the first follow-up visit are typically excluded from the time-to-event analysis (Hujoel et al., 1994; Hannigan et al., 2001). Hence, if all attrition occurs before the first follow-up visit, the time-to-event analysis would likely not be more efficient than a traditional analysis. However, if the attrition were relatively constant over the course of a clinical trial, one would expect to gain efficiency by using the partial data from subjects lost from the study. As an illustration of the potential gains in efficiency, Table 1Go shows the percent the sample size would need to be increased to take into account a constant annual attrition rate of 2.5% to 10%, based on a traditional analysis that excludes data from subjects who are not present at the final follow-up visit and a time-to-event analysis that assumes a constant treatment effect over time. For example, if the annual attrition rate is 5% and there are 3 annual follow-up visits, this implies a 14.3% loss in the number of subjects, and the sample size would need to be multiplied by 1/0.857 for an increase of 16.6%; whereas, the same attrition rate would imply only a 10% loss in the number of caries onsets or amount of surface time at risk, and hence, the sample size for a time-to-event analysis would need to be multiplied only by 1/0.903 for an increase of 10.7%. Given that caries clinical trials often involve relatively large sample sizes, even a modest percent reduction in the sample size (from 5% to 10%) would probably be a notable efficiency gain.


View this table:
[in this window]
[in a new window]

 
Table 1. Percent Increase in Sample Size Necessary to Account for an Annual Attrition Rate of 2.5% to 10% for a 2- or 3-year Follow-up Based on a Traditional Analysis or Time-to-Event Analysis
 
Another potential gain in efficiency of the time-to-event methods is the ability to use the surface or tooth as the unit of analysis, which allows surface-specific information to be taken into account in the analysis (e.g., number of surfaces at risk per subject, correlation between surfaces, surface-level characteristics). The comparison between the efficiency of a subject-unit analysis and surface-unit analysis for the Poisson regression methods with generalized estimating equations is analogous to the comparison of the efficiency of the generalized estimating equations for different specifications of the working correlation or weighting matrix. Specifically, the subject-unit analysis corresponds to the use of an ‘Independence’ working correlation specification, and the surface-unit analysis corresponds to the use of a non-Independence working correlation specification (e.g., an exchangeable working correlation would weight the observations within a subject, assuming a common correlation between surfaces). In general, the efficiency will depend on the number of surfaces per subject, the correlation between surfaces within a subject, what surface and subject characteristics are included in the regression model, and the size of the treatment differences (Mancl and Leroux, 1996). To demonstrate the potential gain in efficiency of a surface-unit analysis that takes into account the number of surfaces at risk per subject and the correlation between surfaces, Table 2Go shows the efficiency of a subject-unit vs. a surface-unit analysis for: estimating a treatment difference between two treatments, not adjusting for any other surface or subject characteristic, assuming no treatment difference (i.e., worst-case scenario for efficiency of a subject-unit analysis), and a common correlation of 0.01 to 0.2 between surfaces. In a typical caries clinical trial measuring caries onset, the average correlation between surfaces within the same subject is usually low, 0.05 or less, whereas higher correlation values, 0.2 or higher, might be observed in animal studies involving experimentally induced dental disease (Hujoel et al., 1994). Table 2Go shows that when the number of surfaces at risk is the same for all subjects (i.e., the coefficient of variation is zero), there is no loss in efficiency in estimating a treatment difference by using a subject-unit analysis vs. a surface-unit analysis. For a coefficient of variation between the number of surfaces per subject of 0.0 to 0.3 and a small to moderate correlation between surfaces, a subject-unit analysis retains high efficiency. A low coefficient of variation (0.0 to 0.2) would be expected in a caries clinical trial involving young to middle-aged subjects, because a larger coefficient variation is possible only when there is extreme variation in the number of surfaces per subjects. Hence, for a typical caries clinical trial, where there is usually a relatively small variation between the number of surfaces per subject and a weak correlation between surfaces, there will be only a modest gain in efficiency due to taking into account the number of surfaces at risk per subject and correlation between surfaces.


View this table:
[in this window]
[in a new window]

 
Table 2. Efficiency of a Subject-unit Analysis Compared with a Surface-unit Analysis for a Treatment Difference
 
A noteworthy advantage of a surface-unit analysis over a subject-unit is the ability to adjust for surface-level (and subject-level) characteristics, if there is imbalance between treatment groups, and for secondary analyses investigating the caries susceptibilities of different tooth surfaces. Also, there is the potential to increase the efficiency of the treatment comparisons by incorporating surface-level characteristics that are predictive of new caries (Grainger et al., 1984; Kingman, 1984). However, it is unlikely that these potential gains would be realized in the planning stages of a caries clinical trial, since they are difficult to quantify, and, in the absence of an imbalance in surface-level characteristics between treatment groups, it is also unlikely that the primary analysis of treatment efficacy would take advantage of the surface-level data.

Another potential gain in efficiency of the time-to-event methods could be due to use of the surface time at risk. Dean and Balshaw (1997) have investigated the efficiency lost by analyzing only the counts (e.g., caries onsets) rather than also incorporating the time at risk (e.g., time at risk for caries onset) into Poisson and overdispersed Poisson regression models for a subject-unit analysis. In the case of overdispersion, only minimal loss of efficiency for treatment differences is shown if the follow-up times are balanced between the treatment groups (worst asymptotic relative efficiency of an extreme case was > 95%). As long as the follow-up times are not extremely imbalanced over the treatment groups (i.e., no one treatment group contains only the smallest or largest follow-up times), the estimates based only on the counts retain very high efficiency. Follow-up times of a randomized caries clinical trial would be expected to be fairly similar between treatment groups, given the relatively short follow-up and small-to-modest treatment effects. A large treatment effect could cause an imbalance in follow-up times, but this would not necessarily imply a loss of efficiency with the use of only the counts, since the efficiency of the count-only-based analysis increases as the treatment difference increases (Dean and Balshaw, 1997). Hence, the gain in efficiency by taking into account the surface time at risk for caries onset will most likely be modest for the typical caries clinical trial. A simplified explanation for these findings, for time-to-event methods that assume a constant treatment effect over time, is that it is the number of events that determine the efficiency of the analysis rather than the time at risk.

Given the limited potential for increasing the efficiency of the statistical analysis by incorporating the time-at-risk for caries onset, newer methods for repeated events modeling that do not use the time-at-risk (e.g., generalized estimating equations and generalized linear mixed-effect models) could possibly be used for the analysis of caries clinical trials (e.g., analysis based on the rate of change in a DMF score). These methods would have advantages over the traditional methods (with respect to efficiency gains) similar to those of the time-to-events methods. For example, the generalized estimating equations and generalized linear mixed-effect methods do not require subjects to have complete follow-up data, and hence, subjects could contribute partial follow-up data to the analysis.

DISCUSSION

Traditional analysis of caries clinical trials based on a change in the DMF score between a baseline visit and a final follow-up visit can suffer from a lack of interpretability for caries occurrence (Hujoel et al., 1994; Slade, 1999). Also, subjects who are lost from the study before the final follow-up visit are typically excluded from the traditional analysis, because the DMF change score cannot be computed. Hence, the traditional analysis does not make efficient use of all caries onsets.

Newer methods of analysis, such as methods based on time-to-event and methods for clustered data, can be used to estimate caries occurrence and demonstrate treatment effects for caries prevention based on standard epidemiological methods to estimate incidence or hazard rates. In contrast to a traditional analysis that excludes data from subjects lost from the study by the final follow-up visit, time-to-event methods allow caries onsets to contribute to the analysis until the subject is lost from the study. Hence, by using all the caries onsets available, the time-to-event analysis can result in notable reductions in the sample size required to demonstrate a treatment effect, and it is straightforward for calculating the expected reduction (e.g., Table 1Go).

In addition, time-to-event methods in which the surface is the unit of analysis can be used to adjust for surface-level characteristics, if there is imbalance between treatment groups, and for secondary analyses investigating the caries susceptibilities of different tooth surfaces. However, the gain in efficiency due to the use of surface-specific information (e.g., number of surfaces at risk per subject, surface-level characteristics) will most likely be small under most circumstances. Further potential drawbacks include more intense monitoring and data collection as well as distributional assumptions.

Given that study attrition appears to have the greatest impact on efficiency, a topic for further research is whether recent methods for handling missing data in longitudinal studies, such as multiple imputation and inverse probability of censoring weighted estimators, can be used to further increase the efficiency of the time-to-event analysis (Robins et al., 1995; Schafer, 1997). As to the availability of software for fitting a time-to-event analysis, the methods using generalized estimating equations and standard survival data methods are available in major statistical software packages, such as SAS (SAS Institute Inc., Cary, NC, USA) and Stata (Stata Statistical Software, College Station, TX, USA).

FOOTNOTES

Presented at the International Consensus Workshop on Caries Clinical Trials, Glasgow, Scotland, January 7–10, 2002

REFERENCES

  • Abbott RD (1985). Logistic regression in survival analysis. Am J Epidemiol 121:465–471.[Abstract/Free Full Text]
  • Beck JD, Lawrence HP, Koch GG (1995). A method for adjusting caries increments for reversals due to examiner misclassification. Community Dent Oral Epidemiol 23:321–330.[CrossRef][Medline] [Order article via Infotrieve]
  • Beck JD, Lawrence HP, Koch GG (1997). Analytic approaches to longitudinal caries data in adults. Community Dent Oral Epidemiol 25:42–51.[CrossRef][Medline] [Order article via Infotrieve]
  • Breslow NE (1984). Extra-Poisson variability in log-linear modeling. Appl Statist 33:38–44.[CrossRef]
  • Breslow NE, Clayton DG (1993). Approximate inference in generalized linear mixed models. J Am Stat Assoc 88:9–25.[CrossRef]
  • Burt BA (1997). How useful are cross-sectional data from surveys of dental caries? Community Dent Oral Epidemiol 25:36–41.[Medline] [Order article via Infotrieve]
  • Dean CB, Balshaw R (1997). Efficiency lost by analyzing counts rather than even times in Poisson and overdispersed Poisson regression models. J Am Stat Assoc 92:1387–1398.
  • DeRouen TA, Hujoel PP, Mancl LA (1995). Statistical issues in periodontal research. J Dent Res 74:1731–1737.[Abstract/Free Full Text]
  • Grainger RM, Lehnhoff RW, Bollmer BW, Zacherl WA (1984). Analysis of covariance in dental caries clinical trials. J Dent Res 63(Spec Iss):766–772.[Medline] [Order article via Infotrieve]
  • Hannigan A, O’Mullane DM, Barry D, Schäfer F, Roberts AJ (2001). A re-analysis of a caries clinical trial by survival analysis. J Dent Res 80:427–431.[Abstract/Free Full Text]
  • Hujoel PP, Isokangas PJ, Tiekso J, Davis S, Lamont RJ, DeRouen TA, et al. (1994). A re-analysis of caries rates in a preventive trial using Poisson regression models. J Dent Res 73:573–579.[Abstract/Free Full Text]
  • Kingman A (1984). Stratification methods in caries clinical trials. J Dent Res 63(Spec Iss):773–777.
  • Kingman A, Selwitz RH (1997). Proposed methods for improving the efficiency of the DMFS index in assessing initiation and progression of dental caries. Community Dent Oral Epidemiol 25:60–68.[Medline] [Order article via Infotrieve]
  • Lawrence HP, Beck JD, Hunt RJ, Koch GG (1996). Adjustment of the M-component of the DMFS index for prevalence studies of older adults. Community Dent Oral Epidemiol 24:322–331.[Medline] [Order article via Infotrieve]
  • Liang KY, Zeger SL (1986). Longitudinal data analysis using generalized linear models. Biometrika 73:13–22.[Abstract/Free Full Text]
  • Lipsitz SR, Parzen M (1996). A jackknife estimator of variance for Cox regression for correlated survival data. Biometrics 52:291–298.
  • Littell RC, Milliken GA, Stroup WW, Wolfinger RD (1996). SAS system for mixed models. Cary, NC: SAS Institute Inc.
  • Mancl LA, Leroux BG (1996). Efficiency of regression estimates for clustered data. Biometrics 52:500–511.
  • Robins JM, Rotnitzky A, Zhao LP (1995). Analysis of semiparametric regression models for repeated outcomes in the presence of missing data. J Am Stat Assoc 90:106–121.[CrossRef]
  • Schafer JL (1997). Analysis of incomplete multivariate data. London: Chapman & Hall.
  • Slade GD, Caplan DJ (1999). Methodological issues in longitudinal epidemiologic studies of dental caries. Community Dent Oral Epidemiol 27:236–248.[Medline] [Order article via Infotrieve]
  • Spencer AJ (1997). Skewed distributions—new outcome measure. Community Dent Oral Epidemiol 25:52–59.[Medline] [Order article via Infotrieve]
  • Wei LJ, Lin DY, Weissfeld L (1989). Regression analysis of multivariate incomplete failure time data by modeling marginal distributions. J Am Stat Assoc 84:1065–1073.[CrossRef]

Journal of Dental Research, Vol. 83, No. suppl 1, C95-C98 (2004)
DOI: 10.1177/154405910408301S19


Add to CiteULike CiteULike   Add to Complore Complore   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati   Add to Twitter Twitter    What's this?


This article has been cited by other articles:


Home page
JDRHome page
J.W. Stamm
The Classic Caries Clinical Trial: Constraints and Opportunities
Journal of Dental Research, July 1, 2004; 83(suppl_1): C6 - C14.
[Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to Saved Citations
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Request Reprints
Right arrow Add to My Marked Citations
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Right arrow Citing Articles via Scopus
Google Scholar
Right arrow Articles by Mancl, L.A.
Right arrow Articles by DeRouen, T.A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Mancl, L.A.
Right arrow Articles by DeRouen, T.A.
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati   Add to Twitter  
What's this?